Tuesday, May 3, 2016

Allergens: Types, sources.......

GENERAL
#######PLANTS#####
Ole e: Olea europaea (Common olive)
Sin a: Sinapis alba (White mustard)
2S albumin: Ricinus communis (Castor bean)
Pectate lyase: Cryptomeria japonica (Japanese cedar) (Cupressus japonica)
Expansin-B1: Zea mays (Maize)
Superoxide dismutase: Olea europaea (Common olive)
Small rubber particle protein: Hevea brasiliensis (Para rubber tree)
Exopolygalacturonase: Platanus acerifolia (London plane tree)
Major pollen allergen Bet v 1-A: Betula pendula (European white birch) (Betula verrucosa)
Profilin-2: Phleum pratense (Common timothy)
Pectinesterase 1: Olea europaea (Common olive)
Non-specific lipid-transfer protein:  Ambrosia artemisiifolia (Short ragweed)
Profilin-1 : Phleum pratense (Common timothy)
Pectate lyase 1: Ambrosia artemisiifolia (Short ragweed)
Pectate lyase 2: Ambrosia artemisiifolia (Short ragweed)
Bet v 1-L: Betula pendula (European white birch) (Betula verrucosa)
Amb a 3: Ambrosia artemisiifolia var. elatior (Short ragweed)
Pectinesterase 2: Olea europaea (Common olive)
Phl p 5b: Phleum pratense (Common timothy)
Polygalacturonase: Cryptomeria japonica (Japanese cedar)
Expansin-B11: Zea mays (Maize)
Lol p 1: Lolium perenne (Perennial ryegrass)
Profilin-4: Corylus avellana (European hazel) (Corylus maxima)
Actinidain: Actinidia deliciosa (Kiwi)
Polygalacturonase: Juniperus ashei (Ozark white cedar)
Esterase: Hevea brasiliensis (Para rubber tree)
Protein DOWNSTREAM OF FLC: Arabidopsis thaliana (Mouse-ear cress)
Major allergen Api g 1: Apium graveolens (Celery)
Alpha-amylase inhibitor BMAI-1: Hordeum vulgare (Barley)
Superoxide dismutase [Cu-Zn]: Olea europaea (Common olive)
Lactoylglutathione lyase: Oryza sativa subsp. japonica (Rice)
Profilin-1: Zea mays (Maize)
Ambrosia artemisiifolia (Short ragweed)
Bra j 1-E: Brassica juncea (Indian mustard) (Sinapis juncea)
Glucan endo-1,3-beta-glucosidase: Prunus avium (Cherry)
Non-specific lipid-transfer protein: Apium graveolens (Celery)
Dau c 1: Daucus carota (Wild carrot)
Pollen allergen KBG 41: Poa pratensis (Kentucky bluegrass)
Lol p 5a: Lolium perenne (Perennial ryegrass)
Profilin-2-5: Olea europaea (Common olive)
######FUNGI#####
60S acidic ribosomal protein P2: Alternaria alternata (Alternaria rot fungus)
Alcohol dehydrogenase 1: Candida albicans (Yeast)
Enolase:  Cladosporium herbarum
Glucoamylase: Trichophyton mentagrophytes
Cla h 7: Cladosporium herbarum
Ribonuclease mitogillin: (Aspergillus fumigatus)
Fructose-bisphosphate aldolase: Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast)
60S acidic ribosomal protein P2: (Cladosporium herbarum)
Enolase: Alternaria alternata (Alternaria rot fungus)

######NEMATODE#####
Polyprotein ABA-1: Ascaris suum (Pig roundworm)
Major allergen Ani s 1: Anisakis simplex (Herring worm)
######ARTHROPODS#####
Pilosulin-3a: Myrmecia pilosula (Jack jumper ant) (Australian jumper ant)
Peptidase 1: Psoroptes ovis (Sheep scab mite)
Hyaluronidase A: Vespula vulgaris (Yellow jacket) (Wasp)
Eur m 3:Euroglyphus maynei (Mayne's house dust mite)
Peptidase 1: Dermatophagoides pteronyssinus (European house dust mite)
Mite group 2 allergen Lep d: Lepidoglyphus destructor (Storage mite)
Peptidase 1: Dermatophagoides farinae (American house dust mite)
Mite group 2 allergen Der p 2: Dermatophagoides pteronyssinus (European house dust mite)
Phospholipase A1: Solenopsis invicta (Red imported fire ant)
Melittin: Apis mellifera (Honeybee)
Pilosulin-1: Myrmecia pilosula (Jack jumper ant) (Australian jumper ant)
Hyaluronidase: Apis mellifera (Honeybee)
Aspartic protease Bla g 2: Blattella germanica (German cockroach) (Blatta germanica)
Peptidase 1: Euroglyphus maynei (Mayne's house dust mite)
Phospholipase A1 1: Dolichovespula maculata (Bald-faced hornet)
Venom allergen 3: Solenopsis invicta (Red imported fire ant)
Der p 3: Dermatophagoides pteronyssinus (European house dust mite)
Der f 3: Dermatophagoides farinae (American house dust mite)
Arginine kinase AK: Penaeus monodon (Giant tiger prawn)
Venom dipeptidyl peptidase 4: Apis mellifera (Honeybee)
Phospholipase A1: Vespula maculifrons (Eastern yellow jacket) (Wasp)
######FISH#####
Parvalbumin beta: Gadus morhua subsp. callarias (Baltic cod) 
######BIRDS#####
Ovalbumin: Gallus gallus (Chicken)
Ovotransferrin: Gallus gallus (Chicken)
Lysozyme C: Gallus gallus (Chicken)
Ovomucoid: Gallus gallus (Chicken)
######MAMMALS#####
Minor allergen Can f 2: Canis lupus familiaris (Dog) (Canis familiaris)
Major allergen I polypeptide chain: Felis catus (Cat)
Allergen Fel d 4: Felis catus (Cat) (Felis silvestris catus)
Major urinary protein: Rattus norvegicus (Rat)
Allergen Bos d 2: Bos taurus (Bovine)
Protein S100-A7: Bos taurus (Bovine)
Latherin : Equus caballus (Horse)
Major allergen Equ c 1: Equus caballus (Horse)
-----------------
SPECIFIC
#Cashew, Pistachio
Vicilin-like protein, 2s albumin, Ana o 2, 11S globulin
#Almond, peach
pru1, Pru du, Non-specific lipid-transfer protein
#Tomato
Profilin, pectate lyase
#Peanut
Conglutin-7, Defensin, Ara h, Profilin, Non-specific lipid-transfer protein
#Avocado
Endochitinase
#Kiwi
Actinidain, Cysteine proteinase inhibitor, Thaumatin-like protein, Act d
Kiwellin, Kirola, Non-specific lipid-transfer protein, Endochitinase, Bet v
#Persimmon
Expansin, Non-specific lipid-transfer protein
#Celery
Non-specific lipid-transfer protein, Chlorophyll a-b binding protein, Api g, Profilin
#Kidney bean
Pathogenesis-related protein 1
Pectate lyase
#Egg
Ovalbumin
Ovotransferrin
Lysozyme C
Ovomucoid
Serum albumin
#Shrimp, lobster
Tropomyosin
Arginine kinase
Arginine kinase
Pen a
Lit v
Sarcoplasmic calcium-binding protein
#Mussel
Tropomyosin
Endo-beta-1,4-glucanase
#Fish
Alpha-enolase
Beta-enolase
Parvalbumin beta
Fructose-bisphosphate aldolase A
#Octopus 
Arginine kinase
#Silk moth
SCP-related protein
Arginine kinase
Apolipoprotein of lipid transfer
#Rubber
Patatin

Monday, May 2, 2016

Tools to learn and work to do......

Alignment...
Alignment of sequencing reads to a reference genome is a core step in the analysis workflows for many high-throughput sequencing assays, including ChIP-Seq, RNA-seq, ribosome profiling and others.
Bowtie  uses an extremely economical data structure called the FM index to store the reference genome sequence and allows it to be searched rapidly. 
TopHat uses Bowtie as an alignment ‘engine’ 

Mauve?
#To run the Mauve GUI from within Terminal 
#Add directory with executables to Mauve path
cd Mauve/
ls
cd mauve_2.3.1/
./Mauve 

File
Align with progressive Mauve
Select the executable folder (by navigation
Mauve Console starts running (1-2 minutes for two full genomes)
 Viewing the alignment
Zoom in    Ctrl + UpScroll 
display left    Ctrl + LeftScroll 
display right    Ctrl + RightLarge 
left scroll    Shift + Ctrl + LeftLarge 
right scroll    Shift + Ctrl + Right
Tool ---------> Export ---------> Export SNPs   

Indel determination..
Whats the logic used to pull information from vcf file?


R PSI Blast
#Reversed Position Specific BLAST, or RPS BLAST, use at command line
#extract just these *.smp files from the large archive (cdd.tar.gz).
#run the formatrpsdb tool to build a database:
formatrpsdb -t Sigma.v001 -i Sigma.pn -o T -f 9.82 -n Sigma -S 100.0
#creates the eight files i.e. Sigma.aux, Sigma.loo, Sigma.phr, Sigma.pin, Sigma.psd, Sigma.psi, Sigma.psq and Sigma.rps which together make up the database.
#Compare
rpsblast -i rpoD.faa -d Sigma -e 0.00001
rpsblast -i rpoD.faa -d Sigma -e 0.00001 -o rpoD.txt
rpsblast -i rpoD.faa -d Sigma -e 0.00001 -m 7 -o rpoD.xml
#If comparing with Pfam database
rpsblast -i rpoD.faa -d Pfam -e 0.00001
#comparing entire genome with the Sigma database made earlier.
rpsblast -i NC_003197.faa -d Sigma -e 0.00001 -o NC_003197.txt
rpsblast -i NC_003197.faa -d Sigma -e 0.00001 -m 7 -o NC_003197.xml

#Analyzing RPS-BLAST output with Biopython
#For the smaller xml file
from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("rpoD.xml")) :
print "QUERY: %s" % record.query
for align in record.alignments :
print " MATCH: %s..." % align.title[:60]
for hsp in align.hsps :
print " HSP, e=%f, from position %i to %i" \
% (hsp.expect, hsp.query_start, hsp.query_end)
if hsp.align_length < 60 :
print " Query: %s" % hsp.query
print " Match: %s" % hsp.match
print " Sbjct: %s" % hsp.sbjct
else :
print " Query: %s..." % hsp.query[:57]
print " Match: %s..." % hsp.match[:57]
print " Sbjct: %s..." % hsp.sbjct[:57]
print "Done"


#For the large xml file
from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("NC_003197.xml")) :
    #We want to ignore any queries with no search results:
    if record.alignments :
        print "QUERY: %s..." % record.query[:60]
        for align in record.alignments :
            for hsp in align.hsps :
                print " %s HSP, e=%f, from position %i to %i" \
                % (align.hit_id, hsp.expect, hsp.query_start, hsp.query_end)
print "Done"
That should give you the following output - note there is only 



#Running RPS-BLAST from Biopython
#Adjust the file locations to match your own:
rpsblast_db = "C:\\Blast\\cdd\\Sigma"
rpsblast_exe = "C:\\Blast\\bin\\rpsblast.exe"

query_filename = "rpoD.faa"
#query_filename = "NC_003197.faa"

E_VALUE_THRESH = 0.00001 #Adjust the expectation cut-off here

from Bio.Blast import NCBIStandalone
output_handle, error_handle = NCBIStandalone.rpsblast(rpsblast_exe, \
rpsblast_db, query_filename, expectation=E_VALUE_THRESH)


from Bio.Blast import NCBIXML
for record in NCBIXML.parse(output_handle) :
    #We want to ignore any queries with no search results:
    if record.alignments :
        print "QUERY: %s..." % record.query[:60]
        for align in record.alignments :
            for hsp in align.hsps :
                print " %s HSP, e=%f, from position %i to %i" \
                % (align.hit_id, hsp.expect, hsp.query_start, hsp.query_end)
                assert hsp.expect <= E_VALUE_THRESH
print "Done"