I often get questions related to Blastocystis epidemiology research, and many of these are 'how-to' questions.
And as announced, I've chosen to dedicate a separate post listing some easy-to-use tools for subtyping Blastocystis from humans and animals.
First, I want to guide your attention to the YouTube video that I made; it takes you through various important steps of subtyping and introduces you to the online database that can be used to call subtypes by BLASTing batches of fasta files - provided that they are the right ones! And what do I mean by 'right ones'? Well, in order to get subtype information in a split second you need to have DNA sequences covering the first 500 base pairs (5'-end) of the Blastocystis small subunit (SSU) rRNA gene.
The online query database can be found here, and as you can see, it has a 'Sequence and profiles definition' section and an 'Isolates database' section; for now, never mind the latter. Now, to test this, press the 'Sequence and profiles definition', press the 'Sequence query' link, copy the following fasta file and paste it into the query box:
Submit your query, and then what you see is this:
Which means that a 100% identify was found and that what you pasted in was ST4, allele no. 94. This allele belongs to the rare genotype of Blastocystis. sp. ST4.
Now, even if you have a non-Blastocystis sequence, you will sometimes get a result providing the gene region is the correct one, and this is where to exert great awareness. Below is a sequence of Saccharomyces cerevisiae, which may be amplified by the barcoding primers; try and paste it into the query box and submit it for analysis:
>Saccharomyces_cerevisiae_(J01353)
TATCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTAAGTATAAGCAATTTATACAGTGAAACTGCGAATGGCTCATTAAATCAGTTATCGTTTATTTGATAGTTCCTTTACTACA
TGGTATAACCGTGGTAATTCTAGAGCTAATACATGCTTAAAATCTCGACCCTTTGGAAGAGATGTATTTATTAGATAAAAAATCAATGTCTTCGGACTCTTTGATGATTCATAATAACTTTTCGAATCGCATGGCCTTGT
GCTGGCGATGGTTCATTCAAATTTCTGCCCTATCAACTTTCGATGGTAGGATAGTGGCCTACCATGGTTTCAACGGGTAACGGGGAATAAGGGTTCGATTCCGGAGAGGGAGCCTGAGAAACGGCTACCACATCCAAGGA
AGGCAGCAGGCGCGCAAATTACCCAATCCTAATTCAGGGAGGTAGTGACAATAAATAACGATACAGGGCCCATTCGGGTCTTGTAATTGGAATGAGTACAATGTAAATACCTTAACGAGGAACAATTGGAGGGCAAGTCT
GGTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTAAAGTTGTTGCAGTTAAAAAGCTCGTAGTTGAACTTTGGGCCCGGTTGGCCGGTCCGATTTTTTCGTGTACTGGATTTCCAACGGGGCCTTTCCTTC
What you'll see is this:
As you can see, there are many mismatches in the alignment.. so this is not allele 42 (ST4), of course not, it's not even Blastocystis! This is why I suggest you always nucleotide BLAST your fasta files at the NCBI database (use this link). Only if they match Blastocystis, go ahead and call the subtype and the allele using the pubmlst.org/blastocystis database.
If you have a Blastocystis sequence that exhibits polymorphism compared to the reference sequences in the Blastocystis database, it may be due to one of two reasons: 1) The sequence may be unclear and/or edited erroneously, or 2) the sequence represents a new allele or a new subtype.
This means that if your sequence does not fit 100% with those in the database, I suggest you have a meticulous look at it, and if there are unclear sections, then re-sequence the whole lot - preferentially bidirectionally. If you end up with a clear sequence which still exhibits one or more polymorphisms, then please submit it to the database - you can do so be contacting the curator, who is basically me.
What you want is sequences looking like this:
For sequence editing you may want to use CHROMAS or FinchTv. These are good for single nucleotide sequence editing. If I do bidirectional sequencing or in cases where I'm having multiple sequences covering a gene (for instance when I'm sequencing complete SSU rRNA genes), I use STADEN Package; installing it may be a pain, though, make sure you use the right browser for starters... Once it has been installed, it works brilliantly, and the SOP I made for it is available below (please note that I made this SOP a couple of years ago; more recent software versions are on the market).
When is a subtype a novel subtype? Well, we addressed this question in our recent review in Advances in Parasitology. If you cannot access this journal, I suggest you look it up in the LSHTM Online Library - where you can find the pre-print version (go here to download). If you think you're dealing with a new subtype (less than 97-98% identity to reference sequences in GenBank), I suggest you look up this blog post. Importantly, please note that there is an alignment of reference sequences (representing all the 17 subtypes currently known) here - however, it requires access to the journal (and then look up 'Supplementary content' - there's a notepad file you can download). I can hope for colleagues using this alignment for phylogenetic analysis of Blastocystis SSU rRNA genes, since this is one important step towards further standardisation of Blastocystis terminology.
Other useful free online software:
For quick nucleotide alignments (groups your sequences in clusters) you can use MultAlin - chose the DNA - 5-0 option from the alignment parameters drop down menu.Trick: I usually do alignments in MultAlin and once I get the alignment, I choose the 'Results as fasta files' option (scroll to the bottom of the page), - this gives you an inventory of aligned fasta files that you can copy and paste directly into the 'build DNA alignment' function in MEGA6... now you can for instance search for specific DNA signatures (this option is not available in the MultAlin output unfortunately) and you can do phylogeny too.
And so, for alignment and phylogeny, I recommend MEGA6 or any more recent version.
Useful papers:
Scicluna SM, Tawari B, & Clark CG (2006). DNA barcoding of Blastocystis. Protist, 157 (1), 77-85 PMID: 16431158
Stensvold CR (2013). Comparison of sequencing (barcode region) and sequence-tagged-site PCR for Blastocystis subtyping. Journal of Clinical Microbiology, 51 (1), 190-4 PMID: 23115257
Alfellani MA, Taner-Mulla D, Jacob AS, Imeede CA, Yoshikawa H, Stensvold CR, & Clark CG (2013). Genetic diversity of Blastocystis in livestock and zoo animals. Protist, 164 (4), 497-509 PMID: 23770574
Stensvold CR (2013). Blastocystis: Genetic diversity and molecular methods for diagnosis and epidemiology. Tropical Parasitology, 3 (1), 26-34 PMID: 23961438
Alfellani MA, Stensvold CR, Vidal-Lapiedra A, Onuoha ES, Fagbenro-Beyioku AF, & Clark CG (2013). Variable geographic distribution of Blastocystis subtypes and its potential implications. Acta Tropica, 126 (1), 11-8 PMID: 23290980
Clark CG, van der Giezen M, Alfellani MA, & Stensvold CR (2013). Recent developments in Blastocystis research. Advances in Parasitology, 82, 1-32 PMID: 23548084
Stensvold CR, Ahmed UN, Andersen LO, & Nielsen HV (2012). Development and evaluation of a genus-specific, probe-based, internal-process-controlled real-time PCR assay for sensitive and specific detection of Blastocystis spp. Journal of Clinical Microbiology, 50 (6), 1847-51 PMID: 22422846
Stensvold CR, Suresh GK, Tan KS, Thompson RC, Traub RJ, Viscogliosi E, Yoshikawa H, & Clark CG (2007). Terminology for Blastocystis subtypes--a consensus. Trends in Parasitology, 23 (3), 93-6 PMID: 17241816
Moreover, London School of Hygiene and Tropical Medicine Online Library currently comprises 25 papers on Blastocystis, most of which can be accessed for free (pre-print version) here.
This blog post might be updated later on, and so you may want to subscribe to blog updates - you can do so using the designated function in the sidebar.If you have any suggestions to how to improve this post, feel free to contact me.
And as announced, I've chosen to dedicate a separate post listing some easy-to-use tools for subtyping Blastocystis from humans and animals.
First, I want to guide your attention to the YouTube video that I made; it takes you through various important steps of subtyping and introduces you to the online database that can be used to call subtypes by BLASTing batches of fasta files - provided that they are the right ones! And what do I mean by 'right ones'? Well, in order to get subtype information in a split second you need to have DNA sequences covering the first 500 base pairs (5'-end) of the Blastocystis small subunit (SSU) rRNA gene.
The online query database can be found here, and as you can see, it has a 'Sequence and profiles definition' section and an 'Isolates database' section; for now, never mind the latter. Now, to test this, press the 'Sequence and profiles definition', press the 'Sequence query' link, copy the following fasta file and paste it into the query box:
>gi|359391562|gb|JN682513.1|
CTGCCAGTAGTCATACGCTCGTCTCAAAGATTAAGCCATGCATGTGTAAGTATAAATATTTGACTTTGAA
ACTGCGAATGGCTCATTATATCAGTTATAGTTTATTTGATGAACAATACTACTTGGATAACCGTAGTAAT
TCTAGAGCTAATACATGACAAAATCCTCGACTTTGAAGAGGTGTATTTATTAGAATGAAACCAAGAGACT
TCGGTCTATTTGTGAGTAATAATAACTAATCGTATCGCATGCTTAGGTAGCGATATGTCTTTCAAGTTTC
TGCCCTATCAGCTTTGGATGGTAGTGTATTGGACTACCATGGCAGTAACGGGTAACGAAGAATTTGGGTT
CGATTTCGGAGAGGGAGCCTGAGAGATGGCTACCACATCCAAGGAAGGCAGCAGGCGCGTAAATTACCCA
ATCCTGACATAGGGAGGTAGTGACAATAAATCACAATGCGGAACTATTAGTTTTGCAATTGGATTGAGAA
CAATGTACAAATGTTATCGATAAACAATTGGAGGGCAAGTCTGGTGCCAGCAGCCGCGGTAATTCCAGCT
CCAATAGCGTATATTAACGTTGTTGCAGTTAAAAAGCTCGTAGTTGAATTGAAGTGAACTTGGATTGATG
TGATCTTCGGATGACGTGAATCAAAGTTGACTCTTTCCAAAGTCAATACATTGGTATTCATTTATCTTTG
TAT
Submit your query, and then what you see is this:
Which means that a 100% identify was found and that what you pasted in was ST4, allele no. 94. This allele belongs to the rare genotype of Blastocystis. sp. ST4.
Now, even if you have a non-Blastocystis sequence, you will sometimes get a result providing the gene region is the correct one, and this is where to exert great awareness. Below is a sequence of Saccharomyces cerevisiae, which may be amplified by the barcoding primers; try and paste it into the query box and submit it for analysis:
>Saccharomyces_cerevisiae_(J01353)
TATCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTAAGTATAAGCAATTTATACAGTGAAACTGCGAATGGCTCATTAAATCAGTTATCGTTTATTTGATAGTTCCTTTACTACA
TGGTATAACCGTGGTAATTCTAGAGCTAATACATGCTTAAAATCTCGACCCTTTGGAAGAGATGTATTTATTAGATAAAAAATCAATGTCTTCGGACTCTTTGATGATTCATAATAACTTTTCGAATCGCATGGCCTTGT
GCTGGCGATGGTTCATTCAAATTTCTGCCCTATCAACTTTCGATGGTAGGATAGTGGCCTACCATGGTTTCAACGGGTAACGGGGAATAAGGGTTCGATTCCGGAGAGGGAGCCTGAGAAACGGCTACCACATCCAAGGA
AGGCAGCAGGCGCGCAAATTACCCAATCCTAATTCAGGGAGGTAGTGACAATAAATAACGATACAGGGCCCATTCGGGTCTTGTAATTGGAATGAGTACAATGTAAATACCTTAACGAGGAACAATTGGAGGGCAAGTCT
GGTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTAAAGTTGTTGCAGTTAAAAAGCTCGTAGTTGAACTTTGGGCCCGGTTGGCCGGTCCGATTTTTTCGTGTACTGGATTTCCAACGGGGCCTTTCCTTC
What you'll see is this:
As you can see, there are many mismatches in the alignment.. so this is not allele 42 (ST4), of course not, it's not even Blastocystis! This is why I suggest you always nucleotide BLAST your fasta files at the NCBI database (use this link). Only if they match Blastocystis, go ahead and call the subtype and the allele using the pubmlst.org/blastocystis database.
If you have a Blastocystis sequence that exhibits polymorphism compared to the reference sequences in the Blastocystis database, it may be due to one of two reasons: 1) The sequence may be unclear and/or edited erroneously, or 2) the sequence represents a new allele or a new subtype.
This means that if your sequence does not fit 100% with those in the database, I suggest you have a meticulous look at it, and if there are unclear sections, then re-sequence the whole lot - preferentially bidirectionally. If you end up with a clear sequence which still exhibits one or more polymorphisms, then please submit it to the database - you can do so be contacting the curator, who is basically me.
What you want is sequences looking like this:
For sequence editing you may want to use CHROMAS or FinchTv. These are good for single nucleotide sequence editing. If I do bidirectional sequencing or in cases where I'm having multiple sequences covering a gene (for instance when I'm sequencing complete SSU rRNA genes), I use STADEN Package; installing it may be a pain, though, make sure you use the right browser for starters... Once it has been installed, it works brilliantly, and the SOP I made for it is available below (please note that I made this SOP a couple of years ago; more recent software versions are on the market).
When is a subtype a novel subtype? Well, we addressed this question in our recent review in Advances in Parasitology. If you cannot access this journal, I suggest you look it up in the LSHTM Online Library - where you can find the pre-print version (go here to download). If you think you're dealing with a new subtype (less than 97-98% identity to reference sequences in GenBank), I suggest you look up this blog post. Importantly, please note that there is an alignment of reference sequences (representing all the 17 subtypes currently known) here - however, it requires access to the journal (and then look up 'Supplementary content' - there's a notepad file you can download). I can hope for colleagues using this alignment for phylogenetic analysis of Blastocystis SSU rRNA genes, since this is one important step towards further standardisation of Blastocystis terminology.
Other useful free online software:
For quick nucleotide alignments (groups your sequences in clusters) you can use MultAlin - chose the DNA - 5-0 option from the alignment parameters drop down menu.Trick: I usually do alignments in MultAlin and once I get the alignment, I choose the 'Results as fasta files' option (scroll to the bottom of the page), - this gives you an inventory of aligned fasta files that you can copy and paste directly into the 'build DNA alignment' function in MEGA6... now you can for instance search for specific DNA signatures (this option is not available in the MultAlin output unfortunately) and you can do phylogeny too.
And so, for alignment and phylogeny, I recommend MEGA6 or any more recent version.
Useful papers:
Scicluna SM, Tawari B, & Clark CG (2006). DNA barcoding of Blastocystis. Protist, 157 (1), 77-85 PMID: 16431158
Stensvold CR (2013). Comparison of sequencing (barcode region) and sequence-tagged-site PCR for Blastocystis subtyping. Journal of Clinical Microbiology, 51 (1), 190-4 PMID: 23115257
Alfellani MA, Taner-Mulla D, Jacob AS, Imeede CA, Yoshikawa H, Stensvold CR, & Clark CG (2013). Genetic diversity of Blastocystis in livestock and zoo animals. Protist, 164 (4), 497-509 PMID: 23770574
Stensvold CR (2013). Blastocystis: Genetic diversity and molecular methods for diagnosis and epidemiology. Tropical Parasitology, 3 (1), 26-34 PMID: 23961438
Alfellani MA, Stensvold CR, Vidal-Lapiedra A, Onuoha ES, Fagbenro-Beyioku AF, & Clark CG (2013). Variable geographic distribution of Blastocystis subtypes and its potential implications. Acta Tropica, 126 (1), 11-8 PMID: 23290980
Clark CG, van der Giezen M, Alfellani MA, & Stensvold CR (2013). Recent developments in Blastocystis research. Advances in Parasitology, 82, 1-32 PMID: 23548084
Stensvold CR, Ahmed UN, Andersen LO, & Nielsen HV (2012). Development and evaluation of a genus-specific, probe-based, internal-process-controlled real-time PCR assay for sensitive and specific detection of Blastocystis spp. Journal of Clinical Microbiology, 50 (6), 1847-51 PMID: 22422846
Stensvold CR, Suresh GK, Tan KS, Thompson RC, Traub RJ, Viscogliosi E, Yoshikawa H, & Clark CG (2007). Terminology for Blastocystis subtypes--a consensus. Trends in Parasitology, 23 (3), 93-6 PMID: 17241816
Moreover, London School of Hygiene and Tropical Medicine Online Library currently comprises 25 papers on Blastocystis, most of which can be accessed for free (pre-print version) here.
This blog post might be updated later on, and so you may want to subscribe to blog updates - you can do so using the designated function in the sidebar.If you have any suggestions to how to improve this post, feel free to contact me.