PECAN: A FAST CLUSTERING-FREE 16S ribosomal RNA gene sequence taxonomic assignment tool
Clustering of sequences into Operational Taxonomic Units (OTUs) has become a mainstream approach to facilitate taxonomic classification of large numbers of 16S rRNA gene sequences. This is partly due to the high computational requirements for processing each sequence in increasingly large datasets. A primary focus of the field has been development and improvement of OTU-based sequence clustering methods that rely on distances between each pair of sequences in a dataset. Following OTU-based clustering, representative sequences are commonly classified using tools such as the RDP Naïve Bayesian Classifier (Wang et al. 2007), and the resulting classification transitively assigned to all sequences comprising that OTU. However, problems with this strategy exist (Nguyen et al., 2016). We have developed PECAN, a novel per sequence taxonomic assigner which quickly and accurately classifies millions of 16S rRNA gene sequences using higher order Markov Chain models built from a user- specified set of reference sequences, hence does not require the need for OTU clustering.
You can download PECAN from GitHub.
Download the poster presented at the 16th International Symposium on Microbial Ecology (ISME-16, August 21-26, 2016) in Montreal, Canada, Session PS08-Cutting-edge methods in microbial ecology. "PECAN: A fast, novel 16S rRNA gene sequence non- clustering based taxonomic assignment tool". Johanna Holm, Pawel Gajer and Jacques Ravel. ISME 16
The development of PECAN was supported by the National Institute of Allergy and Infectious Diseases and the National Institute of General Medical Sciences of the National Institutes of Health under awards numbers U19AI084044, R01AI116799 and R01GM103604.