*Adapted from an article by Ben Callahan appearing in The ISME Journal: Multidisciplinary Journal of Microbial Ecology.
The most common technique used to analyze microbiomes is to perform DNA sequencing on specific “barcode” genes that allow researchers to create a sort of census of the bacteria (or archaea or fungi) present in a community. However, sequencing errors together with the sheer complexity of microbial communities, presents a daunting challenge for microbiome analysis.
For years now most researchers have employed “operational taxonomic units” or OTUs as a tool to reduce the impact of errors and overall complexity. In short, researchers group similar DNA sequences together into OTUs, and then study those OTUs rather than the DNA sequences directly.
In this paper we make the case that this standard practice of studying OTUs is standing in the way of really great science, and that we should go back to studying DNA sequences (or “exact sequence variants) directly.
Recent scientific progress has made this transition possible. New algorithms that powerfully address sequencing errors have been developed in recent years that make it possible to work with exact DNA sequences without being overwhelmed by errors.
In this paper, we argued that now that we can work with exact DNA sequences, we in fact should now work with exact DNA sequences because they lead to more robust science.
The key reason behind our argument is that OTUs in one study are not the same as OTUs in another study, and therefore findings based on OTUs are hard to directly replicate or refute — a key part of the scientific method! This is because OTUs are made by grouping similar sequences together, and the exact groups you construct will depend in potentially sensitive ways on the set of DNA sequences in your dataset. Consider the problem of creating voting districts — there are a hundred ways to divide a state into equal-population districts, and everyone who tries will end up with a different set of districts. So it is with OTUs in different studies.
DNA sequences are different, because DNA sequences represent a real physical object rather than an abstract clustering. As a result, findings based on exact DNA sequences can be directly tested in subsequent experiments, rigorously progressing our science, and results from different labs or studies can be easily combined.
In a sign that the time is right for exact sequences to replace OTUs, our arguments were echoed in a recent study1 published in Nature that was the largest and broadest investigation to date into microbial life on earth. Citing many of the reasons we cited, especially the ability to directly compare exact sequences between different samples and experiments, this landmark study adopted exact sequences instead of OTUs.
1: Thompson, Luke R., et al. “A communal catalogue reveals Earth’s multiscale microbial diversity.” Nature 551.7681 (2017).