This study aims to increase efficiency and accuracy of metabarcoding for use in worldwide biodiversity documentation. Biodiversity supports our society through medicinal breakthroughs, ecotourism, mental health benefits, and more. To conserve biodiversity, we must understand what species exist and their distributions across the globe. One tool used for biodiversity assessment is metabarcoding where species information is recovered from DNA sequenced from environmental samples like soil or water. Species delimitation methods that computationally decide the cutoff between interspecific and intraspecific diversity based on DNA sequence data are often used to assign DNA sequences to species. While these methods are widely used with DNA barcoding data (single species sequencing), knowledge of their performance with metabarcoding data is limited. We will assess the accuracy of four common molecular species delimitation methods in estimating species from these types of data. First, the current literature was used to define simulation parameters for 1000 metabarcoding DNA sequence datasets. Next, we will subset the simulated data to model various scenarios including different taxonomic ranks, different patterns of missing data, and data skewed towards a few common species. We will then apply the species delimitation methods to each dataset and compare the expected species richness to the species richness estimated by the method. This will allow us to assess the strengths and weaknesses of each method under the different scenarios and determine what is the best protocol for using species delimitation with metabarcoding data. Pinpointing the advantages and disadvantages of the different species delimitation methods will allow for a better use of metabarcoding data and in turn increase the efficiency of biodiversity documentation across the globe.
Assessing Species Delimitation Performance with Metabarcoding Data
Category
Student Abstract Submission