COI barcoding

According to Hebert et al. (2003), animal species can be distinguished by comparing a 648-bp fragment of the mitochondrial cytochrome c oxidase subunit I (COI) gene. The length of the fragment is technically determined, corresponding to the limits of Sanger sequencing. The COI barcoding can be used for two purposes:

  • detection (delimitation) of new species
  • identification of known species

In both cases, query sequences are compared with sequences stored in reference libraries like GenBank and BOLD (Barcode of Life Data Systems).

COI sequences have been chosen as species markers, because it turned out that most animal species (except cnidarians) are separated from congeneric species by CO1 sequence divergences higher than 2%, while sequence divergences among conspecifics are usually less than 2% (Hebert et al., 2003). Misleadingly, this phenomenon is referred to as the “barcode gap” (Meyer & Paulay, 2005). More than 94% of morphologically defined bird species have been confirmed with COI as a species-level marker gene (Wang et al., 2020). In a comparative avian mitogenomic study, the CO1 gene proved to be the one with the least amount of rate heterogeneity across avian orders, thus being closest to a “molecular clock” (Pacheco et al., 2011). Thus the suitability of COI gene sequences as indicators of species limits simply reflects the fact that they serve as a reliable time scale. [This should be seen as a strong argument for my plea to rigorously apply Chrono-Taxonomics.]


Čandek K, and Kuntner M (2014), DNA barcoding gap: reliable species identification over morphological and geographical scales, Mol. Ecol. 15, 268-277. (abstract)

DeSalle R, and Goldstein P (2019), Review and interpretation of trends in DNA barcoding, Front. Ecol. Evol. 7, e:302. (free pdf)

Hebert PDN, Cywinska A, Ball SL, and deWaard JR (2003), Biological identifications through DNA barcodes, Proc. Roy. Soc. LondB 270, 313-321. (pdf) 

Hebert PDN, Stoeckle MY, Zemlak TS, and Francis CM (2004), Identification of birds through DNA barcodes, PLOS Biology 2, e:312. (pdf)

Kekkonen M, and Hebert PDN (2014), DNA barcode-based delineation of putative species: efficient start for taxonomic workflows, Mol. Ecol. Resour. 14, 706-715. (pdf)

Lijtmaer DA, Kerr KCR, Stoeckle MY, and Tubaro PL (2012), DNA barcoding birds: from field collection to data analysis, in: “DNA Barcodes’” (Kress W, and Erickson D, eds.), Methods in Molecular Biology, vol. 858. (link) 

Mallo D, and Posada D (2016), Multilocus inference of species trees and DNA barcoding, Philos. Trans. R. Soc. Lond. B 371, e:20150335 (pdf)

Meiklejohn KA, Damaso N, and Robertson JM (2019), Assessment of BOLD and GenBank – their accuracy and reliability for the identification of biological materials, PLoS ONE 14, e:0217084. (pdf)

Paulay G, and Meyer CP (2005), DNA barcoding: error rates based on comprehensive sampling, PLOS Biol. 3, e:422. (pdf)

Pons J, Barraclough TG, Gomez-Zurita J, Cardoso A, Duran DP, Hazell S, Kamoun S, Sumlin WD, and Vogler AP (2006), Sequence-based species delimitation for the DNA taxonomy of undescribed species, Syst. Biol. 55, 595-609. (free pdf)

Ratnasingham S, and Hebert PDN (2007), BOLD: the barcode of life data system (, Mol. Ecol. Notes 7, 355-364. (pdf)

Savolainen V, Cowan RS, Vogler AP, Roderick GK, and Lane R (2005), Towards writing the encyclopaedia of life: an introduction to DNA barcoding, Philos. Trans. R. Soc. Lond. B 360, 1805-11. (abstract)

Shen YY, Chen X, and Murphy RW (2013), Assessing DNA barcoding as a tool for species identification and data quality control, PLOS ONE 8, e:57125. (pdf)