Mitochondrial DNA (mtDNA)

In the avian ground pattern, the mitochondrial genome (mitogenome) consists of 2 rRNA genes, 22 tRNA genes, 13 protein-coding genes, a control region, and (putatively) a tandem duplication: 

Circular map depicting the putative ancestral avian mitogenome organisation (not to scale). The tandem duplication (TD), which extends between CR and F, is shown separately. When fully developed, it contains a pseudogene Ψ (considered a degenerate partial copy of CYB), four functional genes, and an extended control region (Urantówka et al., 2020). 

The 22 tRNA-genes are short, ranging from 64 to 78 base pairs (bp). Since tRNAs play an important role in translating mRNA into protein, they are rather conserved. 

The 13 protein-coding genes are much longer and exhibit a higher rate of nucleotide substitution. For decades, some of these genes (CO1, CYB, ND2) have routinely been used in phylogenetic studies. While individual gene trees derived from the protein-coding genes and control region usually differ from each other and from species trees based on nuclear DNA, phylogenies that are based on entire mitogenomes are mostly concordant with nuclear DNA-based species trees. Because of the observed gene-tree discordance between individual mtDNA genes, phylogenetic studies should no longer rely on limited sets of mitochondrial genes but on the mitogenome as a whole (Meiklejohn et al., 2014; Campillo et al., 2019). Since protein-coding mtDNA has a higher mutation rate than nuclear DNA, mitochondrial genes are particularly suitable for studying shallow (intra- and interspecific) phylogenetic relationships. 

The protein-coding gene ND3 is peculiar in having an extra nucleotide (mostly cytosine) at position 174. Its presence probably pertains to the avian ground pattern, but it has been lost several times during avian evolution. The extra base, however, appears not to be processed during translation, as the downstream reading frame and amino-acid sequence are conserved (Mindell et al., 1998b). 

Summary of avian mitochondrial annotations. Duplicated regions are not considered. In protein-coding genes, partial stop codons (TA and T) serve as stop signals after they are completed to UAA by posttranscriptional polyadenylation. 

Tandem duplication

Avian mitogenomes are distinguished from typical vertebrate mitogenomes by the presence of an additional tandem duplication (Urantówka et al., 2018, 2020, 2021; Mackiewicz et al., 2019). Dating back to Haring et al. (1999), this region is sometimes referred to “pseudo-control region”. With the exception of the duplicated control region (CR2), which is slightly different from CR1, duplicated genes are often identical to their counterpart, a phenomenon referred to as “concerted evolution” (Urantówka et al., 2021). However, numerous deviations from the original configuration are observed among and within avian orders. Tandem duplications seem to be entirely absent in Galloanserae. 

COI barcoding

DNA barcoding is a method of species identification by comparing DNA sequences of an unknown sample with DNA sequences of known species via public online reference databases. In animals, the sequence used for DNA barcoding is a 648-bp fragment of the mitochondrial CO1 gene. The length of the fragment is determined by the limits of Sanger sequencing. 

COI-barcoding has been chosen, because it turned out that most animal species (except cnidarians) are separated from congeneric species by CO1 sequence divergences higher than 2%, while sequence divergences among conspecifics are usually less than 2% (Hebert et al., 2003). This observation is referred to as the “barcode gap” (Meyer & Paulay, 2005). More than 94% of morphologically defined bird species have been confirmed with COI as a species-level marker gene (Wang et al., 2020). 

In a comparative avian mitogenomic study, the CO1 gene proved to be the one with the least amount of rate heterogeneity across avian orders, thus being closest to a “molecular clock” (Pacheco et al., 2011). This explains its suitability as an indicator of species limits. 


