Protein-coding genes

The avian mitochondrial genome (mitogenome), like that of most vertebrates, is highly conserved and typically contains 13 protein-coding genes. These genes are essential for oxidative phosphorylation (i.e. ATP production) in the mitochondria. They include:

  • ND1, ND2, ND3, ND4, ND4L, ND5, ND6 (NADH dehydrogenase subunits)
  • CYTB (cytochrome b)
  • COX1, COX2, COX3 (cytochrome c oxidase subunits)
  • ATP6, ATP8 (ATP synthase subunits)

ATP6

  • ATP8 connection: 10 bp overlap  (ATG.AAC.CTA.A; sometimes T replacing C)
  • Start codon: ATG
  • Length: 684 bp (227 amino acids)
  • Indels: none
  • Stop codon: TAA
  • CO3 connection: 1 bp overlap 1 (A)

ATP8

  • tRNA Lys (K) connection: none
  • Start codon: ATG
  • Length: 165 bp or 168 bp (54 or 55 amino acids)
  • Indels: 3 bp between 132 and 133 (codons #44 and #45)
  • Stop codon: TAA
  • ATP6 connection: 10 bp overlap  (A.TGA.ACC.TAA, sometimes replacing C). 

CO1

  • tRNA Tyr (Y) connection: none
  • Start codon: GTG (but ATG e.g. in Accipitridae, Jacanidae, Meropidae, Strigidae)
  • Length: 1551 bp (516 amino acids)
  • Indels: none
  • Stop codon: AGG
  • tRNA Ser-UCN (S1) connection: 9 bp overlap  (CAA.GAA.AGG, sometimes G replacing A). 

CO2

  • tRNA Asp (D) connection: 1 bp spacer (C/T)
  • Start codon: ATG/GTG
  • Length: 684 bp or 687 bp (227 or 228 amino acids)
  • Indels: 3 bp between 678 and 682 (codons #226 and #228)
  • Stop codon: TAA
  • tRNA Lys (K) connection: 1 bp spacer (C/T, rarely G)

CO3

  • ATP6 connection: 1 bp overlap (A)
  • Start codon: ATG
  • Length: 784 bp (261 amino acids)
  • Indels: none
  • Stop codon: T
  • tRNA Gly (G) connection: 1 bp spacer (G)

CYB

  • ND5 connection: none
  • Start codon: ATG
  • Length: 1143 bp (380 amino acids)
  • Indels: none
  • Stop codon: TAA (sometimes TAG)
  • tRNA Thr (T) connection: none

ND1

  • tRNA Leu-UUR (L1) connection: variable (6-15 spacers)
  • Start codon: ATG
  • Length: 978 bp (325 amino acids), sometimes 975 bp (324 amino acids)
  • Indels: variable (3 bp deletion at positions 4-6; 3 bp deletion at 10-12; 1 bp deletion at 973, creating new stop codon)
  • Stop codon: AGG, sometimes TAA
  • tRNA Ile (I) connection: usually 2 bp overlap (GG), but none when stop codon TAA

ND2

  • tRNA Met (M) connection: none (rarely 1 bp spacer)
  • Start codon: ATG
  • Length: 1041 bp (346 amino acids)
  • Indels: none
  • Stop codon: TAG, sometimes TAA
  • tRNA Trp (W) connection: 2 bp overlap, usually (AG), sometimes (AA)

ND3

  • tRNA Gly (G) connection: none
  • Start codon: ATG, sometimes GTG
  • Length: 352 bp (116 amino acids)
  • Indels: untranslated 1 bp insertion (mostly C) at 174, sometimes absent
  • Stop codon: TAA, sometimes TAG
  • tRNA Arg (R) connection: none

Comment: the protein-coding gene ND3 is peculiar in having an extra nucleotide (mostly cytosine) at position 174. The insertion probably pertains to the avian ground pattern but has been lost many times during avian evolution (Jing et al., 2020, suppl. 12). The extra base, however, appears not to be processed during translation as the downstream reading frame and amino-acid sequence are conserved due to a translational (+1)-frameshift (Mindell et al., 1998b; Al-Arab et al., 2017; Andreu-Sánchez et al., 2020). 

ND4

  • ND4L connection: 7 bp overlap  (ATG.CTA.A, sometimes T replacing C)
  • Start codon: ATG
  • Length: 1378 bp (459 amino acids)
  • Indels: none
  • Stop codon: T
  • tRNA His (H) connection: none

Comment: it remains unclear, whether the three 1 bp insertions (at positions 180/181, 318/319, 390/391) reported to occur in the non-annotated ND4 gene of Stictonetta naevosa (Anatidae, Oxyurinae) are reliable or whether they are due to sequencing errors (GenBank accession number CM021835). 

ND4L

  • tRNA Arg (R) connection: none
  • Start codon: ATG, rarely GTG
  • Length: 297 bp (98 amino acids)
  • Indels: none
  • Stop codon: TAA
  • ND4 connection: 7 bp overlap (mostly A.TGC.TAA;  sometimes T replacing C)

ND5

  • tRNA Leu-CUN (L2) connection: none
  • Start codon: ATG/GTG
  • Length: variable (1806, 1809, 1815, 1818, 1821, 1824, 1827)
  • Indels: variable (3 bp insertion between 9 and 10; 3 bp deletion at 13-15; 3 bp deletion at 91-93; 3 bp deletion at 619-621; 3 bp deletion at 1528-30; 3 bp deletion at 1531-33; 3 bp deletion at 1534-36; 3 bp deletion at 1540-42; 3 bp deletion at 1810-12; 2 bp deletion at 1816/1817)
  • Stop codon: AGA/TAA
  • CYB connection: highly variable (from 4-12 bp spacers to 1 bp overlap)

ND6

  • tRNA Pro (P) connection: none
  • Start codon: ATG
  • Length: 522 (173 amino acids)
  • Indels: none
  • Stop codon: TAG, sometimes AGG or TAA
  • tRNA Glu (E) connection: none

Comment: ND6 is the only protein-coding gene that is encoded on the secondary (-)-strand.