Phylogenetic analyses rely on the alignment of nucleotide sequences, where supposedly homologous nucleotides (orthologs, sites) are juxtaposed. While protein-coding DNA sequences are usually alignable without problems, those encoding rRNAs and particularly the control region may be difficult to align due to the occurrence of insertions and deletions (indels) in variable regions. In these cases, conserved sequence blocks serve as landmarks aiding correct alignment (Castresana, 2000). Sequences that cannot be aligned with certainty have to be excluded from analysis.
A variety of computer software are available to download nucleotide sequences from online databases, to align them, and to perform phylogenetic analyses (e.g. Cucini et al., 2021). Some programs are available as software packages that facilitate workflow (e.g. Roman-Palacios, 2023). The resulting phylogenies are published either as simple phylogenetic trees in which branch length is arbitrary and meaningless (=cladograms, =dendrograms), as advanced phylogenetic trees in which branch length is proportional to the number of DNA-base substitutions (=phylograms), or as sophisticated time-calibrated phylogenetic trees in which branch length is proportional to time (=chronograms, =timetrees).
However, no clade-defining apomorphies are presented in molecular phylogenetic studies (whereas in traditional morphological studies clades always had to be justified by apomorphic characters or character states).
The Galloanserae project
To be able to extract clade-defining apomorphies from well-established phylogenies, I retrieved mitochondrial DNA sequences of a number of land- and waterfowl species (superorder Galloanserae) from GenBank for manual alignment in MS Excel. I have chosen this taxon because it is: (1) quite basal, (2) well studied (at least Galliformes), and (3) known to lack the tandem duplication. The species selection included representatives from all galliform and anseriform families:
In the nucleotide alignments, I arranged the species in columns according to their phylogenetic affinities. Any mutation/substitution of a nucleotide from one base to another, was highlighted by colours in the following way: indels (grey), singletons (orange), convergences (yellow), and apomorphies (red). In those cases where nucleotides differed between Galliformes and Anseriformes, the respective line is marked in black to indicate that a broader outgroup comparison would be necessary to determine which character state is plesio- and which apomorphic. In addition, intergenic spacers are highlighted in blue and intergenic overlaps in green, respectively. Nucleotides that could not be aligned with certainty are represented in italics.
For each gene, the highlighted nucleotide alignment table is accompanied by a second table showing a cladogram of Galloanserae with all the identified apomorphies listed for each taxon. Those nucleotide substitutions that result in a change of the corresponding amino acid composition (see below), are highlighted in blue.
For the protein-coding genes, a third table shows the translation products (i.e. amino-acid sequences).
PLEASE NOTE THAT MOBILES AND TABLETS MAY EXPERIENCE PROBLEMS DISPLAYING THE MS EXCEL FILES CORRECTLY
Castresana J (2000), Selection of conserved blocks from multiple alignments for their use in phylogenetic systematics, Mol. Biol. Evol. 17, 540-552. (free pdf)
Cucini C, Leo C, Iannotti N, Boschi S, Brunetti C, Pons J, Fanciulli PP, Frati F, Carapelli A, and Nardi F (2021), EZmito: a simple and fast tool for multiple mitogenome analysis, Mitochondrial DNA B 6, 1101-09. (pdf)
Guan X, Silva P, Gyenai KB, Xu J, Geng T, Tu Z, Samuels DC, and Smith EJ (2009), The mitochondrial genome sequence and molecular phylogeny of the turkey, Meleagris gallopavo, Anim. Genet. 40, 134-141. (abstract)
He L, Dai B, Zeng B, Zhang X, Chen B, Yue B, and Li J (2009), The complete mitochondrial genome of the Sichuan Hill Partridge (Arborophila rufipectus) and a phylogenetic analysis with related species, Gene 435, 23-28. (abstract)
Kan XZ, Li XF, Lei ZP, Wang M, Gao H, and Yang ZY (2010), Complete mitochondrial genome of Cabot’s tragopan, Tragopan caboti (Galliformes: Phasianidae), Genet. Mol. Res. 9, 1204-1216. (pdf)
Liu D, Zhou Y, Fei Y, Xie C, and Hou D (2021), Mitochondrial genome of the critically endangered Baer‘s Pochard, Aythya baeri, and its phylogenetic relationship with other Anatidae species, Sci. Rep. 11, e:24302. (free pdf)
Liu G, Zhou L, Li B, and Zhang L (2014), The complete mitochondrial genome of Aix galericulata and Tadorna ferruginea: bearings on their phylogenetic position in the Anseriformes, PLoS ONE 9, e:109701. (pdf)
Patton J, and Avise J (1986), Evolutionary genetics of birds IV: rates of protein divergence in waterfowl (Anatidae), Genetica 68, 129-143. (pdf)
Román-Palacios C (2023), The 'phruta' R package and 'salphycon' shiny app: increasing access, reproducibility, and transparency in phylogenetic analyses, bioRxiv. (pdf)
Zhang L, Xia T, Gao X, Yang X, Sun G, Zhao C, Liu G, and Zhang H (2023), Characterization and phylogenetic analysis of the complete mitochondrial genome of Aythya marila, Genes 14, e:1205. (free pdf)