We announce the release of MEGA5 (Molecular Evolutionary Genetics Analysis version 5). MEGA5 is a biologist-friendly software with a graphical user interface for comparative analysis of molecular sequence data. MEGA provides utilities for mining online databases, building sequence alignments and phylogenetic trees, as well as many widely used statistical analyses such as tests of selection, variance estimation, and bootstrap confidence tests.
The newest addition in MEGA5 is a collection of Maximum Likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and site-by-site estimation of evolutionary rates. ML algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. MEGA5 is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available FREE of charge from www.megasoftware.net .
A detailed description of MEGA5 and new results are now published in MBE (2011; doi:10.1093/molbev/msr121). A preprint PDF is available from: www.kumarlab.net/pdf_new/TamuraKumar11.pdf .
The full complement of features implemented in MEGA5 includes:
[1] Sequence Alignments: Direct DNA, codon, and protein alignments; both manual and automated alignments with trace file Editor. Built-in automated aligners: CLUSTALW and MUSCLE.
[2] Select Best Fit Substitution Model (ML); Substitution Models (+F = With and Without Empirical Frequencies; REV = Reversible). Rate Variation and Base Compositions: Gamma rates (G) and Invariant sites (I) models; Incorporate Compositional Heterogeneity. DNA: General-Time-Reversible (GTR), Tamura-Nei, Hasegawa-Kishino-Yano, Tamura 3-Parameter, Kimura 2-Parameter, Tajima-Nei, Jukes-Cantor; Codons: Nei-Gojobori (original and modified), Li-Wu-Lou (original and modified); Protein: Poisson, Equal-Input, Dayhoff (+F), Jones-Taylor-Thornton (+F), Whelan-And-Goldman (+F), Mitochondrial REV (+F), Chloroplast REV (+F), Reverse Transcriptase REV (+F)
[3] Estimate Substitution Pattern (MCL, ML); Estimate Rate Variation among Sites (ML); Estimate Transition/Transversion Bias (MCL, ML); Estimate Site-by-Site Rates (ML) [4] Infer Phylogenetic Trees (NJ, ML, ME, MP); Phylogeny Tests (Bootstrap Branch-length tests); Branch-and-Bound Exact Search (MP); Heuristic Searches (ML, ME, MP), including Nearest-Neighbor-Interchange (NNI), Close-Neighbor-Interchange (CNI), and Max-Mini.
[4] Compute Distances: Pairwise and Diversity; Within- Between-Group Distances; Bootstrap and Analytical Variances; Separate distances by Site Degeneracy, Codon Sites; Separation of Distances in Transitions and Transversions; Separate Nonsynonymous and Synonymous Changes
[5] Tests of Selection: For Complete Sequences or Set of Codons; Sequence Pairs or Groups (Within Between)
[6] Ancestral Sequences: Infer by ML with Relative Probabilities for states or by MP (all parsimonious pathways); both DNA and Protein
[7] Molecular Clocks: Tajima’s 3-Sequence Clock Test; Likelihood Ratio Test (ML) for a Topology; Estimate Branch Lengths under Clock
Sudhir Kumar
Center for Evolutionary Medicine Informatics (CEMI)
Biodesign Institute School of Life Sciences
Arizona State University
Tempe, AZ 85287-5301, USA