Institut Pasteur blank vertical divider clipart blank DBC blank vertical divider clipart blank CRBIP blank vertical divider clipart blank GIPhy

Empirical Models of
Amino Acid Substitution



Description

This web page provides a complete list of amino acid replacement matrices for model-based sequence evolution analyses.
For each amino acid evolutionary model, equilibrium amino-acid frequencies and exchangeability matrix are available in PAML format.
The complete reference list is also provided, as well as the source of the data.
[last update: 22.09.30]


Amino Acid Replacement Matrices

  1. Dayhoff   GENERAL
    Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model of evolutionary change in proteins.
    In: Dayhoff MO (ed.), Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Washington DC, Vol. 5, pp. 345–352. [pdf]
    data source: https://www.ebi.ac.uk/goldman-srv/dayhoff/dayhoff-paml.dat

  2. BLOSUM62   GENERAL
    Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks.
    Proceedings of the National Academy of Sciences of the USA, 89:10915–10919. doi:10.1073/pnas.89.22.10915
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* Blosum62 */)

  3. JTT   GENERAL
    Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences.
    Computer Applications in the Biosciences, 8:275–282. doi:10.1093/bioinformatics/8.3.275
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* jones */)

  4. mtREV   MITOCHONDRION
    Adachi J, Hasegawa M (1996) Model of amino acid substitution in proteins encoded by mitochondrial DNA.
    Journal of Molecular Evolution, 42:459–468. doi:10.1007/BF02498640
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* mtrev24 */)

  5. mtMam   MITOCHONDRION
    Yang Z, Nielsen R, Hasegawa M (1998) Models of amino acid substitution and applications to mitochondrial protein evolution.
    Molecular Biology and Evolution, 15:1600–1611. doi:10.1093/oxfordjournals.molbev.a025888
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* mtmam */)

  6. cpREV   PLASTID
    Adachi J, Waddell PJ, Martin W, Hasegawa M (2000) Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA.
    Journal of Molecular Evolution, 50:348–358. doi:10.1007/s002399910038
    data source: https://github.com/xflouris/libpll-2/blob/master/src/maps.c (pll_aa_rates_cprev)

  7. VT   GENERAL
    Muller T, Vingron M (2000) Modeling amino acid replacement.
    Journal of Computational Biology, 7:761–776. doi:10.1089/10665270050514918
    data source: https://github.com/stephaneguindon/phyml/blob/master/src/init.c (Init_Qmat_VT)

  8. WAG   GENERAL
    Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.
    Molecular Biology and Evolution, 18:691–699. doi:10.1093/oxfordjournals.molbev.a003851
    data source: https://www.ebi.ac.uk/goldman-srv/WAG/wag.dat

  9. WAG*   GENERAL
    Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.
    Molecular Biology and Evolution, 18:691–699. doi:10.1093/oxfordjournals.molbev.a003851
    data source: https://www.ebi.ac.uk/goldman-srv/WAG/wagstar.dat

  10. rtREV   RETROVIRUS
    Dimmic MW, Rest JS, Mindell DP, Goldstein RA (2002) rtREV: an amino acid substitution matrix for inference of retrovirus and reverse transcriptase phylogeny.
    Journal of Molecular Evolution, 55:65–73. doi:10.1007/s00239-001-2304-y
    data source: https://link.springer.com/content/pdf/10.1007/s00239-001-2304-y.pdf

  11. PMB   GENERAL
    Veerassamy S, Smith A, Tillier ER (2003) A transition probability model for amino acid substitutions from blocks.
    Journal of Computational Biology, 10:997–1010. doi:10.1089/106652703322756195
    data source: wwwlabs.uhnresearch.ca/tillier/dependent_files/pmb/pmb.dat

  12. DCMut-Dayhoff   GENERAL
    Kosiol C, Goldman N (2005) Different versions of the Dayhoff rate matrix.
    Molecular Biology and Evolution, 22:193–199. doi:10.1093/molbev/msi005
    data source: https://www.ebi.ac.uk/goldman-srv/dayhoff/dayhoff-dcmut.dat

  13. DCMut-JTT   GENERAL
    Kosiol C, Goldman N (2005) Different versions of the Dayhoff rate matrix.
    Molecular Biology and Evolution, 22:193–199. doi:10.1093/molbev/msi005
    data source: https://www.ebi.ac.uk/goldman-srv/dayhoff/jtt-dcmut.dat

  14. HIVb   HUMAN IMMUNODEFICIENCY VIRUS
    Nickle DC, Heath L, Jensen MA, Gilbert PB, Mullins JI, Kosakovsky Pond SL (2007) HIV-specific probabilistic models of protein evolution.
    PLoS ONE, 2:e503. doi:10.1371/journal.pone.0000503
    data source: https://github.com/stephaneguindon/phyml/blob/master/src/init.c (Init_Qmat_HIVb)

  15. HIVw   HUMAN IMMUNODEFICIENCY VIRUS
    Nickle DC, Heath L, Jensen MA, Gilbert PB, Mullins JI, Kosakovsky Pond SL (2007) HIV-specific probabilistic models of protein evolution.
    PLoS ONE, 2:e503. doi:10.1371/journal.pone.0000503
    data source: https://github.com/stephaneguindon/phyml/blob/master/src/init.c (Init_Qmat_HIVw)

  16. MtArt   ARTHROPODA MITOCHONDRION
    Abascal F, Posada D, Zardoya R (2007) MtArt: a new model of amino acid replacement for Arthropoda.
    Molecular Biology and Evolution, 24:1–5. doi:10.1093/molbev/msl136
    data source: https://github.com/xflouris/libpll-2/blob/master/src/maps.c (pll_aa_rates_mtart)

  17. LG   GENERAL
    Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix.
    Molecular Biology and Evolution, 25:1307–1320. doi:10.1093/molbev/msn067
    data source: http://www.atgc-montpellier.fr/download/datasets/models/lg_LG.PAML.txt

  18. MtZoa   MITOCHONDRION
    Rota-Stabelli O, Yang Z, Telford MJ (2009) MtZoa: a general mitochondrial amino acid substitutions model for animal evolutionary studies.
    Molecular Phylogenetics and Evolution, 52:268–272. doi:10.1016/j.ympev.2009.01.011
    data source: https://ars.els-cdn.com/content/image/1-s2.0-S1055790309000165-mmc1.txt

  19. cpREV64   PLASTID
    Zhong B, Yonezawa T, Zhong Y, Hasegawa M (2010) The Position of Gnetales among seed plants: overcoming pitfalls of chloroplast phylogenomics.
    Molecular Biology and Evolution, 27:2855–2863. doi:10.1093/molbev/msq170
    data source: https://academic.oup.com/mbe/article/27/12/2855/1074835#supplementary-data

  20. FLU   INFLUENZA VIRUS
    Dang CC, Le SQ, Gascuel O, Le VS (2010) FLU, an amino acid substitution model for influenza proteins.
    BMC Evolutionary Biology, 10:99. doi:10.1186/1471-2148-10-99
    data source: ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU/Flu_All_it2.txt_PAML.txt

  21. gcpREV   GREEN PLANT CHLOROPLAST
    Cox CJ, Foster PG (2013) A 20-state empirical amino-acid substitution model for green plant chloroplasts.
    Molecular Phylogenetics and Evolution, 68:218–220. doi:10.1016/j.ympev.2013.03.030
    data source: https://github.com/pgfoster/p4-phylogenetics/blob/master/Misc/gcpREV_model/gcpREV.dat

  22. stmtREV   LAND PLANT MITOCHONDRION
    Liu Y, Cox CJ, Wang W, Goffinet B (2014) Mitochondrial phylogenomics of early land plants: mitigating the effects of saturation, compositional heterogeneity, and codon-usage bias.
    Systematic Biology, 63:862–878. doi:10.1093/sysbio/syu049
    data source: https://datadryad.org/bitstream/handle/10255/dryad.58788/stmtREV_model.txt

  23. AB   ANTIBODY
    Mirsky A, Kazandjian L, Anisimova M (2015) Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.
    Molecular Biology and Evolution, 32:806–819. doi:10.1093/molbev/msu340
    data source: https://academic.oup.com/mbe/article/32/3/806/980410#supplementary-data

  24. mtInv   INVERTEBRATE MITOCHONDRION
    Le VS, Dang CC, Le SQ (2017) Improved mitochondrial amino acid substitution models for metazoan evolutionary studies.
    BMC Evolutionary Biology, 17:136. doi:10.1186/s12862-017-0987-y
    data source: https://github.com/Vinhbio/mt_metazoan_models/tree/master/mt_metazoan_models_from_all_data

  25. mtMet   METAZOAN MITOCHONDRION
    Le VS, Dang CC, Le SQ (2017) Improved mitochondrial amino acid substitution models for metazoan evolutionary studies.
    BMC Evolutionary Biology, 17:136. doi:10.1186/s12862-017-0987-y
    data source: https://github.com/Vinhbio/mt_metazoan_models/tree/master/mt_metazoan_models_from_all_data

  26. mtVer   VERTEBRATE MITOCHONDRION
    Le VS, Dang CC, Le SQ (2017) Improved mitochondrial amino acid substitution models for metazoan evolutionary studies.
    BMC Evolutionary Biology, 17:136. doi:10.1186/s12862-017-0987-y
    data source: https://github.com/Vinhbio/mt_metazoan_models/tree/master/mt_metazoan_models_from_all_data

  27. DEN   DENGUE VIRUS
    Le TK, Dang CC, Le SV (2018) Building a specific amino acid substitution model for Dengue viruses.
    In: Phuong TM, Nguyen ML (eds) Proceedings of 10th International Conference on Knowledge and Systems Engineering (KSE 2018), Ho Chi Minh City, Vietnam, pp. 242–246. doi:10.1109/KSE.2018.8573341
    data source: https://github.com/xflouris/libpll-2/blob/dev/src/maps.c (pll_aa_rates_den)

  28. mtOrt   ORTHOPTERA MITOCHONDRION
    Chang H, Nie Y, Zhang N, Zhang X, Sun H, Mao Y, Qiu Z, Huang Y (2020) MtOrt: an empirical mitochondrial amino acid substitution model for evolutionary studies of Orthoptera insects.
    BMC Ecology and Evolution, 20:57. doi:10.1186/s12862-020-01623-6
    data source: https://static-content.springer.com/esm/art%3A10.1186%2Fs12862-020-01623-6/MediaObjects/12862_2020_1623_MOESM2_ESM.txt (model mtOrt)

  29. FLAVI   FLAVIVIRUS
    Le TK, Vinh LS (2020) FLAVI: An Amino Acid Substitution Model for Flaviviruses.
    Journal of Molecular Evolution, 88:445-452. doi:10.1007/s00239-020-09943-3
    data source: https://github.com/thulekm/flavi/blob/master/FLAVI.PAML

  30. Q.LG   GENERAL
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.LG)

  31. Q.pfam   GENERAL
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.pfam)

  32. Q.bird   BIRD
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.bird)

  33. Q.insect   INSECT
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.insect)

  34. Q.mammal   MAMMAL
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.mammal)

  35. Q.plant   GREEN PLANT
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.plant)

  36. Q.yeast   YEAST
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.yeast)

  37. HIVin   HIV INTEGRASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file HIVin.dat in Substitution models.zip)

  38. HIVpr   HIV PROTEASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file HIVpr.dat in Substitution models.zip)

  39. VIRin   VIRUS INTEGRASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file VIRin.dat in Substitution models.zip)

  40. VIRpr   VIRUS PROTEASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file VIRpr.dat in Substitution models.zip)


GIPhy