Institut Pasteur blankvertical divider clipartblank DBC blankvertical divider clipartblank Bioinformatics and Biostatistics Hub blankvertical divider clipartblank GIPhy

Empirical Models of
Amino Acid Substitution



Description

This web page provides a complete list of amino acid replacement matrices for model-based sequence evolution analyses.
For each amino acid evolutionary model, equilibrium amino-acid frequencies and exchangeability matrix are available in PAML format.
The complete reference list is also provided, as well as the source of the data.
[last update: 22.09.30]


Amino Acid Replacement Matrices

  1. Dayhoff   GENERAL
    Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model of evolutionary change in proteins.
    In: Dayhoff MO (ed.), Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Washington DC, Vol. 5, pp. 345–352. [pdf]
    data source: https://www.ebi.ac.uk/goldman-srv/dayhoff/dayhoff-paml.dat

  2. BLOSUM62   GENERAL
    Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks.
    Proceedings of the National Academy of Sciences of the USA, 89:10915–10919. doi:10.1073/pnas.89.22.10915
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* Blosum62 */)

  3. JTT   GENERAL
    Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences.
    Computer Applications in the Biosciences, 8:275–282. doi:10.1093/bioinformatics/8.3.275
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* jones */)

  4. mtREV   MITOCHONDRION
    Adachi J, Hasegawa M (1996) Model of amino acid substitution in proteins encoded by mitochondrial DNA.
    Journal of Molecular Evolution, 42:459–468. doi:10.1007/BF02498640
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* mtrev24 */)

  5. mtMam   MITOCHONDRION
    Yang Z, Nielsen R, Hasegawa M (1998) Models of amino acid substitution and applications to mitochondrial protein evolution.
    Molecular Biology and Evolution, 15:1600–1611. doi:10.1093/oxfordjournals.molbev.a025888
    data source: https://raw.githubusercontent.com/NBISweden/MrBayes/master/src/model.c (/* mtmam */)

  6. cpREV   PLASTID
    Adachi J, Waddell PJ, Martin W, Hasegawa M (2000) Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA.
    Journal of Molecular Evolution, 50:348–358. doi:10.1007/s002399910038
    data source: https://github.com/xflouris/libpll-2/blob/master/src/maps.c (pll_aa_rates_cprev)

  7. VT   GENERAL
    Muller T, Vingron M (2000) Modeling amino acid replacement.
    Journal of Computational Biology, 7:761–776. doi:10.1089/10665270050514918
    data source: https://github.com/stephaneguindon/phyml/blob/master/src/init.c (Init_Qmat_VT)

  8. WAG   GENERAL
    Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.
    Molecular Biology and Evolution, 18:691–699. doi:10.1093/oxfordjournals.molbev.a003851
    data source: https://www.ebi.ac.uk/goldman-srv/WAG/wag.dat

  9. WAG*   GENERAL
    Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.
    Molecular Biology and Evolution, 18:691–699. doi:10.1093/oxfordjournals.molbev.a003851
    data source: https://www.ebi.ac.uk/goldman-srv/WAG/wagstar.dat

  10. rtREV   RETROVIRUS
    Dimmic MW, Rest JS, Mindell DP, Goldstein RA (2002) rtREV: an amino acid substitution matrix for inference of retrovirus and reverse transcriptase phylogeny.
    Journal of Molecular Evolution, 55:65–73. doi:10.1007/s00239-001-2304-y
    data source: https://link.springer.com/content/pdf/10.1007/s00239-001-2304-y.pdf

  11. PMB   GENERAL
    Veerassamy S, Smith A, Tillier ER (2003) A transition probability model for amino acid substitutions from blocks.
    Journal of Computational Biology, 10:997–1010. doi:10.1089/106652703322756195
    data source: wwwlabs.uhnresearch.ca/tillier/dependent_files/pmb/pmb.dat

  12. DCMut-Dayhoff   GENERAL
    Kosiol C, Goldman N (2005) Different versions of the Dayhoff rate matrix.
    Molecular Biology and Evolution, 22:193–199. doi:10.1093/molbev/msi005
    data source: https://www.ebi.ac.uk/goldman-srv/dayhoff/dayhoff-dcmut.dat

  13. DCMut-JTT   GENERAL
    Kosiol C, Goldman N (2005) Different versions of the Dayhoff rate matrix.
    Molecular Biology and Evolution, 22:193–199. doi:10.1093/molbev/msi005
    data source: https://www.ebi.ac.uk/goldman-srv/dayhoff/jtt-dcmut.dat

  14. HIVb   HUMAN IMMUNODEFICIENCY VIRUS
    Nickle DC, Heath L, Jensen MA, Gilbert PB, Mullins JI, Kosakovsky Pond SL (2007) HIV-specific probabilistic models of protein evolution.
    PLoS ONE, 2:e503. doi:10.1371/journal.pone.0000503
    data source: https://github.com/stephaneguindon/phyml/blob/master/src/init.c (Init_Qmat_HIVb)

  15. HIVw   HUMAN IMMUNODEFICIENCY VIRUS
    Nickle DC, Heath L, Jensen MA, Gilbert PB, Mullins JI, Kosakovsky Pond SL (2007) HIV-specific probabilistic models of protein evolution.
    PLoS ONE, 2:e503. doi:10.1371/journal.pone.0000503
    data source: https://github.com/stephaneguindon/phyml/blob/master/src/init.c (Init_Qmat_HIVw)

  16. MtArt   ARTHROPODA MITOCHONDRION
    Abascal F, Posada D, Zardoya R (2007) MtArt: a new model of amino acid replacement for Arthropoda.
    Molecular Biology and Evolution, 24:1–5. doi:10.1093/molbev/msl136
    data source: https://github.com/xflouris/libpll-2/blob/master/src/maps.c (pll_aa_rates_mtart)

  17. LG   GENERAL
    Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix.
    Molecular Biology and Evolution, 25:1307–1320. doi:10.1093/molbev/msn067
    data source: http://www.atgc-montpellier.fr/download/datasets/models/lg_LG.PAML.txt

  18. MtZoa   MITOCHONDRION
    Rota-Stabelli O, Yang Z, Telford MJ (2009) MtZoa: a general mitochondrial amino acid substitutions model for animal evolutionary studies.
    Molecular Phylogenetics and Evolution, 52:268–272. doi:10.1016/j.ympev.2009.01.011
    data source: https://ars.els-cdn.com/content/image/1-s2.0-S1055790309000165-mmc1.txt

  19. cpREV64   PLASTID
    Zhong B, Yonezawa T, Zhong Y, Hasegawa M (2010) The Position of Gnetales among seed plants: overcoming pitfalls of chloroplast phylogenomics.
    Molecular Biology and Evolution, 27:2855–2863. doi:10.1093/molbev/msq170
    data source: https://academic.oup.com/mbe/article/27/12/2855/1074835#supplementary-data

  20. FLU   INFLUENZA VIRUS
    Dang CC, Le SQ, Gascuel O, Le VS (2010) FLU, an amino acid substitution model for influenza proteins.
    BMC Evolutionary Biology, 10:99. doi:10.1186/1471-2148-10-99
    data source: ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU/Flu_All_it2.txt_PAML.txt

  21. gcpREV   GREEN PLANT CHLOROPLAST
    Cox CJ, Foster PG (2013) A 20-state empirical amino-acid substitution model for green plant chloroplasts.
    Molecular Phylogenetics and Evolution, 68:218–220. doi:10.1016/j.ympev.2013.03.030
    data source: https://github.com/pgfoster/p4-phylogenetics/blob/master/Misc/gcpREV_model/gcpREV.dat

  22. stmtREV   LAND PLANT MITOCHONDRION
    Liu Y, Cox CJ, Wang W, Goffinet B (2014) Mitochondrial phylogenomics of early land plants: mitigating the effects of saturation, compositional heterogeneity, and codon-usage bias.
    Systematic Biology, 63:862–878. doi:10.1093/sysbio/syu049
    data source: https://datadryad.org/bitstream/handle/10255/dryad.58788/stmtREV_model.txt

  23. AB   ANTIBODY
    Mirsky A, Kazandjian L, Anisimova M (2015) Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.
    Molecular Biology and Evolution, 32:806–819. doi:10.1093/molbev/msu340
    data source: https://academic.oup.com/mbe/article/32/3/806/980410#supplementary-data

  24. mtInv   INVERTEBRATE MITOCHONDRION
    Le VS, Dang CC, Le SQ (2017) Improved mitochondrial amino acid substitution models for metazoan evolutionary studies.
    BMC Evolutionary Biology, 17:136. doi:10.1186/s12862-017-0987-y
    data source: https://github.com/Vinhbio/mt_metazoan_models/tree/master/mt_metazoan_models_from_all_data

  25. mtMet   METAZOAN MITOCHONDRION
    Le VS, Dang CC, Le SQ (2017) Improved mitochondrial amino acid substitution models for metazoan evolutionary studies.
    BMC Evolutionary Biology, 17:136. doi:10.1186/s12862-017-0987-y
    data source: https://github.com/Vinhbio/mt_metazoan_models/tree/master/mt_metazoan_models_from_all_data

  26. mtVer   VERTEBRATE MITOCHONDRION
    Le VS, Dang CC, Le SQ (2017) Improved mitochondrial amino acid substitution models for metazoan evolutionary studies.
    BMC Evolutionary Biology, 17:136. doi:10.1186/s12862-017-0987-y
    data source: https://github.com/Vinhbio/mt_metazoan_models/tree/master/mt_metazoan_models_from_all_data

  27. DEN   DENGUE VIRUS
    Le TK, Dang CC, Le SV (2018) Building a specific amino acid substitution model for Dengue viruses.
    In: Phuong TM, Nguyen ML (eds) Proceedings of 10th International Conference on Knowledge and Systems Engineering (KSE 2018), Ho Chi Minh City, Vietnam, pp. 242–246. doi:10.1109/KSE.2018.8573341
    data source: https://github.com/xflouris/libpll-2/blob/dev/src/maps.c (pll_aa_rates_den)

  28. mtOrt   ORTHOPTERA MITOCHONDRION
    Chang H, Nie Y, Zhang N, Zhang X, Sun H, Mao Y, Qiu Z, Huang Y (2020) MtOrt: an empirical mitochondrial amino acid substitution model for evolutionary studies of Orthoptera insects.
    BMC Ecology and Evolution, 20:57. doi:10.1186/s12862-020-01623-6
    data source: https://static-content.springer.com/esm/art%3A10.1186%2Fs12862-020-01623-6/MediaObjects/12862_2020_1623_MOESM2_ESM.txt (model mtOrt)

  29. FLAVI   FLAVIVIRUS
    Le TK, Vinh LS (2020) FLAVI: An Amino Acid Substitution Model for Flaviviruses.
    Journal of Molecular Evolution, 88:445-452. doi:10.1007/s00239-020-09943-3
    data source: https://github.com/thulekm/flavi/blob/master/FLAVI.PAML

  30. Q.LG   GENERAL
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.LG)

  31. Q.pfam   GENERAL
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.pfam)

  32. Q.bird   BIRD
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.bird)

  33. Q.insect   INSECT
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.insect)

  34. Q.mammal   MAMMAL
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.mammal)

  35. Q.plant   GREEN PLANT
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.plant)

  36. Q.yeast   YEAST
    Minh BQ, Dang CC, Le SV, Lanfear R (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.
    Systematic Biology, syab010. doi:10.1093/sysbio/syab010
    data source: https://figshare.com/articles/dataset/QMaker-datasets_zip/9768101?file=25875906 (file Q.yeast)

  37. HIVin   HIV INTEGRASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file HIVin.dat in Substitution models.zip)

  38. HIVpr   HIV PROTEASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file HIVpr.dat in Substitution models.zip)

  39. VIRin   VIRUS INTEGRASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file VIRin.dat in Substitution models.zip)

  40. VIRpr   VIRUS PROTEASE
    Del Amparo R, Arenas M (2022) HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models.
    Genes, 13:61. doi:10.3390/genes13010061
    data source: https://zenodo.org/record/5763867 (file VIRpr.dat in Substitution models.zip)


GIPhy