Bibliography
A list of papers that we think are important for understanding divergence-time estimation, sometimes accompanied by our witty commentary. If there’s a paper that you’ve found super helpful, let us know and we’ll add it. If it is one of your own papers, all the better!!
Bayesian phylogenetic inference, general principles
- Baum, David A., and Stacey DeWitt Smith (2013). Tree thinking: An introduction to Phylogenetic Biology. (Book), Oxford University Press.
- Holder, Mark and Paul O. Lewis (2003). Phylogeny estimation: traditional and Bayesian approaches. Nature reviews genetics 4(), 275-284. link.
Superb entry-level textbook introducing the principles of model-based phylogenetic inference.
These guys know a lot about Bayesian phylogenetics.
Divergence-time estimation: Non-identifiability/model sensitivity
- Rannala Bruce (2002). Identifiability of parameters in MCMC Bayesian inference of phylogeny. Systematic Biology 51(), 754--760.
- Dos Reis, Mario and Ziheng Yang (2013). The unbearable uncertainty of Bayesian divergence time estimation. Journal of Systematics and Evolution 51(1), 30-43. link.
- Condamine, Fabien L., Nathalie S. Nagalingum, Charles R. Marshall, and Hélène Morlon (2015). Origin and diversification of living cycads: a cautionary tale on the impact of the branching process prior in Bayesian molecular dating. BMC evolutionary biology 15(1), 1-18. link.
- Rothfels, Carl J. and Eric Schuettpelz (2014). Accelerated rate of molecular evolution for vittarioid ferns is strong and not driven by selection. Systematic Biology 63(1), 31-54. link.
- Sauquet, Hervé, Santiago Ramírez-Barahona, and Susana Magallón (2022). The age of flowering plants is unknown. BioArXv (preprint), . link.
- May, Michael R., Dori L. Contreras, Michael A. Sundue, Nathalie S. Nagalingum, Cindy V. Looy, and Carl J. Rothfels (2021). Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity. Systematic Biology 70(6), 1232--1255. link.
An introduction to the concept of non-identifiability, with a focus on Bayesian phylogenetics, and using relaxed clocks as an example.
Clear discussion of the basic non-identifiability inherent in divergence-time estimation, and how this feature means that DTE differs in a fundamental way from typical applications of Bayesian inference (the data can never overwhelm the prior.)
Excellent empirical example of the strong influence of the tree model on divergence-time estimates.
Including this paper here because it provides (figure 11) a clear example of the strong influence of the clock model on divergence-time estimates.
The title says it all? This paper discusses the impact of non-identifiability on a focal problem---the crown age of the angiosperms---and points out that the estimates of that age are effectively determined by the priors used in the analysis. They also make the excellent point that so-called "molecular age estimates" are not molecular at all---effectively all the temporal data comes from the fossils (and the models), not from the molecular data.
The non-identifiabilty inherent in divergence-time estimation, and thus prior sensitivity, is a major theme of this paper. For our dataset, the tree model has a huge effect (more so than any other model component), and we show that the "uniform prior" on timetrees, as typically constructed has some pathological behavior and should be avoided (in our opinion, at least).
Divergence-time estimation: Relaxed clocks
- Drummond, Alexei J. and Marc A. Suchard (2010). Bayesian random local clocks, or one rate to rule them all. BMC biology 8(1), 1-12. link.
- Drummond, Alexei J., Simon Y. W. Ho, Matthew J. Phillips, and Andrew Rambaut (2006). Relaxed phylogenetics and dating with confidence. PLoS biology 4(4), e88. link.
One of the few practical implementations of autocorrelated relaxed clocks.
Among the better article titles.
Divergence-time estimation, node dating
- Graur, Dan and William Martin (2004). Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. TRENDS in Genetics 20(2), 80-86. link.
- Marshall, Charles R. (2008). A simple method for bracketing absolute divergence times on molecular phylogenies using multiple fossil calibration points. The American Naturalist 171(6), 726-742. link.
- Marshall, Charles R. (2019). Using the fossil record to evaluate timetree timescales. Frontiers in Genetics 10(1), 1049. link.
- May, Michael R., Dori L. Contreras, Michael A. Sundue, Nathalie S. Nagalingum, Cindy V. Looy, and Carl J. Rothfels (2021). Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity. Systematic Biology 70(6), 1232--1255. link.
- Rothfels, Carl J., Anne K. Johnson, Peter H. Hovenkamp, David L. Swofford, Harry C. Roskam, Christopher R. Fraser-Jenkins, Michael D. Windham, and Kathleen M. Pryer (2015). Natural hybridization between genera that diverged from each other approximately 60 million years ago. The American Naturalist 185(3), 433-442. link.
- Parham, James F., Philip C.J. Donoghue, Christopher J. Bell, Tyler D. Calway, Jason J. Head, Patricia A. Holroyd, Jun G. Inoue, Randall B. Irmis, Walter G. Joyce, and Daniel T. Ksepka, José S. L. Patané, Nathan D. Smith, James E. Tarver, Marcel van Tuinen, Ziheng Yang, Kenneth D. Angielczyk, Jenny M. Greenwood, Christy A. Hipsley, Louis Jacobs, Peter J. Makovicky, Johannes Müller, Krister T. Smith, Jessica M. Theodor, Rachel C. M. Warnock, and Michael J. Benton (2012). Best practices for justifying fossil calibrations. Systematic Biology 61(2), 346-359. link.
The "Spandrels" of divergence-time estimation. Not exactly the most friendly paper, and it terrified a generation of practitioners, but it is also full of good points (particularly that we can't ignore the uncertainty associated with our estimates).
A method for getting more informed calibration densities by better including the information provided by the fossil record.
Lots of discussion of calibration densities for node dating, among other descriptions of the application of the fossil record to evaluating (and thus, indirectly, to inferring) divergence-time estimates.
We'll shameless plug this paper multiple times in this list because we really are very proud of it, and because it is foundational to our developing the perspectives underlying this workshop. In the contex of node dating, we show how that method---which requires a user to associate a fossil with a node a priori---would have been severely misleading in the case of divergence-time estimates in Marattiales.
This paper proposes a "sequential empirical Bayes" approach to secondary calibrations in node-dating analyses. Basically, you can use the full posterior distribution of a node age from a previous analyses as the prior in a focal analysis. Which, we argue, is dramatically superior to, for example, applying some sort of uniform prior.
A guide to best practices on the inclusion of fossil data in divergence-time estimation (from a pre-FBD/total-evidence framework -- i.e., focused on node dating), from a group of phylogenetics-focused paleobiologists.
Divergence-time estimation: Total-evidence dating and the fossilized birth-death process
- Pyron, R A (2011). Divergence time estimation using fossils as terminal taxa and the origins of Lissamphibia. Systematic Biology 60(), 466--481. link.
- Heath, Tracy A., John P. Huelsenbeck, and Tanja Stadler (2014). The fossilized birth--death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences 111(29), E2957-E2966. link.
- Zhang, Chi, Tanja Stadler, Seraina Klopfstein, Tracy A. Heath, and Fredrik Ronquist (2016). Total-evidence dating under the fossilized birth--death process. Systematic biology 65(2), 228-249. link.
- Gavryushkina, Alexandra, Tracy A. Heath, Daniel T. Ksepka, Tanja Stadler, David Welch, and Alexei J. Drummond (2017). Bayesian total-evidence dating reveals the recent crown radiation of penguins. Systematic biology 66(1), 57-73. link.
The first total-evidence dating paper.
Welcome to the FBD! It took me (Carl) a long time to appreciate the importance of this paper; it's a lot more than simply allowing for the inference of sampled ancestors. Hopefully this workshop has made some of these impacts more apparent. Regardless, if you have any questions, ask us!
With Gavryushkina et al. (below) the first major application of the fossilized birth-death model with TED. This paper also introduces a "diversified sampling" tree prior.
With Zhang et al. (above) the first major application of the fossilized birth-death model with TED.
The Fossil Record
- Quental, Tiago B. and Charles R. Marshall (2010). Diversity dynamics: molecular phylogenies need the fossil record. Trends in ecology & evolution 25(8), 434-441. link.
- Quental, Tiago B. and Charles R. Marshall (2009). Extinction during evolutionary radiations: reconciling the fossil record with molecular phylogenies. Evolution: International Journal of Organic Evolution 63(12), 3158-3167. link.
- Marshall, Charles R. (2019). Using the fossil record to evaluate timetree timescales. Frontiers in Genetics 10(1), 1049. link.
- Marshall, Charles R. (1997). Confidence intervals on stratigraphic ranges with nonrandom distributions of fossil horizons. Paleobiology 23(2), 165-173. link.
- Marshall, Charles R. (1994). Confidence intervals on stratigraphic ranges: partial relaxation of the assumption of randomly distributed fossil horizons. Paleobiology 20(4), 459-469. link.
- Marshall, Charles R. (1990). Confidence intervals on stratigraphic ranges. Paleobiology 16(1), 1-10. link.
- Parham, James F., Philip C.J. Donoghue, Christopher J. Bell, Tyler D. Calway, Jason J. Head, Patricia A. Holroyd, Jun G. Inoue, Randall B. Irmis, Walter G. Joyce, and Daniel T. Ksepka, José S. L. Patané, Nathan D. Smith, James E. Tarver, Marcel van Tuinen, Ziheng Yang, Kenneth D. Angielczyk, Jenny M. Greenwood, Christy A. Hipsley, Louis Jacobs, Peter J. Makovicky, Johannes Müller, Krister T. Smith, Jessica M. Theodor, Rachel C. M. Warnock, and Michael J. Benton (2012). Best practices for justifying fossil calibrations. Systematic Biology 61(2), 346-359. link.
- Wang, Steve C. and Charles R. Marshall (2016). Estimating times of extinction in the fossil record. Biology letters 12(4), 20150989. link.
Includes a broad description of the application of the fossil record to evaluating (and thus, indirectly, to inferring) divergence-time estimates, with lots of useful stuff on FADS and LADS and other underlying issues that motivate FBD and total-evidence dating approaches (versus node-dating).
A guide to best practices on the inclusion of fossil data in divergence-time estimation (from a pre-FBD/total-evidence framework -- i.e., focused on node dating), from a group of phylogenetics-focused paleobiologists.
Morphological Data
- Brazeau, Martin D. (2011). Problematic character coding methods in morphology and their effects. Biological Journal of the Linnean Society 104(3), 489-498. link.
- Brazeau, Martin D., Thomas Guillerme, and Martin R. Smith (2019). An algorithm for morphological phylogenetic analysis with inapplicable data. Systematic biology 68(4), 619-631. link.
- Forey, Peter L. and Ian J. Kitching (2014). Experiments in coding multistate. Homology and Systematics: Coding Characters for Phylogenetic Analysis (Book chapter), 54. link.
- Hawkins, Julie A. (2002). A survey of primary homology assessment: different botanists perceive and define characters in different ways. Homology and systemcatics: coding characters for phylogenetic analysis (Book chapter), 217.
- Hawkins, Julie A., Colin E. Hughes, and Robert W. Scotland (1997). Primary homology assessment, characters and character states. Cladistics 13(3), 275-283. link.
- Maddison, Wayne P. (1993). Missing data versus missing characters in phylogenetic analysis. Systematic Biology 42(4), 576-581. link.
- Simões, Tiago R., Michael W. Caldwell, Alessandro Palci, and Randall L. Nydam (2017). Giant taxon-character matrices: quality of character constructions remains critical regardless of size. Cladistics 33(2), 198-219. link.
- Strong, Ellen E. and Diana Lipscomb (1999). Character coding and inapplicable data. Cladistics 15(4), 363-371. link.
- Tarasov, Sergei (2019). Integration of anatomy ontologies and evo-devo using structured Markov models suggests a new framework for modeling discrete phenotypic traits. Systematic biology 68(5), 698-716. link.
- Wright, April M., Graeme T. Lloyd, and David M. Hillis (2016). Modeling character change heterogeneity in phylogenetic analyses of morphology through the use of priors. Systematic Biology 65(54), 602-611. link.
Assessing results: Tree visualization
- Hillis, David M., Tracy A. Heath, and Katherine St. John (2005). Analysis and visualization of tree space. Systematic Biology 54(3), 471-482. link.
- Huang, Wen, Guifang Zhou, Melissa Marchand, Jeremy R. Ash, David Morris, Paul Van Dooren, Jeremy M. Brown, Kyle A. Gallivan, and Jim C. Wilgenbusch (2016). TreeScaper: visualizing and extracting phylogenetic signal from sets of trees. Molecular Biology and Evolution 33(12), 3314-3316. link.
- Tribble, Carrie M., William A. Freyman, Michael J. Landis, Jun Ying Lim, Joellë Barido-Sottani, Bjorn Tore Kopperud, Sebastian Hӧhna, and Michael R. May (2022). RevGadgets: an R Package for visualizing Bayesian phylogenetic analyses from RevBayes. Methods in Ecology and Evolution 13(2), 314-323. link.
- Robinson, David F. and Leslie R. Foulds (1981). Comparison of phylogenetic trees. Mathematical biosciences 53(1-2), 131-147. link.
- Kuhner, Mary K. and Joseph Felsenstein (1994). A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.. Molecular biology and evolution 11(3), 459-468. link.
A foundational paper discussing ways to visualize treespace (and thus to tell if, for example, your different models are yielding different inferences). Promotes mult-dimensional scaling (MDS), those images that Mike showed with each tree represented by a colored dot arrayed in two-dimensional space.
RevGadgets! An R package designed for the convenient visualization of RevBayes output. Use this to make your gorgeous figures.
Introduces what we now call the "Robinson-Foulds distance" between phylogenetic trees.
Introduces what we now call the "Kuhner-Felsenstein distance" between phylogenetic trees.
Assessing results: Model comparison and model adequacy
- Kass, Robert E. and Adrian E. Raftery (1995). Bayes factors. Journal of the american statistical association 90(430), 773-795. link.
- Lartillot, Nicolas and Hervé Philippe (2006). Computing Bayes factors using thermodynamic integration. Systematic biology 55(2), 195-207. link.
- Xie, Wangang, Paul O. Lewis, Yu Fan, Lynn Kuo, and Chen, Ming-Hui (2011). Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Systematic biology 60(2), 150-160. link.
- May, Michael R and Carl J. Rothfels (2021). Mistreating birth-death models as priors in phylogenetic analysis compromises our ability to compare models. bioRxiv (), . link.
- Bollback, Jonathan P (2015). Posterior mapping and posterior predictive distributions. Statistical methods in molecular evolution 54(Book chapter), 439-462. link.
- Brown, Jeremy M. (2014). Detection of implausible phylogenetic inferences using posterior predictive assessment of model fit. Systematic biology 63(3), 334-348. link.
- Höhna, Sebastian, Lyndon M. Coghill, Genevieve G. Mount, Robert C. Thomson, and Jeremy M. Brown (2018). P3: Phylogenetic posterior prediction in RevBayes. Molecular biology and evolution 35(4), 1028-1034. link.
Provides a reference for the interpretation of Bayes factors. How big should the Bayes factor be before you feel that it is important?
Introduces to phylogenetics the path-sampling estimator for computing Bayes factors.
Introduces to phylogenetics the stepping-stone estimator for computing Bayes factors.
This paper describes a strange and long-overlooked "glitch" in how we generally do Bayesian inference on tree models, a glitch that results in Bayes factors for tree models, as typically computed, being unreliable. And discusses ways to address this issue.