Bibliography

A list of papers that we think are important for understanding divergence-time estimation, sometimes accompanied by our witty commentary. If there’s a paper that you’ve found super helpful, let us know and we’ll add it. If it is one of your own papers, all the better!!

Bayesian phylogenetic inference, general principles

Baum, David A., and Stacey DeWitt Smith (2013). Tree thinking: An introduction to Phylogenetic Biology. (Book), Oxford University Press.

Superb entry-level textbook introducing the principles of model-based phylogenetic inference.

Holder, Mark and Paul O. Lewis (2003). Phylogeny estimation: traditional and Bayesian approaches. Nature reviews genetics 4(), 275-284. link.

These guys know a lot about Bayesian phylogenetics.

Divergence-time estimation: Non-identifiability/model sensitivity

Rannala Bruce (2002). Identifiability of parameters in MCMC Bayesian inference of phylogeny. Systematic Biology 51(), 754--760.

An introduction to the concept of non-identifiability, with a focus on Bayesian phylogenetics, and using relaxed clocks as an example.

Dos Reis, Mario and Ziheng Yang (2013). The unbearable uncertainty of Bayesian divergence time estimation. Journal of Systematics and Evolution 51(1), 30-43. link.

Clear discussion of the basic non-identifiability inherent in divergence-time estimation, and how this feature means that DTE differs in a fundamental way from typical applications of Bayesian inference (the data can never overwhelm the prior.)

Condamine, Fabien L., Nathalie S. Nagalingum, Charles R. Marshall, and Hélène Morlon (2015). Origin and diversification of living cycads: a cautionary tale on the impact of the branching process prior in Bayesian molecular dating. BMC evolutionary biology 15(1), 1-18. link.

Excellent empirical example of the strong influence of the tree model on divergence-time estimates.

Rothfels, Carl J. and Eric Schuettpelz (2014). Accelerated rate of molecular evolution for vittarioid ferns is strong and not driven by selection. Systematic Biology 63(1), 31-54. link.

Including this paper here because it provides (figure 11) a clear example of the strong influence of the clock model on divergence-time estimates.

Sauquet, Hervé, Santiago Ramírez-Barahona, and Susana Magallón (2022). The age of flowering plants is unknown. BioArXv (preprint), . link.

The title says it all? This paper discusses the impact of non-identifiability on a focal problem---the crown age of the angiosperms---and points out that the estimates of that age are effectively determined by the priors used in the analysis. They also make the excellent point that so-called "molecular age estimates" are not molecular at all---effectively all the temporal data comes from the fossils (and the models), not from the molecular data.

May, Michael R., Dori L. Contreras, Michael A. Sundue, Nathalie S. Nagalingum, Cindy V. Looy, and Carl J. Rothfels (2021). Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity. Systematic Biology 70(6), 1232--1255. link.

The non-identifiabilty inherent in divergence-time estimation, and thus prior sensitivity, is a major theme of this paper. For our dataset, the tree model has a huge effect (more so than any other model component), and we show that the "uniform prior" on timetrees, as typically constructed has some pathological behavior and should be avoided (in our opinion, at least).

Divergence-time estimation: Relaxed clocks

Drummond, Alexei J. and Marc A. Suchard (2010). Bayesian random local clocks, or one rate to rule them all. BMC biology 8(1), 1-12. link.

One of the few practical implementations of autocorrelated relaxed clocks.

Drummond, Alexei J., Simon Y. W. Ho, Matthew J. Phillips, and Andrew Rambaut (2006). Relaxed phylogenetics and dating with confidence. PLoS biology 4(4), e88. link.

Among the better article titles.

Divergence-time estimation, node dating

Graur, Dan and William Martin (2004). Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. TRENDS in Genetics 20(2), 80-86. link.

The "Spandrels" of divergence-time estimation. Not exactly the most friendly paper, and it terrified a generation of practitioners, but it is also full of good points (particularly that we can't ignore the uncertainty associated with our estimates).

Marshall, Charles R. (2008). A simple method for bracketing absolute divergence times on molecular phylogenies using multiple fossil calibration points. The American Naturalist 171(6), 726-742. link.

A method for getting more informed calibration densities by better including the information provided by the fossil record.

Marshall, Charles R. (2019). Using the fossil record to evaluate timetree timescales. Frontiers in Genetics 10(1), 1049. link.

Lots of discussion of calibration densities for node dating, among other descriptions of the application of the fossil record to evaluating (and thus, indirectly, to inferring) divergence-time estimates.

May, Michael R., Dori L. Contreras, Michael A. Sundue, Nathalie S. Nagalingum, Cindy V. Looy, and Carl J. Rothfels (2021). Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity. Systematic Biology 70(6), 1232--1255. link.

We'll shameless plug this paper multiple times in this list because we really are very proud of it, and because it is foundational to our developing the perspectives underlying this workshop. In the contex of node dating, we show how that method---which requires a user to associate a fossil with a node a priori---would have been severely misleading in the case of divergence-time estimates in Marattiales.

Rothfels, Carl J., Anne K. Johnson, Peter H. Hovenkamp, David L. Swofford, Harry C. Roskam, Christopher R. Fraser-Jenkins, Michael D. Windham, and Kathleen M. Pryer (2015). Natural hybridization between genera that diverged from each other approximately 60 million years ago. The American Naturalist 185(3), 433-442. link.

This paper proposes a "sequential empirical Bayes" approach to secondary calibrations in node-dating analyses. Basically, you can use the full posterior distribution of a node age from a previous analyses as the prior in a focal analysis. Which, we argue, is dramatically superior to, for example, applying some sort of uniform prior.

Parham, James F., Philip C.J. Donoghue, Christopher J. Bell, Tyler D. Calway, Jason J. Head, Patricia A. Holroyd, Jun G. Inoue, Randall B. Irmis, Walter G. Joyce, and Daniel T. Ksepka, José S. L. Patané, Nathan D. Smith, James E. Tarver, Marcel van Tuinen, Ziheng Yang, Kenneth D. Angielczyk, Jenny M. Greenwood, Christy A. Hipsley, Louis Jacobs, Peter J. Makovicky, Johannes Müller, Krister T. Smith, Jessica M. Theodor, Rachel C. M. Warnock, and Michael J. Benton (2012). Best practices for justifying fossil calibrations. Systematic Biology 61(2), 346-359. link.

A guide to best practices on the inclusion of fossil data in divergence-time estimation (from a pre-FBD/total-evidence framework -- i.e., focused on node dating), from a group of phylogenetics-focused paleobiologists.

Divergence-time estimation: Total-evidence dating and the fossilized birth-death process

Pyron, R A (2011). Divergence time estimation using fossils as terminal taxa and the origins of Lissamphibia. Systematic Biology 60(), 466--481. link.

The first total-evidence dating paper.

Heath, Tracy A., John P. Huelsenbeck, and Tanja Stadler (2014). The fossilized birth--death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences 111(29), E2957-E2966. link.

Welcome to the FBD! It took me (Carl) a long time to appreciate the importance of this paper; it's a lot more than simply allowing for the inference of sampled ancestors. Hopefully this workshop has made some of these impacts more apparent. Regardless, if you have any questions, ask us!

Zhang, Chi, Tanja Stadler, Seraina Klopfstein, Tracy A. Heath, and Fredrik Ronquist (2016). Total-evidence dating under the fossilized birth--death process. Systematic biology 65(2), 228-249. link.

With Gavryushkina et al. (below) the first major application of the fossilized birth-death model with TED. This paper also introduces a "diversified sampling" tree prior.

Gavryushkina, Alexandra, Tracy A. Heath, Daniel T. Ksepka, Tanja Stadler, David Welch, and Alexei J. Drummond (2017). Bayesian total-evidence dating reveals the recent crown radiation of penguins. Systematic biology 66(1), 57-73. link.

With Zhang et al. (above) the first major application of the fossilized birth-death model with TED.

The Fossil Record

Quental, Tiago B. and Charles R. Marshall (2010). Diversity dynamics: molecular phylogenies need the fossil record. Trends in ecology & evolution 25(8), 434-441. link.
Quental, Tiago B. and Charles R. Marshall (2009). Extinction during evolutionary radiations: reconciling the fossil record with molecular phylogenies. Evolution: International Journal of Organic Evolution 63(12), 3158-3167. link.
Marshall, Charles R. (2019). Using the fossil record to evaluate timetree timescales. Frontiers in Genetics 10(1), 1049. link.

Includes a broad description of the application of the fossil record to evaluating (and thus, indirectly, to inferring) divergence-time estimates, with lots of useful stuff on FADS and LADS and other underlying issues that motivate FBD and total-evidence dating approaches (versus node-dating).

Marshall, Charles R. (1997). Confidence intervals on stratigraphic ranges with nonrandom distributions of fossil horizons. Paleobiology 23(2), 165-173. link.
Marshall, Charles R. (1994). Confidence intervals on stratigraphic ranges: partial relaxation of the assumption of randomly distributed fossil horizons. Paleobiology 20(4), 459-469. link.
Marshall, Charles R. (1990). Confidence intervals on stratigraphic ranges. Paleobiology 16(1), 1-10. link.
Parham, James F., Philip C.J. Donoghue, Christopher J. Bell, Tyler D. Calway, Jason J. Head, Patricia A. Holroyd, Jun G. Inoue, Randall B. Irmis, Walter G. Joyce, and Daniel T. Ksepka, José S. L. Patané, Nathan D. Smith, James E. Tarver, Marcel van Tuinen, Ziheng Yang, Kenneth D. Angielczyk, Jenny M. Greenwood, Christy A. Hipsley, Louis Jacobs, Peter J. Makovicky, Johannes Müller, Krister T. Smith, Jessica M. Theodor, Rachel C. M. Warnock, and Michael J. Benton (2012). Best practices for justifying fossil calibrations. Systematic Biology 61(2), 346-359. link.

Wang, Steve C. and Charles R. Marshall (2016). Estimating times of extinction in the fossil record. Biology letters 12(4), 20150989. link.

Morphological Data

Brazeau, Martin D. (2011). Problematic character coding methods in morphology and their effects. Biological Journal of the Linnean Society 104(3), 489-498. link.
Brazeau, Martin D., Thomas Guillerme, and Martin R. Smith (2019). An algorithm for morphological phylogenetic analysis with inapplicable data. Systematic biology 68(4), 619-631. link.
Forey, Peter L. and Ian J. Kitching (2014). Experiments in coding multistate. Homology and Systematics: Coding Characters for Phylogenetic Analysis (Book chapter), 54. link.
Hawkins, Julie A. (2002). A survey of primary homology assessment: different botanists perceive and define characters in different ways. Homology and systemcatics: coding characters for phylogenetic analysis (Book chapter), 217.
Hawkins, Julie A., Colin E. Hughes, and Robert W. Scotland (1997). Primary homology assessment, characters and character states. Cladistics 13(3), 275-283. link.
Maddison, Wayne P. (1993). Missing data versus missing characters in phylogenetic analysis. Systematic Biology 42(4), 576-581. link.
Simões, Tiago R., Michael W. Caldwell, Alessandro Palci, and Randall L. Nydam (2017). Giant taxon-character matrices: quality of character constructions remains critical regardless of size. Cladistics 33(2), 198-219. link.
Strong, Ellen E. and Diana Lipscomb (1999). Character coding and inapplicable data. Cladistics 15(4), 363-371. link.
Tarasov, Sergei (2019). Integration of anatomy ontologies and evo-devo using structured Markov models suggests a new framework for modeling discrete phenotypic traits. Systematic biology 68(5), 698-716. link.
Wright, April M., Graeme T. Lloyd, and David M. Hillis (2016). Modeling character change heterogeneity in phylogenetic analyses of morphology through the use of priors. Systematic Biology 65(54), 602-611. link.

Assessing results: Tree visualization

Hillis, David M., Tracy A. Heath, and Katherine St. John (2005). Analysis and visualization of tree space. Systematic Biology 54(3), 471-482. link.

A foundational paper discussing ways to visualize treespace (and thus to tell if, for example, your different models are yielding different inferences). Promotes mult-dimensional scaling (MDS), those images that Mike showed with each tree represented by a colored dot arrayed in two-dimensional space.

Huang, Wen, Guifang Zhou, Melissa Marchand, Jeremy R. Ash, David Morris, Paul Van Dooren, Jeremy M. Brown, Kyle A. Gallivan, and Jim C. Wilgenbusch (2016). TreeScaper: visualizing and extracting phylogenetic signal from sets of trees. Molecular Biology and Evolution 33(12), 3314-3316. link.
Tribble, Carrie M., William A. Freyman, Michael J. Landis, Jun Ying Lim, Joellë Barido-Sottani, Bjorn Tore Kopperud, Sebastian Hӧhna, and Michael R. May (2022). RevGadgets: an R Package for visualizing Bayesian phylogenetic analyses from RevBayes. Methods in Ecology and Evolution 13(2), 314-323. link.

RevGadgets! An R package designed for the convenient visualization of RevBayes output. Use this to make your gorgeous figures.

Robinson, David F. and Leslie R. Foulds (1981). Comparison of phylogenetic trees. Mathematical biosciences 53(1-2), 131-147. link.

Introduces what we now call the "Robinson-Foulds distance" between phylogenetic trees.

Kuhner, Mary K. and Joseph Felsenstein (1994). A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.. Molecular biology and evolution 11(3), 459-468. link.

Introduces what we now call the "Kuhner-Felsenstein distance" between phylogenetic trees.

Assessing results: Model comparison and model adequacy

Kass, Robert E. and Adrian E. Raftery (1995). Bayes factors. Journal of the american statistical association 90(430), 773-795. link.

Provides a reference for the interpretation of Bayes factors. How big should the Bayes factor be before you feel that it is important?

Lartillot, Nicolas and Hervé Philippe (2006). Computing Bayes factors using thermodynamic integration. Systematic biology 55(2), 195-207. link.

Introduces to phylogenetics the path-sampling estimator for computing Bayes factors.

Xie, Wangang, Paul O. Lewis, Yu Fan, Lynn Kuo, and Chen, Ming-Hui (2011). Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Systematic biology 60(2), 150-160. link.

Introduces to phylogenetics the stepping-stone estimator for computing Bayes factors.

May, Michael R and Carl J. Rothfels (2021). Mistreating birth-death models as priors in phylogenetic analysis compromises our ability to compare models. bioRxiv (), . link.

This paper describes a strange and long-overlooked "glitch" in how we generally do Bayesian inference on tree models, a glitch that results in Bayes factors for tree models, as typically computed, being unreliable. And discusses ways to address this issue.

Bollback, Jonathan P (2015). Posterior mapping and posterior predictive distributions. Statistical methods in molecular evolution 54(Book chapter), 439-462. link.
Brown, Jeremy M. (2014). Detection of implausible phylogenetic inferences using posterior predictive assessment of model fit. Systematic biology 63(3), 334-348. link.
Höhna, Sebastian, Lyndon M. Coghill, Genevieve G. Mount, Robert C. Thomson, and Jeremy M. Brown (2018). P3: Phylogenetic posterior prediction in RevBayes. Molecular biology and evolution 35(4), 1028-1034. link.