Increasingly accurate and massive data have recently shed light on the fundamental question of how cells maintain a stable size trajectory as they progress through the cell cycle. Microbes seem to use strategies ranging from a pure sizer, where the end of a given phase is triggered when the cell reaches a critical size, to pure adder, where the cell adds a constant size during a phase. Yet the biological origins of the observed spectrum of behavior remain elusive. We analyze a molecular size-control mechanism, based on experimental data from the yeast S. cerevisiae, that gives rise to behaviors smoothly interpolating between adder and sizer. The size-control is obtained from the accumulation of an activator protein that titrates an inhibitor protein. Strikingly, the size-control is composed of two different regimes: for small initial cell size, the size-control is a sizer, whereas for larger initial cell size, it is an imperfect adder, in agreement with recent experiments. Our model thus indicates that the adder and critical size behaviors may just be different dynamical regimes of a single simple biophysical mechanism.
Bacterial genomes exhibit widespread horizontal gene transfer, resulting in highly variable genome content that complicates the inference of genetic interactions. In this study, we develop a method for detecting coevolving genes from large datasets of bacterial genomes based on pairwise comparisons of closely related individuals, analogous to a pedigree study in eukaryotic populations. We apply our method to pairs of genes from the Staphylococcus aureus accessory genome of over 75,000 annotated gene families using a database of over 40,000 whole genomes. We find many pairs of genes that appear to be gained or lost in a coordinated manner, as well as pairs where the gain of one gene is associated with the loss of the other. These pairs form networks of rapidly coevolving genes, primarily consisting of genes involved in virulence, mechanisms of horizontal gene transfer, and antibiotic resistance, particularly the SCCmec complex. While we focus on gene gain and loss, our method can also detect genes that tend to acquire substitutions in tandem, or genotype-phenotype or phenotype-phenotype coevolution. Finally, we present the R package DeCoTUR that allows for the computation of our method.
Samples of multiple complete genome sequences contain vast amounts of
information about the evolutionary history of populations, much of it in the
associations among polymorphisms at different loci. Current methods that take
advantage of this linkage information rely on models of recombination and
coalescence, limiting the sample sizes and populations that they can analyze.
We introduce a method, Minimal-Assumption Genomic Inference of Coalescence
(MAGIC), that reconstructs key features of the evolutionary history, including
the distribution of coalescence times, by integrating information across
genomic length scales without using an explicit model of recombination,
demography or selection. Using simulated data, we show that MAGIC's performance
is comparable to PSMC' on single diploid samples generated with standard
coalescent and recombination models. More importantly, MAGIC can also analyze
arbitrarily large samples and is robust to changes in the coalescent and
recombination processes. Using MAGIC, we show that the inferred coalescence
time histories of samples of multiple human genomes exhibit inconsistencies
with a description in terms of an effective population size based on
single-genome data.
The bottleneck governing infectious disease transmission describes the size of the pathogen population transferred from the donor to the recipient host. Accurate quantification of the bottleneck size is particularly important for rapidly evolving pathogens such as influenza virus, as narrow bottlenecks reduce the amount of transferred viral genetic diversity and, thus, may decrease the rate of viral adaptation. Previous studies have estimated bottleneck sizes governing viral transmission by using statistical analyses of variants identified in pathogen sequencing data. These analyses, however, did not account for variant calling thresholds and stochastic viral replication dynamics within recipient hosts. Because these factors can skew bottleneck size estimates, we introduce a new method for inferring bottleneck sizes that accounts for these factors. Through the use of a simulated data set, we first show that our method, based on beta-binomial sampling, accurately recovers transmission bottleneck sizes, whereas other methods fail to do so. We then apply our method to a data set of influenza A virus (IAV) infections for which viral deep-sequencing data from transmission pairs are available. We find that the IAV transmission bottleneck size estimates in this study are highly variable across transmission pairs, while the mean bottleneck size of 196 virions is consistent with a previous estimate for this data set. Furthermore, regression analysis shows a positive association between estimated bottleneck size and donor infection severity, as measured by temperature. These results support findings from experimental transmission studies showing that bottleneck sizes across transmission events can be variable and influenced in part by epidemiological factors.
IMPORTANCE The transmission bottleneck size describes the size of the pathogen population transferred from the donor to the recipient host and may affect the rate of pathogen adaptation within host populations. Recent advances in sequencing technology have enabled bottleneck size estimation from pathogen genetic data, although there is not yet a consistency in the statistical methods used. Here, we introduce a new approach to infer the bottleneck size that accounts for variant identification protocols and noise during pathogen replication. We show that failing to account for these factors leads to an underestimation of bottleneck sizes. We apply this method to an existing data set of human influenza virus infections, showing that transmission is governed by a loose, but highly variable, transmission bottleneck whose size is positively associated with the severity of infection of the donor. Beyond advancing our understanding of influenza virus transmission, we hope that this work will provide a standardized statistical approach for bottleneck size estimation for viral pathogens.
The transmission bottleneck is defined as the number of viral particles that transmit from one host to establish an infection in another. Genome sequence data have been used to evaluate the size of the transmission bottleneck between humans infected with the influenza virus; however, the methods used to make these estimates have some limitations. Specifically, viral allele frequencies, which form the basis of many calculations, may not fully capture a process which involves the transmission of entire viral genomes. Here, we set out a novel approach for inferring viral transmission bottlenecks; our method combines an algorithm for haplotype reconstruction with maximum likelihood methods for bottleneck inference.
This approach allows for rapid calculation and performs well when applied to data from simulated transmission events; errors in the haplotype reconstruction step did not adversely affect inferences of the population bottleneck. Applied to data from a previous household transmission study of influenza A infection, we confirm the result that the majority of transmission events involve a small number of viruses, albeit with slightly looser bottlenecks being inferred, with between 1 and 13 particles transmitted in the majority of cases. While influenza A transmission involves a tight population bottleneck, the bottleneck is not so tight as to universally prevent the transmission of within-host viral diversity.
IMPORTANCE
Viral populations undergo a repeated cycle of within-host growth followed by transmission. Viral evolution is affected by each stage of this cycle. The number of viral particles transmitted from one host to another, known as the transmission bottleneck, is an important factor in determining how the evolutionary dynamics of the population play out, restricting the extent to which the evolved diversity of the population can be passed from one host to another. Previous study of viral sequence data has suggested that the transmission bottleneck size for influenza A transmission between human hosts is small. Reevaluating these data using a novel and improved method, we largely confirm this result, albeit that we infer a slightly higher bottleneck size in some cases, of between 1 and 13 virions. While a tight bottleneck operates in human influenza transmission, it is not extreme in nature; some diversity can be meaningfully retained between hosts.
As an application of the transmission bottleneck size estimation method developed in this paper, we used a previously published influenza A data set first presented by L. L. M. Poon, T. Song, R. Rosenfeld, X. Lin, et al. [Nat Genet 48(2):195-200, 2016, https://doi.org/10.1038/ng.3479]. Recently, K. S. Xue and J. D. Bloom (Nat Genet, 25 February 2019, https://doi.org/10.1038/s41588-019-0349-3) have shown that the Poon et al. data set is “technically contaminated” with read pairs split between unrelated samples, which had the effect of inflating the similarities in allele frequencies between samples. As a result, when we applied our betabinomial approach to the Poon et al. data set, it yielded transmission bottleneck size estimates that are incongruous with, and larger than, other transmission bottleneck size estimates for seasonal influenza A virus. The validity of the betabinomial estimation method presented in our paper is itself unaffected. While we therefore continue to encourage the use of our developed estimation method on other data sets, we would like to caution the reader against citing our paper as providing evidence for a loose transmission bottleneck size for influenza A virus. Computer code for the betabinomial transmission bottleneck size estimation method is available on GitHub at https://github.com/koellelab/betabinomial_bottleneck.