Fragile X mental retardation protein (FMRP) is a multifunctional RNA-binding protein with crucial roles in neuronal development and function. Efforts aimed at elucidating how FMRP target mRNAs are selected have produced divergent sets of target mRNA and putative FMRP-bound motifs, and a clear understanding of FMRP's binding determinants has been lacking. To clarify FMRP's binding to its target mRNAs, we produced a shared dataset of FMRP consensus binding sequences (FCBS), which were reproducibly identified in two published FMRP CLIP sequencing datasets. This comparative dataset revealed that of the various sequence and structural motifs that have been proposed to specify FMRP binding, the short sequence motifs TGGA and GAC were corroborated, and a novel TAY motif was identified. In addition, the distribution of the FCBS set demonstrates that FMRP preferentially binds to the coding region of its targets but also revealed binding along 3′ UTRs in a subset of target mRNAs. Beyond probing these putative motifs, the FCBS dataset of reproducibly identified FMRP binding sites is a valuable tool for investigating FMRP targets and function.
Revolutionary changes in sequencing technology and the desire to develop therapeutics for rare diseases have led to the generation of an enormous amount of genomic data in the last 5 years. Large-scale sequencing done in both research and diagnostic laboratories has linked many new genes to rare diseases, but has also generated a number of variants that we cannot interpret today. It is clear that we remain a long way from a complete understanding of the genomic variation in the human genome and its association with human health and disease. Recent studies identified susceptibility markers to infectious diseases and also the contribution of rare variants to complex diseases in different populations. The sequencing revolution has also led to the creation of a large number of databases that act as "keepers" of data, and in many cases give an interpretation of the effect of the variant. This interpretation is based on reports in the literature, prediction models, and in some cases is accompanied by functional evidence. As we move toward the practice of genomic medicine, and consider its place in "personalized medicine," it is time to ask ourselves how we can aggregate this wealth of data into a single database for multiple users with different goals.
Accurately selecting relevant alleles in large sequencing experiments remains technically challenging. Bystro (https://bystro.io/ ) is the first online, cloud-based application that makes variant annotation and filtering accessible to all researchers for terabyte-sized whole-genome experiments containing thousands of samples. Its key innovation is a general-purpose, natural-language search engine that enables users to identify and export alleles and samples of interest in milliseconds. The search engine dramatically simplifies complex filtering tasks that previously required programming experience or specialty command-line programs. Critically, Bystro's annotation and filtering capabilities are orders of magnitude faster than previous solutions, saving weeks of processing time for large experiments.
DNA methylation is a key epigenetic mark involved in both normal development and disease progression. Recent advances in high-throughput technologies have enabled genome-wide profiling of DNA methylation. However, DNA methylation profiling often employs different designs and platforms with varying resolution, which hinders joint analysis of methylation data from multiple platforms. In this study, we propose a penalized functional regression model to impute missing methylation data. By incorporating functional predictors, our model utilizes information from nonlocal probes to improve imputation quality. Here, we compared the performance of our functional model to linear regression and the best single probe surrogate in real data and via simulations. Specifically, we applied different imputation approaches to an acute myeloid leukemia dataset consisting of 194 samples and our method showed higher imputation accuracy, manifested, for example, by a 94% relative increase in information content and up to 86% more CpG sites passing post-imputation filtering. Our simulated association study further demonstrated that our method substantially improves the statistical power to identify trait-associated methylation loci. These findings indicate that the penalized functional regression model is a convenient and valuable imputation tool for methylation data, and it can boost statistical power in downstream epigenome-wide association study (EWAS).
by
Emily Allen;
Stephanie Sherman;
Sarah L. Nolin;
Anne Glicksman;
Nicole Tortora;
James Macpherson;
Montserrat Mila;
Angela M. Vianna-Morgante;
Carl Dobkin;
Gary J. Latham;
Andrew G. Hadd
Instability of the FMR1 repeat, commonly observed in transmissions of premutation alleles (55–200 repeats), is influenced by the size of the repeat, its internal structure and the sex of the transmitting parent. We assessed these three factors in unstable transmissions of 14/3,335 normal (~5 to 44 repeats), 54/293 intermediate (45–54 repeats), and 1561/1,880 premutation alleles. While most unstable transmissions led to expansions, contractions to smaller repeats were observed in all size classes. For normal alleles, instability was more frequent in paternal transmissions and in alleles with long 3′ uninterrupted repeat lengths. For premutation alleles, contractions also occurred more often in paternal than maternal transmissions and the frequency of paternal contractions increased linearly with repeat size. All paternal premutation allele contractions were transmitted as premutation alleles, but maternal premutation allele contractions were transmitted as premutation, intermediate, or normal alleles. The eight losses of AGG interruptions in the FMR1 repeat occurred exclusively in contractions of maternal premutation alleles. We propose a refined model of FMR1 repeat progression from normal to premutation size and suggest that most normal alleles without AGG interruptions are derived from contractions of maternal premutation alleles.
by
Donna M. McDonald-McGinn;
Somayyeh Fahiminiya;
Timothee Revil;
Beata A. Nowakowska;
Joshua Suhl;
Alice Bailey;
Elisabeth Mlynarski;
David R. Lynch;
Albert C. Yan;
Larissa T. Bilaniuk;
Kathleen E. Sullivan;
Stephen Warren;
Beverly S. Emanuel;
Joris R. Vermeesch;
Elaine H. Zackai;
Loydie A. Jerome-Majewska
Background: 22q11.12 deletion syndrome (22q11.12DS) is the most common microdeletion disorder, affecting an estimated 1 : 2000-4000 live births. Patients with 22q11.12DS have a broad spectrum of phenotypic abnormalities which generally includes congenital cardiac abnormalities, palatal anomalies, and immunodeficiency. Additional findings, such as skeletal anomalies and autoimmune disorders, can confer significant morbidity in a subset of patients. 22q11.12DS is a contiguous gene DS and over 40 genes are deleted in patients; thus deletion of several genes within this region contributes to the clinical features. Mutations outside or on the remaining 22q11.12 allele are also known to modify the phenotype. Methods: We utilised whole exome, targeted exome and/or Sanger sequencing to examine the genome of 17 patients with 22q11.12 deletions and phenotypic features found in < 10% of affected individuals. Results and conclusions: In four unrelated patients, we identified three novel mutations in SNAP29, the gene implicated in the autosomal recessive condition cerebral dysgenesis, neuropathy, ichthyosis and keratoderma (CEDNIK). SNAP29 maps to 22q11.12 and encodes a soluble SNARE protein that is predicted to mediate vesicle fusion at the endoplasmic reticulum or Golgi membranes. This work confirms that the phenotypic variability observed in a subset of patients with 22q11.12DS is due to mutations on the non-deleted chromosome, which leads to unmasking of autosomal recessive conditions such as CEDNIK, Kousseff, and a potentially autosomal recessive form of Opitz G/BBB syndrome. Furthermore, our work implicates SNAP29 as a major modifier of variable expressivity in 22q11.12 DS patients.