TET2 is a dioxygenase that catalyses multiple steps of 5-methylcytosine oxidation. Although TET2 mutations frequently occur in various types of haematological malignancies, the mechanism by which they increase risk for these cancers remains poorly understood. Here we show that Tet2(-/-) mice develop spontaneous myeloid, T- and B-cell malignancies after long latencies. Exome sequencing of Tet2(-/-) tumours reveals accumulation of numerous mutations, including Apc, Nf1, Flt3, Cbl, Notch1 and Mll2, which are recurrently deleted/mutated in human haematological malignancies. Single-cell-targeted sequencing of wild-type and premalignant Tet2(-/-) Lin(-)c-Kit(+) cells shows higher mutation frequencies in Tet2(-/-) cells. We further show that the increased mutational burden is particularly high at genomic sites that gained 5-hydroxymethylcytosine, where TET2 normally binds. Furthermore, TET2-mutated myeloid malignancy patients have significantly more mutational events than patients with wild-type TET2. Thus, Tet2 loss leads to hypermutagenicity in haematopoietic stem/progenitor cells, suggesting a novel TET2 loss-mediated mechanism of haematological malignancy pathogenesis.
Background: Targeted resequencing offers a cost-effective alternative to whole-genome and whole-exome sequencing when investigating regions known to be associated with a trait or disease. There are a number of approaches to targeted resequencing, including microfluidic PCR amplification, which may be enhanced by multiplex PCR. Currently, there is no open-source software that can design next-generation multiplex PCR experiments that ensures primers are unique at a genome-level and efficiently pools compatible primers. Results: We present MPD, a software package that automates the design of multiplex PCR primers for next-generation sequencing. The core of MPD is implemented in C for speed and uses a hashed genome to ensure primer uniqueness, avoids placing primers over sites of known variation, and efficiently pools compatible primers. A JavaScript web application (http://multiplexprimer.io) utilizing the MPD Perl package provides a convenient platform for users to make designs. Using a realistic set of genes identified by genome-wide association studies (GWAS), we achieve 90% coverage of all exonic regions using stringent design criteria. Using the first 47 primer pools for wet-lab validation, we sequenced ~25Kb at 99.7% completeness with a mean coverage of 300X among 313 samples simultaneously and identified 224 variants. The number and nature of variants we observe are consistent with high quality sequencing. Conclusions: MPD can successfully design multiplex PCR experiments suitable for next-generation sequencing, and simplifies retooling targeted resequencing pipelines to focus on new targets as new genetic evidence emerges.
by
Bing Bai;
Chad Hales;
Ping-Chung Chen;
Yair Gozal;
Eric B Dammer;
Jason Jon Fritz;
Xusheng Wang;
Qiangwei Xia;
Duc Duong;
Craig Street;
Gloria Cantero;
Dongmei Cheng;
Drew R. Jones;
Zhiping Wu;
Yuxin Li;
Ian Diner;
Craig Heilman;
Howard D Rees III;
Hao Wu;
Li Lin;
Keith E. Szulwach;
Marla Gearing;
Elliott J. Mufson;
David A. Bennett;
Thomas J. Montine;
Nicholas Seyfried;
Thomas Wingo;
Yi E. Sun;
Peng Jin;
John Hanfelt;
Donna M. Willcock;
Allan I Levey;
James J Lah;
Junmin Peng
Deposition of insoluble protein aggregates is a hallmark of neurodegenerative diseases. The universal presence of β-amyloid and tau in Alzheimer’s disease (AD) has facilitated advancement of the amyloid cascade and tau hypotheses that have dominated AD pathogenesis research and therapeutic development. However, the underlying etiology of the disease remains to be fully elucidated. Here we report a comprehensive study of the human brain-insoluble proteome in AD by mass spectrometry. We identify 4,216 proteins, among which 36 proteins accumulate in the disease, including U1-70K and other U1 small nuclear ribonucleoprotein (U1 snRNP) spliceosome components. Similar accumulations in mild cognitive impairment cases indicate that spliceosome changes occur in early stages of AD. Multiple U1 snRNP subunits form cytoplasmic tangle-like structures in AD but not in other examined neurodegenerative disorders, including Parkinson disease and frontotemporal lobar degeneration. Comparison of RNA from AD and control brains reveals dysregulated RNA processing with accumulation of unspliced RNA species in AD, including myc box-dependent-interacting protein 1, clusterin, and presenilin-1. U1-70K knockdown or antisense oligonucleotide inhibition of U1 snRNP increases the protein level of amyloid precursor protein. Thus, our results demonstrate unique U1 snRNP pathology and implicate abnormal RNA splicing in AD pathogenesis.
Accurately selecting relevant alleles in large sequencing experiments remains technically challenging. Bystro (https://bystro.io/ ) is the first online, cloud-based application that makes variant annotation and filtering accessible to all researchers for terabyte-sized whole-genome experiments containing thousands of samples. Its key innovation is a general-purpose, natural-language search engine that enables users to identify and export alleles and samples of interest in milliseconds. The search engine dramatically simplifies complex filtering tasks that previously required programming experience or specialty command-line programs. Critically, Bystro's annotation and filtering capabilities are orders of magnitude faster than previous solutions, saving weeks of processing time for large experiments.
Background: There is a widespread belief that dominant mutations cause most cases of early-onset Alzheimer's Disease (onset ≤ 60 years, EOAD) yet epidemiologic evidence suggests they explain ≤ 10% of all EOAD cases.
Objective: To determine the genetic contribution to the remaining ~90% of non-autosomal dominant EOAD cases and identify the likely mechanism of inheritance in those cases.
Design, Subjects: A liability threshold model of disease was used to estimate heritability of EOAD and late-onset AD (LOAD) using concordance for AD among parent-offspring pairs. Individuals with probable AD and detailed parental history (n =5,370) were identified in the Uniform Dataset (UDS) whose participants were collected from 32 Alzheimer's Disease Centers.
Results: For LOAD (n = 4,302), we found sex-specific parent–offspring concordance that ranged from ~10-30% resulting in a heritability of 69.8% (95% CI: 64.6–75.0%) and equal heritability for both sexes regardless of parental gender. For EOAD (n = 702), we found that the parent–offspring concordance is ≤ 10% and concordance among siblings is 21.6%. EOAD heritability is 92–100% for all likely values of EOAD prevalence.
Conclusion: We confirm LOAD is a highly polygenic disease. By contrast, the data for EOAD suggest it is an almost entirely genetically based disease, and the pattern of observed concordance for parent–offspring pairs and among siblings lead us to reject the hypotheses that EOAD is a purely dominant, mitochondrial, X-linked, or polygenic disorder. The most likely explanation of the data is that ~90% of EOAD cases are due to autosomal recessive causes.
Positive affect denotes a state of pleasurable engagement with the environment eliciting positive emotion such as contentment, enthusiasm or happiness. Positive affect is associated with favorable psychological, physical and economic outcomes in many longitudinal studies. With a heritability of ⩽64%, positive affect is substantially influenced by genetic factors; however, our understanding of genetic pathways underlying individual differences in positive affect is still limited. Here, through a genome-wide association study of positive affect in African-American participants, we identify a single-nucleotide polymorphism, rs322931, significantly associated with positive affect at P<5 × 10-8, and replicate this association in another cohort. Furthermore, we show that the minor allele of rs322931 predicts expression of microRNAs miR-181a and miR-181b in human brain and blood, greater nucleus accumbens reactivity to positive emotional stimuli and enhanced fear inhibition. Prior studies have suggested that miR-181a is part of the reward neurocircuitry. Taken together, we identify a novel genetic variant for further elucidation of genetic underpinning of positive affect that mediates positive emotionality potentially via the nucleus accumbens and miR-181.
Amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) share phenotypic and pathologic overlap. Recently, an expansion of GGGGCC repeats in the first intron of C9orf72 was found to be a common cause of both illnesses; however, the molecular pathogenesis of this expanded repeat is unknown. Here we developed both Drosophila and mammalian models of this expanded hexanucleotide repeat and showed that expression of the expanded GGGGCC repeat RNA (rGGGGCC) is sufficient to cause neurodegeneration. We further identified Pur α as the RNA-binding protein of rGGGGCC repeats and discovered that Pur α and rGGGGCC repeats interact in vitro and in vivo in a sequence-specific fashion that is conserved between mammals and Drosophila. Furthermore, overexpression of Pur α in mouse neuronal cells and Drosophila mitigates rGGGGCC repeat-mediated neurodegeneration, and Pur α forms inclusions in the fly eye expressing expanded rGGGGCC repeats, as well as in cerebellum of human carriers of expanded GGGGCC repeats. These data suggest that expanded rGGGGCC repeats could sequester specific RNA-binding protein from their normal functions, ultimately leading to cell death. Taken together, these findings suggest that the expanded rGGGGCC repeats could cause neurodegeneration, and that Pur α may play a role in the pathogenesis of amyotrophic lateral sclerosis and frontotemporal dementia.
Background
The genetic basis of amyotrophic lateral sclerosis (ALS) is not entirely clear. While there are families with rare highly penetrant mutations in Cu/Zn superoxide dismutase 1 and several other genes that cause apparent Mendelian inheritance of the disease, most ALS occurs in families without another affected individual. However, twin studies suggest that all ALS has a substantial genetic basis. Herein, we estimate the genetic contribution to ALS in a clinically ascertained case series from the United States.
Methodology/Principal Findings
We used the database of the Emory ALS Center to ascertain individuals with ALS along with their family histories to determine the concordance among parents and offspring for the disease. We found that concordance for all parent–offspring pairs was low (<2%). With this concordance we found that ALS heritability, or the proportion of the disease explained by genetic factors, is between 40 and 45% for all likely estimates of ALS lifetime prevalence.
Conclusions/Significance
We found the lifetime risk of ALS is 1.1% in first-degree relatives of those with ALS. Environmental and genetic factors appear nearly equally important for the development of ALS.
The analysis of human whole-genome sequencing data presents significant computational challenges. The sheer size of datasets places an enormous burden on computational, disk array, and network resources. Here, we present an integrated computational package, PEMapper/PECaller, that was designed specifically to minimize the burden on networks and disk arrays, create output files that are minimal in size, and run in a highly computationally efficient way, with the single goal of enabling whole-genome sequencing at scale. In addition to improved computational efficiency, we implement a statistical framework that allows for a base by base error model, allowing this package to perform as well or better than the widely used Genome Analysis Toolkit (GATK) in all key measures of performance on human whole-genome sequences.