Publication

Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits

Downloadable Content

Persistent URL
Last modified
  • 06/25/2025
Type of Material
Authors
    shizhen tang, Emory UniversityAron S Buchman, Rush UniversityYanling Wang, Rush UniversityDenis Avey, Rush UniversityJishu Xu, Rush UniversityShinya Tasaki, Rush UniversityDavid A Bennett, Rush UniversityQi Zheng, University of LouisvilleJingjing Yang, Emory University
Language
  • English
Date
  • 2023-10-03
Publisher
  • Springer Nature
Publication Version
Copyright Statement
  • © The Author(s) 2023
License
Final Published Version (URL)
Title of Journal or Parent Work
Volume
  • 15
Start Page
  • 16570
Grant/Funding Information
  • This work was supported by the National Institute of Health R35GM138313 (S.T., J.Y.), R21AG070659 (S.T., Q.Z., J.Y.), P30AG10161, P30AG72975, K01AG054700, R01AG15819, R01AG17917, R01AG56352; U01AG46152, U01AG61356, and the National Science Foundation DMS-1952486 (Q.Z.).
Supplemental Material (URL)
Abstract
  • Differential gene expression (DGE) analysis has been widely employed to identify genes expressed differentially with respect to a trait of interest using RNA sequencing (RNA-Seq) data. Recent RNA-Seq data with large samples pose challenges to existing DGE methods, which were mainly developed for dichotomous traits and small sample sizes. Especially, existing DGE methods are likely to result in inflated false positive rates. To address this gap, we employed a linear mixed model (LMM) that has been widely used in genetic association studies for DGE analysis of quantitative traits. We first applied the LMM method to the discovery RNA-Seq data of dorsolateral prefrontal cortex (DLPFC) tissue (n = 632) with four continuous measures of Alzheimer’s Disease (AD) cognitive and neuropathologic traits. The quantile–quantile plots of p-values showed that false positive rates were well calibrated by LMM, whereas other methods not accounting for sample-specific mixed effects led to serious inflation. LMM identified 37 potentially significant genes with differential expression in DLPFC for at least one of the AD traits, 17 of which were replicated in the additional RNA-Seq data of DLPFC, supplemental motor area, spinal cord, and muscle tissues. This application study showed not only well calibrated DGE results by LMM, but also possibly shared gene regulatory mechanisms of AD traits across different relevant tissues.
Author Notes
Keywords
Research Categories
  • Biology, Genetics
  • Health Sciences, Medicine and Surgery
  • Biology, Bioinformatics

Tools

Relations

In Collection:

Items