About this item:

168 Views | 106 Downloads

Author Notes:

E-mail: jiankang@umich.edu. Phone: +1 (734)763-1607., *E-mail: tianwei.yu@emory.edu, +1 (404) 727-7671.

The authors declare no competing financial interest.

Subjects:

Research Funding:

This work has been partially supported by NIH Grants 1R01MH105561-02, P20HL113451, K01DK102851, and RF1AG051633.

Keywords:

  • Science & Technology
  • Life Sciences & Biomedicine
  • Biochemical Research Methods
  • Biochemistry & Molecular Biology
  • metabolic network data
  • optimal matching
  • feature selection
  • BODY-MASS INDEX
  • OXIDATIVE STRESS
  • ADIPOSE-TISSUE
  • HUMAN-DISEASE
  • INSULIN-RESISTANCE
  • RISK-FACTORS
  • OBESE WOMEN
  • INFLAMMATION
  • ASSOCIATION
  • ANNOTATION

Network Marker Selection for Untargeted LC–MS Metabolomics Data

Tools:

Journal Title:

Journal of Proteome Research

Volume:

Volume 16, Number 3

Publisher:

, Pages 1261-1269

Type of Work:

Article | Post-print: After Peer Review

Abstract:

Untargeted metabolomics using high-resolution liquid chromatography-mass spectrometry (LC-MS) is becoming one of the major areas of high-throughput biology. Functional analysis, that is, analyzing the data based on metabolic pathways or the genome-scale metabolic network, is critical in feature selection and interpretation of metabolomics data. One of the main challenges in the functional analyses is the lack of the feature identity in the LC-MS data itself. By matching mass-to-charge ratio (m/z) values of the features to theoretical values derived from known metabolites, some features can be matched to one or more known metabolites. When multiple matchings occur, in most cases only one of the matchings can be true. At the same time, some known metabolites are missing in the measurements. Current network/pathway analysis methods ignore the uncertainty in metabolite identification and the missing observations, which could lead to errors in the selection of significant subnetworks/pathways. In this paper, we propose a flexible network feature selection framework that combines metabolomics data with the genome-scale metabolic network. The method adopts a sequential feature screening procedure and machine learning-based criteria to select important subnetworks and identify the optimal feature matching simultaneously. Simulation studies show that the proposed method has a much higher sensitivity than the commonly used maximal matching approach. For demonstration, we apply the method on a cohort of healthy subjects to detect subnetworks associated with the body mass index (BMI). The method identifies several subnetworks that are supported by the current literature, as well as detects some subnetworks with plausible new functional implications. The R code is available at http://web1.sph.emory.edu/users/tyu8/MSS.

Copyright information:

© 2017 American Chemical Society.

Export to EndNote