Publication

Optimizing parameter sensitivity analysis of large-scale microscopy image analysis workflows with multilevel computation reuse

Downloadable Content

Persistent URL
Last modified
  • 05/21/2025
Type of Material
Authors
    Willian Barreiros Jr, University of BrasíliaJeremias Moreira, University of BrasíliaTahsin Kurc, Emory UniversityJun Kong, Emory UniversityAlba C. M. A. Melo, University of BrasíliaJoel Saltz, Emory UniversityGeorge Teodoro, Emory University
Language
  • English
Date
  • 2020-01-25
Publisher
  • Wiley
Publication Version
Copyright Statement
  • © 2019 John Wiley & Sons, Ltd.
Final Published Version (URL)
Title of Journal or Parent Work
Volume
  • 32
Issue
  • 2
Grant/Funding Information
  • This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562.
  • This work was supported in part by U24CA180924, U24CA215109, and 1UG3CA225021 from the NCI, R01LM011119-01 and R01LM009239 from the NLM, CNPq, Capes/Brazil grant PROCAD-183794, and NIH K25CA181503.
  • Specifically, it used the Bridges system, which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC).
Abstract
  • Parameter sensitivity analysis (SA) is an effective tool to gain knowledge about complex analysis applications and assess the variability in their analysis results. However, it is an expensive process as it requires the execution of the target application multiple times with a large number of different input parameter values. In this work, we propose optimizations to reduce the overall computation cost of SA in the context of analysis applications that segment high-resolution slide tissue images, ie, images with resolutions of 100k × 100k pixels. Two cost-cutting techniques are combined to efficiently execute SA: use of distributed hybrid systems for parallel execution and computation reuse at multiple levels of an analysis pipeline to reduce the amount of computation. These techniques were evaluated using a cancer image analysis workflow on a hybrid cluster with 256 nodes, each with an Intel Phi and a dual socket CPU. Our parallel execution method attained an efficiency of over 90% on 256 nodes. The hybrid execution on the CPU and Intel Phi improved the performance by 2×. Multilevel computation reuse led to performance gains of over 2.9×.
Author Notes
  • Correspondence: George Teodoro, Department of Computer Science, Federal University of Minas Gerais, 31270-901 Belo Horizonte-MG, Brazil. george@dcc.ufmg.br
Keywords
Research Categories
  • Biology, Biostatistics
  • Biology, Microbiology
  • Computer Science
  • Biology, Bioinformatics

Tools

Relations

In Collection:

Items