About this item:

354 Views | 453 Downloads

Author Notes:

Correspondence: tahsin.kurc@stonybrook.edu

TK, GT, MN, FW designed the high performance computing and data management components and carried out experiments for performance evaluation.

XQ, DW, LY developed the content based image retrieval methodologies.

XQ, DW, LY, LC provided image analysis expertise and provided codes used for image analysis.

JS and DF supervised the overall effort.

All authors read and approved the final manuscript.

The authors declare that they have no competing interests.


Research Funding:

This work was funded in part by HHSN261200800001E from the NCI, 1U24CA180924-01A1 from the NCI, 5R01LM011119-05 and 5R01LM009239-07 from the NLM, and CNPq.

This research used resources provided by the XSEDE Science Gateways program under grant TG-ASC130023, the Keeneland Computing Facility at the Georgia Institute of Technology, supported by the NSF under Contract OCI-0910735, and the Nautilus system at the University of Tennessee’s Center for Remote Data Analysis and Visualization supported by NSF Award ARRA-NSF-OCI-0906324.


  • Science & Technology
  • Life Sciences & Biomedicine
  • Biochemical Research Methods
  • Biotechnology & Applied Microbiology
  • Mathematical & Computational Biology
  • Biochemistry & Molecular Biology
  • High performance computing
  • GPUs
  • Databases

Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies


Journal Title:

BMC Bioinformatics


Volume 16


, Pages 399-399

Type of Work:

Article | Final Publisher PDF


Background: We describe a suite of tools and methods that form a core set of capabilities for researchers and clinical investigators to evaluate multiple analytical pipelines and quantify sensitivity and variability of the results while conducting large-scale studies in investigative pathology and oncology. The overarching objective of the current investigation is to address the challenges of large data sizes and high computational demands. Results: The proposed tools and methods take advantage of state-of-the-art parallel machines and efficient content-based image searching strategies. The content based image retrieval (CBIR) algorithms can quickly detect and retrieve image patches similar to a query patch using a hierarchical analysis approach. The analysis component based on high performance computing can carry out consensus clustering on 500,000 data points using a large shared memory system. Conclusions: Our work demonstrates efficient CBIR algorithms and high performance computing can be leveraged for efficient analysis of large microscopy images to meet the challenges of clinically salient applications in pathology. These technologies enable researchers and clinical investigators to make more effective use of the rich informational content contained within digitized microscopy specimens.

Copyright information:

© 2015 Kurc et al.

This is an Open Access work distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Creative Commons License

Export to EndNote