In the following narrative review, we discuss the potential role of large language models (LLMs) in medical device innovation, specifically examples using generative pretrained transformer-4. Throughout the biodesign process, LLMs can offer prompt-driven insights, aiding problem identification, knowledge assimilation and decision-making. Intellectual property analysis, regulatory assessment and market analysis emerge as key LLM applications. Through case examples, we underscore LLMs’ transformative ability to democratise information access and expertise, facilitating inclusive innovation in medical devices as well as its effectiveness with providing real-time, individualised feedback for innovators of all experience levels. By mitigating entry barriers, LLMs accelerate transformative advancements, fostering collaboration among established and emerging stakeholders.
INTRODUCTION: To date, most mammography-related AI models have been trained using either film or digital mammogram datasets with little overlap. We investigated whether or not combining film and digital mammography during training will help or hinder modern models designed for use on digital mammograms. METHODS: To this end, a total of six binary classifiers were trained for comparison. The first three classifiers were trained using images only from Emory Breast Imaging Dataset (EMBED) using ResNet50, ResNet101, and ResNet152 architectures. The next three classifiers were trained using images from EMBED, Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM), and Digital Database for Screening Mammography (DDSM) datasets. All six models were tested only on digital mammograms from EMBED. RESULTS: The results showed that performance degradation to the customized ResNet models was statistically significant overall when EMBED dataset was augmented with CBIS-DDSM/DDSM. While the performance degradation was observed in all racial subgroups, some races are subject to more severe performance drop as compared to other races. DISCUSSION: The degradation may potentially be due to ( 1) a mismatch in features between film-based and digital mammograms ( 2) a mismatch in pathologic and radiological information. In conclusion, use of both film and digital mammography during training may hinder modern models designed for breast cancer screening. Caution is required when combining film-based and digital mammograms or when utilizing pathologic and radiological information simultaneously.
by
Jack Gallifant;
Joe Zhang;
Stephen Whebell;
Justin Quion;
Braiam Escobar;
Judy Gichoya;
Karen Herrera;
Ruxana Jina;
Swathikan Chidambaram;
Abha Mehndiratta;
Richard Kimera;
Alvin Marcelo;
Portia Grace Fernandez-Marcelo;
Juan Sebastian Osorio;
Cleva Villanueva;
Lama Nazer;
Irene Dankwa-Mullan;
Leo Anthony Celi
Current methods to evaluate a journal's impact rely on the downstream citation mapping used to generate the Impact Factor. This approach is a fragile metric prone to being skewed by outlier values and does not speak to a researcher's contribution to furthering health outcomes for all populations. Therefore, we propose the implementation of a Diversity Factor to fulfill this need and supplement the current metrics. It is composed of four key elements: dataset properties, author country, author gender and departmental affiliation. Due to the significance of each individual element, they should be assessed independently of each other as opposed to being combined into a simplified score to be optimized. Herein, we discuss the necessity of such metrics, provide a framework to build upon, evaluate the current landscape through the lens of each key element and publish the findings on a freely available website that enables further evaluation. The OpenAlex database was used to extract the metadata of all papers published from 2000 until August 2022, and Natural language processing was used to identify individual elements. Features were then displayed individually on a static dashboard developed using TableauPublic, which is available at www.equitablescience.com. In total, 130,721 papers were identified from 7,462 journals where significant underrepresentation of LMIC and Female authors was demonstrated. These findings are pervasive and show no positive correlation with the Journal's Impact Factor. The systematic collection of the Diversity Factor concept would allow for more detailed analysis, highlight gaps in knowledge, and reflect confidence in the translation of related research. Conversion of this metric to an active pipeline would account for the fact that how we define those most at risk will change over time and quantify responses to particular initiatives. Therefore, continuous measurement of outcomes across groups and those investigating those outcomes will never lose importance. Moving forward, we encourage further revision and improvement by diverse author groups in order to better refine this concept.
The health needs of those living in resource-limited settings are a vastly overlooked and understudied area in the intersection of machine learning (ML) and health care. While the use of ML in health care is more recently popularized over the last few years from the advancement of deep learning, low-and-middle income countries (LMICs) have already been undergoing a digital transformation of their own in health care over the last decade, leapfrogging milestones due to the adoption of mobile health (mHealth). With the introduction of new technologies, it is common to start afresh with a top-down approach, and implement these technologies in isolation, leading to lack of use and a waste of resources. In this paper, we outline the necessary considerations both from the perspective of current gaps in research, as well as from the lived experiences of health care professionals in resource-limited settings. We also outline briefly several key components of successful implementation and deployment of technologies within health systems in LMICs, including technical and cultural considerations in the development process relevant to the building of machine learning solutions. We then draw on these experiences to address where key opportunities for impact exist in resource-limited settings, and where AI/ML can provide the most benefit.
by
J. Raymond Geis;
Adrian Brady;
Carol C. Wu;
Jack Spencer;
Erik Ranschaert;
Jacob L. Jaremko;
Steve G. Langer;
Andrea Borondy Kitts;
Judy Birch;
William F. Shields;
Rovert van den Hoven van Genderen;
Elmar Kotter;
Judy Gichoya;
Tessa S. Cook;
Matthew B. Morgan;
An Tang;
Nabile Safdar;
Marc Kohli
This is a condensed summary of an international multisociety statement on ethics of artificial intelligence (AI) in radiology produced by the ACR, European Society of Radiology, RSNA, Society for Imaging Informatics in Medicine, European Society of Medical Imaging Informatics, Canadian Association of Radiologists, and American Association of Physicists in Medicine. AI has great potential to increase efficiency and accuracy throughout radiology, but also carries inherent pitfalls and biases. Widespread use of AI-based intelligent and autonomous systems in radiology can increase the risk of systemic errors with high consequence, and highlights complex ethical and societal issues. Currently, there is little experience using AI for patient care in diverse clinical settings. Extensive research is needed to understand how to best deploy AI in clinical practice. This statement highlights our consensus that ethical use of AI in radiology should promote well-being, minimize harm, and ensure that the benefits and harms are distributed among stakeholders in a just manner. We believe AI should respect human rights and freedoms, including dignity and privacy. It should be designed for maximum transparency and dependability. Ultimate responsibility and accountability for AI remains with its human designers and operators for the foreseeable future. The radiology community should start now to develop codes of ethics and practice for AI which promote any use that helps patients and the common good and should block use of radiology data and algorithms for financial gain without those two attributes.
Introduction In January, the National Institutes of Health (NIH) implemented a Data Management and Sharing Policy aiming to leverage data collected during NIH-funded research. The COVID-19 pandemic illustrated that this practice is equally vital for augmenting patient research. In addition, data sharing acts as a necessary safeguard against the introduction of analytical biases. While the pandemic provided an opportunity to curtail critical research issues such as reproducibility and validity through data sharing, this did not materialise in practice and became an example of € Open Data in Appearance Only' (ODIAO). Here, we define ODIAO as the intent of data sharing without the occurrence of actual data sharing (eg, material or digital data transfers). Objective Propose a framework that states the main risks associated with data sharing, systematically present risk mitigation strategies and provide examples through a healthcare lens. Methods This framework was informed by critical aspects of both the Open Data Institute and the NIH's 2023 Data Management and Sharing Policy plan guidelines. Results Through our examination of legal, technical, reputational and commercial categories, we find barriers to data sharing ranging from misinterpretation of General Data Privacy Rule to lack of technical personnel able to execute large data transfers. From this, we deduce that at numerous touchpoints, data sharing is presently too disincentivised to become the norm. Conclusion In order to move towards Open Data, we propose the creation of mechanisms for incentivisation, beginning with recentring data sharing on patient benefits, additional clauses in grant requirements and committees to encourage adherence to data reporting practices.
by
Nick Wilmes;
Charlotte WE Hendriks;
Caspar TA Viets;
Simon JWM Cornelissen;
Walther NKA Van Mook;
Josanne Cox-Brinkman;
Leo A Celi;
Nicole Martinez-Martin;
Judy Gichoya;
Craig Watkins;
Ferishta Bakhshi-Raiez;
Laure Wynants;
Iwan CC Van Der Horst;
Bas CT Van Bussel
Background The COVID-19 pandemic required science to provide answers rapidly to combat the outbreak. Hence, the reproducibility and quality of conducting research may have been threatened, particularly regarding privacy and data protection, in varying ways around the globe. The objective was to investigate aspects of reporting informed consent and data handling as proxies for study quality conduct. Methods A systematic scoping review was performed by searching PubMed and Embase. The search was performed on November 8th, 2020. Studies with hospitalised patients diagnosed with COVID-19 over 18 years old were eligible for inclusion. With a focus on informed consent, data were extracted on the study design, prestudy protocol registration, ethical approval, data anonymisation, data sharing and data transfer as proxies for study quality. For reasons of comparison, data regarding country income level, study location and journal impact factor were also collected. Results 972 studies were included. 21.3% of studies reported informed consent, 42.6% reported waivers of consent, 31.4% did not report consent information and 4.7% mentioned other types of consent. Informed consent reporting was highest in clinical trials (94.6%) and lowest in retrospective cohort studies (15.0%). The reporting of consent versus no consent did not differ significantly by journal impact factor (p=0.159). 16.8% of studies reported a prestudy protocol registration or design. Ethical approval was described in 90.9% of studies. Information on anonymisation was provided in 17.0% of studies. In 257 multicentre studies, 1.2% reported on data sharing agreements, and none reported on Findable, Accessible, Interoperable and Reusable data principles. 1.2% reported on open data. Consent was most often reported in the Middle East (42.4%) and least often in North America (4.7%). Only one report originated from a low-income country. Discussion Informed consent and aspects of data handling and sharing were under-reported in publications concerning COVID-19 and differed between countries, which strains study quality conduct when in dire need of answers.
The ability of artificial intelligence to perpetuate bias at scale is increasingly recognized. Recently, proposals for implementing regulation that safeguards such discrimination have come under pressure due to the potential of such restrictions stifling innovation within the field. In this formal comment, we highlight the potential dangers of such views and explore key examples that define this relationship between health equity and innovation. We propose that health equity is a vital component of healthcare and should not be compromised to expedite the advancement of results for the few at the expense of vulnerable populations. A data-centered future that works for all will require funding bodies to incentivize equity-focused AI, and organizations must be held accountable for the differential impact of such algorithms post-deployment.
by
Leo Anthony Celi;
Jacqueline Cellini;
Marie-Laure Charpignon;
Edward Chrsitopher Dee;
Franck Dernoncourt;
Rene Eber;
William Greig Mitchell;
Lama Moukheiber;
Julian Schirmer;
Julia Situ;
Joseph Paguio;
Joel Park;
Judy Gichoya Wawira;
Judy Gichoya;
Seth Yao
BACKGROUND: While artificial intelligence (AI) offers possibilities of advanced clinical prediction and decision-making in healthcare, models trained on relatively homogeneous datasets, and populations poorly-representative of underlying diversity, limits generalisability and risks biased AI-based decisions. Here, we describe the landscape of AI in clinical medicine to delineate population and data-source disparities. METHODS: We performed a scoping review of clinical papers published in PubMed in 2019 using AI techniques. We assessed differences in dataset country source, clinical specialty, and author nationality, sex, and expertise. A manually tagged subsample of PubMed articles was used to train a model, leveraging transfer-learning techniques (building upon an existing BioBERT model) to predict eligibility for inclusion (original, human, clinical AI literature). Of all eligible articles, database country source and clinical specialty were manually labelled. A BioBERT-based model predicted first/last author expertise. Author nationality was determined using corresponding affiliated institution information using Entrez Direct. And first/last author sex was evaluated using the Gendarize.io API. RESULTS: Our search yielded 30,576 articles, of which 7,314 (23.9%) were eligible for further analysis. Most databases came from the US (40.8%) and China (13.7%). Radiology was the most represented clinical specialty (40.4%), followed by pathology (9.1%). Authors were primarily from either China (24.0%) or the US (18.4%). First and last authors were predominately data experts (i.e., statisticians) (59.6% and 53.9% respectively) rather than clinicians. And the majority of first/last authors were male (74.1%). INTERPRETATION: U.S. and Chinese datasets and authors were disproportionately overrepresented in clinical AI, and almost all of the top 10 databases and author nationalities were from high income countries (HICs). AI techniques were most commonly employed for image-rich specialties, and authors were predominantly male, with non-clinical backgrounds. Development of technological infrastructure in data-poor regions, and diligence in external validation and model re-calibration prior to clinical implementation in the short-term, are crucial in ensuring clinical AI is meaningful for broader populations, and to avoid perpetuating global health inequity.
Purpose: Existing anomaly detection methods focus on detecting interclass variations while medical image novelty identification is more challenging in the presence of intraclass variations. For example, a model trained with normal chest X-ray and common lung abnormalities is expected to discover and flag idiopathic pulmonary fibrosis, which is a rare lung disease and unseen during training. The nuances of intraclass variations and lack of relevant training data in medical image analysis pose great challenges for existing anomaly detection methods. Approach: We address the above challenges by proposing a hybrid model-transformation-based embedding learning for novelty detection (TEND), which combines the merits of classifier-based approach and AutoEncoder (AE)-based approach. Training TEND consists of two stages. In the first stage, we learn in-distribution embeddings with an AE via the unsupervised reconstruction. In the second stage, we learn a discriminative classifier to distinguish in-distribution data and the transformed counterparts. Additionally, we propose a margin-aware objective to pull in-distribution data in a hypersphere while pushing away the transformed data. Eventually, the weighted sum of class probability and the distance to margin constitutes the anomaly score. Results: Extensive experiments are performed on three public medical image datasets with the one-vs-rest setup (namely one class as in-distribution data and the left as intraclass out-of-distribution data) and the rest-vs-one setup. Additional experiments on generated intraclass out-of-distribution data with unused transformations are implemented on the datasets. The quantitative results show competitive performance as compared to the state-of-the-art approaches. Provided qualitative examples further demonstrate the effectiveness of TEND. Conclusion: Our anomaly detection model TEND can effectively identify the challenging intraclass out-of-distribution medical images in an unsupervised fashion. It can be applied to discover unseen medical image classes and serve as the abnormal data screening for downstream medical tasks. The corresponding code is available at https://github.com/XiaoyuanGuo/TEND_MedicalNoveltyDetection.