Background: Surveillance data are essential public health resources for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on nonrandom sample designs. Population estimates based on such data may be impacted by the underlying sample distribution compared to the true population of interest. In this study, we simulate a population of interest and allow response rates to vary in nonrandom ways to illustrate and measure the effect this has on population-based estimates of an important public health policy outcome. Objective: The aim of this study was to illustrate the effect of nonrandom missingness on population-based survey sample estimation. Methods: We simulated a population of respondents answering a survey question about their satisfaction with their community’s policy regarding vaccination mandates for government personnel. We allowed response rates to differ between the generally satisfied and dissatisfied and considered the effect of common efforts to control for potential bias such as sampling weights, sample size inflation, and hypothesis tests for determining missingness at random. We compared these conditions via mean squared errors and sampling variability to characterize the bias in estimation arising under these different approaches. Results: Sample estimates present clear and quantifiable bias, even in the most favorable response profile. On a 5-point Likert scale, nonrandom missingness resulted in errors averaging to almost a full point away from the truth. Efforts to mitigate bias through sample size inflation and sampling weights have negligible effects on the overall results. Additionally, hypothesis testing for departures from random missingness rarely detect the nonrandom missingness across the widest range of response profiles considered. Conclusions: Our results suggest that assuming surveillance data are missing at random during analysis could provide estimates that are widely different from what we might see in the whole population. Policy decisions based on such potentially biased estimates could be devastating in terms of community disengagement and health disparities. Alternative approaches to analysis that move away from broad generalization of a mismeasured population at risk are necessary to identify the marginalized groups, where overall response may be very different from those observed in measured respondents.
by
Lance Waller;
Thomas Clasen;
Lisa Thompson;
Ajay Pillarisetti;
KN Williams;
A Quinn;
H North;
J Wang;
A Díaz-Artiga;
K Balakrishnan;
G Thangavel;
G Rosa;
F Ndagijimana;
LJ Underhill;
MA Kirby;
E Puzzolo;
S Hossen;
JL Peel;
JP Rosenthal;
SA Harvey;
W Checkley
BACKGROUND: Reducing household air pollution (HAP) to levels associated with health benefits requires nearly exclusive use of clean cooking fuels and abandonment of traditional biomass fuels. METHODS: The Household Air Pollution Intervention Network (HAPIN) trial randomized 3,195 pregnant women in Guatemala, India, Peru, and Rwanda to receive a liquefied petroleum gas (LPG) stove intervention (n=1,590), with controls expected to continue cooking with biomass fuels (n=1,605). We assessed fidelity to intervention implementation and participant adherence to the intervention starting in pregnancy through the infant's first birthday using fuel delivery and repair records, surveys, observations, and temperature-logging stove use monitors (SUMs). RESULTS: Fidelity and adherence to the HAPIN intervention were high. Median time required to refill LPG cylinders was 1 day (interquartile range 0-2). Although 26% (n=410) of intervention participants reported running out of LPG at some point, the number of times was low (median: 1 day [Q1, Q3: 1, 2]) and mostly limited to the first four months of the COVID-19 pandemic. Most repairs were completed on the same day as problems were reported. Traditional stove use was observed in only 3% of observation visits, and 89% of these observations were followed up with behavioral reinforcement. According to SUMs data, intervention households used their traditional stove a median of 0.4% of all monitored days, and 81% used the traditional stove <1 day per month. Traditional stove use was slightly higher post-COVID-19 (detected on a median [Q1, Q3] of 0.0% [0.0%, 3.4%] of days) than pre-COVID-19 (0.0% [0.0%, 1.6%] of days). There was no significant difference in intervention adherence pre- and post-birth. CONCLUSION: Free stoves and an unlimited supply of LPG fuel delivered to participating homes combined with timely repairs, behavioral messaging, and comprehensive stove use monitoring contributed to high intervention fidelity and near-exclusive LPG use within the HAPIN trial.
by
Justin Remais;
Qu Cheng;
Philip A Collender;
Alexandra K Heaney;
Aidan McLoughlin;
Yang Yang;
Yuzi Zhang;
Jennifer R Head;
Rohini Dasan;
Song Liang;
Qiang Lv;
Yaqiong Liu;
Changhong Yang;
Howard Chang;
Lance Waller;
Jon Zelner;
Justin A Lewnard
With the aid of laboratory typing techniques, infectious disease surveillance networks have the opportunity to obtain powerful information on the emergence, circulation, and evolution of multiple genotypes, serotypes or other subtypes of pathogens, informing understanding of transmission dynamics and strategies for prevention and control. The volume of typing performed on clinical isolates is typically limited by its ability to inform clinical care, cost and logistical constraints, especially in comparison with the capacity to monitor clinical reports of disease occurrence, which remains the most widespread form of public health surveillance. Viewing clinical disease reports as arising from a latent mixture of pathogen subtypes, laboratory typing of a subset of clinical cases can provide inference on the proportion of clinical cases attributable to each subtype (i.e., the mixture components). Optimizing protocols for the selection of isolates for typing by weighting specific subpopulations, locations, time periods, or case characteristics (e.g., disease severity), may improve inference of the frequency and distribution of pathogen subtypes within and between populations. Here, we apply the Disease Surveillance Informatics Optimization and Simulation (DIOS) framework to simulate and optimize hand foot and mouth disease (HFMD) surveillance in a high-burden region of western China. We identify laboratory surveillance designs that significantly outperform the existing network: the optimal network reduced mean absolute error in estimated serotype-specific incidence rates by 14.1%; similarly, the optimal network for monitoring severe cases reduced mean absolute error in serotype-specific incidence rates by 13.3%. In both cases, the optimal network designs achieved improved inference without increasing subtyping effort. We demonstrate how the DIOS framework can be used to optimize surveillance networks by augmenting clinical diagnostic data with limited laboratory typing resources, while adapting to specific, local surveillance objectives and constraints.
Measles is one the best-documented and most-mechanistically-studied non-linear infectious disease dynamical systems. However, systematic investigation into the comparative performance of traditional mechanistic models and machine learning approaches in forecasting the transmission dynamics of this pathogen are still rare. Here, we compare one of the most widely used semi-mechanistic models for measles (TSIR) with a commonly used machine learning approach (LASSO), comparing performance and limits in predicting short to long term outbreak trajectories and seasonality for both regular and less regular measles outbreaks in England and Wales (E&W) and the United States. First, our results indicate that the proposed LASSO model can efficiently use data from multiple major cities and achieve similar short-to-medium term forecasting performance to semi-mechanistic models for E&W epidemics. Second, interestingly, the LASSO model also captures annual to biennial bifurcation of measles epidemics in E&W caused by susceptible response to the late 1940s baby boom. LASSO may also outperform TSIR for predicting less-regular dynamics such as those observed in major cities in US between 1932–45. Although both approaches capture short-term forecasts, accuracy suffers for both methods as we attempt longer-term predictions in highly irregular, post-vaccination outbreaks in E&W. Finally, we illustrate that the LASSO model can both qualitatively and quantitatively reconstruct mechanistic assumptions, notably susceptible dynamics, in the TSIR model. Our results characterize the limits of predictability of infectious disease dynamics for strongly immunizing pathogens with both mechanistic and machine learning models, and identify connections between these two approaches.
Background: Air pollution and neighborhood socioeconomic status (nSES) have been shown to affect cognitive decline in older adults. In previous studies, nSES acts as both a confounder and an effect modifier between air pollution and cognitive decline. Objectives: This study aims to examine the individual and joint effects of air pollution and nSES on cognitive decline on adults 50 years and older in Metro Atlanta, USA. Methods: Perceived memory and cognitive decline was assessed in 11,897 participants aged 50+ years from the Emory Healthy Aging Study (EHAS) using the cognitive function instrument (CFI). Three-year average air pollution concentrations for 12 pollutants and 16 nSES characteristics were matched to participants using census tracts. Individual exposure linear regression and LASSO models explore individual exposure effects. Environmental mixture modeling methods including, self-organizing maps (SOM), Bayesian kernel machine regression (BKMR), and quantile-based G-computation explore joint effects, and effect modification between air pollutants and nSES characteristics on cognitive decline. Results: Participants living in areas with higher air pollution concentrations and lower nSES experienced higher CFI scores (beta: 0.121; 95 % CI: 0.076, 0.167) compared to participants living in areas with low air pollution and high nSES. Additionally, the BKMR model showed a significant overall mixture effect on cognitive decline, suggesting synergy between air pollution and nSES. These joint effects explain protective effects observed in single-pollutant linear regression models, even after adjustment for confounding by nSES (e.g., an IQR increase in CO was associated with a 0.038-point lower (95 % CI: −0.06, −0.01) CFI score). Discussion: Observed protective effects of single air pollutants on cognitive decline can be explained by joint effects and effect modification of air pollutants and nSES. Researchers must consider nSES as an effect modifier if not a co-exposure to better understand the complex relationships between air pollution and nSES in urban settings.
by
Lance Waller;
Usha Ramakrishnan;
Penelope Howards;
Nelson Steenland;
Thomas Clasen;
Howard Chang;
Ajay Pillarisetti;
Lisa Thompson;
K Balakrishnan;
M Johnson;
W Ye;
LP Naeher;
A Diaz-Artiga;
JP McCracken;
G Rosa;
MA Kirby;
G Thangavel;
S Sambandam;
K Mukhopadhyay;
N Puttaswamy;
V Aravindalochanan;
S Garg;
F Ndagijimana;
S Hartinger;
LJ Underhill;
KA Kearns;
D Campbell;
J Kremer;
S Jabbarzadeh;
J Wang;
Y Chen;
J Rosenthal;
A Quinn;
AT Papageorghiou;
W Checkley;
JL Peel
Background: Household air pollution (HAP) from solid fuel use is associated with adverse birth outcomes, but data for exposure–response relationships are scarce. We examined associations between HAP exposures and birthweight in rural Guatemala, India, Peru, and Rwanda during the Household Air Pollution Intervention Network (HAPIN) trial. Methods: The HAPIN trial recruited pregnant women (9–<20 weeks of gestation) in rural Guatemala, India, Peru, and Rwanda and randomly allocated them to receive a liquefied petroleum gas stove or not (ie, and continue to use biomass fuel). The primary outcomes were birthweight, length-for-age, severe pneumonia, and maternal systolic blood pressure. In this exposure–response subanalysis, we measured 24-h personal exposures to PM2·5, carbon monoxide, and black carbon once pre-intervention (baseline) and twice post-intervention (at 24–28 weeks and 32–36 weeks of gestation), as well as birthweight within 24 h of birth. We examined the relationship between the average prenatal exposure and birthweight or weight-for-gestational age Z scores using multivariate-regression models, controlling for the mother's age, nulliparity, diet diversity, food insecurity, BMI, the mother's education, neonate sex, haemoglobin, second-hand smoke, and geographical indicator for randomisation strata. Findings: Between March, 2018, and February, 2020, 3200 pregnant women were recruited. An interquartile increase in the average prenatal exposure to PM2·5 (74·5 μg/m3) was associated with a reduction in birthweight and gestational age Z scores (birthweight: –14·8 g [95% CI –28·7 to –0·8]; gestational age Z scores: –0·03 [–0·06 to 0·00]), as was an interquartile increase in black carbon (7·3 μg/m3; –21·9 g [–37·7 to –6·1]; –0·05 [–0·08 to –0·01]). Carbon monoxide exposure was not associated with these outcomes (1·7; –3·1 [–12·1 to 5·8]; –0·003 [–0·023 to 0·017]). Interpretation: Continuing efforts are needed to reduce HAP exposure alongside other drivers of low birthweight in low-income and middle-income countries. Funding: US National Institutes of Health (1UM1HL134590) and the Bill & Melinda Gates Foundation (OPP1131279).
The mosquito Aedes aegypti is the vector of a number of medically-important viruses, including dengue virus, yellow fever virus, chikungunya virus, and Zika virus, and as such vector control is a key approach to managing the diseases they cause. Understanding the impact of vector control on these diseases is aided by first understanding its impact on Ae. aegypti population dynamics. A number of detail-rich models have been developed to couple the dynamics of the immature and adult stages of Ae. aegypti. The numerous assumptions of these models enable them to realistically characterize impacts of mosquito control, but they also constrain the ability of such models to reproduce empirical patterns that do not conform to the models’ behavior. In contrast, statistical models afford sufficient flexibility to extract nuanced signals from noisy data, yet they have limited ability to make predictions about impacts of mosquito control on disease caused by pathogens that the mosquitoes transmit without extensive data on mosquitoes and disease. Here, we demonstrate how the differing strengths of mechanistic realism and statistical flexibility can be fused into a single model. Our analysis utilizes data from 176,352 household-level Ae. aegypti aspirator collections conducted during 1999–2011 in Iquitos, Peru. The key step in our approach is to calibrate a single parameter of the model to spatio-temporal abundance patterns predicted by a generalized additive model (GAM). In effect, this calibrated parameter absorbs residual variation in the abundance time-series not captured by other features of the mechanistic model. We then used this calibrated parameter and the literature-derived parameters in the agent-based model to explore Ae. aegypti population dynamics and the impact of insecticide spraying to kill adult mosquitoes. The baseline abundance predicted by the agent-based model closely matched that predicted by the GAM. Following spraying, the agent-based model predicted that mosquito abundance rebounds within about two months, commensurate with recent experimental data from Iquitos. Our approach was able to accurately reproduce abundance patterns in Iquitos and produce a realistic response to adulticide spraying, while retaining sufficient flexibility to be applied across a range of settings.
by
Lance Waller;
Timothy Lash;
Benjamin Lopman;
AB Amin;
JE Tate;
ME Wikswo;
UD Parashar;
LS Stewart;
JD Chappell;
NB Halasa;
J Williams;
MG Michaels;
RW Hickey;
EJ Klein;
JA Englund;
GA Weinberg;
PG Szilagyi;
MA Staat;
MM McNeal;
JA Boom;
LC Sahni;
R Selvarangan;
CJ Harrison;
ME Moffatt;
JE Schuster;
BA Pahud;
GM Weddle;
PH Azimi;
SH Johnston;
DC Payne;
MD Bowen
Background: Estimates of rotavirus vaccine effectiveness (VE) in the United States appear higher in years with more rotavirus activity. We hypothesized rotavirus VE is constant over time but appears to vary as a function of temporal variation in local rotavirus cases and/or misclassified diagnoses. Methods: We analyzed 6 years of data from eight US surveillance sites on 8- to 59-month olds with acute gastroenteritis symptoms. Children's stool samples were tested via enzyme immunoassay (EIA); rotavirus-positive results were confirmed with molecular testing at the US Centers for Disease Control and Prevention. We defined rotavirus gastroenteritis cases by either positive on-site EIA results alone or positive EIA with Centers for Disease Control and Prevention confirmation. For each case definition, we estimated VE against any rotavirus gastroenteritis, moderate-to-severe disease, and hospitalization using two mixed-effect regression models: the first including year plus a year-vaccination interaction, and the second including the annual percent of rotavirus-positive tests plus a percent positive-vaccination interaction. We used multiple overimputation to bias-adjust for misclassification of cases defined by positive EIA alone. Results: Estimates of annual rotavirus VE against all outcomes fluctuated temporally, particularly when we defined cases by on-site EIA alone and used a year-vaccination interaction. Use of confirmatory testing to define cases reduced, but did not eliminate, fluctuations. Temporal fluctuations in VE estimates further attenuated when we used a percent positive-vaccination interaction. Fluctuations persisted until bias-adjustment for diagnostic misclassification. Conclusions: Both controlling for time-varying rotavirus activity and bias-adjusting for diagnostic misclassification are critical for estimating the most valid annual rotavirus VE.
Uganda reported cases of Rift Valley fever virus (RVFV) for the first time in almost 50 years in 2016, following an outbreak of Rift Valley fever (RVF) that caused four human infections, two of which resulted in death. Subsequent outbreak investigation serosurveys found high seroprevalence of IgG antibodies without evidence of acute infection or IgM antibodies, suggesting the possibility of undetected RVFV circulation prior to the outbreak. After the 2016 outbreak investigation, a serosurvey was conducted in 2017 among domesticated livestock herds across Uganda. Sampling data were incorporated into a geostatistical model to estimate RVF seroprevalence among cattle, sheep, and goats. Variables resulting in the best fit to RVF seroprevalence sampling data included annual variability in monthly precipitation and enhanced vegetation index, topographic wetness index, log human population density percent increase, and livestock species. Individual species RVF seroprevalence prediction maps were created for cattle, sheep, and goats, and a composite livestock prediction was created based on the estimated density of each species across the country. Seroprevalence was greater in cattle compared with sheep and goats. Predicted seroprevalence was greatest in the central and northwestern quadrant of the country, surrounding Lake Victoria, and along the Southern Cattle Corridor. We identified areas that experienced conditions conducive to potential increased RVFV circulation in 2021 in central Uganda. An improved understanding of the determinants of RVFV circulation and locations with high probability of elevated RVF seroprevalence can guide prioritization of disease surveillance and risk mitigation efforts.
by
Lance Waller;
Henry Blumberg;
Susan Ray;
Azhar Nizam;
Jyothi Rengarajan;
Chris Ibegbu;
Russell Kempker;
Neel Gandhi;
Matthew Magee;
Toidi Adekambi;
Lisa Elon;
Sarita Shah;
Cheryl Day;
Sara Auld;
Jeffrey Collins;
AGC Smith;
L Wassie;
K Bobosha;
J Ernst;
R Ahmed;
L Sharling;
D Columbus;
A Knezevic;
S Jabbarzadeh;
H Wu;
S Swanson;
Y Chen;
W Whatney;
M Quezada;
L Sasser;
RM Lala;
T Fergus;
P Ogongo;
A Tran;
D Kaushal;
N Golden;
T Foreman;
A Bucsan;
J Altman;
SC Alcantra;
A Sette;
CL Arlehamn;
S Allana;
A Campbell;
J Brust;
M Franczek;
J Daniel;
A Rao;
R Goldstein;
M Kabongo;
A Oladele;
A Aseffa;
M Hamza;
Y Abebe;
F Mulate;
M Wondiyfraw;
F Degaga;
D Getachew;
DT Bere;
M Zewdu;
D Mussa;
B Tesfaye;
S Jemberu;
A Tarekegn;
G Assefa;
G Jebessa;
Z Solomon;
S Neway;
J Hussein;
T Hailu;
A Geletu;
E Girma;
M Legesse;
M Wendaferew;
H Solomon;
Z Assefa;
M Mekuria;
M Kedir;
E Zeleke;
R Zerihun;
S Dechasa;
E Haile;
N Getachew;
F Wagari;
R Mekonnen;
S Bayu;
M Gebre-Medhin;
A Kifle
Background. It is uncertain whether diabetes affects the risk of developing latent tuberculosis infection (LTBI) following exposure to Mycobacterium tuberculosis (Mtb). We assessed the relationship of diabetes or prediabetes and LTBI among close and household contacts (HHCs) of patients with active pulmonary tuberculosis (TB) disease in Addis Ababa, Ethiopia. Methods. In this cross-sectional study, we performed interferon-γ release assays, TB symptom screening, and point-of-care glycolated hemoglobin (HbA1c) testing among HHCs of active TB cases. Diabetes status was classified into diabetes (HbA1c ≥6.5% or self-reported diagnosis), prediabetes (5.7%-6.4%), and euglycemia (≤5.6%). Multivariable logistic regression was used to determine the association of diabetes with LTBI. Results. Among 597 study participants, 123 (21%) had dysglycemia including diabetes (n = 31) or prediabetes (n = 92); 423 (71%) participants were diagnosed with LTBI. Twelve of 31 (39%) HHCs with diabetes were previously undiagnosed with diabetes. The prevalence of LTBI among HHCs with diabetes, prediabetes, and euglycemia was 87% (27/31), 73% (67/92), and 69% (329/474), respectively. In multivariable analysis adjusted for age, sex, and HIV status, the odds of LTBI among HHCs with diabetes were 2.33 (95% confidence interval [CI], .76-7.08) times the odds of LTBI without diabetes. When assessing interaction with age, the association of diabetes and LTBI was robust among participants aged ≥40 years (adjusted odds ratio [aOR], 3.68 [95% CI, .77-17.6]) but not those <40 years (aOR, 1.15 [95% CI, .22-6.1]). Conclusions. HHCs with diabetes may be more likely to have LTBI than those with euglycemia. Further investigations are needed to assess mechanisms by which diabetes may increase risk of LTBI after Mtb exposure.