About this item:

40 Views | 19 Downloads

Author Notes:

Correspondence: Qian Di, qiandi@mail.harvard.edu

Research described in this article was also conducted under contract to the Health Effects Institute (HEI), an organization jointly funded by the U.S. EPA (Assistance Award No.CR-83467701) and certain motor vehicle and engine manufacturers.

The computations in this paper were run on the Odyssey cluster supported by the FAS Division of Science, Research Computing Group at Harvard University.


Research Funding:

This publication was made possible by U.S. EPA grant numbers RD-834798, RD-835872, and 83587201; HEI grant 4953-RFA14-3/16-4.


  • Air Pollutants
  • Air Pollution
  • Algorithms
  • Environmental Monitoring
  • Nitrogen Dioxide
  • Uncertainty
  • United States

Assessing no<inf>2</inf> concentration and model uncertainty with high spatiotemporal resolution across the contiguous united states using ensemble model averaging

Show all authors Show less authors


Journal Title:

Environmental Science and Technology


Volume 54, Number 3


, Pages 1372-1384

Type of Work:

Article | Post-print: After Peer Review


NO2 is a combustion byproduct that has been associated with multiple adverse health outcomes. To assess NO2 levels with high accuracy, we propose the use of an ensemble model to integrate multiple machine learning algorithms, including neural network, random forest, and gradient boosting, with a variety of predictor variables, including chemical transport models. This NO2 model covers the entire contiguous U.S. with daily predictions on 1-km-level grid cells from 2000 to 2016. The ensemble produced a cross-validated R2 of 0.788 overall, a spatial R2 of 0.844, and a temporal R2 of 0.729. The relationship between daily monitored and predicted NO2 is almost linear. We also estimated the associated monthly uncertainty level for the predictions and address-specific NO2 levels. This NO2 estimation has a very high spatiotemporal resolution and allows the examination of the health effects of NO2 in unmonitored areas. We found the highest NO2 levels along highways and in cities. We also observed that nationwide NO2 levels declined in early years and stagnated after 2007, in contrast to the trend at monitoring sites in urban areas, where the decline continued. Our research indicates that the integration of different predictor variables and fitting algorithms can achieve an improved air pollution modeling framework.

Copyright information:

© 2019 American Chemical Society.

Export to EndNote