Coupled machine learning–ecosystem ensemble models substantially improve predictions of nitrous oxide (N2O) fluxes from US croplands
Prateek Sharma, Bruno Basso, Aditya Manuraj, Michael S. Murillo, Neville Millar, Tommaso Tadiello, Mukta Sharma, Mathieu Delandmeter, G. Philip Robertson
PNAS; March 4, 2026; 123 (10) e2524808123; https://doi.org/10.1073/pnas.2524808123
Significance
Nitrous oxide (N2O) is a potent and increasingly important greenhouse gas currently responsible for ~7% of human-caused atmospheric warming. Agriculture is a major emitter of N2O globally, and agricultural soils are a major if still uncertain source. In large part this uncertainty stems from the challenge of accurately predicting emissions from fertilized crops. Here, we show how an ensemble modeling system that couples a group of ecosystem models with a group of machine learning models can substantially improve cropland N2O flux predictions. The system additionally generates insights that can improve existing ecosystem models, guide field measurement efforts, and advance N2O mitigation strategies under diverse soils and climates in food and bioenergy cropping systems.
Abstract
Nitrous oxide (N2O) is a potent and persistent greenhouse gas, with rising atmospheric concentrations driven in part by inefficient use of synthetic nitrogen (N) fertilizers in agriculture. Predicting soil N2O emissions is challenging due to high spatial and temporal variability arising from complex soil biogeochemical processes. Process-based ecosystem models and standalone machine learning (ML) approaches without extensive site-specific calibration often miss high-emission episodes. Here, we show how an Ensemble Modeling System (EMS) based on outputs from an ensemble of ecosystem models coupled to an ensemble of ML models can improve predictions and understanding of N2O fluxes from US cropland. Trained and validated on ~12,000 N2O chamber measurements at 17 US Midwest sites (six crops, 35 management practices), the EMS accurately predicted daily fluxes of N2O at both training (R2 = 0.84, RMSE = 16.4 g N ha−1 d−1) and held-out testing sites (R2 = 0.84, RMSE = 6.2 g N ha−1 d−1). Analyses identified six dominant N2O drivers: soil organic carbon (SOC), NH4+, NO3-, water-filled pore space, temperature, and aboveground biomass production. Wet, warm soils produced large N2O peaks only with sufficient SOC and mineral N; in low-SOC soils, fluxes remained low. Incorporating these drivers into process-based models might significantly improve their predictive capacity. The EMS demonstrates a strong potential to predict N2O fluxes at unseen sites, enabling more reliable regional inventories, improved gap-filling where measurements are sparse, and enhanced understanding of mechanisms to advance targeted mitigation strategies in food, feed, and bioenergy crops.
See https://www.pnas.org/doi/10.1073/pnas.2524808123

Figure 2:
Predictive performance of the EMS compared with observations. (A) Scatterplot of daily N2O fluxes for the 13 training sites (n = 11,936). (B) Scatterplot for the four fully withheld test sites (n = 260). (C) Violin-plus-box plots for each test site (IDs 1, 8, 9, and 14) comparing the distribution of observed N2O fluxes (gray), and EMS N2O predictions (red). Dashed lines in (A) and (B) are the 1:1 fits. Statistical fit is reported as the coefficient of determination (R2), RMSE (g N ha−1 d−1), and two-tailed significance (P).
Views: 209


