Data for A Hybrid Biophysical-Machine Learning Framework for Diurnal Surface Energy Flux Estimation Using Proximal Sensing

Themes: Sustainability

Keywords: AI/ML, Ecosystem Flux

Citation

Cross, J.F., Mallick, K., Aslan-Sungur, G., VanLoocke, A., Drewry, D.T. May 2, 2026. Data for: “A Hybrid Biophysical-Machine Learning Framework for Diurnal Surface Energy Flux Estimation in Bioenergy Agricultural Crops Using Proximal Sensing.” Dryad. DOI: 10.5061/dryad.r4xgxd2tm.

Overview

STIC(ML,Rn) predicted vs. LE for (a) corn, (b) miscanthus, (c) sorghum, and (d) soybean. Points are colored by day of year, errors are largest early and late in the season when canopy conditions are changing rapidly. Dashed line = 1:1, solid line = regression fit.

Thermal-based remote sensing of surface energy fluxes has traditionally relied on high spatial resolution satellite data with revisit frequencies on the order of weeks. In this study, we evaluate a biophysics-based analytical surface energy balance model for predicting latent energy (LE) and sensible heat (H) fluxes using proximal sensing observations. The Surface Temperature Initiated Closure (STIC1.2) model has been extensively validated across a wide range of spatial and temporal scales using various satellite-derived thermal datasets. Here we extend this validation by applying STIC at sub-hourly temporal resolution over multiple growing seasons for four distinct agricultural systems. We further develop and evaluate novel STIC variants that incorporate machine learning (ML) techniques to eliminate the need for specific surface energy balance observations, specifically net radiation and soil heat flux, thereby enhancing model applicability in data-sparse settings. The integration of an ML component to estimate surface available energy is shown to have strong predictive performance for both LE (R2 = 0.81-0.94) and H (R2 = 0.46-0.72) across all agricultural systems examined here, demonstrating the potential of hybrid biophysical – machine learning approaches for surface energy balance modeling with minimal data requirements. This study concludes with a novel application of explainable machine learning (exML) to diagnose sources of model error. This exML framework attributes residual prediction errors to both model input variables and environmental drivers not explicitly included in the simulation experiments. This approach provides a new pathway for improving model design and integrating previously overlooked yet influential variables into future model iterations.

Data

Dryad: Model code, field tower data

Illinois Data Bank: Surface Temperature Initiated Closure (STIC) surface energy flux, model performance for PHI estimation

Related Publications