Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning

Zhang, YF; Liang, SL; Ma, H; He, T; Wang, Q; Li, B; Xu, JL; Zhang, GD; Liu, XB; Xiong, CH

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.5194/essd-15-2055-2023
Scopus: eid_2-s2.0-85160943646
WOS: WOS:000993742800001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Geography: Journal/Magazine Articles

Article: Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning

Title	Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning
Authors	Zhang, YF Liang, SL Ma, H He, T Wang, Q Li, B Xu, JL Zhang, GD Liu, XB Xiong, CH
Issue Date	23-May-2023
Publisher	Copernicus Publications
Citation	Earth System Science Data, 2023, v. 15, n. 5, p. 2055-2079 How to Cite? DOI: http://dx.doi.org/10.5194/essd-15-2055-2023
Abstract	Motivated by the lack of long-term global soil moisture products with both high spatial and temporal resolutions, a global 1 km daily spatiotemporally continuous soil moisture product (GLASS SM) was generated from 2000 to 2020 using an ensemble learning model (eXtreme Gradient Boosting – XGBoost). The model was developed by integrating multiple datasets, including albedo, land surface temperature, and leaf area index products from the Global Land Surface Satellite (GLASS) product suite, as well as the European reanalysis (ERA5-Land) soil moisture product, in situ soil moisture dataset from the International Soil Moisture Network (ISMN), and auxiliary datasets (Multi-Error-Removed Improved-Terrain (MERIT) DEM and Global gridded soil information (SoilGrids)). Given the relatively large-scale differences between point-scale in situ measurements and other datasets, the triple collocation (TC) method was adopted to select the representative soil moisture stations and their measurements for creating the training samples. To fully evaluate the model performance, three validation strategies were explored: random, site independent, and year independent. Results showed that although the XGBoost model achieved the highest accuracy on the random test samples, it was clearly a result of model overfitting. Meanwhile, training the model with representative stations selected by the TC method could considerably improve its performance for site- or year-independent test samples. The overall validation accuracy of the model trained using representative stations on the site-independent test samples, which was least likely to be overfitted, was a correlation coefficient (R) of 0.715 and root mean square error (RMSE) of 0.079 m³ m⁻³. Moreover, compared to the model developed without station filtering, the validation accuracies of the model trained with representative stations improved significantly for most stations, with the median R and unbiased RMSE (ubRMSE) of the model for each station increasing from 0.64 to 0.74 and decreasing from 0.055 to 0.052 m³ m⁻³, respectively. Further validation of the GLASS SM product across four independent soil moisture networks revealed its ability to capture the temporal dynamics of measured soil moisture (R=0.69–0.89; ubRMSE = 0.033–0.048 m³ m⁻³). Lastly, the intercomparison between the GLASS SM product and two global microwave soil moisture datasets – the 1 km Soil Moisture Active Passive/Sentinel-1 L2 Radiometer/Radar soil moisture product and the European Space Agency Climate Change Initiative combined soil moisture product at 0.25^∘ – indicated that the derived product maintained a more complete spatial coverage and exhibited high spatiotemporal consistency with those two soil moisture products. The annual average GLASS SM dataset from 2000 to 2020 can be freely downloaded from https://doi.org/10.5281/zenodo.7172664 (Zhang et al., 2022a), and the complete product at daily scale is available at http://glass.umd.edu/soil_moisture/ (last access: 12 May 2023).
Persistent Identifier	http://hdl.handle.net/10722/332217
ISSN	1866-3508 2023 Impact Factor: 11.2 2023 SCImago Journal Rankings: 4.231
ISI Accession Number ID	WOS:000993742800001

DC Field	Value	Language
dc.contributor.author	Zhang, YF	-
dc.contributor.author	Liang, SL	-
dc.contributor.author	Ma, H	-
dc.contributor.author	He, T	-
dc.contributor.author	Wang, Q	-
dc.contributor.author	Li, B	-
dc.contributor.author	Xu, JL	-
dc.contributor.author	Zhang, GD	-
dc.contributor.author	Liu, XB	-
dc.contributor.author	Xiong, CH	-
dc.date.accessioned	2023-10-04T07:20:59Z	-
dc.date.available	2023-10-04T07:20:59Z	-
dc.date.issued	2023-05-23	-
dc.identifier.citation	Earth System Science Data, 2023, v. 15, n. 5, p. 2055-2079	-
dc.identifier.issn	1866-3508	-
dc.identifier.uri	http://hdl.handle.net/10722/332217	-
dc.description.abstract	<p>Motivated by the lack of long-term global soil moisture products with both high spatial and temporal resolutions, a global 1 km daily spatiotemporally continuous soil moisture product (GLASS SM) was generated from 2000 to 2020 using an ensemble learning model (eXtreme Gradient Boosting – XGBoost). The model was developed by integrating multiple datasets, including albedo, land surface temperature, and leaf area index products from the Global Land Surface Satellite (GLASS) product suite, as well as the European reanalysis (ERA5-Land) soil moisture product, in situ soil moisture dataset from the International Soil Moisture Network (ISMN), and auxiliary datasets (Multi-Error-Removed Improved-Terrain (MERIT) DEM and Global gridded soil information (SoilGrids)). Given the relatively large-scale differences between point-scale in situ measurements and other datasets, the triple collocation (TC) method was adopted to select the representative soil moisture stations and their measurements for creating the training samples. To fully evaluate the model performance, three validation strategies were explored: random, site independent, and year independent. Results showed that although the XGBoost model achieved the highest accuracy on the random test samples, it was clearly a result of model overfitting. Meanwhile, training the model with representative stations selected by the TC method could considerably improve its performance for site- or year-independent test samples. The overall validation accuracy of the model trained using representative stations on the site-independent test samples, which was least likely to be overfitted, was a correlation coefficient (<em>R</em>) of 0.715 and root mean square error (RMSE) of 0.079 m<sup>3</sup> m<sup>−3</sup>. Moreover, compared to the model developed without station filtering, the validation accuracies of the model trained with representative stations improved significantly for most stations, with the median <em>R</em> and unbiased RMSE (ubRMSE) of the model for each station increasing from 0.64 to 0.74 and decreasing from 0.055 to 0.052 m<sup>3</sup> m<sup>−3</sup>, respectively. Further validation of the GLASS SM product across four independent soil moisture networks revealed its ability to capture the temporal dynamics of measured soil moisture (<em>R</em>=0.69–0.89; ubRMSE = 0.033–0.048 m<sup>3</sup> m<sup>−3</sup>). Lastly, the intercomparison between the GLASS SM product and two global microwave soil moisture datasets – the 1 km Soil Moisture Active Passive/Sentinel-1 L2 Radiometer/Radar soil moisture product and the European Space Agency Climate Change Initiative combined soil moisture product at 0.25<sup>∘</sup> – indicated that the derived product maintained a more complete spatial coverage and exhibited high spatiotemporal consistency with those two soil moisture products. The annual average GLASS SM dataset from 2000 to 2020 can be freely downloaded from <a href="https://doi.org/10.5281/zenodo.7172664">https://doi.org/10.5281/zenodo.7172664</a> (Zhang et al., 2022a), and the complete product at daily scale is available at <a href="http://glass.umd.edu/soil_moisture/">http://glass.umd.edu/soil_moisture/</a> (last access: 12 May 2023).</p>	-
dc.language	eng	-
dc.publisher	Copernicus Publications	-
dc.relation.ispartof	Earth System Science Data	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning	-
dc.type	Article	-
dc.identifier.doi	10.5194/essd-15-2055-2023	-
dc.identifier.scopus	eid_2-s2.0-85160943646	-
dc.identifier.volume	15	-
dc.identifier.issue	5	-
dc.identifier.spage	2055	-
dc.identifier.epage	2079	-
dc.identifier.eissn	1866-3516	-
dc.identifier.isi	WOS:000993742800001	-
dc.identifier.issnl	1866-3508	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Generation of global 1 km daily soil moisture product from 2000 to 2020 using ensemble learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats