Data scientists use new techniques to identify lakes and reservoirs around the world

Published on by in Academic

Data scientists use new techniques to identify lakes and reservoirs around the world

Highlighting the lakes identified by ReaLSAT and HydroLakes (blue), as well as the lakes present in ReaLSAT but not HydroLakes (red) for different regions in world: a) small reservoirs in India, b) water-intensive agriculture in Vietnam, c) natural lakes in the U.S. and wetlands in Venezuela and d) shallow lakes in Australia. Credit: ReaLSAT, University of Minnesota.


An interdisciplinary team of researchers, led by University of Minnesota Twin Cities data scientists, has published a first-of-its-kind comprehensive global dataset of the lakes and reservoirs on Earth showing how they have changed over the last 30+ years.
The data will provide environmental researchers with new information about land and fresh water use as well as how lakes and reservoirs are being impacted by humans and climate change. The research is also a major advancement in machine learning techniques.

A paper highlighting the Reservoir and Lake Surface Area Timeseries (ReaLSAT) dataset was recently published in Scientific Data.

Highlights of the study include:

The ReaLSAT dataset contains the location and surface area variations of 681,137 lakes and reservoirs larger than 0.1 square kilometers (south of 50 degrees north latitude). The previous most comprehensive database, called HydroLAKES, had identified only 245,420 lakes and reservoirs for the part of the world and minimum size being considered in this study.

ReaLSAT provides data on the surface area of each body of water for each month from 1984 to 2015. This makes it possible to quantify changes in lake and reservoir area over time, which is key to understanding how changing climate and land use are altering bodies of fresh water. The HydroLAKES data contains only a static shape for each water body.

The ReaLSAT dataset is the culmination of eight years of research. It represents a major milestone in the application of new knowledge-guided machine learning for use in the environmental sciences. Unlike other existing efforts, this dataset can now be extended nearly automatically via machine learning and can be quickly replicated for a wide variety of earth observation data that are becoming available at increasingly better resolution.

An image of Minnesota lakes identified using the ReaLSAT dataset (red) is combined with a similar image of the area where lakes were identified in the previous HydroLAKES dataset (blue). The ReaLSAT dataset identifies almost three times as many lakes and reservoirs worldwide compared to HydroLAKES. Credit: ReaLSAT, University of Minnesota

"Around the world, we are seeing lakes and reservoirs changing rapidly with seasonal precipitation patterns, long-term changes in climate, and human management decisions," said Vipin Kumar, the senior author of the study and Regents Professor and William Norris Endowed Chair in the University of Minnesota Twin Cities Department of Computer Science and Engineering. "This new dataset greatly improves the ability of scientists to understand the impact of changing climate and human actions on our fresh water across the globe."