Data drought or data flood?

By Anne van Loon

The basis for (almost) all scientific work, at least in the earth and environmental sciences, is DATA. We all need data to search for the answers to our questions. There are a number of options to get hold of data; we can measure stuff ourselves in the field or in the lab, generate model data, process data measured by satellites, or use data that other people collected. The last option has the advantage that you can cover much larger temporal and spatial scales than if you do all the measurements yourself, but it is not necessarily much easier or quicker. In this blog I do a quick and dirty tour of large-scale data collection initiatives in hydrology and introduce a new initiative focussing on groundwater drought.

“Hydrometeorological data…” (source: https://cloudtweaks.com/ )

The classical way for hydrologists to use other people’s data (also called “secondary data”) is to use national-scale government-funded hydrometeorological databases such as the National River Flow Archive (NRFA) and National Groundwater Level Archive (NGLA) in the UK and the US Geological Survey Water Data in the USA. This seems a good and reliable source for data, but there are worries, for example that the number of gauges worldwide is decreasing due to various reasons (Mishra & Coulibaly, 2009; Hannah et al., 2011) and that paper or microfilm archives are at risk. These national data are collated in global databases like the Global Runoff Data Centre (GRCD) and the Global Groundwater Network (GGN), hosted by the International Groundwater Resources Assessment Centre (IGRAC). The problem there is that it is very dependent on the national hydrometeorological institutes to provide data, the records are not always up to date and quality checked, and important meta-data are not always available.

That is the reason that many researchers spend a lot of time combining and expanding these datasets. A few recent examples (NB: not at all an exhaustive list):

A global streamflow dataset for baseflow and recession analysis by Hylke Beck and others
The Global Streamflow Indices and Metadata Archive (GSIM) by Hong Xuan Do, Lukas Gudmundsson, and colleagues
The CAMELS dataset by Nans Addor and others and its sister dataset CAMELS-CL by Camila Alvarez-Garreton and others

These are very helpful, but also quite time consuming for a single person (usually an early-career scientist) or a small group of people to compile and the dataset easily becomes outdated.

On the other side of the spectrum is crowd-sourced or citizen science data. This is already quite common in meteorology, both for weather observations (Weather Observations Website, WOW), historic weather data (for example Weather Rescue) and climate model data (weather@home, by Massey et al., 2014), but citizen science is starting to get used in hydrology as well. Some examples are (again not exhaustive):

The CrowdHydrology initiative that asks people to text a river level reading from a gauging staff (see Lowry & Fienen, 2012)
The CrowdWater App that lets people make observations of water level, streamflow and soil moisture, amongst others via a “wet boots” test.
Several projects and initiatives that use messages, photos and videos to crowdsource flood data, see this paper by Le Coz et al. (2016) for an overview.
Or even this project that is using YouTube videos of a touristic cave in Saudi Arabia to reproduce water level time series (Michelsen et al., 2016)

Example of crowd-sourcing hydrological data via an App (source: http://www.crowdhydrology.com/)

Most of these are using citizens as passive data collectors with the scientists doing the analysis and interpretation. The nice thing is that it creates lots of data, but the downside is a lot of local knowledge is underused. There are, however, also initiatives that try to make use of this local knowledge, either from citizens themselves, from the experts in government agencies, or from local scientists who know much more about the local hydrological situation. Some of these are funded projects, such as:

FLASH, the Flooded Locations And Simulated Hydrographs Project that combined flash flood data from streamflow gauges, government reports and public survey responses (Gourley et al., 2013).
the Mountain EVO project that liaises with citizens to collect and use hydrological data in several mountain areas (see Buytaert et al. 2014).
the European Drought Impact report Inventory (EDII) developed in the EU Drought-R&SPI project that collected drought impact information from a wealth of different sources (Stahl et al. 2016).

Some of these are not funded, like the UNESCO NE-FRIEND Low flow and Drought group that produced an analysis of the 2015 streamflow drought in Europe after a community effort to collect streamflow data and drought characteristics from partners in countries around Europe (Laaha et al., 2017). Or are only partly funded, for example by a COST action that only provides travel funding, as in the case of the FloodFreq initiative in which researchers collected a dataset of long streamflow records for Europe to study floods (Mediero et al. 2015) or the European Flood Database that could have been developed with support of an ERC Advanced Grant (Hall et al., 2015).

The databases developed in funded projects are great because there is (researcher) time to develop something new. But it is also hard to maintain the database when the project funding stops and a permanent host then needs to be found. Unfunded projects can benefit from the enthusiasm and commitment of their collaborators, but have to rely on people spending time to provide data and be involved in the analysis and interpretation. These work best if they are rooted in active scientific communities or networks. I already mentioned the NE-FRIEND Low flow and Drought group, which developed into a nice group of scientific FRIENDs, but also organisations like the International Association of Hydrological Sciences (IAHS) and the International Association of Hydrogeologists (IAH) play an important role (see Bonnell et al. 2006 – HELPing FRIENDs in PUBs). IAHS for example drives the Panta Rhei decade on Change in Hydrology and Society, which has a number of very active working groups that are driving data sharing initiatives. Another very successful example is HEPEX, which is a true bottom-up network with “friendly people who are full of energy”. These international networks can provide the framework for data sharing initiatives.

The value of international scientific networks for data sharing (source: https://hepex.irstea.fr/)

It also helps if there is one (funded) person driving the data collection and if there is a clear aim or research question that everyone involved is interested in. Also, a clear procedure and format for the data helps. With that in mind, portals have been developed specifically for data sharing in hydrology, for example:

– SWITCH-ON that focusses on open data and virtual laboratories where people can do collective experiments

– Hydroshare, which is a collaborative website where people can upload hydrological data and models

The most inclusive are the initiatives (either funded or unfunded) that manage to incorporate local knowledge, so those that do not only collect data, but also work with the data providers for the interpretation of the data. This synthesis aspect is the main strength of these initiatives and a lot can be learned by bringing data and knowledges together, even if no new data is created.

In a NEW initiative we are hoping to combine some of the advantages of the above-mentioned data collection efforts. The Groundwater Drought Initiative (GDI) is a three-year initiative starting in April 2018 that aims to develop and support a network of European researchers and stakeholders with an interest in regional- to continental-scale groundwater droughts. Through the GDI network we will collect groundwater level data and groundwater drought impact information for Europe. This is needed because most of the data collection initiatives mentioned above are focussed on floods, not on drought, and most collate data on streamflow, not on groundwater. Since around 65% of the Europe’s drinking water supply is obtained from groundwater and drought is (and will increasingly be) a threat to water security in Europe, it is essential to get a good understanding of groundwater drought and its impacts. Since groundwater drought is typically large-scale and transboundary, data on a pan-European scale is needed to increase this understanding.

The Groundwater Drought Initiative

The GDI initiative is embedded in the NE-FRIEND Low flow and Drought group and has obtained a bit of funding from the UK Research Council for workshops and some researcher time, but we hope to arouse the interest and the enthusiasm of even more scientists and government employees of various nationalities and regions to be involved in the initiative and to contribute with data, meta-data, local knowledge and interpretation of data. In return the GDI will provide tools to visualise and analyse groundwater droughts, a regional- to continental-scale context of the groundwater drought information, insights into the impacts of major groundwater droughts, access to a network of international groundwater drought researchers and managers, and the opportunity to participate in joint scientific publications. The long-term sustainability of the initiative will hopefully be developed through the network that we will establish and through the link with formal organisations like the European Drought Centre (EDC) and IGRAC, where the groundwater drought data will be stored after the end of the funded project.

If you are interested, please get in touch:

– John Bloomfield, BGS: [email protected]

– Anne Van Loon, University of Birmingham: [email protected]

Or via Twitter: @GDI_Europe (https://twitter.com/GDI_Europe)

The article was originally published in the EGU blog Water Underground available at the following link:

https://blogs.agu.org/waterunderground/2018/05/28/data-drought-or-data-flood/

Related

Leave a Comment Cancel reply

Data drought or data flood?

Share this:

Related

Related Posts

Leave a Comment Cancel reply