
Most users of environmental datasets are trying to do reproducible and accountable science, but different post-processing workarounds and tools can lead to published results which are not repeatable or comparable.
To work more effectively, the ideal process would be to share value-added data processed to an agreed standard and format. Since legal restrictions currently forbid this type of redistribution, the next best solution is to share the processing workflow, including code and environmental settings or parameters.
This is one of the lessons learnt on how to build and document a usable and repeatable workflow recently published in the paper “Processing Conservation Indicators with Open Source Tools: Lessons Learned from the Digital Observatory for Protected Areas” [1][2], authored by JRC, the European Commission Joint Research Centre.
JRC delivers and maintains the Digital Observatory for Protected Areas (DOPA). The DOPA is a set of web services and applications primarily used to assess, monitor, report and possibly forecast the state of and the pressure on protected areas at multiple scales. The data, indicators, maps and tools provided by the DOPA are relevant, for example, to support spatial planning, resource allocation, protected area development and management, and national and international reporting DOPA, infact, was acknowledged by the Convention on Biological Diversity (CBD) Secretariat as reference tool for Country reporting.
Maintaining the DOPA means to manage large datasets with highly complex geometries, topological inconsistencies, multiple different representations of the same geographical entities, for example coastlines, licensing requirement to continuously update indicators to respond to monthly changes in the authoritative data.
In order to compute and publish these arrays of indicators, JRC is using a range of open source tools (including GRASS, R, python, GDAL, PostGIS, geometry libraries for Hadoop, Geoserver, Geonode, Mapserver) coupled with some commercial software (such as ArcGIS Pro and the Google Earth Engine platform).
To make all of this reproducible, JRC is trying to move the entire processing chain to open source tools and share it as a versioned resource. The latter will be done with the help of BlueBRIDGE with which JRC is collaborating with.
JRC and BlueBRIDGE are developing a Virtual Research Environment aimed at reporting on which features are represented in protected area networks and other managed areas. In particular, the ongoing use case has been developed in the context of the Biodiversity and Protected Areas Management Programme (BIOPAMA Reference Information System - http://rris.biopama.org ), which aims to address threats to biodiversity in African, Caribbean and Pacific (ACP) countries while improving socio-economic conditions of the local communities in and around protected areas.
[1] Bastin, L., Mandrici, A., Battistella, L., Dubois, G. (2017). Processing Conservation Indicators with Open Source Tools: Lessons Learned from the Digital Observatory for Protected Areas. In: Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings: Vol. 17 , Article 14. August 14-19, 2017, Boston, MA, USA. http://scholarworks.umass.edu/foss4g/vol17/iss1/14/
[2] Dubois, G., Bastin, L., Bertzky, B., Mandrici, B., Conti, M., Saura, S., Cottam, A., Battistella, L., Martínez-López, J., Boni, M., Graziano, M. (2016). Integrating multiple spatial datasets to assess protected areas: Lessons learnt from the Digital Observatory for Protected Area (DOPA). International Journal of Geo-Information. http://dx.doi.org/10.3390/ijgi5120242