WP leader: UBA, Barbara Magagna/Johannes Peterseil
Data management is a key feature in a monitoring framework. To provide the data with sufficient meta information for the analysis and calculation is one of the core tasks of it. The EBONE data management framework should build on existing tools and will adopt these for the needs for a European Biodiversity Monitoring framework. The goal of the EBONE work package on data management is to provide a data management system, which allows to store standardised parameters and methods for the EUROPEAN Biodiversity monitoring network. This includes the following tasks: a) to determine essential operational core services, b) to determine relevant data flows and data according to INSPIRE and GEO data sharing principles, c) to provide a database for the collected field test data, and d) to design data architecture and technical tools for needed services.
Two types of data sources can be identified for the EBONE network the data management has to deal with:
- Data according mapped according to the EBONE mapping procedure (GHC/species) on new sites. These data are full compliant to the EBONE data structure and raw data should be available in most of the cases.
- Data from existing monitoring schemas which are harmonised and transformed according to the EBONE transformation rules for GHC/species. These data show different data models which have only a certain level of compliancy to the EBONE data structure. Furthermore often raw data together with their metadata can not be directly accessed but only aggregated values for different parameters for a defined analysis unit (e.g. landscape squares) are available.
The data management in EBONE has to deal with that. Aspects of data policy and data rights need also be taken into account. Beside the data sources also different data levels can be addressed the data management has to deal with. It can be distinguished between:
- Raw field data on the level of the landscape square. These are the mapped data (e.g. GHC or other habitat classification according to the mapping protocol) together with their exact location and shape (spatial information).
- Aggregated data on the level of the landscape square. These are transformed (according to the GHC) and aggregated values, e.g. as sum of area (or share) of habitat categories or species per landscape which are the basis for further calculation. The exact spatial location of the landscape element within the landscape square is not provided anymore. In some cases also the exact location of the landscape square is not provided but only the assignment to a Environmental strata or zone.
- Aggregated data on the level of the reporting unit (e.g. Environmental Strata and Zones). The Environmental stratification forms the basis for the calculation of the indicator values. Therefore this data level is based on aggregated figures of selected indicators based on the entry values of data level II for the Environmental Strata or Zone. Theoretically every other reporting unit is possible if the data meet the statistical requirements for the calculation of the indicator values for this reporting unit.
Based on this data levels the data flows within EBONE can be identified. In principal there is no difference in the process between the species and habitat information. The raw data are transformed according to the transformation rules defined by WP4 and a harmonisation of the data needs to be performed. The result is unified data on an aggregated level (see data levels) depicted in a common domain model for the habitat and species data. This common domain model plays a central role. The calculation of the indicator values or further scientific analysis is based on the harmonised data. The schema below only shows the data flows and not the technical design. The common domain model can either be implemented in central data storage or by a virtual central data storage using data ware house or semantic data integration technologies. The design of the system architecture and the selection of the most appropriate tool kit is one of the next steps in the work package.
 |
| Figure 1. Schema of the data flows for the EBONE data management |
Based on existing habitat mapping data within the EBONE consortium the steps of the data flow model were implemented using a simple Access database. Example data from existing habitat mapping projects - the British Country Side Survey (UK), the North Ireland Country Side Survey (N-IRL), NILS (Sweden), SINUS (Austria) – as well as from landscape squares mapped according to the EBONE habitat protocol – example data from France and Israel – were used to do the test.
A simple common domain model was set up consisting of the following schema elements: a) landscape element (or polygon) having a name (identifier), area (in ha), GHC (resulting from the transformation), and optional the original habitat recording and spatial information, b) the landscape square (or plot) having a name (identifier), total area (in ha), assignment to environmental strata and zone, and optional spatial information about the location (often issue of data policy), and c) Environmental Strata (EnS) having a name (identifier), total area of the strata (in square kilometre) and the spatial information about the extent. This basic common domain model is the starting point for the further work on it within the work package which will be carried out during spring 2010.
The next steps to be taken in the work package are: a) an overview on the existing data management solutions dealing with habitat/species monitoring data or environmental monitoring data in Europe and decide on the most appropriate for EBONE, and b) provide a quick shot to start with and further develop the EBONE system. The data management framework in EBONE consists of different elements (see Figure 2).
 |
| Figure 2. Elements of the EBONE data management framework |
For each of them the most appropriate tool must be chosen and integrated to a consistent data management framework. The starting points for the evaluation of the data management solutions are defined. The work to be done in the coming month is to make a decision and start with the implementation.
The EBONE data management system has the following components:
a) Metadata Component
b) Source Data Component
c) Data Integration Component
d) Data Presentation Component
e) Data Analysis Component
The Joint Research Centre (JRC) contributes to the data presentation component by developing a preliminary prototype map viewer, to view in situ habitat maps and associated indicators. The map viewer uses JRC in-house expertise and capacities developed for the European Forest Data Center (http://efdac.jrc.ec.europa.eu/ ). The Beta version of the prototype is available for internal use only, final delivery March 2012.
The map viewer allows you to view the location of EBONE field based samples, to view habitat maps, to query the presence and extent of habitats per sample and per environmental zones, to view habitat pattern maps and related indicators on fragmentation and connectivity (available from EBONE WP5).