The observation schema stores actual soil property data. Data is organised in a strict hierarchy: an observation belongs to a sample, a sample belongs to a sampling log, a sampling log belongs to a campaign, a campaign belongs to a dataset. Every step in this chain must exist before the next can be created. All observation values link to reference records in observation_utility.

Process files

File Tables created Description
observation/observation_data_source_v10_sql.json data_source Simplified copy of community.organisation — passive data suppliers who need not be full system users
observation/observation_person_v10_sql.json person Simplified copy of community.user with no login credentials — allows attributing data to non-registered persons (GDPR consideration)
observation/observation_dataset_v10_sql.json dataset Top-level grouping of related data (e.g. LUCAS, AI4SoilHealth)
observation/observation_campaign_v10_sql.json campaign, campaign_meta, campaign_method_tier, campaign_location, campaign_provision, campaign_setting Child of dataset; captures details of a specific data collection effort
observation/observation_sampling_log_v10_sql.json sampling_log A sampling event — one or multiple samples taken with consistent methods over a period
observation/observation_sample_geolocation_v10_sql.json sample_geolocation Spatial coordinates for a geolocated sample
observation/observation_sample_v10_sql.json sample An individual sample acquired under a sampling log
observation/observation_observation_log_v10_sql.json observation_log Links a sampling log to a provision; records the logistics (preservation, storage, transportation) of getting the sample to analysis
observation/observation_observation_v10_sql.json observation The actual measured value for a specific indicator from a specific provision
observation/observation_infiltration_beerkan_v10_sql.json infiltration_beerkan BeerKan infiltration measurements
observation/observation_macrofauna_image_v10_sql.json macrofauna_image Image records for macrofauna observations
observation/observation_macrofauna_v10_sql.json macrofauna Macrofauna count and biomass observations
observation/observation_measurement_v10_sql.json measurement General sensor or instrument measurements

Table hierarchy

data_source
dataset (→ data_source)
  campaign (→ dataset)
    campaign_meta (→ campaign, observation_utility.*)
    campaign_method_tier (→ campaign)
    campaign_location (→ campaign, utility.territory, observation_utility.spatial_reference)
    campaign_provision (→ campaign, observation_utility.provision)
    campaign_setting (→ campaign, observation_utility.setting_system)
    sampling_log (→ campaign)
      sample_geolocation (→ sampling_log, observation_utility.spatial_reference)
      sample (→ sampling_log)
        observation_log (→ sampling_log, observation_utility.provision)
          observation (→ observation_log, observation_utility.provision_indicator)

Key tables in detail

dataset

A dataset is a coherent collection of data from one or more campaigns — for example, LUCAS 2009–2022 or the AI4SoilHealth project. Datasets are associated with a data_source.

Key columns: id, data_source_id, name, alias, status_code.

campaign

A campaign is a child of a dataset where sampling methods may vary in detail. For example, LUCAS 2009 and LUCAS 2015 are two campaigns within the LUCAS dataset. Each AI4SH pilot site is a separate campaign.

The campaign table has five companion meta tables:

  • campaign_meta — abstract, DOI, URL, keywords, time series flag, license, taxonomic scope
  • campaign_method_tier — boolean flags for which method tiers (in-situ, in-home, in-lab, from-drone, from-satellite, etc.) are used in the campaign
  • campaign_location — geographic bounding box and spatial reference
  • campaign_provision — which provisions (instrument+provider+method_tier combinations) are used
  • campaign_setting — which setting systems (field, forest, lake, etc.) apply

sampling_log

A sampling log is a sampling event within a campaign, covering one or multiple samples taken with consistent methods. It records the responsible person and time window.

sample and sample_geolocation

sample represents an individual physical sample. sample_geolocation optionally records its latitude, longitude, elevation, and spatial reference. Decoupled from sample so that non-geolocated samples are still representable.

observation_log and observation

observation_log bridges a sampling log to a provision (instrument + service + method tier) and records the sample handling chain (preservation, storage, transportation method). observation stores the actual numeric or text result for a specific provision_indicator (a measurable value from a specific provision).

Specialised observation tables

Three additional tables cover non-standard observation types:

  • infiltration_beerkan — BeerKan ring infiltration test results
  • macrofauna and macrofauna_image — macrofauna counts, biomass, and associated image records
  • measurement — direct sensor readings (e.g. from a handheld spectral sensor)

Updated: