Manage Observation Data

The manage observation step inserts the JSON process files created in the import observation step into the AI4SH database. This populates the dataset metadata tables: data sources, persons, datasets, campaigns, and sampling logs.

Notebook cell

In import_ai4sh_data.ipynb, the Manage dataset meta cell runs this step:

job_file = 'import_data/job_manage_dataset_meta.json'

structured_process_D, scheme_params_D = Initiate_process(notebook_path, scheme_file, job_file)

if structured_process_D is not None:
    Run_process(structured_process_D, scheme_params_D)

Job file

File: ./AI4SH/import_data/job_manage_dataset_meta.json

{
  "process": {
    "job_folder": "import_data/manage_data/observation/dataset_meta",
    "process_sub_folder": "process",
    "pilot_file": "manage_dataset_meta.txt"
  }
}

Pilot file

File: ./AI4SH/import_data/manage_data/observation/dataset_meta/manage_dataset_meta.txt

This file lists the absolute paths to the JSON process files from the translate step. Paste the output from the translate cell into this file before running the manage cell.

What gets inserted

Running the manage cell executes all listed JSON process files. Each inserts records into its target table:

Process file	Target table	Description
`process_manage_data_source.json`	`observation.data_source`	Organisations or individuals providing data
`process_manage_person.json`	`observation.person`	Responsible persons for data collection
`process_manage_dataset.json`	`observation.dataset`	Dataset-level metadata
`process_manage_campaign.json`	`observation.campaign`	Campaign-level metadata
`process_manage_sampling_log.json`	`observation.sampling_log`	Sampling log records

Insertion order

The pilot file must list process files in dependency order. The framework executes them sequentially, and a process that references a foreign key in a table not yet populated will fail. The correct order mirrors the translate order:

process_manage_data_source.json
process_manage_person.json
process_manage_dataset.json
process_manage_campaign.json
process_manage_sampling_log.json

See Foreign key handling for more detail on managing dependency order.

After this step

With utility data and dataset metadata in the database, you can proceed to insert actual observation records (samples, observation logs, measurements). Those steps follow the same translate-then-manage pattern using job files for the observation and measurement tables.