The second import step inserts the utility catalogue JSON process files — generated in the translate utility step — into the AI4SH database. This step must complete successfully before you can insert any observation or dataset data, because every observation record references at least one utility catalogue entry.

Notebook cell

In import_ai4sh_data.ipynb, the Manage utilities cell runs this step:

job_file = 'import_data/job_manage_utility.json'

structured_process_D, scheme_params_D = Initiate_process(notebook_path, scheme_file, job_file)

if structured_process_D is not None:
    Run_process(structured_process_D, scheme_params_D)

Job file

File: ./AI4SH/import_data/job_manage_utility.json

{
  "process": {
    "job_folder": "import_data/manage_data/utility",
    "process_sub_folder": "process",
    "pilot_file": "manage_utility.txt"
  }
}

Pilot file

File: ./AI4SH/import_data/manage_data/utility/manage_utility.txt

This file lists the absolute paths to all JSON process files created by the translate step, one per line. The file is generated by the translate cell output — copy and paste that output into this file before running the manage cell.

Example content (paths will reflect your local installation):

/path/to/seed_ai4sh_db/ai4sh/import_data/manage_data/utility/process/process_manage_territory.json

/path/to/seed_ai4sh_db/ai4sh/import_data/manage_data/utility/process/process_manage_analysis_method.json

/path/to/seed_ai4sh_db/ai4sh/import_data/manage_data/utility/process/process_manage_apparatus.json
...

What gets inserted

Running the manage cell executes all process files listed in manage_utility.txt. Each file inserts records into the target utility table. The complete set of utility tables populated in this step:

  • utility.territory
  • observation.analysis_method
  • observation.apparatus
  • observation.classification_order, classification_family, classification_genus
  • observation.license
  • observation.location_method
  • observation.method_tier
  • observation.preparation, observation.preservation
  • observation.provider
  • observation.quantity, observation.indicator
  • observation.setting_system, observation.juxtaposition
  • observation.spatial_reference
  • observation.storage, observation.transportation
  • observation.unit, observation.profiling
  • observation.provision, observation.provision_indicator

Column header validation

The framework does not validate column headers during the manage step. If an Excel source file had incorrect column headers during the translate step, the resulting JSON will map to wrong column names and the manage step will report errors. Fix column headers in the source Excel file, re-run the translate cell, update the pilot file, and re-run the manage cell.

Next step

Once utility data is in the database, proceed to Import observation data to translate dataset metadata.

Updated: