The first import step translates utility catalogue data from Excel source files into JSON process files. Utility catalogues define the controlled vocabularies — quantities, units, methods, providers, and more — that all observation data must reference. They must be imported and inserted into the database before any observation data can be added.

Notebook cell

In import_ai4sh_data.ipynb, the Translate utilities cell runs this step:

job_file = 'import_data/job_translate_utility.json'

structured_process_D, scheme_params_D = Initiate_process(notebook_path, scheme_file, job_file)

if structured_process_D is not None:
    Run_process(structured_process_D, scheme_params_D)

Job file

File: ./AI4SH/import_data/job_translate_utility.json

{
  "process": {
    "job_folder": "import_data/translate_data/utility",
    "process_sub_folder": "process",
    "pilot_file": "translate_utility_data.txt"
  }
}

Pilot file

File: ./AI4SH/import_data/translate_data/utility/translate_utility_data.txt

This text file lists all JSON process files to execute, in the order they must run. The order respects foreign key dependencies — independent tables first, dependent tables after.

Independent utility tables (no foreign keys)

These can be translated in any order:

  • territory.json
  • analysis_method.json
  • apparatus.json
  • classification_order.json
  • license.json
  • location_method.json
  • method_tier.json
  • preparation.json
  • preservation.json
  • provider.json
  • quantity.json
  • setting_system.json
  • spatial_reference.json
  • storage.json
  • transportation.json
  • unit.json

Dependent utility tables

These have foreign key requirements from the independent tables above:

File Requires
classification_family.json classification_order
classification_genus.json classification_family
indicator.json quantity
juxtaposition.json setting_system
profiling.json unit
provision.json apparatus, provider, method_tier
provision_indicator.json provision, indicator, unit

Process

The translate process reads each Excel source file from ./AI4SH/import_data/excel_src_data/observation_utility/ and ./AI4SH/import_data/excel_src_data/utility/, and writes one JSON file per source table into:

./AI4SH/import_data/manage_data/utility/process/

The cell output lists the absolute paths of all files created. Copy this list into the manage pilot file before running the next step — see Manage utility data.

Source files

All Excel files are in ./AI4SH/import_data/excel_src_data/. See Tabular source data for a full list and descriptions.

Updated: