Foreign Key Handling
Foreign key constraints enforce referential integrity in the AI4SH database — a record can only reference another record that already exists. This means the order in which you import data matters: referenced tables must be populated before referencing tables.
The general rule
Import data in this global order:
- Utility catalogues (no foreign keys among themselves, except the dependent ones)
- Dependent utility catalogues (reference other utility tables)
- Dataset metadata (reference utility tables)
- Observation records (reference dataset metadata and utility tables)
Utility dependency tree
Within the utility catalogues, most tables are independent. The following have foreign key requirements:
classification_order
└── classification_family
└── classification_genus
quantity
└── indicator
setting_system
└── juxtaposition
unit
└── profiling
apparatus + provider + method_tier
└── provision
└── provision_indicator (also requires indicator)
The translate_utility_data.txt pilot file is already ordered to respect these dependencies. Independent tables appear first; dependent tables follow.
Dataset metadata dependency tree
(utility.territory)
data_source
person
↓
dataset ←── (observation.license from utility)
↓
campaign ←── person
↓
sampling_log ←── campaign, person
↓
sample ←── sampling_log
↓
observation_log ←── sample, provision (utility)
↓
measurement ←── observation_log, provision_indicator (utility)
What happens if order is wrong
If a manage process runs before its dependency is in the database, the framework will report a foreign key constraint violation error for that record. The record is not inserted. Subsequent records that do not have the same unresolved dependency will still be attempted.
To recover:
- Fix the pilot file order
- Set
"overwrite": falsein the scheme file to skip already-inserted records - Re-run the manage cell — only the previously failed records will be inserted
Using the foreign_key.xlsx file
The file ./AI4SH/import_data/excel_src_data/utility/foreign_key.xlsx is a reference sheet mapping columns in source Excel files to their foreign key targets in the database. Use it to verify that the values in your source data match existing records in the referenced tables before running the translate step.
Common issues:
- A
unit_idvalue inindicator.xlsxdoes not exist in theunittable yet - A
provider_idvalue inprovision.xlsxreferences a provider not yet imported - A
dataset_idincampaign.xlsxreferences a dataset not yet in the database
In all cases, ensure the referenced record is imported first, then re-run the translate and manage steps for the dependent table.
Overwrite and delete flags
The scheme file controls how the framework handles existing records:
"process": [
{
"execute": true,
"overwrite": false,
"delete": false
}
]
"overwrite": false— skip records that already exist (safe default for re-runs)"overwrite": true— update existing records with new values"delete": true— delete existing records before inserting (use with caution)