All source data for the AI4SH database import is stored as Excel (.xlsx) files under:

./AI4SH/import_data/excel_src_data/

The directory is split into two groups: utility catalogues and observation/dataset data.

Utility source files

Utility files are catalogue tables — they define the controlled vocabularies that all observation data must reference. They live under:

./AI4SH/import_data/excel_src_data/observation_utility/
File Database table Description
analysis_method.xlsx observation.analysis_method Laboratory and field analysis methods
apparatus.xlsx observation.apparatus Instruments and tools delivering data
classification_order.xlsx observation.classification_order Highest level of Linnean taxonomy
classification_family.xlsx observation.classification_family Second Linnean level (requires order)
classification_genus.xlsx observation.classification_genus Third Linnean level (requires family)
indicator.xlsx observation.indicator Refined quantity definitions (requires quantity)
juxtaposition.xlsx observation.juxtaposition Local thematic setting of samples (requires setting_system)
license.xlsx observation.license Dataset and campaign licenses
location_method.xlsx observation.location_method Geolocation methods
method_tier.xlsx observation.method_tier Professionality level of a method
preparation.xlsx observation.preparation Pre-analysis sample preparation methods
preservation.xlsx observation.preservation Sample preservation methods
profiling.xlsx observation.profiling Z-dimension profiling (requires unit)
provider.xlsx observation.provider Instruments, labs, services providing results
provision.xlsx observation.provision Combination of apparatus + provider + method_tier
provision_indicator.xlsx observation.provision_indicator Links provisions to quantities and indicators
quantity.xlsx observation.quantity Unambiguous quantity definitions
setting_system.xlsx observation.setting_system Thematic frame for juxtaposition definitions
spatial_reference.xlsx observation.spatial_reference EPSG-based spatial reference systems
storage.xlsx observation.storage Sample storage methods
transportation.xlsx observation.transportation Sample transportation conditions
unit.xlsx observation.unit Units of reported observation values

General utility data lives in a separate subdirectory:

./AI4SH/import_data/excel_src_data/utility/
File Database table Description
territory.xlsx utility.territory Geographic territories (countries, regions)
foreign_key.xlsx Reference for foreign key resolution

Observation and dataset source files

Files for datasets, campaigns, and observations live under:

./AI4SH/import_data/excel_src_data/dataset_campaign_sampling_log/
File Database table Description
data_source.xlsx observation.data_source Organisations or individuals providing data
person.xlsx observation.person Persons responsible for data collection and analysis
dataset.xlsx observation.dataset Dataset-level metadata (name, license, substance)
campaign.xlsx observation.campaign Campaign metadata (dates, geographic scope)
campaign_ai4sh.xlsx observation.campaign AI4SH-specific campaign records
campaign_lucas.xlsx observation.campaign LUCAS campaign records

Additional observation files:

./AI4SH/import_data/excel_src_data/sample/
./AI4SH/import_data/excel_src_data/observation/
./AI4SH/import_data/excel_src_data/measurement/
./AI4SH/import_data/excel_src_data/observation_log/

Column headers

The Xspatula framework maps Excel column headers directly to database table column names. Headers must match exactly — there is no automatic header correction. If a mismatch exists, the translate step will write incorrect JSON and the manage step will report errors when it tries to insert. Always validate your column headers against the target table definition before running.

Updated: