To manage the Xspatula framework database it comes with three Jupyter notebooks: setup_db, delete_db and setup_processes. Before you can run the notebooks you need to create a python environment. The only thing you need to edit in the framework itself are your credentials for the postgreSQL database cluster you have access to.

Prerequisits

To build your own database, whether on your local machine or a remote server, you might need to install the following applications/files:

  • Visual Studion code (VScode) (or other environment for running jupyter notebooks),
  • postgreSQL (unless you have access to a server with postgres installed),
  • Anaconda or any other application for defining virtual python environments, and/or
  • A .netrc file that defines your credentials for setting up new databases (to protect your login credentials).

Python environment

To run a Jupyter notebook you need to define the Python environment that must contain all Python packages used by the notebook and its underlying code. In the Xspatula package path ./anaconda there is a prepared .yml file and a short instruction for how to setup the virtual environment. The same information is available in the Anaconda document that also links to the Anaconda download and installation guides.

Notebook structure

All Xspatula framework notebooks have the same basic structure with only three code blocks. The first code block loads the underlying python packages, the second defines the scheme file, and the third points to either a job or process file and the actual processes to run.

Code block 1 - imports and setup

The first code block in all notebooks defines the local path to the notebook itself and loads the necessary python packages. Only the last item, the package application import, differs between the notebooks.

# Standard library imports
from os import path, getcwd

import sys

# Get the working directory of the notebook
notebook_path = getcwd()

# Set the path to the root directory of this jupyter notebook (python) project
module_path = path.abspath(path.join('..'))

if module_path not in sys.path:

    sys.path.append(module_path)

# Package application imports
from src_setup.lib_setup import Initiate_database

Code block 2 - scheme file

The second code block loads the scheme file, defining project path, login credentials and default settings for execute, delete, overwrite and verbosity.

scheme_file = './zzz/scheme_xspatula_local_setup.json'

The default path for the scheme file for setting up the database is in the (sleepy) directory zzz under the directory where the notebooks themselves are available. You can move the scheme file to any other path you want, and either give a relative (vis-a-vis the notebook itself) or absolute path that you have access to.

See the document on scheme file for more information.

Code block 3 - job or process file

The third code block points to either a job file or a process file (the framework will recognise which type it is and act accordingly).

job_file = 'job_setup_db.json'

Initiate_database(notebook_path, scheme_file, job_file)

The code in the block calling the job/process file is different when setting up or deleting a database compared to all other notebooks, illustrated below by code block in the notebook setup_processes

job_file = 'job_setup_processes.json'

structured_process_D = Initiate_process(notebook_path, scheme_file, job_file)

if structured_process_D is not None:

    Run_process(structured_process_D)

See the documents on job file, pilot file and process file for more information.

Updated: