This document describes how to work with HIL’s database migration framework. It is aimed at developers; for instructions on upgrading a production database see UPGRADING.rst.

Introduction

HIL uses SQLAlchemy as a database abstraction layer, Flask-SQLAlchemy for integration with Flask, and Flask-Migrate/Alembic for migration support. This document explains HIL-specific details; refer to the libraries’ documentation for general information about these tools. This document mainly focuses on Alembic and Flask-Migrate.

HIL uses migration scripts for updating the database to work with a new version of the software. Any time the database schema changes, a migration script must be written so that existing installations may be upgraded to the next HIL release automatically.

Background

Developers unfamiliar with alembic should at least skim the Alembic tutorial, which explains general alembic concepts. Developers should also read the overview documentation for Flask-Migrate, which integrates Alembic with Flask.

Usage In HIL

HIL uses the migration libraries as follows:

The Flask-Script extension is used to expose Flask-Migrate’s functionality, via the hil-admin db subcommand, implemented in hil/commands/. This command is almost exactly as described in the Flask-Migrate documentation, except we’ve added sub-command hil db create, which initially creates and populates the database tables. This command is idempotent, so it is safe to run it on an already-initialized database. When adding extensions to hil.cfg, the create command should be re-run to create any new tables; see UPGRADING for details.

We use Alembic’s branch support to handle tables created by extensions; Developers should be familiar with this part of alembic’s interface. Each extension should have its own head. We do not merge these heads.

When a new migration script needs to be written, the procedure depends on whether or not the migration script concerns an extension which does not already have any migration scripts.

If the extension does not have migration scripts already

First, choose a directory in which to store the migration scripts for the extension. Storing the migrations as close to the module as possible is recommended. Add code to your extension to register this directory with hil.migrations.paths, e.g.:

from hil.migrations import paths:

paths[__name__] = 'Path to my migrations directory'

Make sure the value will be correct regardless of where the source tree is located; making it a function of __file__ is a good idea. Have a look at existing extensions for examples if need be. hil.ext.switches.dell is a good example.

Also, make sure that the migration scripts will be included in the final package. This usually just means adding an appropriate entry to package_data in setup.py. If the extension is maintained outside of the HIL core source tree, make sure you pass zip_safe=False to setup.

Next, to generate a template migration, execute:

hil-admin db migrate --head hil@head \
    --branch-label ${name_of_extension_module} \
    -m '<Summary of change>'

Alembic will look at the differences between the schema in the database and the one derived from the HIL source, and generate its best attempt at a migration script. The script will be stored in hil/migrations/versions. Move it to the directory you chose, and change the value of down_revision to None.

Sanity check the output; Alembic often does a good job generating scripts, but it should not be trusted blindly.

Finally, run hil-admin db upgrade heads to upgrade the database according to your new migration.

If the extension already has migrations, or the script is for HIL core

To generate a new migration script, execute:

hil-admin db migrate --head ${branch_name}@head \
    -m '<Summary of change>'

Where ${branch_name} is either the name of the extension’s module (if the script is for an extension) or hil (if the script is for HIL core).

Alembic will look at the differences between the schema in the database and the one derived from the HIL source, and generate its best attempt at a migration script. The script will be stored in hil/migrations/versions.

The value of down_revision should be the identifier of the previous migration script. The value of branch_labels should be ('<branch_name>',) where branch_name should match what was used in the command to generate the migration script. Finally, the value of branch_lables in the previous migration script (named by down_revision in the new one) must be set to None. This will need to be changed manually.

Sanity check the output; Alembic often does a good job generating scripts, but it should not be trusted blindly.

Notes on Generating Migrations and Checking the Output

State of the Database

For automatic migrations the database being loaded to generate the migration should match the schema of the current master branch. To ensure this, create a new PostgreSQL database and initialise it using hil-admin db create while on a branch that is up to date with current HIL master branch. The command to generate the migration script should then be run after checking out the branch that has the changes that the script should be generated for.

Renaming Columns

Alembic will not rename a column, instead it will delete the original column and create a new one with the new name. This is an issue as the data will then not be contained in the new column (see Data Migration vs Schema Migration below). To change the name of a column the script should be edited manually to remove the lines dropping the old column and creating one with the new name and replace them with a line altering the column: op.alter_column(u'<table_name>', '<old_column_name>', new_column_name='<new_column_name>')

Data Migration vs. Schema Migration

The migrations Alembic generates are schema migrations: they will create/delete tables, columns, and relationships, but they do not populate these with data. This can be an issue, particularly in the case of a new relationship that would apply to existing data. None of the existing data will be accessible via the new relationship unless the data itself is specifically migrated as well.

This can be done by directly encoding data within the script and using a command like bulk_insert(), executing a SQL statement to SELECT the data and INSERT it into the new column, or by creating a live interaction with the database.

lines 28-32 in hil/migrations/versions/89630e3872ec_network_acl.py show an example of executing a SQL statement then using bulk_insert() to migrate data.

Writing tests

The file tests/unit/migrations.py provides some basic infrastructure for testing migrations. The comments there describe things in full detail, but the basic strategy is, back up a database with a known set of objects using pg_dump, and place the result in tests/unit/migrations/. To generate a dump, execute::

pg_dump -U $hil_user $database_name --column-inserts > filename.sql

--column-inserts is necessary to ensure that the result can be loaded via SQLAlchemy; the default is not valid SQL and takes advantage of specific features of the psql command. You will also need to edit the file manually, doing the following:

  • It is likely that the system you’re developing on will have a newer version of postgres than what travis uses, so you may need to remove statements setting options that don’t exist int he older versions. Delete any statements of the form SET option_name = ..., where option_name is any of:
    • idle_in_transaction_session_timeout
    • lock_timeout
    • row_security
  • Delete all statements using the keywords GRANT, REVOKE, or EXTENSION. These will cause permission errors if your database user is not root.
  • Delete all statements of the form ALTER TABLE ... OWNER TO ...; these may cause failures if the connection details on the machine which runs the tests are different from yours, since the refer to specific database users/roles.
  • At the top of the file, document the following:
    • In roughly what stage of HIL’s development this dump was taken, e.g. “Just after re-integrating flask” or “first integration of openvpn”
    • The git commit hash of the revision of HIL which created the database.
    • A list of extensions that were loaded into HIL when the database was created.
    • Any other options in the config file that would have affected the contents of the database.
    • How the database was generated, e.g. “by the function hil.foo.bar as of commit $hash.”