Automating Partitioned Table Migrations with GitHub Actions
PostgreSQL supports table partitioning, which splits rows into multiple partitions for improved query performance and easier bulk data transfers. Partitioned tables are useful when data are regularly inserted into a table and only recently added rows are selected in active operation. Older table partitions can be detached, archived, and finally dropped. Changes to database schemas in production systems should be managed with schema migrations, version-controlled, and incrementally applied SQL scripts. Repeated schema migrations should be automated to avoid errors, save engineering time, and ensure they are run. GitHub Actions is a CI/CD automation service that executes jobs according to workflow definitions. This post shows how to use GitHub Actions to regularly open pull requests that manage table partition migrations.
Initial Database Schema with Partitioned Table
PostgreSQL supports three different ways to partition rows in a table:
- Range Partitioning: using one or more orderable columns
- List Partitioning: explicitly specifying row keys for each partition
- Hash Partitioning: using hash of partition key
Range partitioning is suitable for cases when rows include a timestamp and only more recently added rows are accessed regularly. A range-partitioned table is created by adding the PARTITION BY RANGE
clause at the end of a CREATE TABLE
statement.
CREATE TABLE samples (
time date not null,
data jsonb not null
) PARTITION BY RANGE (time);
It can be helpful to add an index on the key column.
CREATE INDEX samples_time_idx ON samples (time);
Schema Migrations with postgres-migrations
The migration library postgres-migrations, which I've chosen here for its simplicity, runs migration scripts sequentially and stores information on which migrations have already been run in the table migrations
the library creates. You specify the ordering for migrations in file names:
migrations/00001_initial_schema.sql
migrations/00002_first_migration.sql
migrations/00003_second_migration.sql
The library does not include a command-line utility, so a minimal migration script is required:
import pg from "pg";
import { migrate } from "postgres-migrations";
const client = new pg.Client()
await client.connect();
try {
await migrate({client}, "migrations");
} finally {
await client.end();
}
A PostgreSQL database can be created for testing in a Docker container with postgres image:
$ docker run --rm --name postgres -e POSTGRES_PASSWORD=password -p 5432:5432 postgres
Before running the initial migration, a role and a database need to be created:
$ docker exec -i postgres su -c psql postgres <<EOF
CREATE ROLE samples PASSWORD 'samples' LOGIN;
CREATE DATABASE samples OWNER samples;
EOF
Connection information can be supplied through environment variables:
$ PGHOST=localhost PGPORT=5432 PGUSER=samples PGPASSWORD=samples PGDATABASE=samples node scripts/migrate.mjs
After running the initial migration, the table samples
is present:
$ docker exec -it postgres su -c "psql samples -c '\d samples'" postgres
Partitioned table "public.samples"
Column | Type | Collation | Nullable | Default
--------+-------+-----------+----------+---------
time | date | | not null |
data | jsonb | | not null |
Partition key: RANGE ("time")
Indexes:
"samples_time_idx" btree ("time")
Number of partitions: 0
Creating Schema Migrations for Table Partitions
Picking a suitable number of partitions for a partitioned table depends on the requirements. If one partition contains rows for one calendar month and only the last three months are ever queried, four attached partitions can be a good starting point. When the oldest partition is detached, it can still be kept in the database for one month to archive its contents to separate storage before deletion.
The following script create-partition.ts
creates schema migration files to manage a configurable number of table partitions. It takes three command-line arguments:
- current date (in any format
Date
can parse), - name of the partitioned table
- number of active partitions
import fs from "fs";
import path from "path";
const date = process.argv[2];
const table = process.argv[3];
const partitions = parseInt(process.argv[4]);
if (!date || !table || !Number.isInteger(partitions)) {
console.log(
"Usage: create-partition.ts YYYY-MM TABLE PARTITION\nExample: create-partition.ts 2022-07 samples 4"
);
process.exit(1);
}
// finding existing partition migration files with a regex
const migrationFileNameRegex = new RegExp(
`^[0-9]{5}_${table}_[0-9]{4}_[0-9]{2}.sql$`
);
const partitionTableName = (monthStart: Date) =>
`${table}_${monthStart.getFullYear()}_${String(
monthStart.getMonth() + 1
).padStart(2, "0")}`;
const toDateString = (date: Date) => date.toISOString().slice(0, 10);
const createScript = (startDate: Date, endDate: Date) =>
`CREATE TABLE ${partitionTableName(startDate)}
PARTITION OF ${table}
FOR VALUES FROM ('${toDateString(startDate)}') TO ('${toDateString(endDate)}');
`;
const run = async () => {
const migrations = await fs.promises.readdir("migrations");
const numExistingPartitions = migrations.filter((fileName) =>
migrationFileNameRegex.test(fileName)
).length;
const migration = migrations.length + 1;
const partition = numExistingPartitions + 1;
const today = new Date(date);
const currentMonthStart = new Date(today.getFullYear(), today.getMonth(), 1);
const nextMonthStart = new Date(
currentMonthStart.getFullYear(),
currentMonthStart.getMonth() + 1,
1
);
const nextMonthEnd = new Date(
nextMonthStart.getFullYear(),
nextMonthStart.getMonth() + 1,
1
);
const partitionToDetachStartDate = new Date(
today.getFullYear(),
today.getMonth() - partitions,
1
);
const partitionToDropStartDate = new Date(
today.getFullYear(),
today.getMonth() - partitions - 1,
1
);
const detachScript = `ALTER TABLE ${table} DETACH PARTITION ${partitionTableName(
partitionToDetachStartDate
)};`;
const dropScript = `DROP TABLE ${partitionTableName(
partitionToDropStartDate
)};`;
const migrationScript = [
// first migration creates a partition for the current month
partition === 1 ? createScript(currentMonthStart, nextMonthStart) : "",
// all migrations create a partition for the next month
createScript(nextMonthStart, nextMonthEnd),
// after a desired number of partitions have been created, detach one partition
partition >= partitions ? detachScript : "",
// a detached partition from previous run can be dropped
partition >= partitions + 1 ? dropScript : "",
]
.filter((script) => script.length > 0)
.join("\n");
const migrationFileName = `${String(migration).padStart(
5,
"0"
)}_${partitionTableName(
partition === 1 ? currentMonthStart : nextMonthStart
)}.sql`;
await fs.promises.writeFile(
path.join("migrations", migrationFileName),
migrationScript
);
};
run();
Running the script five times and passing five consecutive months as parameters will create five migration files.
$ for month in 07 08 09 10 11; do
ts-node scripts/create-partition.ts 2022-${month} samples 4
done
The first partition table migration script creates two partitions, one for the current month and one for the next month:
CREATE TABLE samples_2022_07
PARTITION OF samples
FOR VALUES FROM ('2022-06-30') TO ('2022-07-31');
CREATE TABLE samples_2022_08
PARTITION OF samples
FOR VALUES FROM ('2022-07-31') TO ('2022-08-31');
The following two scripts both create one partition.
CREATE TABLE samples_2022_09
PARTITION OF samples
FOR VALUES FROM ('2022-08-31') TO ('2022-09-30');
CREATE TABLE samples_2022_10
PARTITION OF samples
FOR VALUES FROM ('2022-09-30') TO ('2022-10-31');
The fourth migration script creates one partition and detaches the first partition.
CREATE TABLE samples_2022_11
PARTITION OF samples
FOR VALUES FROM ('2022-10-31') TO ('2022-11-30');
ALTER TABLE samples DETACH PARTITION samples_2022_07;
The fifth migration script creates one partition, detaches the second partition, and drops the first partition.
CREATE TABLE samples_2022_12
PARTITION OF samples
FOR VALUES FROM ('2022-11-30') TO ('2022-12-31');
ALTER TABLE samples DETACH PARTITION samples_2022_08;
DROP TABLE samples_2022_07;
Now the migrations can be run against the local database:
PGHOST=localhost PGPORT=5432 PGUSER=samples PGPASSWORD=samples PGDATABASE=samples node scripts/migrate.mjs
After the migrations have been run, there are four attached partitions:
$ docker exec -it postgres su -c "psql samples -c '\d+ samples'" postgres
Partitioned table "public.samples"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stat
s target | Description
--------+-------+-----------+----------+---------+----------+-------------+-----
---------+-------------
time | date | | not null | | plain | |
|
data | jsonb | | not null | | extended | |
|
Partition key: RANGE ("time")
Indexes:
"samples_time_idx" btree ("time")
Partitions: samples_2022_09 FOR VALUES FROM ('2022-08-31') TO ('2022-09-30'),
samples_2022_10 FOR VALUES FROM ('2022-09-30') TO ('2022-10-31'),
samples_2022_11 FOR VALUES FROM ('2022-10-31') TO ('2022-11-30'),
samples_2022_12 FOR VALUES FROM ('2022-11-30') TO ('2022-12-31')
And one detached partition:
$ docker exec -it postgres su -c "psql samples -c '\d samples_2022_08'" postgres
Table "public.samples_2022_08"
Column | Type | Collation | Nullable | Default
--------+-------+-----------+----------+---------
time | date | | not null |
data | jsonb | | not null |
Indexes:
"samples_2022_08_time_idx" btree ("time")
Opening Pull Requests in GitHub Actions Workflow
To ensure table partition schema migration scripts are created every month GitHub Actions workflow can be dispatched using the schedule trigger that uses CRON syntax, e.g. 0 0 1 * *
triggers the workflow on the first day of every month. It can be helpful to dispatch it manually, too, with the workflow_dispatch trigger.
Database-specific configuration is better placed in a separate script than a workflow file.
#!/bin/sh
ts-node scripts/create-partition.ts $(date +%Y-%m) samples 4
The following workflow file .github/workflows/monthly-migrations.yaml
does the following:
- checks out the repository with actions/checkout,
- installs node.js with actions/setup-node,
- installs dependencies, e.g., ts-node, by running
npm ci
, - runs the script
scripts/create-monthly-migrations.sh
and - creates a pull request with peter-evans/create-pull-request.
name: Create Monthly PostgreSQL Migrations
on:
schedule:
- cron: "0 0 1 * *" # first day of every month
workflow_dispatch: # manual trigger
jobs:
monthly_migrations:
name: Create Monthly PostgreSQL Migrations
runs-on: ubuntu-latest # or use your own runners, e.g. [self-hosted, linux, x64]
steps:
- name: Checkout
uses: actions/checkout@v3 # consider using commit hashes instead of version tags for improved peace of mind
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: 16
- name: Install Dependencies
run: npm ci
- name: Create Monthly Migrations
run: ./scripts/create-monthly-migrations.sh
- name: Create Pull Request
uses: peter-evans/create-pull-request@v4
with:
author: ${{ github.actor }} <${{ github.actor }}@users.noreply.github.com>
title: "Monthly PostgreSQL Migrations"
body: "PostgreSQL migration scripts that should be run monthly against the database."
branch: monthly-postgresql-migrations
The script scripts/create-monthly-migrations.sh
calls scripts/create-partition.ts
with appropriate parameters:
#!/bin/sh
TS_NODE=./node_modules/ts-node/dist/bin.js
$TS_NODE scripts/create-partition.ts $(date +%Y-%m) samples 4
After committing the workflow file to the repository it is possible to dispatch it under the "Actions" tab.

A successful workflow run results in a newly opened pull request:

The pull request includes a commit to add the file migrations/00002_samples_2022_07.sql
:

It is possible to merge such pull requests automatically with peter-evans/enable-pull-request-automerge to avoid creating schema migration scripts with the same numeric prefix (00002 here). However, auto-merging is available only for public and non-free private repositories.
Conclusions
Partitioning a PostgreSQL table can speed up queries and help archive rows that are no longer used actively. Managing a partitioned table requires recurring updates to the database schema, and these should be handled as versioned and incremental schema migration scripts. Schema migration scripts to create, detach and drop partitions are highly regular, and their creation should be automated. Creating schema migration scripts programmatically is a good start, but it is better to push the automation further to ensure schema migrations are regularly and correctly created. GitHub Actions and postgres-migrations can be leveraged with the help of a short script and a workflow definition to periodically open pull requests that add appropriate schema migration scripts.