Annotations

Annotations are comments that are placed in SQL migration files to provide additional information to goose. They are used to parse SQL statements and optionally modify how migrations are executed.

To summarize, annotations are:

Case-insensitive
Placed on their own line (no leading whitespace)
Prefixed with -- +goose (^--\s\+goose\s.*$)
The only mandatory annotation is -- +goose up

There are currently seven annotations:

-- +goose up
-- +goose down
-- +goose statementbegin
-- +goose statementend
-- +goose no transaction
-- +goose envsub on
-- +goose envsub off

Quick start

Here's a copy/pasteable example:

-- +goose up
SELECT 'up SQL query';

-- +goose down
SELECT 'down SQL query';

Important, annotations are captured as comments and cannot have leading spaces:

-- +goose up ✅

    -- +goose up ❌ (invalid, because leading whitespace)

Basics

A SQL migration file must have a .sql extension and is prefixed with either a timestamp or a sequential number. There is a handy goose create command to stub out migration files in a consistent way.

Each SQL migration file is expected to have exactly one -- +goose up annotation.

The -- +goose down annotation is optional and must come after the -- +goose up annotation.

-- +goose up
SELECT 'up SQL query 1';
SELECT 'up SQL query 2';
SELECT 'up SQL query 3';

-- +goose down (1)
SELECT 'down SQL query 1';
SELECT 'down SQL query 2';

The down annotation is optional, and may be omitted entirely if there are no down migrations. Within the .sql file it must come after the -- +goose up annotation.

This is invalid:
```
-- +goose down
SELECT 'down SQL query';

-- +goose up
SELECT 'up SQL query';
```

Complex statements

By default, SQL statements are split by semicolons (;) and executed individually -- in fact, query statements must be separated by semicolons to be properly recognized by goose.

More complex statements, such as PL/pgSQL functions, typically have semicolons within them and must be wrapped with -- +goose statementbegin and -- +goose statementend annotations.

This pair of annotations tell goose to treat the entire block as a single statement. Comments, empty lines, and semicolons within the block are preserved.

-- +goose up

-- +goose statementbegin
CREATE OR REPLACE FUNCTION histories_partition_creation( DATE, DATE )
returns void AS $$
DECLARE
  create_query text;
BEGIN
-- This comment will be preserved.
  -- And so will this one.
  FOR create_query IN SELECT
      'CREATE TABLE IF NOT EXISTS histories_'
      || TO_CHAR( d, 'YYYY_MM' )
      || ' ( CHECK( created_at >= timestamp '''
      || TO_CHAR( d, 'YYYY-MM-DD 00:00:00' )
      || ''' AND created_at < timestamp '''
      || TO_CHAR( d + INTERVAL '1 month', 'YYYY-MM-DD 00:00:00' )
      || ''' ) ) inherits ( histories );'
    FROM generate_series( $1, $2, '1 month' ) AS d
  LOOP
    EXECUTE create_query;
  END LOOP;  -- LOOP END
END;         -- FUNCTION END
$$
language plpgsql;
-- +goose statementend

Multiple statements

The -- +goose statementbegin and -- +goose statementend annotations can also be used to combine multiple statements so they get sent as a single query string instead of being executed individually.

Example

This is best illustrated with a contrived example. Suppose we have a migration that creates a users table and adds 100,000 rows with distinct INSERT's.

-- +goose up
CREATE TABLE users (
    id int NOT NULL PRIMARY KEY,
    username text
);

-- (1)! Inserts:
INSERT INTO "users" ("id", "name") VALUES (1, 'gallant_almeida7');
INSERT INTO "users" ("id", "name") VALUES (2, 'brave_spence8');
.
.
INSERT INTO "users" ("id", "name") VALUES (99999, 'jovial_chaum1');
INSERT INTO "users" ("id", "name") VALUES (100000, 'goofy_ptolemy0');

-- +goose down
DROP TABLE users;

This is a contrived example. Normally this would be a set of batched INSERT's with multiple column values, each enclosed with parentheses and separated by commas, like so:
```
INSERT INTO "users" ("id", "username")
VALUES
  (1, 'gallant_almeida7'),
  (2, 'brave_spence8');
```

The up migration contains 100,001 unique statements, all executed within the same transaction, but sent to the database one-by-one. This migration will take ~30s to complete due to the number of round trips.

Using PostgreSQL as an example, here's what the database logs show:

LOG:  statement: INSERT INTO "users" ("id", "username") VALUES (1, 'gallant_almeida7');
LOG:  statement: INSERT INTO "users" ("id", "username") VALUES (2, 'brave_spence8');
LOG:  statement: INSERT INTO "users" ("id", "username") VALUES (3, 'lucid_bardeen6');
[...] 100,000 log statements

However, if we wrap the inserts with -- +goose statementbegin and -- +goose statementend annotations, the entire block will be sent to the server as a single command.

A single command still contains several statements separated by semicolons, but they get sent to the server in the same query string: "INSERT INTO ...; INSERT INTO ...;".

-- +goose up
CREATE TABLE users (
    id int NOT NULL PRIMARY KEY,
    username text,
    name text,
    surname text
);

-- +goose statementbegin
INSERT INTO "users" ("id", "username") VALUES (1, 'gallant_almeida7');
INSERT INTO "users" ("id", "username") VALUES (2, 'brave_spence8');
.
.
INSERT INTO "users" ("id", "username") VALUES (99999, 'jovial_chaum1');
INSERT INTO "users" ("id", "username") VALUES (100000, 'goofy_ptolemy0');
-- +goose statementend

-- +goose down
DROP TABLE users;

These annotations instruct goose to execute the entire block as a single statement. Yes, that's a larger payload, but that's fine and the migration will execute in ~3s, which is an order of magnitude faster as compared to the previous example that ran in ~30s.

No transaction

All statements within a single migration file are run within the same transaction. However, some statements, like CREATE DATABASE or CREATE INDEX CONCURRENTLY, cannot be run within a transaction block.

For such cases add the -- +goose no transaction annotation, usually placed at the top of the file.

This annotation instructs goose to run all statements within the file outside a transaction. This applies to all up and down statements within the file.

-- +goose no transaction

-- +goose up
CREATE INDEX CONCURRENTLY ON users (user_id);

-- +goose down
DROP INDEX users_user_id_idx;

Environment variable substitution

Goose supports environment variable substitution in SQL migration files. This is useful for parameterizing SQL queries with values that are not known at the time of writing the migration.

Substitution is disabled by default. To enable it, add the -- +goose envsub on annotation at the location where you want to start substituting environment variables. This could be at the top of the file, which enables substitution for the entire file, or at a specific location within the file.

goose will attempt to substitute environment variables until the end of the file, or until the annotation -- +goose envsub off is found.

For example, if the environment variable REGION is set to us_east_1, the following SQL migration will be substituted to SELECT * FROM regions WHERE name = 'us_east_1';.

-- +goose envsub on

-- +goose up
SELECT * FROM regions WHERE name = '${REGION}';

Supported expansions

${parameter} or $parameter
${parameter:-[word]}
${parameter-[word]}
${parameter:[offset]}
${parameter:[offset]:[length]}
${parameter?[word]}
${parameter:?[word]} (coming soon)

For an explanation of each expansion, refer to the mfridman/interpolate package. In due time, we'll update the documentation to reflect the supported expansions.