The Missing Manual for Hacking Postgres

The basics of Postgres development: how to build the project, run the test suite, and format a patch for submission.

It’s probably obvious that Postgres is my favorite database. One minor grievance that I have with the project is that its documentation is almost entirely optimized for people who ultimately be users of the database rather than developers of it. An unfortunate side effect of this is that none of the repository’s standard files (e.g. README) give much insight into how to get started with the source code.

In numerous places, some references in files and in errors generated by make tasks are actively misleading in that they’ll reference an INSTALL for further instructions. Some investigation will reveal that INSTALL doesn’t actually exist on master; it’s only generated as part of a release.

The excellent Postgres does of course contain all the information needed to get started with development, but if its has one weakness, it’s that its overwhelming verbosity tends to obscure information.

Here I’ve tried to assemble some succinct instructions for getting started that are useful and more importantly, succinct. I don’t expect most of them to change all that much, but I’ll try to keep the document up-to-date in case they do.

The Basics

Selecting Prefix and Port

It’s often desirable to have a stable release of Postgres running on your machine for day-to-day work along with your experimental build, so you may want to choose a non-standard build directory and port for development.

A prefix is passed during configure; I’ve chosen /opt/postgres below.

A port can be overridden with a command line argument to a server or client command like psql. It can also be overridden for an entire session by setting the PGPORT environmental variable. I’ve chosen 5433 as my port below.

Building

Clone the repository:

git clone https://github.com/postgres/postgres.git

Run configure with a prefix pointing to your chosen target build directory:

./configure --prefix /opt/postgres

Then build it. The -j option gives you some parallelism which will probably help if you’re on a modern computer.

make -j4

Install the result to the prefix configured above:

make install

Running

Initialize a data directory and start an instance of Postgres right in your terminal. This is convenient because you can see any logging that it emits and you can restart it easily with Ctrl+C.

/opt/postgres/bin/initdb -D data

/opt/postgres/bin/postgres -D data -p 5433

Now create a database and connect to it:

/opt/postgres/bin/createdb -p 5433 brandur-test

/opt/postgres/bin/psql -p 5433 brandur-test

Testing

Postgres doesn’t have much in the way of standard unit testing, but instead relies heavily on a thorough regression suite. Run it with:

make check

The command will start a new server, set it up, run the suite, and then tear it down. This is a reliable way to get consistent results, but is somewhat slow. A faster version is also provided which can use a server that you already have running elsewhere:

PGPORT=5433 make installcheck

There’s also a parallel version available to further improve speed:

PGPORT=5433 make installcheck-parallel

pgindent

Postgres has a slightly unusual tradition of code indentation which seems to have evolved to maximize the number of bytes saved at a time when that mattered, and which continues through to this day. A program similar to Go’s gofmt called pgindent ships with the Postgres source to help automatically reformat source files that are inconsistent.

You may be asked to run pgindent if someone notices that your patch isn’t compliant, and it’s generally a good idea to run it on any sources files that you changed before producing a patch anyway.

A few dependencies need to be installed before pgindent can run. The most up-to-date instructions on how to do that can be found in its README.

After that’s done it can simply be run like so on a C file (where our current directory is the Postgres source root):

src/tools/pgindent/pgindent src/backend/utils/adt/mac.c

Given that pgindent is brittle Perl code and appears to have no test coverage whatsoever, I’d recommend committing changes before using it on any of your code.

Patch Formatting

Changes to Postgres are submitted as patch file email attachments to the PG Hackers mailing list. Traditionally, Postgres required that patches were in a particular style called “context format” (as generated by the diff tool’s -c option), but that constraint has since loosened a bit as the “unified diff” (probably what you’re used to seeing from programs like git diff) has become widely considered to more legible.

One good method for producing a patch that will be acceptable on the mailing list is the use of git format-patch 1. This command formats each commit as a separate file named based on the commit message, and includes each entire commit message within the files for extra context. For example:

$ git format-patch master...
0001-Implement-SortSupport-for-macaddr-data-type.patch

Regardless of the tool you use, good commit hygiene is still of paramount importance, so remember to squash and fix using git rebase -i before producing patch files.

1 Note that git format-patch is not officially endorsed and so your mileage with its usage may vary.

The Missing Manual for Hacking Postgres was published on August 17, 2016 from San Francisco.

Find me on Twitter at @brandur.

Did I make a mistake? Please consider sending a pull request.