Building and Debugging Postgres

I recommend building Postgres from source so that you can add prints around the code to get a better idea of what is happening. Personally, I use a custom Docker image to explore code bases, especially if I need to install and build stuff that will touch multiple parts of my system. In this case it is even more useful because we need root access.1 For Postgres I start from a humble Ubuntu image that installs all the requirements:

FROM ubuntu:22.04

WORKDIR /app

RUN apt-get update && apt-get install -y \
    build-essential pkg-config libicu-dev \
    g++ make bison flex libreadline-dev \
    zlib1g-dev git llvm-dev clang \
  && rm -rf /var/lib/apt/lists/*

Most of the packages you see are required to build the base of Postgres. We also need llvm-dev to enable JIT compilation, which is now enabled by default in e.g., the Docker images. Once we have a setup with all the requirements, download the Postgres source code; in this article, I use version 17.5. Instead of downloading the source code from the FTP server in the releases, I download it from Github. This way I can always undo all the changes I do in a simple way. So, let's start by grabbing the source code, by cloning a single branch of a release, and only the last commit (as we only need to be able to go back to the last commit):

git clone https://github.com/postgres/postgres --branch REL_17_5 --depth 1

Say we clone this in /app/postgres. We navigate there and then issue:

./configure --with-llvm

We need the --with-llvm flag as otherwise JIT compilation will not be available. Assuming this succeeds, the next step is to build Postgres with:

make -j4

We then execute the following commands. As I said earlier, we need sudo access. The easiest way is to log in as root, which is what we do below. If you can't do su, but your user is still in the sudoers, you can always add sudo in front of every command. Moreover, you don't have to add a new user called postgres. You can do all the other commands and use your user (e.g., stef) for me. The only limitation is that in this case, your user should not be root, because Postgres cannot be run from the root user for security reasons.

su
make install
adduser postgres
mkdir -p /usr/local/pgsql/data
chown postgres /usr/local/pgsql/data

We start by getting root access which we need for the following steps. We first install what we built. Then, we create the postgres user because Postgres can't run as root (for security reasons). At this step you will be asked to create a password and enter some other information. Normally you want a good password, but for this exploration you can use something simple like "a". Finally, we create the directory that Postgres will use to store all its files and make postgres its owner since this user will start the Postgres processes. We follow up with these commands:

su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data

Now we use the postgres user to basically simply initiate the Postgres engine. Up to now we've been following the Postgres documentation on how to build and run Postgres from source. At this point, however, we will deviate from it because I find this method easier for iteration. Instead of starting Postgres as a background process, we will start directly as a normal executable. So, in the same terminal in which we have logged in as postgres we issue this:

/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data

Now, we leave this terminal running on the side. The benefit of this is it will show us any log on the terminal, which can be useful for iteration. Now, in a new terminal, we first of all create a database for our experiments:

/usr/local/pgsql/bin/createdb test

Now we can connect to this database with the following. If you did not create a postgres user earlier, you should omit -U postgres.

/usr/local/pgsql/bin/psql -U postgres -d test

At this point, the setup is basically done. We can start executing queries. But here's how my iteration cycle looks like. For starters, I have 3 terminals: (1) postgres terminal, (2) psql terminal, (3) make terminal. In the postgres terminal I am logged in as postgres and I have the postgres executable running. The psql terminal, which is what we'll use in the rest of this article, is the user-facing terminal. Here we just execute our queries, insert data into the database, etc. Here I'm logged in as root in Docker. Finally, I have the make terminal only to hit make install when I make a change. So, an iteration looks like this: After making the change to the Postgres code, I first stop postgres with Ctrl+C in terminal (1), I hit make in terminal (3), I then restart postgres in (1), and I execute queries in (2) to see the changes.

Debugging Postgres is not the easiest thing because e.g., it uses many threads and because it involves multiple executables (e.g., postgres and psql). But here's a process that has worked for me. First of all, you need to compile Postgres with debug information. Start with:

./configure --enable-cassert --enable-debug CFLAGS="-O0 -g3 -ggdb -fno-omit-frame-pointer"

If you have already built Postgres before, do:

make clean

and then of course:

make -j4
sudo make install

Now start your Postgres backend as we discussed above:

/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data

Then, in a new terminal, start your psql executable:

/usr/local/pgsql/bin/psql -d test

Before doing anything else, do this:

SELECT pg_backend_pid();

Then, in yet another terminal,2 start gdb like this:

sudo gdb /usr/local/pgsql/bin/postgres <PID>

This should give you an output that looks like this:

Reading symbols from /usr/local/pgsql/bin/postgres...
Attaching to program: /usr/local/pgsql/bin/postgres, process 235825
Reading symbols from /lib/x86_64-linux-gnu/libz.so.1...
(No debugging symbols found in /lib/x86_64-linux-gnu/libz.so.1)
...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x000073c17c325e5a in epoll_wait (epfd=6, events=0x58839c925e00, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30	../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.

Do not worry about the error with epoll_wait.c. This is just because the backend is currently idle. Also, do not worry about all the “No debugging symbols found” messages. But, you should make sure that this:

Reading symbols from /usr/local/pgsql/bin/postgres...

is successful, meaning, that you don't get a:

(No debugging symbols found in /lib/x86_64-linux-gnu/libz.so.1)

Next, do this:

handle SIGPIPE SIGUSR1 SIGUSR2 SIGALRM nostop noprint pass

to handle these. Finally, add your breakpoints. If you add a breakpoint to a core part of Postgres, it should be added right away. For example:

(gdb) b exec_simple_query
Breakpoint 1 at 0x588373babda3: file postgres.c, line 1018.

However, if you add a breakpoint to e.g., the PL/SQL part of the code, like execstmtexecsql, it may appear that gdb can't find any debug symbols:

(gdb) b exec_stmt_execsql
Function "exec_stmt_execsql" not defined.
Make breakpoint pending on future shared library load? (y or [n])

Again, don't worry about this at this point. This happens because the PL/SQL module is a SQL extension and is installed as such.3 In practice, this means that it's built and loaded as a shared library.4 At this point in your GDB session, assuming you have not issued any queries that would involve PL/pgSQL, it is not loaded yet. So, you can make it pending.

To make sure this works, add a breakpoint to exec_simple_query and then hit c or:

(gdb) continue

Now, in the terminal that runs psql, issue something simple like:

test=# SELECT 1;

This should make it seem like it's taking forever, because hopefully in the terminal running gdb, you hit your breakpoint. And that's pretty much it! One useful note is that usually you'll want to debug whole files as input. I've found that the -f flag to psql doesn't work, so instead I start psql, get the PID, start gdb, and then load the file with \i in psql.

Debugging in VS Code

As you'd expect, debugging in VS Code is slightly harder, but again, here's something that has worked for me. We will use gdbserver because, as we saw earlier, to have a successful debugging session, gdb needs to start with sudo. Doing that in the terminal is trivial, but doing it inside VS Code is a pain. So, we will start a GDB server with sudo in the terminal, and then make VS Code communicate with that. This is easier than it sounds. First, make sure gdbserver is installed, e.g.:

sudo apt-get install gdbserver

Then, start psql and grab the PID as before. In a third terminal, do this:

sudo gdbserver --attach :1234 <PID>

You should see:

Attached; pid = 246440
Listening on port 1234

Open the Postgres source code in VS Code, and add a breakpoint wherever, just make sure that this is a breakpoint that should be hit by the query we will execute in just a second. Go to the “Run and Debug” tab, and create a launch.json as you would for any debugging session with VS Code. Use this:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Attach via gdbserver",
      "type": "cppdbg",
      "request": "launch",
      "program": "/usr/local/pgsql/bin/postgres",
      "cwd": "${workspaceFolder}",
      "MIMode": "gdb",
      "miDebuggerPath": "/usr/bin/gdb",
      "miDebuggerServerAddress": "localhost:1234",
      "setupCommands": [
        { "text": "-enable-pretty-printing" },
        { "text": "set pagination off" },
        { "text": "handle SIGALRM SIGUSR1 SIGPIPE nostop noprint pass" }
      ]
    }
  ]
}

Now, click Run. You may see some errors, but don't worry. In the terminal where you're running gdbserver you should now see something like:

Remote debugging from host 127.0.0.1, port 55466

Finally, go to the psql terminal and issue a query. Hopefully VS Code will break at the breakpoint once you issue the query. For me, when I stop the debugger, it stops the psql process, for reasons that are unclear to me, but this is not that big of a deal for me.

Debugging in Single-User Mode

Hopefully you will not need the information of this section, but it's workaround if debugging the standard Postgres execution doesn't work. “Single-User Mode”, as it's called, runs in a single thread, has a single user, a single database, and a single executable. Everything happens in one terminal. You can start it with:

/usr/local/pgsql/bin/postgres --single -D /usr/local/pgsql/data test

Note that not only do we need to pas the --single flag, but we also need to pass the database to use, in this case test. This should take us to a terminal like:

backend>

As I said, the benefit of this mode is that everything happens with one executable, postgres. The executable psql is not involved. This lets us debug as a plain simple executable with GDB... almost. The only issue is how can we issue a .sql file to run, which is useful when debugging as we don't want to have to retype the code for every iteration. The problem arises from the fact that in this mode we can only issue the code by typing it directly, or... piping it. Long-story short, I've had a short of quick iteration time with the following script:

gdb -ex 'set breakpoint pending on' \
    -ex 'b exec_stmt_execsql' \
    -ex 'run --single -D /usr/local/pgsql/data test < ./regression-sl.sql' \
    -ex 'layout src' \
    --args /usr/local/pgsql/bin/postgres

You should be careful with the files you pipe in. In single-user mode, Postgres reads files line by line and things that normally work may not work here. For example, this will not work:

CREATE FUNCTION func() RETURNS VOID AS $$
BEGIN
  SELECT SUM(o_totalprice) FROM orders;
END
$$
LANGUAGE PLPGSQL;

Instead, you should define it in a single line, like this:

CREATE FUNCTION func() RETURNS VOID AS $$BEGIN SELECT SUM(o_totalprice) FROM orders; END$$ LANGUAGE plpgsql;

Printf vs debugger debugging has been argued for years in the software community. I'm not interested in this debate simply because I find both useful in different scenarios. I will only add my two cents when it comes to debugging Postgres. As with most things I do, most of what I do with Postgres has to do with performance. To figure out if a query is faster than another, you can't use a debug build simply because you won't get accurate numbers. Thus, most of the time I use an optimized build, and only adding prints here and there to see roughly which code paths are taken. Only when I really need to take a deep look at the code paths the program is taking do I switch to a debug build. In short, prints are useful at least some of the time, so let's learn how to add them.

Postgres is written in C, and you might think that you can simply add printfs around the code. The problem with that, however, is that the print statements will appear in the terminal running postgres which is non-ideal because the postgres executable by default logs all other kinds of information that is usually useless for iteration and clutters the output. Instead, you can use elog(INFO, "..."), e.g.:

elog(INFO, "STEF44: ncols: %d\n", ncols);