pg_stat_statements is a fantastically useful PostgreSQL extension that collects execution statistics of all statements executed by the server, to allow the database administrator to monitor and analyze any possible performance issue. For a good reason, it is one of the, if not the most installed extension, and even a cursory internet search will reveal numerous sources extolling its virtues.
Of course, collecting those statistics has some small performance overhead, which is widely reported to be negligible. There's nothing wrong with checking the veracity of those claims ourselves, however: hardware and software differences can matter, after all, and the hardware I'm running this database on is not exactly usual.
The target machine
The particular machine I'm most interested in measuring the overhead on is a repurposed older thin client, so by server standards, it's laughably puny. It features such cutting edge hardware as:
- A dual-core AMD Jaguar system-on-chip, clocked at 1.65 GHz.
Fun fact: AMD Jaguar is the CPU microarchitecture powering two of the eighth generation video game consoles, its use outside of embedded spaces is otherwise rare.
- 4 GB of DDR3 RAM, at 1600 MT/s, single-channel. The SoC doesn't support more memory channels; at least it supports ECC.
- 128 GB SATA SSD.
Unlike typical server hardware, it is also completely silent, being 100% passively cooled, and draws very little power, even at full load.
Now that I think about it, when compared to some of the smaller cloud instances, this is honestly quite beefy, and there are no noisy neighbors to worry about, and the storage performance is much higher than what basic cloud storage offers, unless you explicitly provision enough throughput and I/O operations per second, which of course costs you extra… huh, I'm starting to think that the entire cloud thing may be a bit of a raw deal.
Software-wise, there's nothing special: just a regular Debian bookworm with PostgreSQL 15.2 installed from Debian's repositories. The PostgreSQL settings are quite vanilla as well, I didn't do anything interesting in
postgresql.conf apart from the usual stuff like changing
random_page_cost and some logging settings.
What I'm most interested in is what is the worst case overhead I can realistically expect: assume all statements are simple and do very little, if any, I/O. For this reason, I used
pgbench with scale factor of 40 and in select only mode. With scale factor this small, the database fits entirely not just in RAM, but also in PostgreSQL's shared buffers, while
pgbench's select only mode uses a simple, but still realistic
SELECT statement. While one could argue that something like
SELECT 1 would be even simpler, it does not quite satisfy the criterion of being a realistic query a program would execute against the database.
With that in mind, let's run
pgbench with two client connections — one per thread — for 5 minutes, using prepared statements, then restart the server with
pg_stat_statements enabled and do it again.
|Transactions per second
We can see that enabling
pg_stat_statements resulted in a 2.5% hit to transaction throughput and 3% hit to mean transaction latency. Well worth it, considering the utility of this extension, and in line with the expectations of being negligible.
We could also try using
pgbench's default transaction type, to see what happens if we add inserts and updates, albeit simple ones still, into the mix. Those operations perform more work than pure selects, for example checking foreign key constraints, updating indexes and appending every data modification to the write-ahead log to ensure crash resilience, so the time spent on collecting statistics should be proportionally smaller.
|Transactions per second
Indeed, the overhead is even smaller now, with only 1.4% drop in throughput and mean latency.