Measuring pg_stat_statements overhead
Published on the
Topics: postgresql
pg_stat_statements
is a fantastically useful PostgreSQL extension that collects execution statistics of all statements executed by the server, to allow the database administrator to monitor and analyze any possible performance issue. For a good reason, it is one of the, if not the most installed extension, and even a cursory internet search will reveal numerous sources extolling its virtues.
Of course, collecting those statistics has some small performance overhead, which is widely reported to be negligible. There's nothing wrong with checking the veracity of those claims ourselves, however: hardware and software differences can matter, after all, and the hardware I'm running this database on is not exactly usual.
The target machine
The particular machine I'm most interested in measuring the overhead on is a repurposed older thin client, so by server standards, it's laughably puny. It features such cutting edge hardware as:
- A dual-core AMD Jaguar system-on-chip, clocked at 1.65 GHz.
Fun fact: AMD Jaguar is the CPU microarchitecture powering two of the eighth generation video game consoles, its use outside of embedded spaces is otherwise rare.
- 4 GB of DDR3 RAM, at 1600 MT/s, single-channel. The SoC doesn't support more memory channels; at least it supports ECC.
- 128 GB SATA SSD.
Unlike typical server hardware, it is also completely silent, being 100% passively cooled, and draws very little power, even at full load.
Now that I think about it, when compared to some of the smaller cloud instances, this is honestly quite beefy, and there are no noisy neighbors to worry about, and the storage performance is much higher than what basic cloud storage offers, unless you explicitly provision enough throughput and I/O operations per second, which of course costs you extra… huh, I'm starting to think that the entire cloud thing may be a bit of a raw deal.
Software-wise, there's nothing special: just a regular Debian bookworm with PostgreSQL 15.2 installed from Debian's repositories. The PostgreSQL settings are quite vanilla as well, I didn't do anything interesting in postgresql.conf
apart from the usual stuff like changing shared_buffers
, random_page_cost
and some logging settings.
The benchmark
What I'm most interested in is what is the worst case overhead I can realistically expect: assume all statements are simple and do very little, if any, I/O. For this reason, I used pgbench
with scale factor of 40 and in select only mode. With scale factor this small, the database fits entirely not just in RAM, but also in PostgreSQL's shared buffers, while pgbench
's select only mode uses a simple, but still realistic SELECT
statement. While one could argue that something like SELECT 1
would be even simpler, it does not quite satisfy the criterion of being a realistic query a program would execute against the database.
With that in mind, let's run pgbench
with two client connections — one per thread — for 5 minutes, using prepared statements, then restart the server with pg_stat_statements
enabled and do it again.
SELECT
onlypg_stat_statements state | Transactions per second | Mean latency | Standard deviation | Total transactions |
---|---|---|---|---|
off | 12235 | 0.162 ms | 0.051 ms | 3,670,487 |
on | 11922 | 0.167 ms | 0.057 ms | 3,576,490 |
We can see that enabling pg_stat_statements
resulted in a 2.5% hit to transaction throughput and 3% hit to mean transaction latency. Well worth it, considering the utility of this extension, and in line with the expectations of being negligible.
We could also try using pgbench
's default transaction type, to see what happens if we add inserts and updates, albeit simple ones still, into the mix. Those operations perform more work than pure selects, for example checking foreign key constraints, updating indexes and appending every data modification to the write-ahead log to ensure crash resilience, so the time spent on collecting statistics should be proportionally smaller.
pg_stat_statements state | Transactions per second | Mean latency | Standard deviation | Total transactions |
---|---|---|---|---|
off | 1402.8 | 1.423 ms | 0.415 ms | 420,849 |
on | 1384.2 | 1.443 ms | 0.424 ms | 415,244 |
Indeed, the overhead is even smaller now, with only 1.4% drop in throughput and mean latency.