I have found a memory leak in contrib/pg_stat_statements that occurs when the query text file (pgss_query_texts.stat) contains an invalid byte sequence. Each call to pg_stat_statements leaks the entire malloc'd file buffer and fails to release the held LWLock.PostgreSQL version: Discovered against PostgreSQL 15.12, verified also present in PG18(installed via homebrew). The affected code path in pg_stat_statements_internal() is unchanged between these versions.Platform: macOS 15.7.3 (aarch64).Steps to reproduce:
Enable pg_stat_statements and populate it with a large number of structurally unique queries (I used 2000 unique CTE-based queries with identifiers padded to 63 characters each). This creates a query text file of approximately 600 KB.
Corrupt the query text file by injecting a null byte at an arbitrary offset(I used byte offset 500). This can be done with: printf '\x00' | dd of=<data_directory>/pg_stat_tmp/pgss_query_texts.stat bs=1 seek=500 count=1 conv=notrunc
Verify that querying pg_stat_statements now returns: ERROR: invalid byte sequence for encoding "UTF8": 0x00
In a single psql session, repeatedly query pg_stat_statements (I ran SELECT count() FROM pg_stat_statements2000 times) while monitoring the backend process RSS usingps-o rss= -p <backend_pid>.
Output I got:The backend's RSS grows linearly with each failing query. With a 600 KB query text file and 2000 iterations, the backend's RSS grew by approximately 1.2 GB. The per-error leak is approximately equal to the query text file size (600 KB), confirming the file buffer is leaked on every call. Sample RSS measurements over time:
0 seconds: 67 MB
8 seconds: 153 MB
20 seconds: 370 MB
38 seconds: 739 MB
50 seconds: 1028 MB
54 seconds: 1251 MB
Output I expected:RSS should remain approximately constant across the failing queries. Each call should either succeed or fail cleanly without leaking memory. The LWLock should always be released regardless of whether the function succeeds or errors out.Root cause:Inpg_stat_statements_internal() in pg_stat_statements.c, the function acquires pgss->lock viaLWLockAcquire() and may allocate a file buffer viaqtext_load_file()(which uses malloc). Inside the hash table iteration loop, pg_any_to_server() is called to convert each stored query text to the server encoding. If the query text file contains an invalid encoding (such as a null byte), pg_any_to_server() calls ereport(ERROR), which performs a longjmp out of the function. The cleanup code at the bottom of the function that callsLWLockRelease() and free(qbuffer) is never reached. On every subsequent call, the entire file buffer is leaked again, and the LWLock release is skipped.Proposed fix:Wrap the hash table iteration loop in PG_TRY/PG_FINALLY so that LWLockRelease(pgss->lock) and free(qbuffer) execute even whenpg_any_to_server() throws an encoding error. This is a minimal change: no new allocations, no behavioral change on the success path. It only adds cleanup protection on the error path.Gaurav Singh