On 8/15/22 1:58 PM, Nathan Bossart wrote:
> Hi hackers,
>
> As Greg Stark noted elsewhere [0], it is presently very difficult to
> identify the PID of the session using a temporary schema, which is
> particularly unfortunate when a temporary table is putting a cluster in
> danger of transaction ID wraparound. I noted [1] that the following query
> can be used to identify the PID for a given backend ID:
>
> SELECT bid, pg_stat_get_backend_pid(bid) AS pid FROM pg_stat_get_backend_idset() bid;
>
> But on closer inspection, this is just plain wrong. The backend IDs
> returned by pg_stat_get_backend_idset() might initially bear some
> resemblance to the backend IDs stored in PGPROC, so my suggested query
> might work some of the time, but the pg_stat_get_backend_* backend IDs
> typically diverge from the PGPROC backend IDs as sessions connect and
> disconnect.
I didn't review the patch itself yet, but I'd like to chime in with a
big "+1" for the idea. I've had several past experiences getting called
to help in situations where a database was getting close to wraparound
and the culprit was a temp table blocking vacuum. I went down this same
trail of pg_stat_get_backend_idset() and I can attest that it did work
once or twice, but it didn't work other times.
AFAIK, in PostgreSQL today, there's really no way to reliably get the
PID of the session holding particular temp tables. (The idea of
iterating through backends with gdb and trying to find & dump some
obscure data structure seems completely impractical for regular
production ops.)
I'll take a look at the patch if I can... and I'm hopeful that we're
able to move this idea forward and get this little gap in PG filled once
and for all!
-Jeremy
--
Jeremy Schneider
Database Engineer
Amazon Web Services