Thread: Stale entries in pg_stat_activity
PostgreSQL 8.0.0beta3 (CVS) Solaris 9 pg_stat_activity sometimes shows old connections that have already terminated. I've been able to reproduce this somewhat consistently by doing the following: createdb test pgbench -i test sudo /etc/init.d/postgresql restart pgbench -S -c 30 -t 20 test psql -c 'SELECT * from pg_stat_activity' (Sorry for the long lines, but on a wide terminal the output is easier to read than if I had used -x to enable expanded output.) datid | datname | procpid | usesysid | usename | current_query | query_start -------+---------+---------+----------+----------+-------------------------------------------------+------------------------------- 38894 | test | 28497 | 1 | postgres | <IDLE> | 2004-10-16 20:40:17.247942-06 38894 | test | 28494 | 1 | postgres | <IDLE> | 2004-10-16 20:40:17.247759-06 38894 | test | 28487 | 1 | postgres | <IDLE> | 2004-10-16 20:40:16.185687-06 38894 | test | 28481 | 1 | postgres | <IDLE> | 2004-10-16 20:40:17.120745-06 38894 | test | 28479 | 1 | postgres | <IDLE> | 2004-10-16 20:40:16.185514-06 38894 | test | 28477 | 1 | postgres | <IDLE> | 2004-10-16 20:40:17.119938-06 38894 | test | 28470 | 1 | postgres | select abalance from accounts where aid = 32673 | 2004-10-16 20:40:17.248236-06 38894 | test | 28462 | 1 | postgres | <IDLE> | 2004-10-16 20:40:16.200115-06 38894 | test | 28463 | 1 | postgres | <IDLE> | 2004-10-16 20:40:17.248175-06 (9 rows) This is a test box with no other activity. ps shows that the listed processes don't exist and netstat shows no open TCP or Unix socket connections to the database. Before running pgbench I restart the backend to be sure it has a clean slate. The number of rows varies from test to test -- sometimes there are none and I've seen as many as 19 with the pgbench options shown. Sometimes current_query shows <IDLE> for all connections and sometimes it shows a query for some or all of them. Any ideas? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr <mike@fuhr.org> writes: > pg_stat_activity sometimes shows old connections that have already > terminated. Define "old"? There's a built-in delay of up to half a second (if memory serves) in the statistics views. regards, tom lane
On Sun, Oct 17, 2004 at 12:01:45AM -0400, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > pg_stat_activity sometimes shows old connections that have already > > terminated. > > Define "old"? There's a built-in delay of up to half a second (if > memory serves) in the statistics views. Fourteen minutes and counting. I just noticed the problem this evening so I don't know how long the stale connections remain in the view. During my testing I was restarting the backend frequently to start each run afresh; I'll leave the current situation alone for a while and see what happens. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Sat, Oct 16, 2004 at 10:30:12PM -0600, Michael Fuhr wrote: > On Sun, Oct 17, 2004 at 12:01:45AM -0400, Tom Lane wrote: > > Michael Fuhr <mike@fuhr.org> writes: > > > pg_stat_activity sometimes shows old connections that have already > > > terminated. > > > > Define "old"? There's a built-in delay of up to half a second (if > > memory serves) in the statistics views. > > Fourteen minutes and counting. Two hours and counting. I've also reproduced the problem using pgbench against 8.0.0beta3 on FreeBSD 4.10-STABLE. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr <mike@fuhr.org> writes: >>>> pg_stat_activity sometimes shows old connections that have already >>>> terminated. >>> >>> Define "old"? There's a built-in delay of up to half a second (if >>> memory serves) in the statistics views. >> >> Fourteen minutes and counting. > Two hours and counting. I've also reproduced the problem using > pgbench against 8.0.0beta3 on FreeBSD 4.10-STABLE. The other likely theory is that the stats subsystem is under sufficiently heavy load that it drops a certain proportion of stats messages (as it is designed to do, rather than causing foreground activity to block). If it were to drop a backend-termination message then of course that connection would remain depicted in pg_stat_activity until the stats collector figures out that the backend is gone. Which IIRC would normally happen when some new backend re-uses its BackendId slot. You could check for log messages "statistics buffer is full" as a partial check of this theory. This is only partial because it isn't a direct test of whether messages are being dropped. regards, tom lane