BUG #10123: Weird entries in pg_stat_activity - Mailing list pgsql-bugs

From maxim.boguk@gmail.com
Subject BUG #10123: Weird entries in pg_stat_activity
Date
Msg-id 20140424101827.2714.39486@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #10123: Weird entries in pg_stat_activity  (Magnus Hagander <magnus@hagander.net>)
Re: BUG #10123: Weird entries in pg_stat_activity  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      10123
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 9.2.8
Operating system:   Linux
Description:

Hi everyone,

During debug of the weird nagios database alerts about long transactions I
found that sometime pg_stat_activity contains very weird entries with
xact_start NOT NULL and state='idle'... what's more they happened during
backend idle state.

To check what's going on I performed the following actions on production
database:
1)set log_min_duration_statement=0 to see ALL database queries
2)run single line bash script to check for weird entries:
while true; do  psql -x -X -c "select now() as ts,now()-xact_start as
tx_age, now()-state_change as change_age,* from pg_stat_activity where
state='idle' and xact_start is not null" |  grep -vE '^$|No rows'; sleep
0.1; done

And yes, once-twice-trice per minute script output some interesting results
such as:


-[ RECORD 1

]----+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ts               | 2014-04-24 13:49:03.498354+04
tx_age           | 00:04:01.283106
change_age       | 00:04:01.279373
datid            | 16385
datname          | *******
pid              | 685
usesysid         | 16384
usename          | *******
application_name |
client_addr      | 127.0.0.1
client_hostname  |
client_port      | 37461
backend_start    | 2014-04-24 13:30:03.275711+04
xact_start       | 2014-04-24 13:45:02.215248+04
query_start      | 2014-04-24 13:45:02.215248+04
state_change     | 2014-04-24 13:45:02.218981+04
waiting          | f
state            | idle
query            |
                 |                     SELECT cache.save_response(
                 |                      *****
                 |                     );


State=idle
xact_start=4 min ago

Now what I found in the full database log for this pid:

...

2014-04-24 13:45:02 MSK 685 ****@**** from 127.0.0.1 [vxid:24/0 txid:0]
[SELECT] LOG:  duration: 3.703 ms  statement:
                            SELECT cache.save_response(...)

-- no queries between 13:45 and 13:50 at all!

2014-04-24 13:50:01 MSK 685 ****@**** from 127.0.0.1 [vxid:24/0 txid:0]
[SELECT] LOG:  duration: 0.962 ms  statement: select 1
2014-04-24 13:50:01 MSK 685 ****@**** from 127.0.0.1 [vxid:24/0 txid:0]
[SELECT] LOG:  duration: 0.415 ms  statement: SELECT * FROM o_queries_queue
...

So it seems that sometime pg_stat_activity entry become alive with the old
information (it shows xact_start from the last executed query in backend,
even if that query finished long time ago).
I know about asynchronous behavior of pg_stat_activity but getting
xact_start from deep past seems wrong for me.

What's more: i tested that problem on another databases - and found they
have same effect (so it doesn't look like a problem local to that single
exact database).

Kind Regards,
Maksym

pgsql-bugs by date:

Previous
From: Claudio Nieder
Date:
Subject: Possible transaction bug with isolation level serialisable?
Next
From: Magnus Hagander
Date:
Subject: Re: BUG #10123: Weird entries in pg_stat_activity