Thread: pg_stat_get_backen_last_activity() ???
Is there any notion of pg_stat_get_backend_last_activity()? I've got a rogue process someplace and I'm having a devil of a time tracking down which process is not committing its transaction and letting the rest of the world continue to access the tables. Based off of the last query performed, I'd have some idea as to where to hunt and could figure this out, but as best as I can tell, the only option available is to setup a query log and cruize through there with grep/less. Am I missing more programatic and obvious? -sc PS, the archives didn't turn up anything for 'idle in transaction' ??? -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > Is there any notion of pg_stat_get_backend_last_activity()? I've got > a rogue process someplace and I'm having a devil of a time tracking > down which process is not committing its transaction and letting the > rest of the world continue to access the tables. Perhaps look in pg_locks to see who's holding the oldest open transaction number, or who's got the most locks that other people are waiting for? That should get you at least the PID of the offender. Then use ps and/or pg_stat_activity to figure out what the offender is doing. regards, tom lane
> > Is there any notion of pg_stat_get_backend_last_activity()? I've > > got a rogue process someplace and I'm having a devil of a time > > tracking down which process is not committing its transaction and > > letting the rest of the world continue to access the tables. > > Perhaps look in pg_locks to see who's holding the oldest open > transaction number, or who's got the most locks that other people > are waiting for? That should get you at least the PID of the > offender. Then use ps and/or pg_stat_activity to figure out what > the offender is doing. I can already identify that. Using the PID of the backend + sockstat on both the DB server and on the remote www server I can get the PID of the webserver process causing the problem, but I have no clue which customer, which page, which query, etc. The www process is doing something along the lines of: BEGIN; SELECT.... [waits forever] END; I could easily grep and figure out what the query is if I knew what the prior query was because that'd give me some indication as to which page it was that was sitting there pouting inside of a transaction. Attaching GDB to random httpd processes isn't my idea of fun. :-/ The other problem is that with the frequency being around once every week or so, it's one of those largely annoying things that isn't too critical in its frequency, but just critical enough to spend some cycles on here and there. -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > I can already identify that. Using the PID of the backend + sockstat > on both the DB server and on the remote www server I can get the PID > of the webserver process causing the problem, but I have no clue which > customer, which page, which query, etc. The www process is doing > something along the lines of: > BEGIN; > SELECT.... > [waits forever] > END; Oh, you want to know the actual current query. Turn on stats_command_string or whatever it's called. See the statistics collector documentation. If this is after-the-fact (ie, you want to know about a process that's hung right now), attach to it with a debugger and print out debug_query_string. regards, tom lane
> > I can already identify that. Using the PID of the backend + > > sockstat on both the DB server and on the remote www server I can > > get the PID of the webserver process causing the problem, but I > > have no clue which customer, which page, which query, etc. The > > www process is doing something along the lines of: > > > BEGIN; > > SELECT.... > > [waits forever] > > END; > > Oh, you want to know the actual current query. Turn on > stats_command_string or whatever it's called. See the statistics > collector documentation. > > If this is after-the-fact (ie, you want to know about a process > that's hung right now), attach to it with a debugger and print out > debug_query_string. Alright, turned on stats_command_string and am seeing nothing in any of the stats tables and debug_query_string is null. (gdb) p debug_query_string $1 = 0 (gdb) bt #0 0x283d0c58 in recvfrom () from /usr/lib/libc.so.4 #1 0x283c092b in recv () from /usr/lib/libc.so.4 #2 0x80f11c6 in secure_read () #3 0x80f4557 in StreamConnection () #4 0x80f4935 in pq_getbyte () #5 0x81377b0 in PostgresMain () #6 0x811a63c in PostmasterMain () #7 0x8119e5b in PostmasterMain () #8 0x811934b in PostmasterMain () #9 0x8118e94 in PostmasterMain () #10 0x80f53d7 in main () #11 0x8069b3e in _start () Any ideas on how I can figure out what the mystery query/app/lib is that's causing my backend to hang with a zillion connections that are all idle? -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > (gdb) bt > #0 0x283d0c58 in recvfrom () from /usr/lib/libc.so.4 > #1 0x283c092b in recv () from /usr/lib/libc.so.4 > #2 0x80f11c6 in secure_read () > #3 0x80f4557 in StreamConnection () > #4 0x80f4935 in pq_getbyte () > #5 0x81377b0 in PostgresMain () > #6 0x811a63c in PostmasterMain () > #7 0x8119e5b in PostmasterMain () > #8 0x811934b in PostmasterMain () > #9 0x8118e94 in PostmasterMain () > #10 0x80f53d7 in main () > #11 0x8069b3e in _start () > Any ideas on how I can figure out what the mystery query/app/lib is > that's causing my backend to hang with a zillion connections that are > all idle? Well, that particular backend is not hung: the stack trace proves it's waiting for a command from its frontend. Perhaps you need to look through your other Postgres processes. regards, tom lane
> > (gdb) bt > > #0 0x283d0c58 in recvfrom () from /usr/lib/libc.so.4 > > #1 0x283c092b in recv () from /usr/lib/libc.so.4 > > #2 0x80f11c6 in secure_read () > > #3 0x80f4557 in StreamConnection () > > #4 0x80f4935 in pq_getbyte () > > #5 0x81377b0 in PostgresMain () > > #6 0x811a63c in PostmasterMain () > > #7 0x8119e5b in PostmasterMain () > > #8 0x811934b in PostmasterMain () > > #9 0x8118e94 in PostmasterMain () > > #10 0x80f53d7 in main () > > #11 0x8069b3e in _start () > > > Any ideas on how I can figure out what the mystery query/app/lib > > is that's causing my backend to hang with a zillion connections > > that are all idle? > > Well, that particular backend is not hung: the stack trace proves it's > waiting for a command from its frontend. Perhaps you need to look > through your other Postgres processes. Sorry, hang is the wrong word. Block is more correct. I know it's waiting for commands from the front end, but I'm trying to figure out what front end/query/lib/page is causing the problem. It's only once in few thousand queries that it happens and I'm going crazy trying to figure out what it is that's causing this. I've set stats_command_string to true and attached gdb to the backend, but that's null. Is there another var in the backend that contains a pointer to the last query performed? I haven't found it if there is... -sc -- Sean Chittenden
> > > Any ideas on how I can figure out what the mystery query/app/lib > > > is that's causing my backend to hang with a zillion connections > > > that are all idle? > > > > Well, that particular backend is not hung: the stack trace proves it's > > waiting for a command from its frontend. Perhaps you need to look > > through your other Postgres processes. > > Sorry, hang is the wrong word. Block is more correct. I know it's > waiting for commands from the front end, but I'm trying to figure out > what front end/query/lib/page is causing the problem. It's only once > in few thousand queries that it happens and I'm going crazy trying to > figure out what it is that's causing this. I've set > stats_command_string to true and attached gdb to the backend, but > that's null. Is there another var in the backend that contains a > pointer to the last query performed? I haven't found it if there > is... -sc Hrm, guess this'll work: log_pid = true log_statement = true log_duration = true log_timestamp = true Lower tech, but seems to work well effective. I changed pg_ctl to zip my error logs though, otherwise it grows too quickly. :-/ - "$po_path" ${1+"$@"} ${PGDATAOPTS+$PGDATAOPTS} </dev/null >>$logfile 2>& 1 & + "$po_path" ${1+"$@"} ${PGDATAOPTS+$PGDATAOPTS} </dev/null | $gzip -${ziplevel} -c >>$logfile 2>& 1 & -sc -- Sean Chittenden