Thread: outdated (bad) information in pg_stat_activity

outdated (bad) information in pg_stat_activity

From
hubert depesz lubaczewski
Date:
hi
my pg_stat_activity is raporting process working on a query for 12 hours straight! (query_start).
2 problems:
1. this particular query runs typically between 80 and 200 miliseconds!
2. procpid of this backend does not exists anymore in system

i found this:
http://www.issociate.de/board/post/264560/possibly_outdated_info_in_pg_stat_activity.html

is there any workaround? what can be done to fix the data - i am using pg_stat_activity a lot, and this thing really ruined my day.

best regards

depesz

Re: outdated (bad) information in pg_stat_activity

From
Tom Lane
Date:
hubert depesz lubaczewski <depesz@gmail.com> writes:
> my pg_stat_activity is raporting process working on a query for 12 hours
> straight! (query_start).
> 2 problems:
> 1. this particular query runs typically between 80 and 200 miliseconds!
> 2. procpid of this backend does not exists anymore in system

It's possible that the "backend exited" message got dropped by the stats
mechanism --- the stats code is deliberately designed to lose messages
rather than delay the rest of the system, so it's not 100% reliable.

            regards, tom lane

Re: outdated (bad) information in pg_stat_activity

From
hubert depesz lubaczewski
Date:
On 12/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
hubert depesz lubaczewski <depesz@gmail.com> writes:
> my pg_stat_activity is raporting process working on a query for 12 hours
> straight! (query_start).
> 2 problems:
> 1. this particular query runs typically between 80 and 200 miliseconds!
> 2. procpid of this backend does not exists anymore in system
It's possible that the "backend exited" message got dropped by the stats
mechanism --- the stats code is deliberately designed to lose messages
rather than delay the rest of the system, so it's not 100% reliable.

is it possible to "by hand" remove bad entries from this stats?

depesz

Re: outdated (bad) information in pg_stat_activity

From
Chris Browne
Date:
depesz@gmail.com (hubert depesz lubaczewski) writes:
> On 12/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>
>           hubert depesz lubaczewski <depesz@gmail.com> writes:
>      > my pg_stat_activity is raporting process working on a query for 12 hours
>      > straight! (query_start).
>      > 2 problems:
>      > 1. this particular query runs typically between 80 and 200 miliseconds!
>      > 2. procpid of this backend does not exists anymore in system
>      It's possible that the "backend exited" message got dropped by the stats
>      mechanism --- the stats code is deliberately designed to lose messages
>      rather than delay the rest of the system, so it's not 100% reliable.
>
>
>
> is it possible to "by hand" remove bad entries from this stats?
> depesz

The only answer I have been able to come to is that restarting the
postmaster will clear this all up.

If there is some less intrusive way of accomplishing this, I'd be keen
on hearing about it...
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/nonrdbms.html
"Funny, the  only thing that makes me  go Keanu about Microsoft is the
fact  that they are constantly  behind the times  and yet  claim to be
innovating." -- Steve Lamb <morpheus@despair.rpglink.com>

Re: outdated (bad) information in pg_stat_activity

From
"Gregory S. Williamson"
Date:
I too have had issues with "stuck" entries in the pg_stat_activity view -- the real pid is long gone but postgres still
thinksthe process is there. 

It would be real sweet to have a way of cleaning this table with bringing postgres off-line.

My $0.02 worth ...

Greg Williamson
DBA
GlobeXplorer LLC

-----Original Message-----
From:    pgsql-general-owner@postgresql.org on behalf of Chris Browne
Sent:    Wed 12/14/2005 11:06 AM
To:    pgsql-general@postgresql.org
Cc:
Subject:    Re: [GENERAL] outdated (bad) information in pg_stat_activity
depesz@gmail.com (hubert depesz lubaczewski) writes:
> On 12/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>
>           hubert depesz lubaczewski <depesz@gmail.com> writes:
>      > my pg_stat_activity is raporting process working on a query for 12 hours
>      > straight! (query_start).
>      > 2 problems:
>      > 1. this particular query runs typically between 80 and 200 miliseconds!
>      > 2. procpid of this backend does not exists anymore in system
>      It's possible that the "backend exited" message got dropped by the stats
>      mechanism --- the stats code is deliberately designed to lose messages
>      rather than delay the rest of the system, so it's not 100% reliable.
>
>
>
> is it possible to "by hand" remove bad entries from this stats?
> depesz

The only answer I have been able to come to is that restarting the
postmaster will clear this all up.

If there is some less intrusive way of accomplishing this, I'd be keen
on hearing about it...
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/nonrdbms.html
"Funny, the  only thing that makes me  go Keanu about Microsoft is the
fact  that they are constantly  behind the times  and yet  claim to be
innovating." -- Steve Lamb <morpheus@despair.rpglink.com>

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

!DSPAM:43a07aed148321697067737!






Re: outdated (bad) information in pg_stat_activity

From
"Jim C. Nasby"
Date:
On Wed, Dec 14, 2005 at 02:02:02PM -0800, Gregory S. Williamson wrote:
> I too have had issues with "stuck" entries in the pg_stat_activity view -- the real pid is long gone but postgres
stillthinks the process is there. 
>
> It would be real sweet to have a way of cleaning this table with bringing postgres off-line.

Rather than trying to clean things up by hand, could the stats system
instead periodically check to make sure that all the PIDs it knows about
actually still exist? I think that should be a pretty cheap check to
perform...

> Greg Williamson
> DBA
> GlobeXplorer LLC

On a side note, is GlobeXplorer using PostgreSQL? Would they be willing
to let us publicize that fact? Better yet, would they be willing to do a
case study?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: outdated (bad) information in pg_stat_activity

From
"Gregory S. Williamson"
Date:
Jim Nasby wrote:

> > I too have had issues with "stuck" entries in the pg_stat_activity view -- > the real pid is long gone but postgres
stillthinks the process is there. 
> >
> > It would be real sweet to have a way of cleaning this table with bringing > postgres off-line.
>
> Rather than trying to clean things up by hand, could the stats system
> instead periodically check to make sure that all the PIDs it knows about
> actually still exist? I think that should be a pretty cheap check to
> perform...

This would certainly work for me, even as a function that a DBA might call manually.
<...>
> On a side note, is GlobeXplorer using PostgreSQL? Would they be willing
> to let us publicize that fact? Better yet, would they be willing to do a
> case study?

We are indeed using postgres (and postGIS) for both runtime data access and billing and other misc. data processing
requirements. 

We're be delighted to be publicized as happy users -- the transition from Informix was fairly smooth and performance is
solid. 

Feel free to contact me off-list if you like for more substantive comments. I am gsw @ globexplorer.com

Greg W.


Re: outdated (bad) information in pg_stat_activity

From
hubert depesz lubaczewski
Date:
On 12/14/05, Chris Browne <cbbrowne@acm.org> wrote:
The only answer I have been able to come to is that restarting the
postmaster will clear this all up.

this is actually not an option for me.
of course i can make myself a function which will check if pid exists, but it would generally be better if there was some way to fix the problem in postgresql itself.
a series of delete's from some specific place perhaps?

depesz

Re: outdated (bad) information in pg_stat_activity

From
Tom Lane
Date:
hubert depesz lubaczewski <depesz@gmail.com> writes:
> of course i can make myself a function which will check if pid exists, but
> it would generally be better if there was some way to fix the problem in
> postgresql itself.

Try updating to PG 8.1 ...

2005-08-09 17:14  tgl

    * src/backend/postmaster/pgstat.c: Make backends that are reading
    the pgstats file verify each backend PID against the PGPROC array.
    Anything in the file that isn't in PGPROC gets rejected as being a
    stale entry.  This should solve complaints about stale entries in
    pg_stat_activity after a BETERM message has been dropped due to
    overload.

            regards, tom lane

Re: outdated (bad) information in pg_stat_activity

From
hubert depesz lubaczewski
Date:
On 12/16/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
2005-08-09 17:14  tgl
        * src/backend/postmaster/pgstat.c: Make backends that are reading
        the pgstats file verify each backend PID against the PGPROC array.
        Anything in the file that isn't in PGPROC gets rejected as being a
        stale entry.  This should solve complaints about stale entries in
        pg_stat_activity after a BETERM message has been dropped due to
        overload.

must have missed that one. thanks - we are moving to 8.1 anyway (50-150% increase in performance for some specific tasks!).

depesz

Re: outdated (bad) information in pg_stat_activity

From
Harry Jackson
Date:
On 12/17/05, hubert depesz lubaczewski <depesz@gmail.com> wrote:
>  must have missed that one. thanks - we are moving to 8.1 anyway (50-150%
> increase in performance for some specific tasks!).

For which tasks in particular are you seeing this king of gain?

Harry

--
http://www.hjackson.org
http://www.uklug.co.uk

Re: outdated (bad) information in pg_stat_activity

From
hubert depesz lubaczewski
Date:
On 12/17/05, Harry Jackson <harryjackson@gmail.com> wrote:
On 12/17/05, hubert depesz lubaczewski <depesz@gmail.com> wrote:
>  must have missed that one. thanks - we are moving to 8.1 anyway (50-150%
> increase in performance for some specific tasks!).
For which tasks in particular are you seeing this king of gain?

so called "imports".
we get data from external source, check it's id number, fetch the same object from database (if it exists), compare, update changes. if it didn't exist earlier - insert.
such a procedure "touched" directly 7 tables, and about 20 other using triggers and other means of automation.

depesz