Thread: pg 8.1.2 performance issue

pg 8.1.2 performance issue

From

"Ed L."

Date:

26 March 2006, 00:36:07

I have a performance riddle, hoping someone can point me in a
helpful direction.  We have a pg 8.1.2 cluster using
Apache::Sessions and experiencing simple UPDATEs taking
sometimes 30+ seconds to do a very simply update, no foreign
keys, no triggers:

        Table "public.sessions"
  Column   |     Type      | Modifiers
-----------+---------------+-----------
 id        | character(32) | not null
 a_session | text          |
Indexes:
    "sessions_pkey" PRIMARY KEY, btree (id)


This is on an HP ia64 11.23 box with what appears to be gobs of
surplus CPU, I/O (it's on a SAN), and RAM, pretty high query
volume from 180 concurrent client connections.

Wondering if it is a locking issue, I set up logging to capture
existing locks every 10s with this query:

SELECT now(), dbu.usename as locker, l.mode as locktype, CASE
l.granted WHEN true THEN 'granted' ELSE 'pending' END as status,
pg_stat_get_backend_pid(S.backendid)
as pid, l.transaction as xid, db.datname||'.'||n.nspname||'.'||
r.relname as relation, case l.mode when 'AccessShareLock' then 1
when 'RowShareLock' then 2 when 'Row
ExclusiveLock' then 3 when 'ShareUpdateExclusiveLock' then 4 when
'ShareLock' then 5 when 'ShareRowExclusiveLock' then 6 when
'ExclusiveLock' then 7 else 100 end as
 exclusivity, pg_stat_get_backend_activity(S.backendid) as query
FROM pg_user dbu,
     (SELECT pg_stat_get_backend_idset() AS backendid) AS S,
     pg_database db, pg_locks l, pg_class r, pg_namespace n
WHERE db.oid = pg_stat_get_backend_dbid(S.backendid)
  AND dbu.usesysid = pg_stat_get_backend_userid(S.backendid)
  AND l.pid = pg_stat_get_backend_pid(S.backendid)
  AND l.relation = r.oid
  AND l.database = db.oid
  AND r.relnamespace = n.oid
ORDER BY exclusivity DESC, db.datname, n.nspname, r.relname,
l.mode;"


I see what appear to be many single transactions holding
RowExclusiveLocks for sometimes 40-50 seconds while their query
shows "<IDLE> in transaction".

 2006-03-25 20:04:01.063873-08 | www | RowExclusiveLock | granted
| 17192 | 270205914 | db1.public.sessions |           3 | <IDLE>
in transaction
 2006-03-25 20:04:11.128632-08 | www | RowExclusiveLock | granted
| 17192 | 270205914 | db1.public.sessions          |           3
| <IDLE> in transaction
 2006-03-25 20:04:21.215896-08 | www | RowExclusiveLock | granted
| 17192 | 270205914 | db1.public.sessions          |           3
| <IDLE> in transaction

I'm thinking that means the client is simply tweaking a row and
then failing to commit the change for 40-50 seconds.  Is that
consistent?  Is there something else obvious here to explain the
delays?  Does this sound like a browser stop button issue where
they may be aborting the query that has the lock, and then
issuing another that waits on the first?  Other suggestions?

Thanks,
Ed

Re: pg 8.1.2 performance issue

From

"Ed L."

Date:

26 March 2006, 00:49:55

On Saturday March 25 2006 9:36 pm, Ed L. wrote:
> I have a performance riddle, hoping someone can point me in a
> helpful direction.  We have a pg 8.1.2 cluster using
> Apache::Sessions and experiencing simple UPDATEs taking
> sometimes 30+ seconds to do a very simply update, no foreign
> keys, no triggers:
>
>         Table "public.sessions"
>   Column   |     Type      | Modifiers
> -----------+---------------+-----------
>  id        | character(32) | not null
>  a_session | text          |
> Indexes:
>     "sessions_pkey" PRIMARY KEY, btree (id)

The table has 6800 rows over 18000 pages, and is getting a
minimum of many tens of thousands of updates per day with
queries like this:

    UPDATE sessions SET a_session = ? WHERE id = ?

Ed

>
>
> This is on an HP ia64 11.23 box with what appears to be gobs
> of surplus CPU, I/O (it's on a SAN), and RAM, pretty high
> query volume from 180 concurrent client connections.
>
> Wondering if it is a locking issue, I set up logging to
> capture existing locks every 10s with this query:
>
> SELECT now(), dbu.usename as locker, l.mode as locktype, CASE
> l.granted WHEN true THEN 'granted' ELSE 'pending' END as
> status, pg_stat_get_backend_pid(S.backendid)
> as pid, l.transaction as xid,
> db.datname||'.'||n.nspname||'.'|| r.relname as relation, case
> l.mode when 'AccessShareLock' then 1 when 'RowShareLock' then
> 2 when 'Row
> ExclusiveLock' then 3 when 'ShareUpdateExclusiveLock' then 4
> when 'ShareLock' then 5 when 'ShareRowExclusiveLock' then 6
> when 'ExclusiveLock' then 7 else 100 end as
>  exclusivity, pg_stat_get_backend_activity(S.backendid) as
> query FROM pg_user dbu,
>      (SELECT pg_stat_get_backend_idset() AS backendid) AS S,
>      pg_database db, pg_locks l, pg_class r, pg_namespace n
> WHERE db.oid = pg_stat_get_backend_dbid(S.backendid)
>   AND dbu.usesysid = pg_stat_get_backend_userid(S.backendid)
>   AND l.pid = pg_stat_get_backend_pid(S.backendid)
>   AND l.relation = r.oid
>   AND l.database = db.oid
>   AND r.relnamespace = n.oid
> ORDER BY exclusivity DESC, db.datname, n.nspname, r.relname,
> l.mode;"
>
>
> I see what appear to be many single transactions holding
> RowExclusiveLocks for sometimes 40-50 seconds while their
> query shows "<IDLE> in transaction".
>
>  2006-03-25 20:04:01.063873-08 | www | RowExclusiveLock |
> granted
>
> | 17192 | 270205914 | db1.public.sessions |           3 |
> | <IDLE>
>
> in transaction
>  2006-03-25 20:04:11.128632-08 | www | RowExclusiveLock |
> granted
>
> | 17192 | 270205914 | db1.public.sessions          |
> | 3 <IDLE> in transaction
>
>  2006-03-25 20:04:21.215896-08 | www | RowExclusiveLock |
> granted
>
> | 17192 | 270205914 | db1.public.sessions          |
> | 3 <IDLE> in transaction
>
> I'm thinking that means the client is simply tweaking a row
> and then failing to commit the change for 40-50 seconds.  Is
> that consistent?  Is there something else obvious here to
> explain the delays?  Does this sound like a browser stop
> button issue where they may be aborting the query that has the
> lock, and then issuing another that waits on the first?  Other
> suggestions?
>
> Thanks,
> Ed

Re: pg 8.1.2 performance issue

From

"Ed L."

Date:

26 March 2006, 00:53:07

On Saturday March 25 2006 9:49 pm, Ed L. wrote:
> On Saturday March 25 2006 9:36 pm, Ed L. wrote:
> > I have a performance riddle, hoping someone can point me in
> > a helpful direction.  We have a pg 8.1.2 cluster using
> > Apache::Sessions and experiencing simple UPDATEs taking
> > sometimes 30+ seconds to do a very simply update, no foreign
> > keys, no triggers:
> >
> >         Table "public.sessions"
> >   Column   |     Type      | Modifiers
> > -----------+---------------+-----------
> >  id        | character(32) | not null
> >  a_session | text          |
> > Indexes:
> >     "sessions_pkey" PRIMARY KEY, btree (id)
>
> The table has 6800 rows over 18000 pages, and is getting a
> minimum of many tens of thousands of updates per day with
> queries like this:
>
>     UPDATE sessions SET a_session = ? WHERE id = ?

Sorry, it's getting late.  Yes, the table has been analyzed and
explain shows an index scan is occurring.

Ed

Re: pg 8.1.2 performance issue

From

"chris smith"

Date:

26 March 2006, 00:55:25

On 3/26/06, Ed L. <pgsql@bluepolka.net> wrote:
> On Saturday March 25 2006 9:36 pm, Ed L. wrote:
> > I have a performance riddle, hoping someone can point me in a
> > helpful direction.  We have a pg 8.1.2 cluster using
> > Apache::Sessions and experiencing simple UPDATEs taking
> > sometimes 30+ seconds to do a very simply update, no foreign
> > keys, no triggers:
> >
> >         Table "public.sessions"
> >   Column   |     Type      | Modifiers
> > -----------+---------------+-----------
> >  id        | character(32) | not null
> >  a_session | text          |
> > Indexes:
> >     "sessions_pkey" PRIMARY KEY, btree (id)
>
> The table has 6800 rows over 18000 pages, and is getting a
> minimum of many tens of thousands of updates per day with
> queries like this:

If you're updating that much, how often are you running 'analyze'? Are
you running autovacuum? How often?

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: pg 8.1.2 performance issue

From

"Ed L."

Date:

26 March 2006, 10:22:47

On Saturday March 25 2006 9:55 pm, chris smith wrote:
> On 3/26/06, Ed L. <pgsql@bluepolka.net> wrote:
> > On Saturday March 25 2006 9:36 pm, Ed L. wrote:
> > > I have a performance riddle, hoping someone can point me
> > > in a helpful direction.  We have a pg 8.1.2 cluster using
> > > Apache::Sessions and experiencing simple UPDATEs taking
> > > sometimes 30+ seconds to do a very simply update, no
> > > foreign keys, no triggers:
> >
> > The table has 6800 rows over 18000 pages, and is getting a
> > minimum of many tens of thousands of updates per day with
> > queries like this:
>
> If you're updating that much, how often are you running
> 'analyze'? Are you running autovacuum? How often?

I count on the built-in autovacuum to do do analyzes (per 8.1.2
docs).  I'm running autovacuum with the following non-default
parameters:

    autovacuum = on
    autovacuum_naptime = 600 # 10 minutes

With our set of 4 DBs, that amounts to once every 40 minutes for
the given database.  I see "LOG:  autovacuum: processing
database "xyz"" in the log, but I do not see any analyze/vacuum
commands being issued at all (does it log when it
analyzes/vacuums?).

Thanks,
Ed

Re: pg 8.1.2 performance issue

From

"Ed L."

Date:

26 March 2006, 10:27:27

On Sunday March 26 2006 7:22 am, Ed L. wrote:
> On Saturday March 25 2006 9:55 pm, chris smith wrote:
> > On 3/26/06, Ed L. <pgsql@bluepolka.net> wrote:
> > > On Saturday March 25 2006 9:36 pm, Ed L. wrote:
> > > > I have a performance riddle, hoping someone can point me
> > > > in a helpful direction.  We have a pg 8.1.2 cluster
> > > > using Apache::Sessions and experiencing simple UPDATEs
> > > > taking sometimes 30+ seconds to do a very simply update,
> > > > no foreign keys, no triggers:
> > >
> > > The table has 6800 rows over 18000 pages, and is getting a
> > > minimum of many tens of thousands of updates per day with
> > > queries like this:
> >
> > If you're updating that much, how often are you running
> > 'analyze'? Are you running autovacuum? How often?
>
> I count on the built-in autovacuum to do do analyzes (per
> 8.1.2 docs).  I'm running autovacuum with the following
> non-default parameters:
>
>     autovacuum = on
>     autovacuum_naptime = 600 # 10 minutes

I also have these non-default settings:

    autovacuum_vacuum_cost_delay = 500
    autovacuum_vacuum_cost_limit = 200

Thanks,
Ed

Re: pg 8.1.2 performance issue

From

Tom Lane

Date:

26 March 2006, 12:15:06

"Ed L." <pgsql@bluepolka.net> writes:
> With our set of 4 DBs, that amounts to once every 40 minutes for
> the given database.  I see "LOG:  autovacuum: processing
> database "xyz"" in the log, but I do not see any analyze/vacuum
> commands being issued at all (does it log when it
> analyzes/vacuums?).

Not at the default logging level --- I think you have to go to DEBUG1
or DEBUG2 to get info about per-table actions.  (This ought to be
rethought, likely, as that will also clutter your log with a ton of
extraneous info...)

            regards, tom lane

Re: pg 8.1.2 performance issue

From

Tom Lane

Date:

26 March 2006, 12:16:30

"Ed L." <pgsql@bluepolka.net> writes:
> I see what appear to be many single transactions holding
> RowExclusiveLocks for sometimes 40-50 seconds while their query
> shows "<IDLE> in transaction".
> ...
> I'm thinking that means the client is simply tweaking a row and
> then failing to commit the change for 40-50 seconds.  Is that
> consistent?

That's what it sounds like to me.  You might consider logging all
commands from these clients for awhile so you can check that theory.

            regards, tom lane

Re: pg 8.1.2 performance issue

From

"Matthew T. O'Connor"

Date:

26 March 2006, 12:27:45

>>>> The table has 6800 rows over 18000 pages, and is getting a
>>>> minimum of many tens of thousands of updates per day with
>>>> queries like this:
>>> If you're updating that much, how often are you running
>>> 'analyze'? Are you running autovacuum? How often?
>> I count on the built-in autovacuum to do do analyzes (per
>> 8.1.2 docs).  I'm running autovacuum with the following
>> non-default parameters:
>>
>>     autovacuum = on
>>     autovacuum_naptime = 600 # 10 minutes
>
> I also have these non-default settings:
>
>     autovacuum_vacuum_cost_delay = 500
>     autovacuum_vacuum_cost_limit = 200

Not totally sure, but it sounds like the table isn't getting vacuumed
often enough.  To help, you might reduce the naptime from 10 minutes to
5.  But I think the cost_delay settings might be the larger problem.  If
I remember correctly, even small values here tend to greatly increase
the time it takes vacuum commands to complete, so you might try backing
down those settings.  Can you tell from the log files how often
autovacuum is actually taking actions, or how long those actions are taking?

Does anyone out there have any empirical data on good
autovacuum_*_cost_* settings?  I would be curious to hear about them.

Re: pg 8.1.2 performance issue

From

"Jim C. Nasby"

Date:

27 March 2006, 08:49:22

On Sun, Mar 26, 2006 at 11:27:33AM -0500, Matthew T. O'Connor wrote:
> >>>>The table has 6800 rows over 18000 pages, and is getting a
> >>>>minimum of many tens of thousands of updates per day with
> >>>>queries like this:
> >>>If you're updating that much, how often are you running
> >>>'analyze'? Are you running autovacuum? How often?
> >>I count on the built-in autovacuum to do do analyzes (per
> >>8.1.2 docs).  I'm running autovacuum with the following
> >>non-default parameters:
> >>
> >>    autovacuum = on
> >>    autovacuum_naptime = 600 # 10 minutes
> >
> >I also have these non-default settings:
> >
> >    autovacuum_vacuum_cost_delay = 500
> >    autovacuum_vacuum_cost_limit = 200
>
> Not totally sure, but it sounds like the table isn't getting vacuumed
> often enough.  To help, you might reduce the naptime from 10 minutes to

Just look at the number of rows an the number of pages; the table has a
huge amount of bloat, and all the indexes will as well. Best bet at this
point to get things under control is a VACUUM FULL and a REINDEX.

> 5.  But I think the cost_delay settings might be the larger problem.  If
> I remember correctly, even small values here tend to greatly increase
> the time it takes vacuum commands to complete, so you might try backing
> down those settings.  Can you tell from the log files how often
> autovacuum is actually taking actions, or how long those actions are taking?
>
> Does anyone out there have any empirical data on good
> autovacuum_*_cost_* settings?  I would be curious to hear about them.

Even 5 minutes is too long to sleep in this case; I'd go with something
closer to 10 seconds. The default threasholds won't work either; I'd cut
autovacuum_*_(scale_factor|threshold) in half. And yes, that cost delay
is way too large, I'd go with 20-50, given that you've got "gobs of IO".

Finally, PostgreSQL just flat-out doesn't handle things like webapp
session tables well at all, because of the high update volume. You
should consider other alternatives. Someone did point me to a session
manager that allows you to do most work in memory, only flushing to
permanent storage (ie: the database) periodically, that would probably
be a good option for you. Another possibility is using something like
SQLite just for storing the session info (though I think it's MVCC based
as well, so it might have just as much difficulty with this as
PostgreSQL does).
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: pg 8.1.2 performance issue

From

"Ed L."

Date:

28 March 2006, 12:40:48

On Sunday March 26 2006 9:14 am, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > With our set of 4 DBs, that amounts to once every 40 minutes
> > for the given database.  I see "LOG:  autovacuum: processing
> > database "xyz"" in the log, but I do not see any
> > analyze/vacuum commands being issued at all (does it log
> > when it
> > analyzes/vacuums?).
>
> Not at the default logging level --- I think you have to go to
> DEBUG1 or DEBUG2 to get info about per-table actions.  (This
> ought to be rethought, likely, as that will also clutter your
> log with a ton of extraneous info...)

Yes, I'd definitely like to see those vacuums and analyzes logged
similarly to user-generated queries.

Ed

Re: pg 8.1.2 performance issue

From

"Ed L."

Date:

31 March 2006, 14:51:27

On Sunday March 26 2006 9:16 am, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > I see what appear to be many single transactions holding
> > RowExclusiveLocks for sometimes 40-50 seconds while their
> > query shows "<IDLE> in transaction".
> > ...
> > I'm thinking that means the client is simply tweaking a row
> > and then failing to commit the change for 40-50 seconds.  Is
> > that consistent?
>
> That's what it sounds like to me.  You might consider logging
> all commands from these clients for awhile so you can check
> that theory.

This indeed appears to be locking problem from within
Apache::Session where it deletes a row from the DB but fails to
commit the change for an extended period while another
transaction waits on the same row.  Not sure how/why that's
happening in the client code, but it's not a DB issue.

Thanks,
Ed

Re: pg 8.1.2 performance issue

From

Vivek Khera

Date:

03 April 2006, 13:33:29

On Mar 31, 2006, at 1:51 PM, Ed L. wrote:

> This indeed appears to be locking problem from within
> Apache::Session where it deletes a row from the DB but fails to
> commit the change for an extended period while another

And you should read well the notes in the Pg driver for
Apache::Session where it requires you to define your transaction model.

Ie, this is pilot error more than Apache::Session problem.
Apache::Session works extremely well with Postgres.

Re: pg 8.1.2 performance issue

From

Bruce Momjian

Date:

10 April 2006, 18:13:29

Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > With our set of 4 DBs, that amounts to once every 40 minutes for
> > the given database.  I see "LOG:  autovacuum: processing
> > database "xyz"" in the log, but I do not see any analyze/vacuum
> > commands being issued at all (does it log when it
> > analyzes/vacuums?).
>
> Not at the default logging level --- I think you have to go to DEBUG1
> or DEBUG2 to get info about per-table actions.  (This ought to be
> rethought, likely, as that will also clutter your log with a ton of
> extraneous info...)

Added to TODO:

    * Allow log_min_messages to be specified on a per-module basis

      This would allow administrators to see more detailed information from
      specific sections of the backend, e.g. checkpoints, autovacuum, etc.

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +