Thread: unusual performance for vac following 8.2 upgrade

unusual performance for vac following 8.2 upgrade

From
Kim
Date:
Hello all!
Running a vac on an 8.2 client connecting to an 8.2 database (following
example was for a 14 row static table) - vacuums will sit (for lack of a
better word) for anywhere from ten to twenty minutes before taking a
lock out on the table and doing work there. Once the vacuum does
complete, I noticed that no timestamp is registered in
pg_stat_all_tables for that relation for the last-vacuum'd timestamp
(however analyze does seem to set it's timestamp). I asked it to run a
vacuum on an index (knowing it would fail out), and again, the vacuum
sat for several minutes before finally erroring out saying that it
couldn't vacuum an index. Out of curiosity I tried the vacuum on an 8.1
client connected to the 8.2 db, same delay.

In running a truss on the process while it is running, there is over
five minutes where the process seems to be scanning pg_class (at least
thats the only table listed in pg_locks for this process). Following
this it drops into a cycle of doing the same send() command with several
seconds lag between each one, and every so often it catches the same
interrupt (SIGUSR1) and then goes back into the same cycle of send()
calls. Also, whatever it is doing during this stage, it isn't checking
for process-cancelled interrupts, as the process won't recognize it's
been requested to cancel until it breaks out of this cycle of send()s
and SIGUSR1s (which can go for another several minutes). I'm happy to
send along the gore of the truss call if you think it would be helpful...

Any ideas what the vac is prepping for that it could become bogged down
in before finally taking the lock on the table?

Is the lack of a timestamp set for last_vacuum in pg_stat_all_tables an
indication that there may be something incomplete about our install?

Since the upgrade, we've also seen unusual lag time in simple inserts
into tables (atomic inserts have been seen running for several seconds),
and also extreme delays in running \d on tables (I got tired of counting
past 2 minutes, connecting with an 8.1 client gives immediate response
on this command). We plan to upgrade to 8.2.1 as soon as possible, and
also to drop into single user mode and run a reindex system, but any
suggestions in the meantime as to a potential cause or a way to further
debug the vacs would be greatly appreciated.

OS: Solaris 10
write transactions/hr: 1.5 million
size of pg_class: 535,226
number of relations: 108,694

Thanks to all,

Kim

Re: unusual performance for vac following 8.2 upgrade

From
Russell Smith
Date:
Kim wrote:

<snip>
> OS: Solaris 10
> write transactions/hr: 1.5 million
> size of pg_class: 535,226
> number of relations: 108,694
>
That is a huge pg_class.  I remember some discussion recently about
problems with 8.2 and the way it scans pg_class.  I also believe it's
fixed in 8.2.1.  Are you running that.  If not, I suggest you upgrade
and see if the fault still exists.

Regards

Russell Smith
> Thanks to all,
>
> Kim
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>
>


Re: unusual performance for vac following 8.2 upgrade

From
Tom Lane
Date:
Kim <kim@myemma.com> writes:
> Running a vac on an 8.2 client connecting to an 8.2 database (following
> example was for a 14 row static table) - vacuums will sit (for lack of a
> better word) for anywhere from ten to twenty minutes before taking a
> lock out on the table and doing work there.

How big is this database (how many pg_class entries)?  What do you get
from "VACUUM VERBOSE pg_class"?  The truss results make it sound like
the problem is pgstat_vacuum_tabstat() taking a long time, but that code
has not changed since 8.1 ...

            regards, tom lane

Re: unusual performance for vac following 8.2 upgrade

From
Tom Lane
Date:
Kim <kim@myemma.com> writes:
> size of pg_class: 535,226
> number of relations: 108,694

Oh, I shoulda read all the way to the bottom of your email :-(.  What
version of PG were you running before?  I would think that pretty much
any version of pgstat_vacuum_tabstats would have had a performance issue
with pg_class that large.  Also, could we see

    select relkind, count(*) from pg_class group by relkind;

            regards, tom lane

Re: unusual performance for vac following 8.2 upgrade

From
Kim
Date:
Hey Tom,
We were running on 8.1.1 previous to upgrading to 8.2, and yes, we definitely have a heafty pg_class. The inheritance model is heavily used in our schema (the results of the group by you wanted to see are down below).  However, no significant problems were seen with vacs while we were on 8.1. Execution time for the vac seemed more linked to large table size and how active the table was with updates, rather than being universally over 10 minutes regardless of the vac's object. We will be doing an audit of our 8.2 install to try and make sure that it looks like a complete install, any tests you can think of that may further narrow things down for us?



 relkind | count
---------+--------
 v       |   1740
 t       |  49986
 c       |      4
 S       |     57
 r       | 108689
 i       | 374723
(6 rows)



Tom Lane wrote:
Kim <kim@myemma.com> writes: 
size of pg_class: 535,226
number of relations: 108,694   
Oh, I shoulda read all the way to the bottom of your email :-(.  What
version of PG were you running before?  I would think that pretty much
any version of pgstat_vacuum_tabstats would have had a performance issue
with pg_class that large.  Also, could we see
select relkind, count(*) from pg_class group by relkind;
		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
              http://archives.postgresql.org
 

Re: unusual performance for vac following 8.2 upgrade

From
Tom Lane
Date:
Kim <kim@myemma.com> writes:
> We were running on 8.1.1 previous to upgrading to 8.2, and yes, we
> definitely have a heafty pg_class. The inheritance model is heavily used
> in our schema (the results of the group by you wanted to see are down
> below).  However, no significant problems were seen with vacs while we
> were on 8.1.

Odd, because the 8.1 code looks about the same, and it is perfectly
obvious in hindsight that its runtime is about O(N^2) in the number of
relations :-(.  At least that'd be the case if the stats collector
output were fully populated.  Did you have either stats_block_level or
stats_row_level turned on in 8.1?  If not, maybe the reason for the
change is that in 8.2, that table *will* be pretty fully populated,
because now it's got a last-vacuum-time entry that gets made even if the
stats are otherwise turned off.  Perhaps making that non-disablable
wasn't such a hot idea :-(.

What I think we need to do about this is

(1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
of using a hash table for the OIDs instead of a linear list.  Should be
a pretty small change; I'll work on it today.

(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.

Comments from hackers?

            regards, tom lane

Re: [HACKERS] unusual performance for vac following 8.2 upgrade

From
Alvaro Herrera
Date:
Tom Lane wrote:

> What I think we need to do about this is
>
> (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
> of using a hash table for the OIDs instead of a linear list.  Should be
> a pretty small change; I'll work on it today.
>
> (2) Reconsider whether last-vacuum-time should be sent to the collector
> unconditionally.

(2) seems a perfectly reasonably answer, but ISTM (1) would be good to
have anyway (at least in HEAD).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: unusual performance for vac following 8.2 upgrade

From
"Simon Riggs"
Date:
On Thu, 2007-01-11 at 14:45 -0500, Tom Lane wrote:
> Kim <kim@myemma.com> writes:
> > We were running on 8.1.1 previous to upgrading to 8.2, and yes, we
> > definitely have a heafty pg_class. The inheritance model is heavily used
> > in our schema (the results of the group by you wanted to see are down
> > below).  However, no significant problems were seen with vacs while we
> > were on 8.1.
>
> Odd, because the 8.1 code looks about the same, and it is perfectly
> obvious in hindsight that its runtime is about O(N^2) in the number of
> relations :-(.  At least that'd be the case if the stats collector
> output were fully populated.  Did you have either stats_block_level or
> stats_row_level turned on in 8.1?  If not, maybe the reason for the
> change is that in 8.2, that table *will* be pretty fully populated,
> because now it's got a last-vacuum-time entry that gets made even if the
> stats are otherwise turned off.  Perhaps making that non-disablable
> wasn't such a hot idea :-(.
>
> What I think we need to do about this is
>
> (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
> of using a hash table for the OIDs instead of a linear list.  Should be
> a pretty small change; I'll work on it today.
>
> (2) Reconsider whether last-vacuum-time should be sent to the collector
> unconditionally.
>
> Comments from hackers?

It's not clear to me how this fix will alter the INSERT issue Kim
mentions. Are those issues connected? Or are you thinking that handling
stats in a tight loop is slowing down other aspects of the system?

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: [HACKERS] unusual performance for vac following 8.2 upgrade

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> It's not clear to me how this fix will alter the INSERT issue Kim
> mentions.

I didn't say that it would; we have no information on the INSERT issue,
so I'm just concentrating on the problem that he did provide info on.

(BTW, I suppose the slow-\d issue is the regex planning problem we
already knew about.)

I'm frankly not real surprised that there are performance issues with
such a huge pg_class; it's not a regime that anyone's spent any time
optimizing.  It is interesting that 8.2 seems to have regressed but
I can think of several places that would've been bad before.  One is
that there are seqscans of pg_inherits ...

            regards, tom lane

Re: [HACKERS] unusual performance for vac following 8.2 upgrade

From
"Jim C. Nasby"
Date:
On Thu, Jan 11, 2007 at 04:49:28PM -0300, Alvaro Herrera wrote:
> Tom Lane wrote:
>
> > What I think we need to do about this is
> >
> > (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
> > of using a hash table for the OIDs instead of a linear list.  Should be
> > a pretty small change; I'll work on it today.
> >
> > (2) Reconsider whether last-vacuum-time should be sent to the collector
> > unconditionally.
>
> (2) seems a perfectly reasonably answer, but ISTM (1) would be good to
> have anyway (at least in HEAD).

Actually, I'd rather see the impact #1 has before adding #2... If #1
means we're good for even someone with 10M relations, I don't see much
point in #2.

BTW, we're now starting to see more users with a large number of
relations, thanks to partitioning. It would probably be wise to expand
test coverage for that case, especially when it comes to performance.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: unusual performance for vac following 8.2 upgrade

From
Kim
Date:
For 8.1, we did have stats_block_level and stats_row_level on, so thats not it either :-/ However, I did go on to an alternate database of ours on the same machine, using the same install, same postmaster - that holds primarily static relations, and not many of those (16 relations total). The response of running a vac for a 1.3k static table was quick (6 seconds - but it still did not set the last_vacuum field). Not sure why we weren't seeing more probs with this on 8.1 for the full db, but from the looks of things I think your theory on the primary problem with our vacs is solid. I'm hoping we can fire up our old 8.1 dataset and run some tests on there to confirm/reject the idea that it was doing any better, but that will require quieter times on the machine than we've got right now :)

We are going to try and upgrade to 8.2.1 as soon as we can, and if we continue to see some of the other problems I mentioned as side-notes, we'll build some information on those and pass it along...

Thanks so much!

Kim


Tom Lane wrote:
Kim <kim@myemma.com> writes: 
We were running on 8.1.1 previous to upgrading to 8.2, and yes, we 
definitely have a heafty pg_class. The inheritance model is heavily used 
in our schema (the results of the group by you wanted to see are down 
below).  However, no significant problems were seen with vacs while we 
were on 8.1.   
Odd, because the 8.1 code looks about the same, and it is perfectly
obvious in hindsight that its runtime is about O(N^2) in the number of
relations :-(.  At least that'd be the case if the stats collector
output were fully populated.  Did you have either stats_block_level or
stats_row_level turned on in 8.1?  If not, maybe the reason for the
change is that in 8.2, that table *will* be pretty fully populated,
because now it's got a last-vacuum-time entry that gets made even if the
stats are otherwise turned off.  Perhaps making that non-disablable
wasn't such a hot idea :-(.

What I think we need to do about this is

(1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
of using a hash table for the OIDs instead of a linear list.  Should be
a pretty small change; I'll work on it today.

(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.

Comments from hackers?
		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
 

Re: unusual performance for vac following 8.2 upgrade

From
Tom Lane
Date:
I wrote:
> (2) Reconsider whether last-vacuum-time should be sent to the collector
> unconditionally.

Actually, now that I look, the collector already contains this logic:

    /*
     * Don't create either the database or table entry if it doesn't already
     * exist.  This avoids bloating the stats with entries for stuff that is
     * only touched by vacuum and not by live operations.
     */

and ditto for analyze messages.  So my idea that the addition of
last-vac-time was causing an increase in the statistics file size
compared to 8.1 seems wrong.

How large is your $PGDATA/global/pgstat.stat file, anyway?

            regards, tom lane

Re: [HACKERS] unusual performance for vac following 8.2 upgrade

From
Tom Lane
Date:
I wrote:
> What I think we need to do about this is
> (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
> of using a hash table for the OIDs instead of a linear list.  Should be
> a pretty small change; I'll work on it today.

I've applied the attached patch to 8.2 to do the above.  Please give it
a try and see how much it helps for you.  Some limited testing here
confirms a noticeable improvement in VACUUM startup time at 10000
tables, and of course it should be 100X worse with 100000 tables.

I am still confused why you didn't see the problem in 8.1, though.
This code is just about exactly the same in 8.1.  Maybe you changed
your stats collector settings when moving to 8.2?

            regards, tom lane

Index: pgstat.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.140
diff -c -r1.140 pgstat.c
*** pgstat.c    21 Nov 2006 20:59:52 -0000    1.140
--- pgstat.c    11 Jan 2007 22:32:30 -0000
***************
*** 159,164 ****
--- 159,165 ----
  static void pgstat_read_statsfile(HTAB **dbhash, Oid onlydb);
  static void backend_read_statsfile(void);
  static void pgstat_read_current_status(void);
+ static HTAB *pgstat_collect_oids(Oid catalogid);

  static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
  static void pgstat_send(void *msg, int len);
***************
*** 657,666 ****
  void
  pgstat_vacuum_tabstat(void)
  {
!     List       *oidlist;
!     Relation    rel;
!     HeapScanDesc scan;
!     HeapTuple    tup;
      PgStat_MsgTabpurge msg;
      HASH_SEQ_STATUS hstat;
      PgStat_StatDBEntry *dbentry;
--- 658,664 ----
  void
  pgstat_vacuum_tabstat(void)
  {
!     HTAB       *htab;
      PgStat_MsgTabpurge msg;
      HASH_SEQ_STATUS hstat;
      PgStat_StatDBEntry *dbentry;
***************
*** 679,693 ****
      /*
       * Read pg_database and make a list of OIDs of all existing databases
       */
!     oidlist = NIL;
!     rel = heap_open(DatabaseRelationId, AccessShareLock);
!     scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
!     while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
!     {
!         oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup));
!     }
!     heap_endscan(scan);
!     heap_close(rel, AccessShareLock);

      /*
       * Search the database hash table for dead databases and tell the
--- 677,683 ----
      /*
       * Read pg_database and make a list of OIDs of all existing databases
       */
!     htab = pgstat_collect_oids(DatabaseRelationId);

      /*
       * Search the database hash table for dead databases and tell the
***************
*** 698,709 ****
      {
          Oid            dbid = dbentry->databaseid;

!         if (!list_member_oid(oidlist, dbid))
              pgstat_drop_database(dbid);
      }

      /* Clean up */
!     list_free(oidlist);

      /*
       * Lookup our own database entry; if not found, nothing more to do.
--- 688,701 ----
      {
          Oid            dbid = dbentry->databaseid;

!         CHECK_FOR_INTERRUPTS();
!
!         if (hash_search(htab, (void *) &dbid, HASH_FIND, NULL) == NULL)
              pgstat_drop_database(dbid);
      }

      /* Clean up */
!     hash_destroy(htab);

      /*
       * Lookup our own database entry; if not found, nothing more to do.
***************
*** 717,731 ****
      /*
       * Similarly to above, make a list of all known relations in this DB.
       */
!     oidlist = NIL;
!     rel = heap_open(RelationRelationId, AccessShareLock);
!     scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
!     while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
!     {
!         oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup));
!     }
!     heap_endscan(scan);
!     heap_close(rel, AccessShareLock);

      /*
       * Initialize our messages table counter to zero
--- 709,715 ----
      /*
       * Similarly to above, make a list of all known relations in this DB.
       */
!     htab = pgstat_collect_oids(RelationRelationId);

      /*
       * Initialize our messages table counter to zero
***************
*** 738,750 ****
      hash_seq_init(&hstat, dbentry->tables);
      while ((tabentry = (PgStat_StatTabEntry *) hash_seq_search(&hstat)) != NULL)
      {
!         if (list_member_oid(oidlist, tabentry->tableid))
              continue;

          /*
           * Not there, so add this table's Oid to the message
           */
!         msg.m_tableid[msg.m_nentries++] = tabentry->tableid;

          /*
           * If the message is full, send it out and reinitialize to empty
--- 722,738 ----
      hash_seq_init(&hstat, dbentry->tables);
      while ((tabentry = (PgStat_StatTabEntry *) hash_seq_search(&hstat)) != NULL)
      {
!         Oid            tabid = tabentry->tableid;
!
!         CHECK_FOR_INTERRUPTS();
!
!         if (hash_search(htab, (void *) &tabid, HASH_FIND, NULL) != NULL)
              continue;

          /*
           * Not there, so add this table's Oid to the message
           */
!         msg.m_tableid[msg.m_nentries++] = tabid;

          /*
           * If the message is full, send it out and reinitialize to empty
***************
*** 776,782 ****
      }

      /* Clean up */
!     list_free(oidlist);
  }


--- 764,813 ----
      }

      /* Clean up */
!     hash_destroy(htab);
! }
!
!
! /* ----------
!  * pgstat_collect_oids() -
!  *
!  *    Collect the OIDs of either all databases or all tables, according to
!  *    the parameter, into a temporary hash table.  Caller should hash_destroy
!  *    the result when done with it.
!  * ----------
!  */
! static HTAB *
! pgstat_collect_oids(Oid catalogid)
! {
!     HTAB       *htab;
!     HASHCTL        hash_ctl;
!     Relation    rel;
!     HeapScanDesc scan;
!     HeapTuple    tup;
!
!     memset(&hash_ctl, 0, sizeof(hash_ctl));
!     hash_ctl.keysize = sizeof(Oid);
!     hash_ctl.entrysize = sizeof(Oid);
!     hash_ctl.hash = oid_hash;
!     htab = hash_create("Temporary table of OIDs",
!                        PGSTAT_TAB_HASH_SIZE,
!                        &hash_ctl,
!                        HASH_ELEM | HASH_FUNCTION);
!
!     rel = heap_open(catalogid, AccessShareLock);
!     scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
!     while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
!     {
!         Oid        thisoid = HeapTupleGetOid(tup);
!
!         CHECK_FOR_INTERRUPTS();
!
!         (void) hash_search(htab, (void *) &thisoid, HASH_ENTER, NULL);
!     }
!     heap_endscan(scan);
!     heap_close(rel, AccessShareLock);
!
!     return htab;
  }



Re: unusual performance for vac following 8.2 upgrade

From
Kim
Date:
Our pgstats.stat file is 40M for 8.2, on 8.1 it was 33M. Our schema size hasn't grown *that* much in the two weeks since we upgraded

I'm not sure if this sheds any more light on the situation, but in scanning down through the process output from truss, it looks like the first section of output was a large chunk of reads on pgstat.stat, followed by a larger chunk of reads on the global directory and directories under base - this whole section probably went on for a good 6-7 minutes, though I would say the reads on pgstat likely finished within a couple of minutes or so. Following this there was a phase were it did a lot of seeks and reads on files under pg_clog, and it was while doing this (or perhaps it had finished whatever it wanted with clogs) it dropped into the send()/SIGUSR1 loop that goes for another several minutes.

Kim


Tom Lane wrote:
I wrote: 
(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.   
Actually, now that I look, the collector already contains this logic:
   /*    * Don't create either the database or table entry if it doesn't already    * exist.  This avoids bloating the stats with entries for stuff that is    * only touched by vacuum and not by live operations.    */

and ditto for analyze messages.  So my idea that the addition of
last-vac-time was causing an increase in the statistics file size
compared to 8.1 seems wrong.

How large is your $PGDATA/global/pgstat.stat file, anyway?
		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
 

Re: [HACKERS] unusual performance for vac following 8.2upgrade

From
"Simon Riggs"
Date:
On Thu, 2007-01-11 at 16:11 -0500, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > It's not clear to me how this fix will alter the INSERT issue Kim
> > mentions.
>
> I didn't say that it would; we have no information on the INSERT issue,
> so I'm just concentrating on the problem that he did provide info on.

OK.

> I'm frankly not real surprised that there are performance issues with
> such a huge pg_class; it's not a regime that anyone's spent any time
> optimizing.

Yeh, I saw a pg_class that big once, but it just needed a VACUUM.

Temp relations still make pg_class entried don't they? Is that on the
TODO list to change?

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: [HACKERS] unusual performance for vac following 8.2upgrade

From
Alvaro Herrera
Date:
Simon Riggs wrote:

> Temp relations still make pg_class entried don't they? Is that on the
> TODO list to change?

Yeah, and pg_attribute entries as well, which may be more problematic
because they are a lot.  Did we get rid of pg_attribute entries for
system attributes already?

Can we actually get rid of pg_class entries for temp tables.  Maybe
creating a "temp pg_class" which would be local to each session?  Heck,
it doesn't even have to be an actual table -- it just needs to be
somewhere from where we can load entries into the relcache.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [HACKERS] unusual performance for vac following 8.2upgrade

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Can we actually get rid of pg_class entries for temp tables.  Maybe
> creating a "temp pg_class" which would be local to each session?  Heck,
> it doesn't even have to be an actual table -- it just needs to be
> somewhere from where we can load entries into the relcache.

A few things to think about:

1. You'll break a whole lotta client-side code if temp tables disappear
from pg_class.  This is probably solvable --- one thought is to give
pg_class an inheritance child that is a view on a SRF that reads out the
stored-in-memory rows for temp pg_class entries.  Likewise for
pg_attribute and everything else related to a table definition.

2. How do you keep the OIDs for temp tables (and their associated
rowtypes) from conflicting with OIDs for real tables?  Given the way
that OID generation works, there wouldn't be any real problem unless a
temp table survived for as long as it takes the OID counter to wrap all
the way around --- but in a database that has WITH OIDS user tables,
that might not be impossibly long ...

3. What about dependencies on user-defined types, functions, etc?
How will you get things to behave sanely if one backend tries to drop a
type that some other backend is using in a column of a temp table?  Even
if you put entries into pg_depend, which would kind of defeat the point
of not having on-disk catalog entries for temp tables, I don't see how
the other backend figures out what the referencing object is.

I don't really see any solution to that last point :-(

            regards, tom lane

Re: [HACKERS] unusual performance for vac following 8.2upgrade

From
Richard Huxton
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Can we actually get rid of pg_class entries for temp tables.  Maybe
>> creating a "temp pg_class" which would be local to each session?  Heck,
>> it doesn't even have to be an actual table -- it just needs to be
>> somewhere from where we can load entries into the relcache.
>
> A few things to think about:
>
> 1. You'll break a whole lotta client-side code if temp tables disappear
> from pg_class.

> 2. How do you keep the OIDs for temp tables (and their associated
> rowtypes) from conflicting with OIDs for real tables?

> 3. What about dependencies on user-defined types, functions, etc?

Is there not some gain from just a "standard" partitioning of pg_class
into: (system-objects, user-objects, temp-objects)? I'd expect them to
form a hierarchy of change+vacuum rates (if you see what I mean).

--
   Richard Huxton
   Archonet Ltd

Re: [HACKERS] unusual performance for vac following 8.2

From
Kim
Date:
Hello again Tom,

We have our upgrade to 8.2.1 scheduled for this weekend, and we noticed your message regarding the vacuum patch being applied to 8.2 and back-patched. I expect I know the answer to this next question :) but I was wondering if the patch referenced below has also been bundled into the normal source download of 8.2.1 or if we would still need to manually apply it?

- Fix a performance problem in databases with large numbers of tables
  (or other types of pg_class entry): the function
  pgstat_vacuum_tabstat, invoked during VACUUM startup, had runtime
  proportional to the number of stats table entries times the number
  of pg_class rows; in other words O(N^2) if the stats collector's
  information is reasonably complete.  Replace list searching with a
  hash table to bring it back to O(N) behavior.  Per report from kim
  at myemma.com.  Back-patch as far as 8.1; 8.0 and before use
  different coding here.

Thanks,
Kim


Tom Lane wrote:
I wrote: 
What I think we need to do about this is
(1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
of using a hash table for the OIDs instead of a linear list.  Should be
a pretty small change; I'll work on it today.   
I've applied the attached patch to 8.2 to do the above.  Please give it
a try and see how much it helps for you.  Some limited testing here
confirms a noticeable improvement in VACUUM startup time at 10000
tables, and of course it should be 100X worse with 100000 tables.

I am still confused why you didn't see the problem in 8.1, though.
This code is just about exactly the same in 8.1.  Maybe you changed
your stats collector settings when moving to 8.2?
		regards, tom lane
 

Index: pgstat.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v retrieving revision 1.140 diff -c -r1.140 pgstat.c *** pgstat.c 21 Nov 2006 20:59:52 -0000 1.140 --- pgstat.c 11 Jan 2007 22:32:30 -0000 *************** *** 159,164 **** --- 159,165 ---- static void pgstat_read_statsfile(HTAB **dbhash, Oid onlydb); static void backend_read_statsfile(void); static void pgstat_read_current_status(void); + static HTAB *pgstat_collect_oids(Oid catalogid); static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype); static void pgstat_send(void *msg, int len); *************** *** 657,666 **** void pgstat_vacuum_tabstat(void) { ! List *oidlist; ! Relation rel; ! HeapScanDesc scan; ! HeapTuple tup; PgStat_MsgTabpurge msg; HASH_SEQ_STATUS hstat; PgStat_StatDBEntry *dbentry; --- 658,664 ---- void pgstat_vacuum_tabstat(void) { ! HTAB *htab; PgStat_MsgTabpurge msg; HASH_SEQ_STATUS hstat; PgStat_StatDBEntry *dbentry; *************** *** 679,693 **** /* * Read pg_database and make a list of OIDs of all existing databases */ ! oidlist = NIL; ! rel = heap_open(DatabaseRelationId, AccessShareLock); ! scan = heap_beginscan(rel, SnapshotNow, 0, NULL); ! while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL) ! { ! oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup)); ! } ! heap_endscan(scan); ! heap_close(rel, AccessShareLock); /* * Search the database hash table for dead databases and tell the --- 677,683 ---- /* * Read pg_database and make a list of OIDs of all existing databases */ ! htab = pgstat_collect_oids(DatabaseRelationId); /* * Search the database hash table for dead databases and tell the *************** *** 698,709 **** { Oid dbid = dbentry->databaseid; ! if (!list_member_oid(oidlist, dbid)) pgstat_drop_database(dbid); } /* Clean up */ ! list_free(oidlist); /* * Lookup our own database entry; if not found, nothing more to do. --- 688,701 ---- { Oid dbid = dbentry->databaseid; ! CHECK_FOR_INTERRUPTS(); ! ! if (hash_search(htab, (void *) &dbid, HASH_FIND, NULL) == NULL) pgstat_drop_database(dbid); } /* Clean up */ ! hash_destroy(htab); /* * Lookup our own database entry; if not found, nothing more to do. *************** *** 717,731 **** /* * Similarly to above, make a list of all known relations in this DB. */ ! oidlist = NIL; ! rel = heap_open(RelationRelationId, AccessShareLock); ! scan = heap_beginscan(rel, SnapshotNow, 0, NULL); ! while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL) ! { ! oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup)); ! } ! heap_endscan(scan); ! heap_close(rel, AccessShareLock); /* * Initialize our messages table counter to zero --- 709,715 ---- /* * Similarly to above, make a list of all known relations in this DB. */ ! htab = pgstat_collect_oids(RelationRelationId); /* * Initialize our messages table counter to zero *************** *** 738,750 **** hash_seq_init(&hstat, dbentry->tables); while ((tabentry = (PgStat_StatTabEntry *) hash_seq_search(&hstat)) != NULL) { ! if (list_member_oid(oidlist, tabentry->tableid)) continue; /* * Not there, so add this table's Oid to the message */ ! msg.m_tableid[msg.m_nentries++] = tabentry->tableid; /* * If the message is full, send it out and reinitialize to empty --- 722,738 ---- hash_seq_init(&hstat, dbentry->tables); while ((tabentry = (PgStat_StatTabEntry *) hash_seq_search(&hstat)) != NULL) { ! Oid tabid = tabentry->tableid; ! ! CHECK_FOR_INTERRUPTS(); ! ! if (hash_search(htab, (void *) &tabid, HASH_FIND, NULL) != NULL) continue; /* * Not there, so add this table's Oid to the message */ ! msg.m_tableid[msg.m_nentries++] = tabid; /* * If the message is full, send it out and reinitialize to empty *************** *** 776,782 **** } /* Clean up */ ! list_free(oidlist); } --- 764,813 ---- } /* Clean up */ ! hash_destroy(htab); ! } ! ! ! /* ---------- ! * pgstat_collect_oids() - ! * ! * Collect the OIDs of either all databases or all tables, according to ! * the parameter, into a temporary hash table. Caller should hash_destroy ! * the result when done with it. ! * ---------- ! */ ! static HTAB * ! pgstat_collect_oids(Oid catalogid) ! { ! HTAB *htab; ! HASHCTL hash_ctl; ! Relation rel; ! HeapScanDesc scan; ! HeapTuple tup; ! ! memset(&hash_ctl, 0, sizeof(hash_ctl)); ! hash_ctl.keysize = sizeof(Oid); ! hash_ctl.entrysize = sizeof(Oid); ! hash_ctl.hash = oid_hash; ! htab = hash_create("Temporary table of OIDs", ! PGSTAT_TAB_HASH_SIZE, ! &hash_ctl, ! HASH_ELEM | HASH_FUNCTION); ! ! rel = heap_open(catalogid, AccessShareLock); ! scan = heap_beginscan(rel, SnapshotNow, 0, NULL); ! while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL) ! { ! Oid thisoid = HeapTupleGetOid(tup); ! ! CHECK_FOR_INTERRUPTS(); ! ! (void) hash_search(htab, (void *) &thisoid, HASH_ENTER, NULL); ! } ! heap_endscan(scan); ! heap_close(rel, AccessShareLock); ! ! return htab; }