Re: GLOBAL vs LOCAL temp tables - Mailing list pgsql-hackers

From Kevin Brown
Subject Re: GLOBAL vs LOCAL temp tables
Date
Msg-id 20030417065322.GH1833@filer
Whole thread Raw
In response to Re: GLOBAL vs LOCAL temp tables  (Alvaro Herrera <alvherre@dcc.uchile.cl>)
Responses Re: GLOBAL vs LOCAL temp tables  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Alvaro Herrera wrote:
> On Wed, Apr 16, 2003 at 06:42:36PM -0700, Kevin Brown wrote:
> 
> > Hmm...could VACUUM FULL then take an additional option, REINDEX (not
> > sure if calling it that will be possible if it'll cause ambiguities in
> > the parser), so that you don't have to do REINDEX separately?
> 
> Actually, the need to REINDEX should be pretty minimal.  The command was
> originally invented to recover from corrupted indexes, but was used as
> a regular maintenance because of the ever-growing indexes problem.  The
> problem is mostly gone now, and REINDEX is again only meant to the
> corrupted indexes scenario.

But as Tom pointed out, if you delete a bunch of data from a table
then insert a fresh set of data, but don't end up inserting much data
with roughly the same keys that were in the original batch of data,
you'll get a lot of empty areas in your index that are unused.  VACUUM
marks them as being available for reuse, of course, but that doesn't
help you unless you insert data containing values that are appropriate
to the unused areas.

An example would be an index on a serial column, where a periodic
delete of old data happens.  The values in the serial column are never
reused, so their locations in the index never get reused, either,
right?  So the index on that serial column grows without bound.  I
imagine that situation is actually rather common.

Unless that behavior has changed, you will need to reindex on a
periodic basis to keep your index from growing indefinitely, won't
you?

> VACUUM generally won't free disk space, but it will mark pages free so
> subsequent index growth will use them.  Periodic VACUUM usage should be
> enough to keep indexes in control.

Even in situations as I describe above?

> > On a slightly different note, from what I've seen of the source,
> > REINDEX TABLE acquires an access exclusive lock on the table, thus
> > preventing concurrent SELECTs against the table while REINDEX TABLE is
> > running.
> 
> Yeah, it's difficult to recreate an index keeping concurrency.  

I don't quite understand this.  I'm certainly not advocating allowing
writes while reindexing!  The data in the table isn't changing at all
because you've acquired an exclusive lock on the table (which permits
reads but not writes) -- but not a shared exclusive lock (which would
prevent reads).

Why would you need to prevent reads of the data in the table or the
old index, aside from performance or corruption considerations (which
is an evaluation that belongs in the hands the DBA)?  I can certainly
see the need to prevent reads (and thus acquire a shared exclusive
lock) once the new index is built and needs to be swapped in for the
old one, but not before then.

> I had originally meant to implement this (concurrent index rebuild),
> but got stuck in the freelist thing, and now concurrent index
> rebuild is not needed as much.  Because of this I selected a
> different project that is probably more useful.

I can certainly understand that.  :-)

I'm just wondering what would happen if you changed the code to
acquire a standard exclusive lock on the table instead of a shared
exclusive one, and added another line to acquire a shared exclusive
lock once the new index is built...


-- 
Kevin Brown                          kevin@sysexperts.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_clog woes with 7.3.2 - Episode 2
Next
From: Oliver Elphick
Date:
Subject: New flex