Re: Index bloat problem? - Mailing list pgsql-performance

From Alex Turner
Subject Re: Index bloat problem?
Date
Msg-id 33c6269f0504211212b0163ac@mail.gmail.com
Whole thread Raw
In response to Re: Index bloat problem?  (Bill Chandler <billybobc1210@yahoo.com>)
List pgsql-performance
Same thing happens in Oracle

ALTER INDEX <blah> rebuild

To force a rebuild.  It will mark the free blocks as 'free' below the
PCTFREE value for the tablespace.

Basically If you build an index with 9999 entries.  and each entry is
1/4 of a block, the database will write 2500 blocks to the disk.  If
you delete a random 75% of the index values, you will now have 2500
blocks that have 75% free space.  The database will reuse that free
space in those blocks as you insert new values, but until then, you
still have 2500 blocks worth of data on a disk, that is only 25% full.
 Rebuilding the index forces the system to physically re-allocate all
that data space, and now you have just 2499 entries, that use 625
blocks.

I'm not sure that 'blocks' is the correct term in postgres, it's
segments in Oracle, but the concept remains the same.

Alex Turner
netEconomist

On 4/21/05, Bill Chandler <billybobc1210@yahoo.com> wrote:
>
> --- Josh Berkus <josh@agliodbs.com> wrote:
> > Bill,
> >
> > > What about if an out-of-the-ordinary number of
> > rows
> > > were deleted (say 75% of rows in the table, as
> > opposed
> > > to normal 5%) followed by a 'VACUUM ANALYZE'?
> > Could
> > > things get out of whack because of that situation?
> >
> > Yes.  You'd want to run REINDEX after and event like
> > that.  As you should now.
> >
> > --
> > Josh Berkus
> > Aglio Database Solutions
> > San Francisco
> >
>
> Thank you.  Though I must say, that is very
> discouraging.  REINDEX is a costly operation, timewise
> and due to the fact that it locks out other processes
> from proceeding.  Updates are constantly coming in and
> queries are occurring continuously.  A REINDEX could
> potentially bring the whole thing to a halt.
>
> Honestly, this seems like an inordinate amount of
> babysitting for a production application.  I'm not
> sure if the client will be willing to accept it.
>
> Admittedly my knowledge of the inner workings of an
> RDBMS is limited, but could somebody explain to me why
> this would be so?  If you delete a bunch of rows why
> doesn't the index get updated at the same time?  Is
> this a common issue among all RDBMSs or is it
> something that is PostgreSQL specific?  Is there any
> way around it?
>
> thanks,
>
> Bill
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

pgsql-performance by date:

Previous
From: Enrico Weigelt
Date:
Subject: index not used
Next
From: Enrico Weigelt
Date:
Subject: Re: immutable functions vs. join for lookups ?