Thread: db grows and grows

db grows and grows

From

"Bjoern Metzdorf"

Date:

18 June 2002, 10:56:20

Hi,

I have a 3 GB (fs based) large pgdata directory. I regularly do vacuums
every 15 minutes and vacuums with analyzing every night.

After dumping the whole db (pg_dump -c db), dropping and creating the db,
reinserting the dump and vacuuming again, my pgdata directory only contains
1 GB. The dump had no errors, all data has been saved and reinserted.

The xlogs/clogs didnt take up 2 GB, so I am wondering what has happened.

Shouldn't the vacuuming take care of this?

A (desired) sideeffect is, that the postmaster runs much faster now. Queries
get executed much faster.

If I compare the relpages from before and after, I see the difference there
also.

Any hints?

Greetings,
Bjoern

Re: db grows and grows

From

Alvaro Herrera

Date:

18 June 2002, 13:31:51

On Tue, 18 Jun 2002, Bjoern Metzdorf wrote:

> I have a 3 GB (fs based) large pgdata directory. I regularly do vacuums
> every 15 minutes and vacuums with analyzing every night.
>
> After dumping the whole db (pg_dump -c db), dropping and creating the db,
> reinserting the dump and vacuuming again, my pgdata directory only contains
> 1 GB. The dump had no errors, all data has been saved and reinserted.

Your indexes are probably growing. In extant versions, VACUUM doesn't
shrink them, so you should do it manually using REINDEX.  There's a
contributed script that does it in a reasonably automated way, like the
vacuum script does.  Look in the archives.

> Shouldn't the vacuuming take care of this?

Yes IMVHO, but at present it doesn't.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)

Re: db grows and grows

From

Scott Marlowe

Date:

18 June 2002, 13:32:44

Look through the mailing list archives for a program called "reindexdb"
that will likely fix your problems.  There was a whole thread on this not
more than a month or so ago.

Basically, large updates can cause uncontrolled index growth that vacuum
doesn't fix.

On Tue, 18 Jun 2002, Bjoern Metzdorf wrote:

> Hi,
>
> I have a 3 GB (fs based) large pgdata directory. I regularly do vacuums
> every 15 minutes and vacuums with analyzing every night.
>
> After dumping the whole db (pg_dump -c db), dropping and creating the db,
> reinserting the dump and vacuuming again, my pgdata directory only contains
> 1 GB. The dump had no errors, all data has been saved and reinserted.
>
> The xlogs/clogs didnt take up 2 GB, so I am wondering what has happened.
>
> Shouldn't the vacuuming take care of this?
>
> A (desired) sideeffect is, that the postmaster runs much faster now. Queries
> get executed much faster.
>
> If I compare the relpages from before and after, I see the difference there
> also.
>
> Any hints?
>
> Greetings,
> Bjoern
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
"Force has no place where there is need of skill.", "Haste in every
business brings failures.", "This is the bitterest pain among men, to have
much knowledge but no power." -- Herodotus

Re: db grows and grows

From

terry@greatgulfhomes.com

Date:

18 June 2002, 13:51:29

I had a similar issue:

Look at your indexes, I'll bet they are growing and not recovering unused
pages.

Consider scheduling the REINDEX command like you do for VACUUM.


Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of
> Bjoern Metzdorf
> Sent: Tuesday, June 18, 2002 10:28 AM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] db grows and grows
>
>
> Hi,
>
> I have a 3 GB (fs based) large pgdata directory. I regularly
> do vacuums
> every 15 minutes and vacuums with analyzing every night.
>
> After dumping the whole db (pg_dump -c db), dropping and
> creating the db,
> reinserting the dump and vacuuming again, my pgdata directory
> only contains
> 1 GB. The dump had no errors, all data has been saved and reinserted.
>
> The xlogs/clogs didnt take up 2 GB, so I am wondering what
> has happened.
>
> Shouldn't the vacuuming take care of this?
>
> A (desired) sideeffect is, that the postmaster runs much
> faster now. Queries
> get executed much faster.
>
> If I compare the relpages from before and after, I see the
> difference there
> also.
>
> Any hints?
>
> Greetings,
> Bjoern
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Re: db grows and grows

From

Bill Moran

Date:

18 June 2002, 14:24:44

Bjoern Metzdorf wrote:
> Hi,
>
> I have a 3 GB (fs based) large pgdata directory. I regularly do vacuums
> every 15 minutes and vacuums with analyzing every night.
>
> After dumping the whole db (pg_dump -c db), dropping and creating the db,
> reinserting the dump and vacuuming again, my pgdata directory only contains
> 1 GB. The dump had no errors, all data has been saved and reinserted.
>
> The xlogs/clogs didnt take up 2 GB, so I am wondering what has happened.
>
> Shouldn't the vacuuming take care of this?
>
> A (desired) sideeffect is, that the postmaster runs much faster now. Queries
> get executed much faster.
>
> If I compare the relpages from before and after, I see the difference there
> also.
>
> Any hints?

Have you tried "VACUUM FULL"?
Note that this needs exclusive table access, so run it when usage is low
or (preferrably) during scheduled downtime.
Maybe if you could squeeze this in once a week or so ...

--
Bill Moran
Potential Technologies
http://www.potentialtech.com

Re: db grows and grows

From

Jason Earl

Date:

18 June 2002, 14:37:10

"Bjoern Metzdorf" <bm@turtle-entertainment.de> writes:

> Hi,
>
> I have a 3 GB (fs based) large pgdata directory. I regularly do vacuums
> every 15 minutes and vacuums with analyzing every night.
>
> After dumping the whole db (pg_dump -c db), dropping and creating the db,
> reinserting the dump and vacuuming again, my pgdata directory only contains
> 1 GB. The dump had no errors, all data has been saved and reinserted.
>
> The xlogs/clogs didnt take up 2 GB, so I am wondering what has happened.
>
> Shouldn't the vacuuming take care of this?
>
> A (desired) sideeffect is, that the postmaster runs much faster now. Queries
> get executed much faster.
>
> If I compare the relpages from before and after, I see the difference there
> also.
>
> Any hints?
>
> Greetings,
> Bjoern

Chances are good that your indexes are growing out of control.  If you
have tables with a lot of turnover that is almost certainly the
problem.  Shaun Thomas has written a script that will reindex your
database.  It works very well, but it does lock tables, so it might
not be appropriate for your environment.

The script was posted to this list (a search for reindex turned it up
in my local mirror of the mailing list).  If you can't find it feel
free to contact me.

Jason

Re: db grows and grows

From

"Bjoern Metzdorf"

Date:

19 June 2002, 06:21:36

Thanks for all pointers to REINDEX. Seems to work perfectly now.

Greetings,
Bjoern


----- Original Message -----
From: "Bjoern Metzdorf" <bm@turtle-entertainment.de>
To: <pgsql-general@postgresql.org>
Sent: Tuesday, June 18, 2002 4:28 PM
Subject: [GENERAL] db grows and grows


> Hi,
>
> I have a 3 GB (fs based) large pgdata directory. I regularly do vacuums
> every 15 minutes and vacuums with analyzing every night.
>
> After dumping the whole db (pg_dump -c db), dropping and creating the db,
> reinserting the dump and vacuuming again, my pgdata directory only
contains
> 1 GB. The dump had no errors, all data has been saved and reinserted.
>
> The xlogs/clogs didnt take up 2 GB, so I am wondering what has happened.
>
> Shouldn't the vacuuming take care of this?
>
> A (desired) sideeffect is, that the postmaster runs much faster now.
Queries
> get executed much faster.
>
> If I compare the relpages from before and after, I see the difference
there
> also.
>
> Any hints?
>
> Greetings,
> Bjoern
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>

Re: db grows and grows

From

Shaun Thomas

Date:

19 June 2002, 11:00:49

On 18 Jun 2002, Jason Earl wrote:

> The script was posted to this list (a search for reindex turned it up
> in my local mirror of the mailing list).  If you can't find it feel
> free to contact me.

Oh come now, I troll the list once in a while.  I sent a copy to him.
My script doesn't really lock anything, it must just be a natural
side-effect of the REINDEX command, kinda like the old VACUUM in
Postgres < 7.2.  I can see why, though.  It looks like it drops the
index and then recreates it.  Not good for integrity if you don't lock.

Ah well, it'll get fixed eventually.  How about postgres 8.9?

^_^

--
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| Shaun M. Thomas                INN Database Administrator           |
| Phone: (309) 743-0812          Fax  : (309) 743-0830                |
| Email: sthomas@townnews.com    AIM  : trifthen                      |
| Web  : www.townnews.com                                             |
|                                                                     |
|     "Most of our lives are about proving something, either to       |
|      ourselves or to someone else."                                 |
|                                           -- Anonymous              |
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+

Re: db grows and grows

From

Andrew Sullivan

Date:

19 June 2002, 13:09:21

On Wed, Jun 19, 2002 at 09:30:23AM -0500, Shaun Thomas wrote:
> On 18 Jun 2002, Jason Earl wrote:

> My script doesn't really lock anything, it must just be a natural
> side-effect of the REINDEX command,

According to

<http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/locking-tables.html>

CREATE INDEX gets a ShareLock.  Which will block UPDATE, DELETE and
INSERT.  So, not _quite_ as bad as VACUUM FULL, but not something to
do in the middle of a busy day.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110

Re: db grows and grows

From

"Bjoern Metzdorf"

Date:

19 June 2002, 15:26:11

> > My script doesn't really lock anything, it must just be a natural
> > side-effect of the REINDEX command,
>
> According to
>
>
<http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/locking-tables.
html>
>
> CREATE INDEX gets a ShareLock.  Which will block UPDATE, DELETE and
> INSERT.  So, not _quite_ as bad as VACUUM FULL, but not something to
> do in the middle of a busy day.

During the REINDEX-run I see plenty "SELECT (waiting)" with ps. So I can
confirm this.

Greetings,
Bjoern

Re: db grows and grows

From

Andrew Sullivan

Date:

19 June 2002, 16:38:30

On Wed, Jun 19, 2002 at 09:09:32PM +0200, Bjoern Metzdorf wrote:
>
> During the REINDEX-run I see plenty "SELECT (waiting)" with ps. So I can
> confirm this.

But why would a ShareLock cause SELECT to wait?  Anyone?  ShareLock
isn't supposed to prevent that.  It _should_ cause UPDATE to wait,
but not SELECT (or at least, that's what I think the docs say).

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110

Re: db grows and grows

From

Varun Kacholia

Date:

19 June 2002, 17:08:50

> > During the REINDEX-run I see plenty "SELECT (waiting)" with ps. So I can
> > confirm this.
>
> But why would a ShareLock cause SELECT to wait?  Anyone?  ShareLock
> isn't supposed to prevent that.  It _should_ cause UPDATE to wait,
> but not SELECT (or at least, that's what I think the docs say).

Exaclty i also feel that sharelock shud not make select wait..
(unless pgsql goes bunkers? :-D )


> A
>
> --
> ----
> Andrew Sullivan                               87 Mowat Avenue
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M6K 3E3
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
------
Varun
  Why doesn't DOS ever say "EXCELLENT command or filename!"

Re: db grows and grows

From

Tom Lane

Date:

19 June 2002, 17:55:27

Andrew Sullivan <andrew@libertyrms.info> writes:
> On Wed, Jun 19, 2002 at 09:09:32PM +0200, Bjoern Metzdorf wrote:
>> During the REINDEX-run I see plenty "SELECT (waiting)" with ps. So I can
>> confirm this.

> But why would a ShareLock cause SELECT to wait?

He was doing REINDEX, which grabs an exclusive lock.

The reason CREATE INDEX can run concurrently with SELECTs is that the
SELECTs won't see (and perhaps try to use) the new index, because its
catalog entries aren't committed yet.  REINDEX needs an exclusive lock
to prevent other transactions from trying to use the
index-under-reconstruction, since they certainly would see it as
an available index.

While one can think of hacks that might be able to work around that,
it's not clear to me that we should expend development time on making
REINDEX slicker --- fixing the underlying space-management problem in
btree indexes would be a better expenditure of time IMHO.  REINDEX
is really intended for disaster recovery, not routine space management.

            regards, tom lane

Re: db grows and grows

From

Varun Kacholia

Date:

19 June 2002, 18:19:25

> While one can think of hacks that might be able to work around that,
> it's not clear to me that we should expend development time on making
> REINDEX slicker --- fixing the underlying space-management problem in
> btree indexes would be a better expenditure of time IMHO.  REINDEX
> is really intended for disaster recovery, not routine space management.

ah.. well if you say that REINDEX is for recovery then what should be
used for routine space management? some thing is needed if you need
regular and heavy updates and still dont wanna compromise speed.

>
>             regards, tom lane
>
--
------
Varun
If Bill Gates had a penny for every time Windows crashed......Oh wait, he does.

Re: db grows and grows

From

terry@greatgulfhomes.com

Date:

19 June 2002, 18:53:10

If he needs to REINDEX live without locking out selects, then simply DROP
INDEX <indexname>; and then do a CREATE INDEX ...  to recreate it.  Drop is
fast, and the create does not acquire the exclusive lock.

The only down side of delete/recreate is that with REINDEX one did not need
to know the statement to recreate the index, and the latter does.

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Tom Lane
> Sent: Wednesday, June 19, 2002 5:32 PM
> To: Andrew Sullivan
> Cc: PostgreSQL general list
> Subject: Re: [GENERAL] db grows and grows
>
>
> Andrew Sullivan <andrew@libertyrms.info> writes:
> > On Wed, Jun 19, 2002 at 09:09:32PM +0200, Bjoern Metzdorf wrote:
> >> During the REINDEX-run I see plenty "SELECT (waiting)"
> with ps. So I can
> >> confirm this.
>
> > But why would a ShareLock cause SELECT to wait?
>
> He was doing REINDEX, which grabs an exclusive lock.
>
> The reason CREATE INDEX can run concurrently with SELECTs is that the
> SELECTs won't see (and perhaps try to use) the new index, because its
> catalog entries aren't committed yet.  REINDEX needs an exclusive lock
> to prevent other transactions from trying to use the
> index-under-reconstruction, since they certainly would see it as
> an available index.
>
> While one can think of hacks that might be able to work around that,
> it's not clear to me that we should expend development time on making
> REINDEX slicker --- fixing the underlying space-management problem in
> btree indexes would be a better expenditure of time IMHO.  REINDEX
> is really intended for disaster recovery, not routine space
> management.
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>

Re: db grows and grows

From

Martijn van Oosterhout

Date:

19 June 2002, 19:41:35

On Wed, Jun 19, 2002 at 06:37:38PM -0400, terry@greatgulfhomes.com wrote:
> If he needs to REINDEX live without locking out selects, then simply DROP
> INDEX <indexname>; and then do a CREATE INDEX ...  to recreate it.  Drop is
> fast, and the create does not acquire the exclusive lock.
>
> The only down side of delete/recreate is that with REINDEX one did not need
> to know the statement to recreate the index, and the latter does.

A while ago I did post a script that did this, only it did the create first,
then the drop and then renamed the new index to the old one. All within a
transaction so other queries wouldn't be left without an index.

It used pg_dump to get the CREATE INDEX command.

Should be in the archive somewhere...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: db grows and grows

From

Andrew Sullivan

Date:

20 June 2002, 08:28:31

On Thu, Jun 20, 2002 at 09:38:41AM +1000, Martijn van Oosterhout wrote:
>
> A while ago I did post a script that did this, only it did the create first,

Yes, actually, that's why I was confused.  I thought it was that
script that was being run.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110

Re: db grows and grows

From

Fran Fabrizio

Date:

20 June 2002, 09:21:32

>
>
>A while ago I did post a script that did this, only it did the create first,
>then the drop and then renamed the new index to the old one. All within a
>transaction so other queries wouldn't be left without an index.
>
>It used pg_dump to get the CREATE INDEX command.
>
>Should be in the archive somewhere...
>
>
Any hints?  I'm not having much luck searching the archives for this
script.

Thanks,
Fran

Re: db grows and grows

From

Fran Fabrizio

Date:

20 June 2002, 09:21:49

Nevermind, I just found it.

http://archives.postgresql.org/pgsql-general/2002-05/msg00242.php

Thanks!

-Fran

>On Wed, Jun 19, 2002 at 06:37:38PM -0400, terry@greatgulfhomes.com wrote:
>
>
>>If he needs to REINDEX live without locking out selects, then simply DROP
>>INDEX <indexname>; and then do a CREATE INDEX ...  to recreate it.  Drop is
>>fast, and the create does not acquire the exclusive lock.
>>
>>The only down side of delete/recreate is that with REINDEX one did not need
>>to know the statement to recreate the index, and the latter does.
>>
>>
>
>A while ago I did post a script that did this, only it did the create first,
>then the drop and then renamed the new index to the old one. All within a
>transaction so other queries wouldn't be left without an index.
>
>It used pg_dump to get the CREATE INDEX command.
>
>Should be in the archive somewhere...
>
>

Re: db grows and grows

From

Shaun Thomas

Date:

20 June 2002, 09:32:03

On Wed, 19 Jun 2002, Tom Lane wrote:

> REINDEX is really intended for disaster recovery, not routine space
> management.

That's nice.  Except the fact Postgres is so bad at maintaining its own
indexes makes us have to *use* it as routine space management.  Having a
200MB index that's still getting bigger, and really should only be 200k,
tells me that there's no better use for it.  Sure, I could wait until
the index takes up my entire disk, and then consider it disaster
recovery, but that's just silly.

Besides that, if REINDEX isn't meant for this, what is?  Nothing?  So if
nothing is meant to fix this, we're just supposed to let this continue
and only use things for their intended purpose?  Sure, writing something
that reads the pg_* tables to get index information and drops/creates
them from scratch is nice, but that would also mean recreating foreign
keys.  Basically everything aside from basic database structure would
have to be abandoned while your script was running.  Or it would have to
be in one huge transaction or something.  Either way, that sounds like a
lot of work I'd have to do to fix something that is inherantly
Postgres's fault.

So, I'll at least continue to use REINDEX.  I'll encourage other people
to use REINDEX, and I'll even tell my dog to use REINDEX, because right
now, that's the only adequate tool we have.

Besides, I'm not talking about the "REINDEX DATABASE blah" command that
redoes the system-table indexes.  That, I agree, is really only
system-recovery, because I can only run it in standalone mode.  But
REINDEX TABLE and REINDEX INDEX do not have that problem, and so I'll
use them.  Bob will use them, Fred will use them, and my dog will use
them until something better comes along, or it's no longer necessary.

So enjoy the vacuumdb script everyone, it's likely all you'll get for a
long time, since not even the developers themselves know how to keep
indexes from growing.  Maybe over the next few weeks, I'll delve into
the code and see if I can't come up with something.  More eyes, right?

--
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| Shaun M. Thomas                INN Database Administrator           |
| Phone: (309) 743-0812          Fax  : (309) 743-0830                |
| Email: sthomas@townnews.com    AIM  : trifthen                      |
| Web  : www.townnews.com                                             |
|                                                                     |
|     "Most of our lives are about proving something, either to       |
|      ourselves or to someone else."                                 |
|                                           -- Anonymous              |
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+

Re: db grows and grows

From

"Bjoern Metzdorf"

Date:

20 June 2002, 09:57:04

> > REINDEX is really intended for disaster recovery, not routine space
> > management.
>
> That's nice.  Except the fact Postgres is so bad at maintaining its own
> indexes makes us have to *use* it as routine space management.  Having a
> 200MB index that's still getting bigger, and really should only be 200k,
> tells me that there's no better use for it.  Sure, I could wait until
> the index takes up my entire disk, and then consider it disaster
> recovery, but that's just silly.

I think was Tom meant is, that REINDEX should not have to be used for
routine space management. It *should* be only there for disaster recovery.
Unfortunately this is not the case in the latest releases, so we all use it
now temporarily for space management, although this is surely going to be
fixed in one of the next releases.

Greetings,
Bjoern

Re: db grows and grows

From

Varun Kacholia

Date:

20 June 2002, 10:22:52

> I think was Tom meant is, that REINDEX should not have to be used for
> routine space management. It *should* be only there for disaster recovery.
> Unfortunately this is not the case in the latest releases, so we all use it
> now temporarily for space management, although this is surely going to be
> fixed in one of the next releases.

well when is the next release scheduled?
BTW i hope you will also fix the IN bug :)

>
> Greetings,
> Bjoern
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
------
Varun
It is easier to write an incorrect program than understand a correct one

Re: db grows and grows

From

Bruce Momjian

Date:

20 June 2002, 11:43:03

Bjoern Metzdorf wrote:
> > > REINDEX is really intended for disaster recovery, not routine space
> > > management.
> >
> > That's nice.  Except the fact Postgres is so bad at maintaining its own
> > indexes makes us have to *use* it as routine space management.  Having a
> > 200MB index that's still getting bigger, and really should only be 200k,
> > tells me that there's no better use for it.  Sure, I could wait until
> > the index takes up my entire disk, and then consider it disaster
> > recovery, but that's just silly.
>
> I think was Tom meant is, that REINDEX should not have to be used for
> routine space management. It *should* be only there for disaster recovery.
> Unfortunately this is not the case in the latest releases, so we all use it
> now temporarily for space management, although this is surely going to be
> fixed in one of the next releases.

Yes, but the problem is that we don't have a plan on how to fix the
index growth problem right now, so if people want to prevent index
growth, reindex is the only solution.

The TODO item says:

    * Certain indexes will not shrink, e.g. indexes on ever-increasing
      columns and indexes with many duplicate keys

but in fact that wording is misleading.  >80% of index are on
ever-increasing columns, so it isn't really 'certain index' but more
accurately 'most indexes'.

I am planning to add the reindex script to /contrib, document its need
in the maintenance docs, and add an FAQ item.  If it gets fixed in 7.3,
great.  If not, we will have communicated to users and given them the
tools then need.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: db grows and grows

From

Scott Marlowe

Date:

20 June 2002, 16:06:14

On Thu, 20 Jun 2002, Martijn van Oosterhout wrote:

> On Wed, Jun 19, 2002 at 06:37:38PM -0400, terry@greatgulfhomes.com wrote:
> > If he needs to REINDEX live without locking out selects, then simply DROP
> > INDEX <indexname>; and then do a CREATE INDEX ...  to recreate it.  Drop is
> > fast, and the create does not acquire the exclusive lock.
> >
> > The only down side of delete/recreate is that with REINDEX one did not need
> > to know the statement to recreate the index, and the latter does.
>
> A while ago I did post a script that did this, only it did the create first,
> then the drop and then renamed the new index to the old one. All within a
> transaction so other queries wouldn't be left without an index.
>
> It used pg_dump to get the CREATE INDEX command.

You can do 'select indexdef from pg_indexes where tablename='tablename' to
get the create index command without using pg_dump.

I've written a small PHP script that uses that and transactions to do
dynamic index recreation on my box at work.

--
"Force has no place where there is need of skill.", "Haste in every
business brings failures.", "This is the bitterest pain among men, to have
much knowledge but no power." -- Herodotus

Re: db grows and grows

From

Robert Treat

Date:

20 June 2002, 16:15:30

On Thu, 2002-06-20 at 11:40, Bruce Momjian wrote:
> Bjoern Metzdorf wrote:
>
> Yes, but the problem is that we don't have a plan on how to fix the
> index growth problem right now, so if people want to prevent index
> growth, reindex is the only solution.
>
> The TODO item says:
>
>     * Certain indexes will not shrink, e.g. indexes on ever-increasing
>       columns and indexes with many duplicate keys
>
> but in fact that wording is misleading.  >80% of index are on
> ever-increasing columns, so it isn't really 'certain index' but more
> accurately 'most indexes'.
>
> I am planning to add the reindex script to /contrib, document its need
> in the maintenance docs, and add an FAQ item.  If it gets fixed in 7.3,
> great.  If not, we will have communicated to users and given them the
> tools then need.
>

Would you say the reindex command/script is the recommended way of
dealing with the issue, rather than the create/rename method some have
suggested? Or maybe the difference is negligible? Does either method
have an upside in regards to the query planner statistics generated via
routine vacuum analyzes?

Robert Treat

Re: db grows and grows

From

terry@greatgulfhomes.com

Date:

20 June 2002, 18:32:21

REINDEX causes a lock, which pauses selects on that table, bad thing if your
DB is 24/7, or tables are very large so REINDEX takes a long time to run.

DROP only momentarily (if at all) locks the table.
CREATE does not cause a lock because the index being CREATE'd is "invisible"
until it is fully created.

Down side of DROP/CREATE is your shrinking script has to know the exact code
to recreate the index, which is an issue if your database is evolving,
you'll have to keep updating the script as you add tables/indexes.


Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Robert Treat
> Sent: Thursday, June 20, 2002 3:58 PM
> To: Bruce Momjian
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] db grows and grows
>
>
> On Thu, 2002-06-20 at 11:40, Bruce Momjian wrote:
> > Bjoern Metzdorf wrote:
> >
> > Yes, but the problem is that we don't have a plan on how to fix the
> > index growth problem right now, so if people want to prevent index
> > growth, reindex is the only solution.
> >
> > The TODO item says:
> >
> >     * Certain indexes will not shrink, e.g. indexes on
> ever-increasing
> >       columns and indexes with many duplicate keys
> >
> > but in fact that wording is misleading.  >80% of index are on
> > ever-increasing columns, so it isn't really 'certain index' but more
> > accurately 'most indexes'.
> >
> > I am planning to add the reindex script to /contrib,
> document its need
> > in the maintenance docs, and add an FAQ item.  If it gets
> fixed in 7.3,
> > great.  If not, we will have communicated to users and
> given them the
> > tools then need.
> >
>
> Would you say the reindex command/script is the recommended way of
> dealing with the issue, rather than the create/rename method some have
> suggested? Or maybe the difference is negligible? Does either method
> have an upside in regards to the query planner statistics
> generated via
> routine vacuum analyzes?
>
> Robert Treat
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>

Re: db grows and grows

From

Martijn van Oosterhout

Date:

20 June 2002, 21:15:37

On Thu, Jun 20, 2002 at 05:33:28PM -0400, terry@greatgulfhomes.com wrote:
> Down side of DROP/CREATE is your shrinking script has to know the exact code
> to recreate the index, which is an issue if your database is evolving,
> you'll have to keep updating the script as you add tables/indexes.

My script-that-makes-the-script used pg_dump to get the appropriate
statement. As someone pointed out, it is also stored in plain text within
the database.

> > -----Original Message-----
> > From: pgsql-general-owner@postgresql.org
> > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Robert Treat
> > Sent: Thursday, June 20, 2002 3:58 PM
> > To: Bruce Momjian
> > Cc: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] db grows and grows
> >
> >
> > On Thu, 2002-06-20 at 11:40, Bruce Momjian wrote:
> > > Bjoern Metzdorf wrote:
> > >
> > > Yes, but the problem is that we don't have a plan on how to fix the
> > > index growth problem right now, so if people want to prevent index
> > > growth, reindex is the only solution.
> > >
> > > The TODO item says:
> > >
> > >     * Certain indexes will not shrink, e.g. indexes on
> > ever-increasing
> > >       columns and indexes with many duplicate keys
> > >
> > > but in fact that wording is misleading.  >80% of index are on
> > > ever-increasing columns, so it isn't really 'certain index' but more
> > > accurately 'most indexes'.
> > >
> > > I am planning to add the reindex script to /contrib,
> > document its need
> > > in the maintenance docs, and add an FAQ item.  If it gets
> > fixed in 7.3,
> > > great.  If not, we will have communicated to users and
> > given them the
> > > tools then need.
> > >
> >
> > Would you say the reindex command/script is the recommended way of
> > dealing with the issue, rather than the create/rename method some have
> > suggested? Or maybe the difference is negligible? Does either method
> > have an upside in regards to the query planner statistics
> > generated via
> > routine vacuum analyzes?
> >
> > Robert Treat
> >
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: db grows and grows

From

Robert Treat

Date:

21 June 2002, 11:06:25

Perhaps your script or one that grabs the information internally should
be added to the contrib section as well?

Robert Treat

On Thu, 2002-06-20 at 20:27, Martijn van Oosterhout wrote:
> On Thu, Jun 20, 2002 at 05:33:28PM -0400, terry@greatgulfhomes.com wrote:
> > Down side of DROP/CREATE is your shrinking script has to know the exact code
> > to recreate the index, which is an issue if your database is evolving,
> > you'll have to keep updating the script as you add tables/indexes.
>
> My script-that-makes-the-script used pg_dump to get the appropriate
> statement. As someone pointed out, it is also stored in plain text within
> the database.
>
> > > -----Original Message-----
> > > From: pgsql-general-owner@postgresql.org
> > > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Robert Treat
> > > Sent: Thursday, June 20, 2002 3:58 PM
> > > To: Bruce Momjian
> > > Cc: pgsql-general@postgresql.org
> > > Subject: Re: [GENERAL] db grows and grows
> > >
> > >
> > > On Thu, 2002-06-20 at 11:40, Bruce Momjian wrote:
> > > > Bjoern Metzdorf wrote:
> > > >
> > > > Yes, but the problem is that we don't have a plan on how to fix the
> > > > index growth problem right now, so if people want to prevent index
> > > > growth, reindex is the only solution.
> > > >
> > > > The TODO item says:
> > > >
> > > >     * Certain indexes will not shrink, e.g. indexes on
> > > ever-increasing
> > > >       columns and indexes with many duplicate keys
> > > >
> > > > but in fact that wording is misleading.  >80% of index are on
> > > > ever-increasing columns, so it isn't really 'certain index' but more
> > > > accurately 'most indexes'.
> > > >
> > > > I am planning to add the reindex script to /contrib,
> > > document its need
> > > > in the maintenance docs, and add an FAQ item.  If it gets
> > > fixed in 7.3,
> > > > great.  If not, we will have communicated to users and
> > > given them the
> > > > tools then need.
> > > >
> > >
> > > Would you say the reindex command/script is the recommended way of
> > > dealing with the issue, rather than the create/rename method some have
> > > suggested? Or maybe the difference is negligible? Does either method
> > > have an upside in regards to the query planner statistics
> > > generated via
> > > routine vacuum analyzes?
> > >
> > > Robert Treat
> > >
> > >
> > > ---------------------------(end of
> > > broadcast)---------------------------
> > > TIP 3: if posting/reading through Usenet, please send an appropriate
> > > subscribe-nomail command to majordomo@postgresql.org so that your
> > > message can get through to the mailing list cleanly
> > >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > There are 10 kinds of people in the world, those that can do binary
> > arithmetic and those that can't.

Re: db grows and grows

From

Shaun Thomas

Date:

21 June 2002, 13:20:26

On Thu, 20 Jun 2002, Bjoern Metzdorf wrote:

> It *should* be only there for disaster recovery.  Unfortunately this
> is not the case in the latest releases, so we all use it now
> temporarily for space management, although this is surely going to be
> fixed in one of the next releases.

True...

<deletes rant>


--
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| Shaun M. Thomas                INN Database Administrator           |
| Phone: (309) 743-0812          Fax  : (309) 743-0830                |
| Email: sthomas@townnews.com    AIM  : trifthen                      |
| Web  : www.townnews.com                                             |
|                                                                     |
|     "Most of our lives are about proving something, either to       |
|      ourselves or to someone else."                                 |
|                                           -- Anonymous              |
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+