Home > mailing lists

Re: create index concurrently - duplicate index to reduce time without an index - Mailing list pgsql-general

From
Subject	Re: create index concurrently - duplicate index to reduce time without an index
Date	June 3, 2010 21:42:20
Msg-id	007DECE986B47F4EABF823C1FBB19C6201026E19AF06@exvic-mbx04.nexus.csiro.au Whole thread Raw
In response to	Re: create index concurrently - duplicate index to reduce time without an index (Greg Smith <greg@2ndquadrant.com>)
Responses	Re: create index concurrently - duplicate index to reduce time without an index Re: create index concurrently - duplicate index to reduce time without an index
List	pgsql-general

Tree view

> -----Original Message-----
> From: Greg Smith [mailto:greg@2ndquadrant.com]
-snip-
>
> Gareth.Williams wrote:
> > So the rest of the question is, if I have two indexes with identical
> definitions, what happens?  I've confirmed that I can create indexes with
> identical definitions (except name) without postgres complaining - and
> without breaking the client on my test system - but I am wary of trying it
> on my production system where there is much more data (8GB) and I care
> about it's integrity so much more.
> >
>
> The database doesn't care one bit if you create a bunch of redundant
> indexes.  So long as one of them is around to satisfy the queries that
> need the index to run well, you're fine.
>
> The main thing you can't do with the index concurrently/rename shuffle
> you've discovered here is use that approach to concurrently rebuild an
> index that enforces a constraint or unique index.  If your index is
> enforcing a PRIMARY KEY for example, you'll discover a major roadblock
> were you to try this same technique to rebuild it.  Those are tied into
> the constraint portion of the system catalogs and manipulating them
> isn't so easy.
>
> Regular indexes that exist just to speed up queries, those you can
> rename around as you've been figuring out without any downside.  From a
> general paranoia perspective, you should run an explicit ANALYZE on the
> underlying table after you finish the shuffle, just to make absolutely
> sure the right statistics are available afterwards.
>
> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> greg@2ndQuadrant.com   www.2ndQuadrant.us

Thanks Greg, Alban and others,

This has cleared up a misunderstanding I had about why one should reindex.  Re-reading the documentation
http://www.postgresql.org/docs/8.4/interactive/sql-reindex.htmlit is clear now that reindex or recreating and index
shouldnot normally be needed - certainly not to keep an index up-to-date.  I would have guessed that VACUUM or VACUUM
ANALYSEon the table that the index is associated would have been sufficient to reclaim space for a 'bloated' index
(maybeonly VACUUM FULL would help).  In any case we can leave reindexing or full vacuum for outages where we are
interruptingservice anyway. 

I was heartened by the responses and tried further testing (if it could not hurt, why not try and see if it could be
faster),but ran into a problem.  A few times when I was trying to drop an index (before or after creating a duplicate
indexwith 'concurrently'), the dropping of the index stalled.  It seems that this was because of existing connection: 
postgres: rods ICAT 130.102.163.141(58061) idle in transaction
And new clients block.

Is that expected? Should we look for 'bugs' in out client that seems to be holding a connection?

For the application we have, I'm ready to give up on this train of investigation for optimization and just vacuum
analysekey tables regularly and vaccuum and maybe reindex more completely during outages - though we aim for outages to
beinfrequent.  The database holds data representing a virtual filesystem structure with millions of file (and
associatedaccess controls, and information on underlying storage resources and replication).  There is probably not
muchupdate or delete of the main data - at least compared with the total holdings and the new data/files which are
regularlybeing added to the system. 

Thanks again,

Gareth

Ps. Greg, I don't understand the issue with 'concurrently rebuild (of) an index that enforces a constraint or unique
index'. I don't think I care much right at the moment, but I'm generally interested and others might be too. Would you
expectthe create index to fail or to cause locking or just transient performance degradation?

pgsql-general by date:

From: zhong ming wu
Date: 03 June 2010, 20:36:50
Subject: Re: c program fails to run with the postgres which is installed at user location

From: Craig Ringer
Date: 03 June 2010, 21:43:23
Subject: Re: How to remove the current database and populate the database with new data?

Re: create index concurrently - duplicate index to reduce time without an index - Mailing list pgsql-general

Previous

Next