Thread: Online index builds (was: [ANNOUNCE] PostgreSQL 8.2 Now Available)

Online index builds (was: [ANNOUNCE] PostgreSQL 8.2 Now Available)

From

Bill Moran

Date:

05 December 2006, 17:06:36

In response to Josh Berkus <josh@postgresql.org>:

> -- Online index builds

I'm particularly curious about this feature.  Does this mean that
PostgreSQL 8.2 can perform a REINDEX without blocking the relevant
table from writes?

If so, the 8.2 docs are a bit out of date:
http://www.postgresql.org/docs/8.2/static/sql-reindex.html

--
Bill Moran
Collaborative Fusion Inc.

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: Online index builds (was: [ANNOUNCE] PostgreSQL 8.2

From

"Joshua D. Drake"

Date:

05 December 2006, 17:12:04

On Tue, 2006-12-05 at 16:06 -0500, Bill Moran wrote:
> In response to Josh Berkus <josh@postgresql.org>:
>
> > -- Online index builds
>
> I'm particularly curious about this feature.  Does this mean that
> PostgreSQL 8.2 can perform a REINDEX without blocking the relevant
> table from writes?

I don't know about reindex, but it is possible to drop and create the
index.

Sincerely,

Joshua D. Drake


>
> If so, the 8.2 docs are a bit out of date:
> http://www.postgresql.org/docs/8.2/static/sql-reindex.html
>
--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

Re: Online index builds (was: [ANNOUNCE] PostgreSQL 8.2 Now Available)

From

Alvaro Herrera

Date:

05 December 2006, 17:12:21

Bill Moran wrote:
> In response to Josh Berkus <josh@postgresql.org>:
>
> > -- Online index builds
>
> I'm particularly curious about this feature.  Does this mean that
> PostgreSQL 8.2 can perform a REINDEX without blocking the relevant
> table from writes?
>
> If so, the 8.2 docs are a bit out of date:
> http://www.postgresql.org/docs/8.2/static/sql-reindex.html

No, it means you can do CREATE INDEX CONCURRENTLY.

http://www.postgresql.org/docs/8.2/static/sql-createindex.html

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Online index builds (was: [ANNOUNCE] PostgreSQL 8.2

From

Bill Moran

Date:

05 December 2006, 17:29:49

In response to Alvaro Herrera <alvherre@commandprompt.com>:

> Bill Moran wrote:
> > In response to Josh Berkus <josh@postgresql.org>:
> >
> > > -- Online index builds
> >
> > I'm particularly curious about this feature.  Does this mean that
> > PostgreSQL 8.2 can perform a REINDEX without blocking the relevant
> > table from writes?
> >
> > If so, the 8.2 docs are a bit out of date:
> > http://www.postgresql.org/docs/8.2/static/sql-reindex.html
>
> No, it means you can do CREATE INDEX CONCURRENTLY.
>
> http://www.postgresql.org/docs/8.2/static/sql-createindex.html

Ahh ... and the text there specifically states that REINDEX does
_not_ work concurrently.

Thanks.

--
Bill Moran
Collaborative Fusion Inc.

Re: Online index builds

From

Chris Browne

Date:

06 December 2006, 16:04:16

wmoran@collaborativefusion.com (Bill Moran) writes:
> In response to Alvaro Herrera <alvherre@commandprompt.com>:
>
>> Bill Moran wrote:
>> > In response to Josh Berkus <josh@postgresql.org>:
>> >
>> > > -- Online index builds
>> >
>> > I'm particularly curious about this feature.  Does this mean that
>> > PostgreSQL 8.2 can perform a REINDEX without blocking the relevant
>> > table from writes?
>> >
>> > If so, the 8.2 docs are a bit out of date:
>> > http://www.postgresql.org/docs/8.2/static/sql-reindex.html
>>
>> No, it means you can do CREATE INDEX CONCURRENTLY.
>>
>> http://www.postgresql.org/docs/8.2/static/sql-createindex.html
>
> Ahh ... and the text there specifically states that REINDEX does
> _not_ work concurrently.
>
> Thanks.

Let me add another question to this; this might possibly be worthy of
a TODO for 8.3 or so...

What if I wanted to:
    ALTER TABLE distributors ADD PRIMARY KEY CONCURRENTLY (dist_id);
?

We have a number of cases where there isn't a true primary key on
tables.  It would be very attractive to have a non-blocking way of
getting one, perhaps to be combined with letting Slony-I know about
it...

Or is it a better answer to look more deeply into the index
configuration, creating a suitably named UNIQUE index on NOT NULL
fields, and fiddling it into being the primary key?
--
let name="cbbrowne" and tld="linuxfinances.info" in name ^ "@" ^ tld;;
http://linuxfinances.info/info/advocacy.html
"Marketing Division, Sirius    Cybernetics Corp: A  bunch of  mindless
jerks who'll be the first against the wall when the revolution comes."
-- The Hitchhiker's Guide to the Galaxy

Re: Online index builds

From

Jeff Davis

Date:

06 December 2006, 22:22:48

On Wed, 2006-12-06 at 15:00 -0500, Chris Browne wrote:
> Let me add another question to this; this might possibly be worthy of
> a TODO for 8.3 or so...
>
> What if I wanted to:
>     ALTER TABLE distributors ADD PRIMARY KEY CONCURRENTLY (dist_id);
> ?
>
> We have a number of cases where there isn't a true primary key on
> tables.  It would be very attractive to have a non-blocking way of
> getting one, perhaps to be combined with letting Slony-I know about
> it...
>
> Or is it a better answer to look more deeply into the index
> configuration, creating a suitably named UNIQUE index on NOT NULL
> fields, and fiddling it into being the primary key?

Interesting, I was just thinking about this today as well. I am thinking
it would be nice if we could:

ALTER TABLE SET PRIMARY KEY INDEX foo_pkey;

If it's already got a primary key we switch the primary key to be the
new primary key (throwing an error if the columns don't match up to the
existing primary key, or if it's not unique). If not, the primary key
attribute is added to the existing index and the columns in the index
now make up the primary key (throwing an error if the index is not
unique).

It makes CREATE INDEX CONCURRENTLY more useful for reindexing a primary
key on a live database: you could just create the new index, switch it
to be the primary key, and drop the old index.

Regards,
    Jeff Davis

Re: Online index builds

From

Ragnar

Date:

07 December 2006, 08:17:34

On mið, 2006-12-06 at 18:22 -0800, Jeff Davis wrote:
> On Wed, 2006-12-06 at 15:00 -0500, Chris Browne wrote:
> > Let me add another question to this; this might possibly be worthy of
> > a TODO for 8.3 or so...
> >
> > What if I wanted to:
> >     ALTER TABLE distributors ADD PRIMARY KEY CONCURRENTLY (dist_id);

> Interesting, I was just thinking about this today as well. I am thinking
> it would be nice if we could:
>
> ALTER TABLE SET PRIMARY KEY INDEX foo_pkey;
>
> If it's already got a primary key we switch the primary key to be the
> new primary key
>
> (throwing an error if the columns don't match up to the
> existing primary key,

not sure what you mean by this

>  or if it's not unique).

must also be NOT NULL

>  If not, the primary key
> attribute is added to the existing index and the columns in the index
> now make up the primary key (throwing an error if the index is not
> unique).

What about existing foreign key constraints ?
as the only function of the PRIMARY key property of an
index is making it the default target of a foreign key
reference, you would have to decide what implications
this has. Possibly none, as I am not sure the foreign
key constraint remembers if the target was a primary key
or not.

also, your proposed syntax muddies the relationship
between the PRIMARY KEY constraint and the existence
of an INDEX. There is no such relationship in the SQL
standards.

possibly more appropriate would be

ALTER TABLE SET PRIMARY KEY (columns)
and an error issued if no UNIQUE NOT NULL index
is found on the relevant columns

one other question is what shuld happen to the original index that was
implicitly created. should it be dropped
automatically ?

gnari

Re: Online index builds

From

Jeff Davis

Date:

07 December 2006, 13:27:38

On Thu, 2006-12-07 at 12:26 +0000, Ragnar wrote:
> On mið, 2006-12-06 at 18:22 -0800, Jeff Davis wrote:
> > On Wed, 2006-12-06 at 15:00 -0500, Chris Browne wrote:
> > > Let me add another question to this; this might possibly be worthy of
> > > a TODO for 8.3 or so...
> > >
> > > What if I wanted to:
> > >     ALTER TABLE distributors ADD PRIMARY KEY CONCURRENTLY (dist_id);
>
> > Interesting, I was just thinking about this today as well. I am thinking
> > it would be nice if we could:
> >
> > ALTER TABLE SET PRIMARY KEY INDEX foo_pkey;
> >
> > If it's already got a primary key we switch the primary key to be the
> > new primary key
> >
> > (throwing an error if the columns don't match up to the
> > existing primary key,
>
> not sure what you mean by this

In my suggestion, if the table already has a primary key, then you can
only set the primary key index to be an index with exactly the same
columns as the existing primary key index.

> >  or if it's not unique).
>
> must also be NOT NULL

Indexes can't be NOT NULL; NOT NULL is a constraint. You're right
though, if it was a new primary key, the column must already have the
NOT NULL constraint on it.

> >  If not, the primary key
> > attribute is added to the existing index and the columns in the index
> > now make up the primary key (throwing an error if the index is not
> > unique).
>
> What about existing foreign key constraints ?
> as the only function of the PRIMARY key property of an
> index is making it the default target of a foreign key
> reference, you would have to decide what implications
> this has. Possibly none, as I am not sure the foreign
> key constraint remembers if the target was a primary key
> or not.

Doesn't matter. Foreign keys don't reference an index, they reference a
set of attributes. I am just trying to provide an ability to change the
underlying unique index that is used to implement the unique constraint
that is necessary for all primary keys.

>
> also, your proposed syntax muddies the relationship
> between the PRIMARY KEY constraint and the existence
> of an INDEX. There is no such relationship in the SQL
> standards.

The index is an important implementation detail of a primary key,
because it is necessary to implement the UNIQUE constraint. Many PG DBAs
need to reindex the primary key on a large table as part of regular
maintenance. I am trying to provide a way to do this without locking our
reads or writes, using the already-existing CREATE INDEX CONCURRENTLY.

> possibly more appropriate would be
>
> ALTER TABLE SET PRIMARY KEY (columns)
> and an error issued if no UNIQUE NOT NULL index
> is found on the relevant columns

That doesn't solve the problem, because that doesn't allow you to choose
the index that the primary key will use, which was the whole point of my
suggestion.

> one other question is what shuld happen to the original index that was
> implicitly created. should it be dropped
> automatically ?
>

Good question. Either way should be fine, as long as it is documented.
It should probably not be automatically dropped, but maybe issue a
NOTICE, like when the index is implicitly created.

Regards,
    Jeff Davis

Re: Online index builds

From

Ragnar

Date:

07 December 2006, 15:59:10

On fim, 2006-12-07 at 09:27 -0800, Jeff Davis wrote:
> On Thu, 2006-12-07 at 12:26 +0000, Ragnar wrote:
> > On mið, 2006-12-06 at 18:22 -0800, Jeff Davis wrote:
> >
> > > Interesting, I was just thinking about this today as well. I am thinking
> > > it would be nice if we could:
> > >
> > > ALTER TABLE SET PRIMARY KEY INDEX foo_pkey;
> > >
> > > If it's already got a primary key we switch the primary key to be the
> > > new primary key
> > >
> > > (throwing an error if the columns don't match up to the
> > > existing primary key,
> >
> > not sure what you mean by this
>
> In my suggestion, if the table already has a primary key, then you can
> only set the primary key index to be an index with exactly the same
> columns as the existing primary key index.

Why would you do that?

I saw the use-case of when you have a primary key and a
surrogate key , and decided you wanted the surrogate key to be the
primary key after all, maybe because the
natural key you had used turned out not to be a good
candidate.

>
> > >  or if it's not unique).
> >
> > must also be NOT NULL
>
> Indexes can't be NOT NULL; NOT NULL is a constraint.

Sorry, I got confused by the UNIQUE in the create index syntax:

CREATE [ UNIQUE ] INDEX name ON table [ USING method ]
    ( { column | ( expression ) } [ opclass ] [, ...] )
    [ TABLESPACE tablespace ]
    [ WHERE predicate ]


> > ...
> > What about existing foreign key constraints ?
> > as the only function of the PRIMARY key property of an
> > index is making it the default target of a foreign key
> > reference, you would have to decide what implications
> > this has. Possibly none, as I am not sure the foreign
> > key constraint remembers if the target was a primary key
> > or not.
>
> Doesn't matter. Foreign keys don't reference an index, they reference a
> set of attributes. I am just trying to provide an ability to change the
> underlying unique index that is used to implement the unique constraint
> that is necessary for all primary keys.

I was still imagining here that you would want a
different set of attributes froyour primary key.

gnari

Re: Online index builds

From

Jeff Davis

Date:

07 December 2006, 17:58:09

On Thu, 2006-12-07 at 20:07 +0000, Ragnar wrote:
> On fim, 2006-12-07 at 09:27 -0800, Jeff Davis wrote:
> > On Thu, 2006-12-07 at 12:26 +0000, Ragnar wrote:
> > > On mið, 2006-12-06 at 18:22 -0800, Jeff Davis wrote:
> > >
> > > > Interesting, I was just thinking about this today as well. I am thinking
> > > > it would be nice if we could:
> > > >
> > > > ALTER TABLE SET PRIMARY KEY INDEX foo_pkey;
> > > >
> > > > If it's already got a primary key we switch the primary key to be the
> > > > new primary key
> > > >
> > > > (throwing an error if the columns don't match up to the
> > > > existing primary key,
> > >
> > > not sure what you mean by this
> >
> > In my suggestion, if the table already has a primary key, then you can
> > only set the primary key index to be an index with exactly the same
> > columns as the existing primary key index.
>
> Why would you do that?
>
> I saw the use-case of when you have a primary key and a
> surrogate key , and decided you wanted the surrogate key to be the
> primary key after all, maybe because the
> natural key you had used turned out not to be a good
> candidate.
>

You've got a valid use-case, but it's completely different from the one
I suggested. I wanted to be able to build an index concurrently (with
the new functionality in 8.2) and then switch the primary key to use
that new index, and then drop the old index.

The reason is because that allows a 0-downtime index rebuild on a
primary key's index without losing it's primary key status.

I think all you need to do what you want is something like:
ALTER TABLE foo DROP CONSTRAINT foo_pkey KEEP INDEX;

Because then you could drop the primary key status on a column without
affecting the column or the index, then use my suggested syntax to
switch the primary key status to a different index like so:
ALTER TABLE foo SET PRIMARY KEY INDEX foo_othercolumn_index;

Regards,
    Jeff Davis

Re: Online index builds

From

Tom Lane

Date:

07 December 2006, 19:11:13

Jeff Davis <pgsql@j-davis.com> writes:
> I think all you need to do what you want is something like:
> ALTER TABLE foo DROP CONSTRAINT foo_pkey KEEP INDEX;

> Because then you could drop the primary key status on a column without
> affecting the column or the index, then use my suggested syntax to
> switch the primary key status to a different index like so:
> ALTER TABLE foo SET PRIMARY KEY INDEX foo_othercolumn_index;

That seems like an awful lot of uglification simply to let the index be
marked as "primary key" rather than just "unique".

            regards, tom lane

Re: Online index builds

From

Jeff Davis

Date:

07 December 2006, 19:34:26

On Thu, 2006-12-07 at 18:11 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > I think all you need to do what you want is something like:
> > ALTER TABLE foo DROP CONSTRAINT foo_pkey KEEP INDEX;
>
> > Because then you could drop the primary key status on a column without
> > affecting the column or the index, then use my suggested syntax to
> > switch the primary key status to a different index like so:
> > ALTER TABLE foo SET PRIMARY KEY INDEX foo_othercolumn_index;
>
> That seems like an awful lot of uglification simply to let the index be
> marked as "primary key" rather than just "unique".
>

Agreed. It's just a thought.

The reason it came to my mind is because some applications, like Slony,
use the primary key by default.

After reading through the archives, it looks like Gregory Stark
suggested a REINDEX CONCURRENTLY, which would certainly solve the
awkwardness of maintenance on a primary key. I didn't see much
objection, maybe it's worth consideration for 8.3?

Regards,
    Jeff Davis

Re: Online index builds

From

Tom Lane

Date:

07 December 2006, 19:51:49

Jeff Davis <pgsql@j-davis.com> writes:
> After reading through the archives, it looks like Gregory Stark
> suggested a REINDEX CONCURRENTLY, which would certainly solve the
> awkwardness of maintenance on a primary key. I didn't see much
> objection, maybe it's worth consideration for 8.3?

That idea was bounced on the grounds that it requires a DROP INDEX to
occur somewhere, and that can't be concurrent, and you'd surely not like
to go through all the work of a CONCURRENTLY rebuild only to get a
deadlock failure at the very end.

            regards, tom lane

Re: Online index builds

From

Ragnar

Date:

07 December 2006, 19:56:26

On fim, 2006-12-07 at 13:57 -0800, Jeff Davis wrote:
> On Thu, 2006-12-07 at 20:07 +0000, Ragnar wrote:
> > On fim, 2006-12-07 at 09:27 -0800, Jeff Davis wrote:
> > > On Thu, 2006-12-07 at 12:26 +0000, Ragnar wrote:
> > > > On mið, 2006-12-06 at 18:22 -0800, Jeff Davis wrote:
> > > >
> > > > > Interesting, I was just thinking about this today as well. I am thinking
> > > > > it would be nice if we could:
> > > > >
> > > > > ALTER TABLE SET PRIMARY KEY INDEX foo_pkey;
> > > > >
>
> You've got a valid use-case, but it's completely different from the one
> I suggested. I wanted to be able to build an index concurrently (with
> the new functionality in 8.2) and then switch the primary key to use
> that new index, and then drop the old index.
>
> The reason is because that allows a 0-downtime index rebuild on a
> primary key's index without losing it's primary key status.

my point was just that 'primary key' is really just
a property of a set of attributes, and it is just
incidental that postgres enforces this property
with an index.

so if if a  ALTER TABLE SET PRIMARY KEY is implemented,
it should involve a set of attributes, but not an index.

in your use case, the ALTER should not really be needed.
lets say you have PRIMARY KEY (a,b) on some table.
you decide you want to rebuild the primary key concurrently. just build
a new index on (a,b).
if you then drop the old index, the primary key constraint can still be
enforced by the new index, so
the DROP should be allowed to proceed, without affecting
the constraint.

on the other hand, the PRIMARY KEY property is really
only there because the standards say so, but does not
have a great value in my opinion, so the ability to
alter it would not be high on my priority lists.

gnari

Re: Online index builds

From

Jeff Davis

Date:

07 December 2006, 20:32:29

On Thu, 2006-12-07 at 18:51 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > After reading through the archives, it looks like Gregory Stark
> > suggested a REINDEX CONCURRENTLY, which would certainly solve the
> > awkwardness of maintenance on a primary key. I didn't see much
> > objection, maybe it's worth consideration for 8.3?
>
> That idea was bounced on the grounds that it requires a DROP INDEX to
> occur somewhere, and that can't be concurrent, and you'd surely not like
> to go through all the work of a CONCURRENTLY rebuild only to get a
> deadlock failure at the very end.
>

I don't understand. CREATE INDEX CONCURRENTLY can't be run in a
transaction block anyway, so a REINDEX CONCURRENTLY wouldn't either. So
how (or when) would you deadlock?

I see it as the following logical operations:
(1) CREATE INDEX CONCURRENTLY tmp;
(2) swap the relfilenode of the old index and new index
(3) DROP INDEX tmp;

If this was all already hashed out on -hackers, you can point me to the
discussion if it's easier.

Regards,
    Jeff Davis

Re: Online index builds

From

Tom Lane

Date:

07 December 2006, 20:44:18

Jeff Davis <pgsql@j-davis.com> writes:
> I don't understand. CREATE INDEX CONCURRENTLY can't be run in a
> transaction block anyway, so a REINDEX CONCURRENTLY wouldn't either. So
> how (or when) would you deadlock?

The problem is you need to upgrade from a nonexclusive table lock to an
exclusive one before you could drop the old index.  If someone else
is waiting to get a conflicting lock, boom ...

            regards, tom lane

Re: Online index builds

From

Jeff Davis

Date:

07 December 2006, 21:09:08

On Thu, 2006-12-07 at 19:44 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > I don't understand. CREATE INDEX CONCURRENTLY can't be run in a
> > transaction block anyway, so a REINDEX CONCURRENTLY wouldn't either. So
> > how (or when) would you deadlock?
>
> The problem is you need to upgrade from a nonexclusive table lock to an
> exclusive one before you could drop the old index.  If someone else
> is waiting to get a conflicting lock, boom ...
>

I think what I'm confused about is how these non-transactional commands
work (like VACUUM, etc). Are they still transactions, and just can't be
run in a block?

My original thinking was that the shared lock could be unlocked before
the exclusive lock is taken to switch the relfilenodes and to drop the
index. However, if it is a real transaction, clearly you can't unlock in
the middle.

Is it safe to manually run the sequence I previously suggested? If so it
seems like there could be a command to do it properly. I tried it and it
appeared to work.

Regards,
    Jeff Davis

Re: Online index builds

From

Bruce Momjian

Date:

12 December 2006, 16:33:56

Jeff Davis wrote:
> On Thu, 2006-12-07 at 18:11 -0500, Tom Lane wrote:
> > Jeff Davis <pgsql@j-davis.com> writes:
> > > I think all you need to do what you want is something like:
> > > ALTER TABLE foo DROP CONSTRAINT foo_pkey KEEP INDEX;
> >
> > > Because then you could drop the primary key status on a column without
> > > affecting the column or the index, then use my suggested syntax to
> > > switch the primary key status to a different index like so:
> > > ALTER TABLE foo SET PRIMARY KEY INDEX foo_othercolumn_index;
> >
> > That seems like an awful lot of uglification simply to let the index be
> > marked as "primary key" rather than just "unique".
> >
>
> Agreed. It's just a thought.
>
> The reason it came to my mind is because some applications, like Slony,
> use the primary key by default.
>
> After reading through the archives, it looks like Gregory Stark
> suggested a REINDEX CONCURRENTLY, which would certainly solve the
> awkwardness of maintenance on a primary key. I didn't see much
> objection, maybe it's worth consideration for 8.3?

Added to TODO:

    * Allow REINDEX CONCURRENTLY

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Online index builds

From

Tom Lane

Date:

12 December 2006, 19:09:07

Gregory Stark <stark@enterprisedb.com> writes:
> You could create a whole new index concurrently, then in a completely new
> (third) transaction drop the old one. The problem there is that there could be
> other things (namely foreign key constraints) depending on the old index.
> Fixing them all to depend on the new one may not be a problem or it may, I
> haven't thought it through. Nor have I thought through whether it would be
> possible to keep the original name.

If the idea is to do REINDEX CONCURRENTLY then ISTM you could just swap
the relfilenodes of the two indexes and then zap the new catalog entries
(and old index contents).  The problem is exactly the same as before,
though: you need exclusive lock to do that.

            regards, tom lane

Re: Online index builds

From

Jeff Davis

Date:

12 December 2006, 19:33:59

On Tue, 2006-12-12 at 18:08 -0500, Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
> > You could create a whole new index concurrently, then in a completely new
> > (third) transaction drop the old one. The problem there is that there could be
> > other things (namely foreign key constraints) depending on the old index.
> > Fixing them all to depend on the new one may not be a problem or it may, I
> > haven't thought it through. Nor have I thought through whether it would be
> > possible to keep the original name.
>
> If the idea is to do REINDEX CONCURRENTLY then ISTM you could just swap
> the relfilenodes of the two indexes and then zap the new catalog entries
> (and old index contents).  The problem is exactly the same as before,
> though: you need exclusive lock to do that.
>

My point was that, because we can run it in multiple transactions, can't
we drop the nonexclusive lock before acquiring the exclusive lock,
thereby eliminating the possibility of losing the index we just made to
a deadlock?

In other words, why would the following not work:

CREATE UNIQUE INDEX CONCURRENTLY foo_pkey_tmp ON foo (id);
BEGIN;
UPDATE pg_class SET relfilenode=<relfilenode_of_foo_pkey> WHERE
relname='foo_pkey_tmp';
UPDATE pg_class SET relfilenode=<relfilenode_of_foo_pkey_tmp> WHERE
relname='foo_pkey';
COMMIT;
DROP INDEX foo_pkey_tmp;

Or is there something more sophisticated we need to do to swap the
relfilenodes?

Regards,
    Jeff Davis

Re: Online index builds

From

Gregory Stark

Date:

12 December 2006, 19:35:44

Jeff Davis <pgsql@j-davis.com> writes:

> I think what I'm confused about is how these non-transactional commands
> work (like VACUUM, etc). Are they still transactions, and just can't be
> run in a block?

In the case of CREATE INDEX CONCURRENTLY it can't be run in a transaction
block because it itself consists of two transactions. First it builds an
index, then it has to commit that and start a second transaction that
completes the index.

> My original thinking was that the shared lock could be unlocked before
> the exclusive lock is taken to switch the relfilenodes and to drop the
> index. However, if it is a real transaction, clearly you can't unlock in
> the middle.

Well you can't play games with the relfilenode if it's concurrent or else
other transactions executing inserts and updates won't be updating your new
index.

You could create a whole new index concurrently, then in a completely new
(third) transaction drop the old one. The problem there is that there could be
other things (namely foreign key constraints) depending on the old index.
Fixing them all to depend on the new one may not be a problem or it may, I
haven't thought it through. Nor have I thought through whether it would be
possible to keep the original name.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Online index builds

From

Tom Lane

Date:

12 December 2006, 19:40:53

Jeff Davis <pgsql@j-davis.com> writes:
> My point was that, because we can run it in multiple transactions, can't
> we drop the nonexclusive lock before acquiring the exclusive lock,

No.  What happens if someone renames the table out from under you, to
mention just one possibility?  If you've been holding nonexclusive lock
for a long time (as you would've been) there's a nontrivial chance that
someone is already queued up for an exclusive lock and will get in
before you do.

            regards, tom lane

Re: Online index builds

From

Jeff Davis

Date:

12 December 2006, 20:04:34

On Tue, 2006-12-12 at 18:40 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > My point was that, because we can run it in multiple transactions, can't
> > we drop the nonexclusive lock before acquiring the exclusive lock,
>
> No.  What happens if someone renames the table out from under you, to
> mention just one possibility?  If you've been holding nonexclusive lock
> for a long time (as you would've been) there's a nontrivial chance that
> someone is already queued up for an exclusive lock and will get in
> before you do.
>

I'm trying to understand what would actually happen. I assume you mean
change the name of the index, because after we create the index
concurrently, it doesn't matter what the table name is.

(1) We create the new index concurrently
(2) someone gets an exclusive lock before we do, and they rename the old
index (foo_pkey is now known as bar_pkey).
(3) We don't find the index, throw an error, and have an extra index
hanging around. Same for any other situation that makes us unable to
continue in a well-defined way.

Even if we deleted the extra index on step 3, we could consider that
reasonable behavior because the user went out of their way to rename an
index with a concurrent REINDEX. They could then try again, albeit with
some wasted effort.

Even thinking about strange edge cases, like if they decide to use their
exclusive lock to swap the names of two indexes in step 2, we could
probably detect whether it was the same old index or not; perhaps by
remembering the relfilenode of the index we're REINDEXing.

Regards,
    Jeff Davis

Re: Online index builds

From

Tom Lane

Date:

12 December 2006, 20:13:42

Jeff Davis <pgsql@j-davis.com> writes:
> On Tue, 2006-12-12 at 18:40 -0500, Tom Lane wrote:
>> No.  What happens if someone renames the table out from under you, to
>> mention just one possibility?

> I'm trying to understand what would actually happen. I assume you mean
> change the name of the index, because after we create the index
> concurrently, it doesn't matter what the table name is.

Well, if you don't like that one, consider ALTER OWNER revoking your
privilege to perform the REINDEX.  Without an explicit check for the
case, the code would proceed to do it anyway.  (And even if it did
check, what then?  You don't really have the right anymore to undo what
you did so far, either.)

Yeah, we could add defenses one by one for the cases we could think of,
but I'd never feel very secure that we'd covered them all.

Another point here is that I think you are assuming that an OID is a
unique-for-all-time identifier for a table or index.  It's not; as soon
as someone drops the table or index, the OID is up for grabs and could
be re-used for an unrelated table or index.  Admittedly one would have
to be quite unlucky to get burnt that way, but deliberately introducing
race conditions in the name of convenience is not my idea of the way to
design a database.

            regards, tom lane

Re: Online index builds

From

Jeff Davis

Date:

12 December 2006, 20:39:04

On Tue, 2006-12-12 at 19:13 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > On Tue, 2006-12-12 at 18:40 -0500, Tom Lane wrote:
> >> No.  What happens if someone renames the table out from under you, to
> >> mention just one possibility?
>
> > I'm trying to understand what would actually happen. I assume you mean
> > change the name of the index, because after we create the index
> > concurrently, it doesn't matter what the table name is.
>
> Well, if you don't like that one, consider ALTER OWNER revoking your
> privilege to perform the REINDEX.  Without an explicit check for the
> case, the code would proceed to do it anyway.  (And even if it did
> check, what then?  You don't really have the right anymore to undo what
> you did so far, either.)
>
> Yeah, we could add defenses one by one for the cases we could think of,
> but I'd never feel very secure that we'd covered them all.
>

Ok, fair enough. I just wanted to make sure I understood the reason why
we couldn't (shouldn't?) do it.

> Another point here is that I think you are assuming that an OID is a
> unique-for-all-time identifier for a table or index.  It's not; as soon
> as someone drops the table or index, the OID is up for grabs and could
> be re-used for an unrelated table or index.  Admittedly one would have
> to be quite unlucky to get burnt that way, but deliberately introducing
> race conditions in the name of convenience is not my idea of the way to
> design a database.
>

It essentially does boil down to just convenience. In general we don't
have much ability to change primary key status for columns without
creating/dropping indexes non-concurrently. Admittedly, that isn't
important, but would be convenient.

Regards,
    Jeff Davis

Re: Online index builds

From

Csaba Nagy

Date:

13 December 2006, 06:06:08

> Yeah, we could add defenses one by one for the cases we could think of,
> but I'd never feel very secure that we'd covered them all.

What you all forget in this discussion is that reindexing concurrently
would have to be a highly administrative task, controlled by the DB
admin... so whoever has a big index to be reindexed can schedule it so
that no other schema changes occur to the table until the reindex is
finished.

So an implementation which optimistically builds the new index
concurrently while holding no lock, and then hopes for the 3rd
transaction to be able to get the exclusive lock and be able to swap the
new index in the place of the old index, and error out if it can't - it
is perfectly acceptable. The waisted effort when dropping the newly
created index on error is easily avoidable by not doing anything which
would cause an error in that phase... and it is easily controlled by the
DBA. The only thing needed is documentation to point it out.

I didn't understand completely the discussion here, and if there are
some problems detecting the error conditions in the index swap phase,
that's a problem... but if it is possible to reliably detect cases where
the swap is not possible because something changed in between, erroring
out will be acceptable for the purpose of this command...

Cheers,
Csaba.

Re: Online index builds

From

Ragnar

Date:

13 December 2006, 06:38:47

On mið, 2006-12-13 at 11:05 +0100, Csaba Nagy wrote:
> > Yeah, we could add defenses one by one for the cases we could think of,
> > but I'd never feel very secure that we'd covered them all.
>
> What you all forget in this discussion is that reindexing concurrently
> would have to be a highly administrative task, controlled by the DB
> admin... so whoever has a big index to be reindexed can schedule it so
> that no other schema changes occur to the table until the reindex is
> finished.

well, if this is a command that would nly be made manually by
an administrator, why do we need a separate command for this.

the DBA can just create a new index concurrently, and then
perform the DROP and rename in a transaction whenever he
thinks it is safe to take the exclusive lock needed for a
short while.

the only functionality missing compared to a REINDEX
CONCURRENTLY, is the handling of a PRIMARY key constraint
linked to the index while this happens, but that seems a
much simpler problem to solve separately.

gnari

Re: Online index builds

From

Tom Lane

Date:

13 December 2006, 12:13:04

Csaba Nagy <nagy@ecircle-ag.com> writes:
> So an implementation which optimistically builds the new index
> concurrently while holding no lock, and then hopes for the 3rd
> transaction to be able to get the exclusive lock and be able to swap the
> new index in the place of the old index, and error out if it can't - it
> is perfectly acceptable.

It would maybe be acceptable if there were a way to clean up the mess
after a failure, but there wouldn't be ...

            regards, tom lane

Re: Online index builds

From

Csaba Nagy

Date:

13 December 2006, 12:38:46

On Wed, 2006-12-13 at 17:12, Tom Lane wrote:
> Csaba Nagy <nagy@ecircle-ag.com> writes:
> > So an implementation which optimistically builds the new index
> > concurrently while holding no lock, and then hopes for the 3rd
> > transaction to be able to get the exclusive lock and be able to swap the
> > new index in the place of the old index, and error out if it can't - it
> > is perfectly acceptable.
>
> It would maybe be acceptable if there were a way to clean up the mess
> after a failure, but there wouldn't be ...

With the "mess" you refer to the new index, and the fact it is
impossible to delete it if not possible to replace the old one ? I fail
to see why... you WILL get an exclusive lock, so you should be able to
delete the index. The deadlock is not an issue if you release first the
shared locks you hold...

If "mess" means that it's impossible to tell that you can or can't
safely replace the index, then that's a problem, but I think the
scenarios you thought out and would break things are detectable, right ?
Then you: take the exclusive lock, check if you can still safely replace
the index, do it if yes, delete the new index otherwise or on failure to
swap (to cover unexpected cases). If you can't delete the new index
cause somebody changed it in the meantime (that must be a really strange
corner case), then bad luck, nobody is supposed to do that...

While I'm not familiar enough with how postgres handles locking,
wouldn't be also possible for DDLs to first also acquire a lock which
would only lock other DDLs and not DMLs ? In that case you could get
that lock first and hold it through the second phase, and make the
second phase also swap the indexes after also acquiring the full
exclusive lock. That could potentially still deadlock, but the chance to
do so would be a lot smaller.

I think the above is not clear enough... what I mean is to make all DDLs
get 2 locks:

 - first an "DDL exclusive" lock which blocks other DDLs from getting
the same;
 - second a full exclusive lock which blocks any other locks;

Between the 2 there could go some operation which is not blocking normal
operation but needs protection from other concurrent DDL. If only DDLs
do this and always in this order, there's no deadlock potential.
Disadvantage is the very need to place one more lock...

Cheers,
Csaba.

Re: Online index builds

From

Tom Lane

Date:

14 December 2006, 11:30:18

Csaba Nagy <nagy@ecircle-ag.com> writes:
> With the "mess" you refer to the new index, and the fact it is
> impossible to delete it if not possible to replace the old one ? I fail
> to see why... you WILL get an exclusive lock, so you should be able to
> delete the index.

Consider the ALTER OWNER example ...

            regards, tom lane