Thread: CREATE INDEX CONCURRENTLY?

CREATE INDEX CONCURRENTLY?

From
Mark Woodward
Date:
I have not kept up with PostgreSQL changes and have just been using it. A co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE INDEX" to avoid table locking. I called BS on this because to my knowledge PostgreSQL does not lock tables. I referenced this page in the documentation:

http://www.postgresql.org/docs/9.3/static/locking-indexes.html

However, I do see this sentence in the indexing page that was not in the docs prior to 8.0:

"Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table."

Is this true? When/why the change?

When we use "concurrently," it seems to hang. I am looking into it.


Re: CREATE INDEX CONCURRENTLY?

From
Andrew Dunstan
Date:
On 10/31/2014 10:28 AM, Mark Woodward wrote:
> I have not kept up with PostgreSQL changes and have just been using 
> it. A co-worker recently told me that you need to word "CONCURRENTLY" 
> in "CREATE INDEX" to avoid table locking. I called BS on this because 
> to my knowledge PostgreSQL does not lock tables. I referenced this 
> page in the documentation:
>
> http://www.postgresql.org/docs/9.3/static/locking-indexes.html

That page refers to using the indexes, not creating them.

>
> However, I do see this sentence in the indexing page that was not in 
> the docs prior to 8.0:
>
> "Creating an index can interfere with regular operation of a database. 
> Normally PostgreSQL locks the table to be indexed against writes and 
> performs the entire index build with a single scan of the table."
>
> Is this true? When/why the change?
>
> When we use "concurrently," it seems to hang. I am looking into it.
>
>


Creating indexes always did lock tables. See for example 
http://www.postgresql.org/docs/7.4/static/explicit-locking.html#LOCKING-TABLES 
there CREATE INDEX is documented to take a SHARE lock on the table.

CONCURRENTLY was an additional feature to allow you to get around this, 
at the possible cost of some extra processing.

So we haven't made things harder, we've made them easier, and your 
understanding of old releases is incorrect.

cheers

andrew



Re: CREATE INDEX CONCURRENTLY?

From
Greg Stark
Date:
On Fri, Oct 31, 2014 at 2:28 PM, Mark Woodward
<mark.woodward@actifio.com> wrote:
> I have not kept up with PostgreSQL changes and have just been using it. A
> co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE
> INDEX" to avoid table locking. I called BS on this because to my knowledge
> PostgreSQL does not lock tables. I referenced this page in the
> documentation:


You can read from tables while a normal index build is in progress but
you can't insert, update, or delete from them. CREATE INDEX
CONCURRENTLY allows you to insert, update, and delete data while the
index build is running at the expense of having the index build take
longer.

-- 
greg



Re: CREATE INDEX CONCURRENTLY?

From
Michael Banck
Date:
Am Freitag, den 31.10.2014, 14:43 +0000 schrieb Greg Stark:
> On Fri, Oct 31, 2014 at 2:28 PM, Mark Woodward
> <mark.woodward@actifio.com> wrote:
> > I have not kept up with PostgreSQL changes and have just been using it. A
> > co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE
> > INDEX" to avoid table locking. I called BS on this because to my knowledge
> > PostgreSQL does not lock tables. I referenced this page in the
> > documentation:
> 
> You can read from tables while a normal index build is in progress but
> you can't insert, update, or delete from them. CREATE INDEX
> CONCURRENTLY allows you to insert, update, and delete data while the
> index build is running at the expense of having the index build take
> longer.

I believe there is one caveat: If there is an idle-in-transaction
backend from before the start of CREATE INDEX CONCURRENTLY, it can hold
up the index creation indefinitely as long as it doesn't commit.

src/backend/access/heap/README.HOT mentions this WRT CIC: "Then we wait
until every transaction that could have a snapshot older than the second
reference snapshot is finished.  This ensures that nobody is alive any
longer who could need to see any tuples that might be missing from the
index, as well as ensuring that no one can see any inconsistent rows in
a broken HOT chain (the first condition is stronger than the second)."

I have seen CIC stall at clients when there were (seemlingy) unrelated
idle-in-transactions open (their locks even touching only other
schemas). I believe it depends on the specific locks that the other
backend acquired, but at least with a DECLARE CURSOR I can trivially
reproduce it:

first session:

postgres=# CREATE SCHEMA foo1;
CREATE SCHEMA
postgres=# CREATE TABLE foo1.foo1 (id int);
CREATE TABLE
postgres=# CREATE SCHEMA foo2;
CREATE SCHEMA
postgres=# CREATE TABLE foo2.foo2 (id int);
CREATE TABLE

second session:

postgres=# BEGIN; DECLARE c1 CURSOR FOR SELECT * FROM foo1.foo1;
BEGIN
DECLARE CURSOR

first session:

postgres=# CREATE INDEX CONCURRENTLY ixfoo2 ON foo2.foo2(id);
(hangs)

I wonder whether that is pilot error (fair enough), or whether something
could be done about this?


Michael

-- 
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax:  +49 (2161) 4643-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer




Re: CREATE INDEX CONCURRENTLY?

From
Simon Riggs
Date:
On 31 October 2014 17:46, Michael Banck <michael.banck@credativ.de> wrote:

> I wonder whether that is pilot error (fair enough), or whether something
> could be done about this?

When originally written the constraints were tighter, but have since
been relaxed.

Even so a CIC waits until all snapshots that can see it have gone. So
what you observe is correct and known.


Can it be changed? Maybe.

CREATE INDEX gets around the wait by using indcheckxmin to see whether
the row is usable. So the command completes, even if the index is not
usable by all current sessions.

We perform the wait in a completely different way for CIC, for this
reason (in comments)
 We also need not set indcheckxmin during a concurrent index build, because we won't set indisvalid true until all
transactionsthat care about the broken HOT chains are gone.
 

Reading that again, I can't see why we do it that way. If CREATE INDEX
can exit once the index is built, so could CONCURRENTLY.

ISTM that we could indcheckxmin into an Xid, not a boolean  For CREATE INDEX, set the indcheckxmin = xid of creating
transaction For CREATE INDEX CONCURRENTLY set the indcheckxmin = xid of the
 
completing transaction

The apparent reason it does this is that the Xmin value used currently
is the Xmin of the index row. The index row is inserted prior to the
index being valid so that technique cannot work. So I am suggesting
for CIC that we use the xid of the transaction that completes the
index, not the xid that originally created the index row. Plus handle
the difference between valid and not.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: CREATE INDEX CONCURRENTLY?

From
Tim Kane
Date:
This just hit us today... Admittedly on an old cluster still running 9.2, though I can't see any mention of it being
addressedsince.<br /><br />Any chance of getting this on to to-do list?<br /><div class="gmail_quote"><div dir="ltr">On
Sat,1 Nov 2014 at 07:45, Simon Riggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>> wrote:<br
/></div><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 31
October2014 17:46, Michael Banck <<a href="mailto:michael.banck@credativ.de"
target="_blank">michael.banck@credativ.de</a>>wrote:<br /><br /> > I wonder whether that is pilot error (fair
enough),or whether something<br /> > could be done about this?<br /><br /> When originally written the constraints
weretighter, but have since<br /> been relaxed.<br /><br /> Even so a CIC waits until all snapshots that can see it
havegone. So<br /> what you observe is correct and known.<br /><br /><br /> Can it be changed? Maybe.<br /><br />
CREATEINDEX gets around the wait by using indcheckxmin to see whether<br /> the row is usable. So the command
completes,even if the index is not<br /> usable by all current sessions.<br /><br /> We perform the wait in a
completelydifferent way for CIC, for this<br /> reason (in comments)<br /><br />   We also need not set indcheckxmin
duringa concurrent index build,<br />   because we won't set indisvalid true until all transactions that care<br />  
aboutthe broken HOT chains are gone.<br /><br /> Reading that again, I can't see why we do it that way. If CREATE
INDEX<br/> can exit once the index is built, so could CONCURRENTLY.<br /><br /> ISTM that we could indcheckxmin into an
Xid,not a boolean<br />    For CREATE INDEX, set the indcheckxmin = xid of creating transaction<br />    For CREATE
INDEXCONCURRENTLY set the indcheckxmin = xid of the<br /> completing transaction<br /><br /> The apparent reason it
doesthis is that the Xmin value used currently<br /> is the Xmin of the index row. The index row is inserted prior to
the<br/> index being valid so that technique cannot work. So I am suggesting<br /> for CIC that we use the xid of the
transactionthat completes the<br /> index, not the xid that originally created the index row. Plus handle<br /> the
differencebetween valid and not.<br /><br /> --<br />  Simon Riggs                   <a
href="http://www.2ndQuadrant.com/"rel="noreferrer" target="_blank">http://www.2ndQuadrant.com/</a><br />  PostgreSQL
Development,24x7 Support, Training & Services<br /><br /><br /> --<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org"target="_blank">pgsql-hackers@postgresql.org</a>)<br /> To make changes to
yoursubscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" rel="noreferrer"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></blockquote></div>