Thread: Re: [GENERAL] Concurrency problem building indexes

Re: [GENERAL] Concurrency problem building indexes

From
"Jim C. Nasby"
Date:
moving to -hackers

On Sun, Apr 23, 2006 at 09:06:59AM -0500, Wes wrote:
> Postgres: 8.1.3
> OS: Mac OS X 10.4.6
> 
> I've run into another concurrency issue - parallel building of indexes.
> When I try to build multiple indexes at a time, I randomly get:
> 
>   ERROR:  tuple concurrently updated
> 
> The following thread talks about this, but there is no answer.
> 
>   <http://archives.postgresql.org/pgsql-hackers/2002-07/msg00969.php>
> 
> How can I safely build indexes in parallel?
> 
> At this point, I'm only trying to build two at a time.  I will be building
> indexes for tables with any where from a few rows to 100 million rows on a
> daily basis - I need to maximize performance.

Since this seems to only be an issue due to trying to update pg_class
for the table, perhaps CREATE INDEX can just ignore errors there?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: [GENERAL] Concurrency problem building indexes

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Since this seems to only be an issue due to trying to update pg_class
> for the table, perhaps CREATE INDEX can just ignore errors there?

Lessee, where would ignoring an error potentially cause the greatest
damage?  I can hardly think of a less critical catalog than pg_class :-(
        regards, tom lane


Re: [GENERAL] Concurrency problem building indexes

From
"Jim C. Nasby"
Date:
On Mon, Apr 24, 2006 at 08:14:33PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Since this seems to only be an issue due to trying to update pg_class
> > for the table, perhaps CREATE INDEX can just ignore errors there?
> 
> Lessee, where would ignoring an error potentially cause the greatest
> damage?  I can hardly think of a less critical catalog than pg_class :-(

Sorry, should have been more specific... as I understand it, the update
is just to set pg_class.relpages for the heap, which shouldn't be
critical.

Was the code ever changed so that it won't update relpages if the number
is the same?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: [GENERAL] Concurrency problem building indexes

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Was the code ever changed so that it won't update relpages if the number
> is the same?

Long ago.  I just tested it again, and AFAICS you can create indexes
concurrently so long as the underlying table isn't changing (ie,
neither reltuples nor relpages changes).
        regards, tom lane


Re: [GENERAL] Concurrency problem building indexes

From
"Jim C. Nasby"
Date:
On Mon, Apr 24, 2006 at 08:42:41PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Was the code ever changed so that it won't update relpages if the number
> > is the same?
> 
> Long ago.  I just tested it again, and AFAICS you can create indexes
> concurrently so long as the underlying table isn't changing (ie,
> neither reltuples nor relpages changes).

Hrm, the OP seemed to find a case that was having problems:
http://archives.postgresql.org/pgsql-general/2006-04/msg01009.php

Of course it's possible that he's getting that error from an entirely
different section of code, or that this is now only an issue if you're
doing a lot of indexing at once...
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: [GENERAL] Concurrency problem building indexes

From
Wes
Date:
On 4/24/06 7:54 PM, "Jim C. Nasby" <jnasby@pervasive.com> wrote:

>> Long ago.  I just tested it again, and AFAICS you can create indexes
>> concurrently so long as the underlying table isn't changing (ie,
>> neither reltuples nor relpages changes).
> 
> Hrm, the OP seemed to find a case that was having problems:
> http://archives.postgresql.org/pgsql-general/2006-04/msg01009.php
> 
> Of course it's possible that he's getting that error from an entirely
> different section of code, or that this is now only an issue if you're
> doing a lot of indexing at once...

I don't think there's much chance it's other code.  The index build is a
standalone operation in an external script that uses psql (so it is easy to
tailor).  This script is called as the last statement before the main
program exit.  It does:
 fork   create index1   create index2   exit
 fork   create index3   create index4   exit
 wait for termination exit

As my test tables are small and indexing happens almost instantly, I put a
'sleep' after one of the forks so that the two don't complete at the same
time.  When I do that, I don't get the errors.

Wes







Re: [GENERAL] Concurrency problem building indexes

From
Wes
Date:
>> Long ago.  I just tested it again, and AFAICS you can create indexes
>> concurrently so long as the underlying table isn't changing (ie,
>> neither reltuples nor relpages changes).
> 
> Hrm, the OP seemed to find a case that was having problems:
> http://archives.postgresql.org/pgsql-general/2006-04/msg01009.php
> 
> Of course it's possible that he's getting that error from an entirely
> different section of code, or that this is now only an issue if you're
> doing a lot of indexing at once...

I just verified using 'ps' that there are no other open connections when the
index builds are running.  I tried somewhat bigger test tables (a few rows
to a little over a hundred thousand).  I can duplicate the error at will.

Wes




Re: [GENERAL] Concurrency problem building indexes

From
Tom Lane
Date:
Wes <wespvp@syntegra.com> writes:
> I can duplicate the error at will.

[ shrug... ]  Worksforme.

There is a short interval at the end of the first CREATE INDEX on the
table where the problem would happen if another CREATE INDEX tries to
modify the pg_class row before the first one's committed.  That would be
hard to hit with any regularity though.  Subsequent CREATE INDEXes after
that should be completely reliable.  If you are modifying the table
while it's being indexed, then the same small window would apply for
each CREATE INDEX not just the first ... but you didn't admit to that.

Care to put together a self-contained test case?
        regards, tom lane


Re: [GENERAL] Concurrency problem building indexes

From
Wes
Date:
On 4/24/06 11:02 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> [ shrug... ]  Worksforme.
> 
> There is a short interval at the end of the first CREATE INDEX on the
> table where the problem would happen if another CREATE INDEX tries to
> modify the pg_class row before the first one's committed.  That would be
> hard to hit with any regularity though.  Subsequent CREATE INDEXes after
> that should be completely reliable.  If you are modifying the table
> while it's being indexed, then the same small window would apply for
> each CREATE INDEX not just the first ... but you didn't admit to that.
> 
> Care to put together a self-contained test case?

I think I've got a reasonably small test case I can send you in the morning.
I did a pg_dumpall and removed the index creation commands.  The first time
I run the index build, I usually get at least one occurrence.

Where do you want me to send it to?

Of course, since your hardware is different, it may not show up since it
appears to be a timing thing..  I'm on a PB G4 1Ghz.

Wes




Re: [GENERAL] Concurrency problem building indexes

From
Tom Lane
Date:
Wes <wespvp@syntegra.com> writes:
> Where do you want me to send it to?

pgsql-bugs would be appropriate.

> Of course, since your hardware is different, it may not show up since it
> appears to be a timing thing..  I'm on a PB G4 1Ghz.

My G4 is in the shop at the moment, but Apple promised it back by Friday.
I kinda doubt it's *that* platform specific though.
        regards, tom lane


Re: [GENERAL] Concurrency problem building indexes

From
Wes
Date:
On 4/25/06 1:01 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

>> Where do you want me to send it to?
> 
> pgsql-bugs would be appropriate.

It's not that small that I would want to post it to a list.  Also, I think
I've sanitized the data, but I wouldn't want to post it on a public list.

Or are you just looking for the index build script?

>> Of course, since your hardware is different, it may not show up since it
>> appears to be a timing thing..  I'm on a PB G4 1Ghz.
> 
> My G4 is in the shop at the moment, but Apple promised it back by Friday.
> I kinda doubt it's *that* platform specific though.

You never know...  CPU speed, hard drive speed (slow), etc.

Wes




Re: [GENERAL] Concurrency problem building indexes

From
Tom Lane
Date:
Wes <wespvp@syntegra.com> writes:
> On 4/25/06 1:01 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> Where do you want me to send it to?
>> 
>> pgsql-bugs would be appropriate.

> It's not that small that I would want to post it to a list.  Also, I think
> I've sanitized the data, but I wouldn't want to post it on a public list.

So invent some made-up data.  I'd be seriously surprised if this
behavior has anything to do with the precise data being indexed.
Experiment around till you've got something you don't mind posting
that exhibits the behavior you see.
        regards, tom lane