Thread: BUG #5626: Parallel pg_restore fails with "tuple concurrently updated"

BUG #5626: Parallel pg_restore fails with "tuple concurrently updated"

From
"Albert Ullrich"
Date:
The following bug has been logged online:

Bug reference:      5626
Logged by:          Albert Ullrich
Email address:      aullrich@blackducksoftware.com
PostgreSQL version: 8.4.4
Operating system:   Centos 5.5 64bit
Description:        Parallel pg_restore fails with "tuple concurrently
updated"
Details:

pg_restore -e -v -j 4 -Fc -L /tmp/fp_basic.toc -d fp_basic
/tmp/fp_basic.dump
pg_restore: connecting to database for restore
pg_restore: processing item 5 SCHEMA file_tables
pg_restore: creating SCHEMA file_tables
pg_restore: processing item 7 SCHEMA scratch
pg_restore: creating SCHEMA scratch
pg_restore: processing item 27 FUNCTION get_file_tab_id(text)
pg_restore: creating FUNCTION get_file_tab_id(text)
pg_restore: processing item 1578 TABLE file_table
pg_restore: creating TABLE file_table
pg_restore: processing item 1607 SEQUENCE scratch_table_id
pg_restore: creating SEQUENCE scratch_table_id
pg_restore: processing item 1953 SEQUENCE SET scratch_table_id
pg_restore: executing SEQUENCE SET scratch_table_id
pg_restore: processing item 1608 TABLE scratch_tablespace
pg_restore: creating TABLE scratch_tablespace
pg_restore: entering main parallel loop
pg_restore: launching item 1915 TABLE DATA file_table
pg_restore: launching item 1941 TABLE DATA scratch_tablespace
pg_restore: skipping item 1942 ENCODING ENCODING
pg_restore: skipping item 1943 STDSTRINGS STDSTRINGS
pg_restore: skipping item 1944 DATABASE fp_basic
pg_restore: skipping item 8 SCHEMA public
pg_restore: skipping item 1946 ACL public
pg_restore: launching item 35 FUNCTION removefilefromdb(integer)
pg_restore: launching item 1579 TABLE files
pg_restore: restoring data for table "file_table"
pg_restore: restoring data for table "scratch_tablespace"
pg_restore: creating FUNCTION removefilefromdb(integer)
pg_restore: creating TABLE files
pg_restore: finished item 1579 TABLE files
pg_restore: launching item 1580 TABLE files_0
pg_restore: creating TABLE files_0
pg_restore: finished item 35 FUNCTION removefilefromdb(integer)
pg_restore: launching item 1581 TABLE files_1
pg_restore: creating TABLE files_1
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1581; 1259 153051 TABLE
files_1 blackduck
pg_restore: [archiver (db)] could not execute query: ERROR:  tuple
concurrently updated
    Command was:
CREATE TABLE files_1 (
)
INHERITS (public.files);
pg_restore: *** aborted because of error
pg_restore: finished item 1581 TABLE files_1
pg_restore: [archiver] worker process failed: exit code 1
pg_restore: *** aborted because of error
"Albert Ullrich" <aullrich@blackducksoftware.com> writes:
> Description:        Parallel pg_restore fails with "tuple concurrently
> updated"

> pg_restore -e -v -j 4 -Fc -L /tmp/fp_basic.toc -d fp_basic
> /tmp/fp_basic.dump

Apparently you've used the -L option to reorder the dump objects in a way
that won't work with parallel restore.  On the whole I don't recommend
trying to use -L with parallel restore at all, but if you must do it,
it's your responsibility to choose a safe order.  Basically, you had
better keep all the PRE_DATA objects ahead of the DATA objects, and
those ahead of POST_DATA objects.

Did you have a specific reason for not wanting to let parallel restore
choose the restore order for itself?

            regards, tom lane

Re: BUG #5626: Parallel pg_restore fails with "tuple concurrently updated"

From
Albert Ullrich
Date:
We run essentially the following commands to create the table of contents i=
n order to prevent pg_restore from failing:
pg_restore -l database.dump | \
eval fgrep -v -e "' SCHEMA - public '" \
      -e "' COMMENT - SCHEMA public '" \
      -e "' PROCEDURAL LANGUAGE - plpgsql'" database.toc

Where would the reordering happen?

Thanks,

A. Ullrich

On 8/19/10 3:59 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

"Albert Ullrich" <aullrich@blackducksoftware.com> writes:
> Description:        Parallel pg_restore fails with "tuple concurrently
> updated"

> pg_restore -e -v -j 4 -Fc -L /tmp/fp_basic.toc -d fp_basic
> /tmp/fp_basic.dump

Apparently you've used the -L option to reorder the dump objects in a way
that won't work with parallel restore.  On the whole I don't recommend
trying to use -L with parallel restore at all, but if you must do it,
it's your responsibility to choose a safe order.  Basically, you had
better keep all the PRE_DATA objects ahead of the DATA objects, and
those ahead of POST_DATA objects.

Did you have a specific reason for not wanting to let parallel restore
choose the restore order for itself?

                        regards, tom lane
Albert Ullrich <aullrich@blackducksoftware.com> writes:
> We run essentially the following commands to create the table of contents in order to prevent pg_restore from
failing:
> pg_restore -l database.dump | \
> eval fgrep -v -e "' SCHEMA - public '" \
>       -e "' COMMENT - SCHEMA public '" \
>       -e "' PROCEDURAL LANGUAGE - plpgsql'" database.toc

Mph ... removing the public schema from the restore list is problematic,
because you've got a lot of stuff *in* the public schema, and of course
all that stuff depends on the public schema entry.  Normally this
doesn't bother pg_restore because it just blindly restores in the order
you tell it, without paying much attention to the dependency entries.
However, in parallel restore mode it does believe the dependencies,
and the fact that you've got lots of entries that depend on something
not to be restored screws it up.

We should probably try to make pg_restore smarter about this case,
but for the moment my advice remains: don't use -L with parallel
restore.  It appears to me that you're trying to avoid running
pg_restore as superuser, which no doubt seems a bit safer, but
it's not a terribly well-tested path.

            regards, tom lane

Re: BUG #5626: Parallel pg_restore fails with "tuple concurrently updated"

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Mph ... removing the public schema from the restore list is problematic,
> because you've got a lot of stuff *in* the public schema, and of course
> all that stuff depends on the public schema entry.  Normally this
> doesn't bother pg_restore because it just blindly restores in the order
> you tell it, without paying much attention to the dependency entries.

The problem here, to some extent, is that 'public' is where everyone
dumps their favorite contrib functions (classic example here being
PostGIS).  I just ran into this during an 8.3->8.4 upgrade yesterday.  I
installed the new PostGIS on 8.4 and didn't need/want the old PostGIS to
be copied over from the 8.3 instance.  In that case I wasn't trying
parallel restore, but there are certainly cases where I'll want to..

> We should probably try to make pg_restore smarter about this case,

Yes, definitely.  I don't have an immediate solution though,
unfortunately.  Would be kind of neat if pg_restore could connect to the
NEW database and determine if certain things exist which are needed
dependencies...  That's a whole lot of rather complex work though.

    Thanks,

        Stephen
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> We should probably try to make pg_restore smarter about this case,

> Yes, definitely.  I don't have an immediate solution though,

I just posted some further analysis to pgsql-hackers.  Please follow up
there.

            regards, tom lane
I wrote:
> We should probably try to make pg_restore smarter about this case,

I've applied a patch for this:
http://archives.postgresql.org/pgsql-committers/2010-08/msg00271.php

            regards, tom lane

Re: BUG #5626: Parallel pg_restore fails with "tuple concurrently updated"

From
Albert Ullrich
Date:
I followed your advise, Tom and reworked the way we do dumps and restores t=
o remove the requirement for having TOCs.

So far the restores executed flawlessly and I am grateful for having this f=
eature available to us! Huge time savings!

Thanks!

Albert


On 8/20/10 1:24 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Albert Ullrich <aullrich@blackducksoftware.com> writes:
> We run essentially the following commands to create the table of contents=
 in order to prevent pg_restore from failing:
> pg_restore -l database.dump | \
> eval fgrep -v -e "' SCHEMA - public '" \
>       -e "' COMMENT - SCHEMA public '" \
>       -e "' PROCEDURAL LANGUAGE - plpgsql'" database.toc

Mph ... removing the public schema from the restore list is problematic,
because you've got a lot of stuff *in* the public schema, and of course
all that stuff depends on the public schema entry.  Normally this
doesn't bother pg_restore because it just blindly restores in the order
you tell it, without paying much attention to the dependency entries.
However, in parallel restore mode it does believe the dependencies,
and the fact that you've got lots of entries that depend on something
not to be restored screws it up.

We should probably try to make pg_restore smarter about this case,
but for the moment my advice remains: don't use -L with parallel
restore.  It appears to me that you're trying to avoid running
pg_restore as superuser, which no doubt seems a bit safer, but
it's not a terribly well-tested path.

                        regards, tom lane


A. Ullrich
Director, Quality Assurance and Infrastructure
Black Duck Software, Inc.

aullrich@blackducksoftware.com
T +1.781.810.2092
C +1.781.405.0780
F +1.781.891.5145
http://www.blackducksoftware.com