Thread: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
9.2beta1 regression: pg_restore --data-only does not set sequence values any more
From
Martin Pitt
Date:
Hello all, while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common test suite noticed a regression: It seems that pg_restore --data-only now skips the current value of sequences, so that in the upgraded database the sequence counter is back to the default. This can lead to rather serious data errors. Reproducer:=20 * Create a db and sequence: $ createdb test $ psql test -c "CREATE SEQUENCE odd10 INCREMENT BY 2 MINVALUE 1 MAXVALUE = 10 CYCLE" CREATE SEQUENCE =20 * Advance it two steps: $ psql --cluster 9.2/main -Atc "SELECT nextval('odd10')" test 1 $ psql --cluster 9.2/main -Atc "SELECT nextval('odd10')" test 3 * Dump schema and "test" data: $ pg_dumpall -s > /tmp/schema $ pg_dump -Fc test > /tmp/test.dump * Drop db: =20 $ dropdb test * Restore DB: $ psql template1 < /tmp/schema=20 $ pg_restore --data-only -d postgres /tmp/test.dump=20 * Check the counter: $ psql -Atc "SELECT nextval('odd10')" test 1 This is wrong, it should be "5", from the original database. This worked all the way up to 9.1. With "pg_restore /tmp/test.dump" (i. e. a full dump), you see SELECT pg_catalog.setval('odd10', 3, true); in the dump, but it is not present in the output of "pg_restore --data-only /tmp/test.dump". However, when I use the 9.1 version, it works: $ /usr/lib/postgresql/9.1/bin/pg_restore --data-only /tmp/test.dump | = grep setval SELECT pg_catalog.setval('odd10', 3, true); Thanks! Martin --=20 Martin Pitt | http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
From
Tom Lane
Date:
Martin Pitt <mpitt@debian.org> writes: > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common > test suite noticed a regression: It seems that pg_restore --data-only > now skips the current value of sequences, so that in the upgraded > database the sequence counter is back to the default. I believe this is a consequence of commit a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the entirely false assumption that --schema-only and --data-only have something to do with the order that entries appear in the archive ... regards, tom lane
Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
From
Andrew Dunstan
Date:
On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Martin Pitt <mpitt@debian.org> writes: > > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common > > test suite noticed a regression: It seems that pg_restore --data-only > > now skips the current value of sequences, so that in the upgraded > > database the sequence counter is back to the default. > > I believe this is a consequence of commit > a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the entirely > false assumption that --schema-only and --data-only have something to > do with the order that entries appear in the archive ... > > > Darn, will investigate. cheers andrew
Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
From
Andrew Dunstan
Date:
On 05/16/2012 10:23 AM, Andrew Dunstan wrote: > > > On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: > > Martin Pitt <mpitt@debian.org <mailto:mpitt@debian.org>> writes: > > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common > > test suite noticed a regression: It seems that pg_restore > --data-only > > now skips the current value of sequences, so that in the upgraded > > database the sequence counter is back to the default. > > I believe this is a consequence of commit > a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the > entirely > false assumption that --schema-only and --data-only have something to > do with the order that entries appear in the archive ... > > > > Darn, will investigate. > > [cc -hackers] Well, the trouble is that we have these pesky SECTION_NONE entries for things like comments, security labels and ACLs that need to be dumped in the right section, so we can't totally ignore the order. But we could (and probably should) ignore the order for making decisions about everything BUT those entries. So, here's a revised plan: --section=data will dump exactly TABLE DATA, SEQUENCE SET or BLOBS entries --section=pre-data will dump SECTION_PRE_DATA items (other than SEQUENCE SET) plus any immediately following SECTION_NONE items. --section=post-data will dump everything else. Comments? cheers andrew
Re: [HACKERS] Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
From
Andrew Dunstan
Date:
On 05/21/2012 02:59 PM, Andrew Dunstan wrote: > > > On 05/16/2012 10:23 AM, Andrew Dunstan wrote: >> >> >> On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl@sss.pgh.pa.us >> <mailto:tgl@sss.pgh.pa.us>> wrote: >> >> Martin Pitt <mpitt@debian.org <mailto:mpitt@debian.org>> writes: >> > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common >> > test suite noticed a regression: It seems that pg_restore >> --data-only >> > now skips the current value of sequences, so that in the upgraded >> > database the sequence counter is back to the default. >> >> I believe this is a consequence of commit >> a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the >> entirely >> false assumption that --schema-only and --data-only have >> something to >> do with the order that entries appear in the archive ... >> >> >> >> Darn, will investigate. >> >> > > [cc -hackers] > > Well, the trouble is that we have these pesky SECTION_NONE entries for > things like comments, security labels and ACLs that need to be dumped > in the right section, so we can't totally ignore the order. But we > could (and probably should) ignore the order for making decisions > about everything BUT those entries. > > So, here's a revised plan: > > --section=data will dump exactly TABLE DATA, SEQUENCE SET or BLOBS > entries > --section=pre-data will dump SECTION_PRE_DATA items (other than > SEQUENCE SET) plus any immediately following SECTION_NONE items. > --section=post-data will dump everything else. > > It turns out there were some infelicities with pg_dump as well as with pg_restore. I think the attached patch does the right thing. I'll keep testing - I'll be happier if other people bang on it too. cheers andrew
Attachment
Re: [HACKERS] Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: >>> Martin Pitt <mpitt@debian.org <mailto:mpitt@debian.org>> writes: >>>> while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common >>>> test suite noticed a regression: It seems that pg_restore --data-only >>>> now skips the current value of sequences, so that in the upgraded >>>> database the sequence counter is back to the default. > It turns out there were some infelicities with pg_dump as well as with > pg_restore. > I think the attached patch does the right thing. I'll keep testing - > I'll be happier if other people bang on it too. After looking this over, I think the original patch was just fundamentally wrong and needs to be largely rewritten. The basic error was in saying that the existing options --schema-only and --data-only were equivalent to particular cases of --section, which they are not. The proposed new patch does not make this better; it just makes the logic even more spaghetti-ish. I think what we need is to rip all that out and treat --section as being a new option that's not tied to the old ones in any way, but is an entirely orthogonal filter. The right place to implement it (for either pg_dump or pg_restore) is in the TOC-scanning loops in pg_backup_archiver.c, which can track which section they are in fairly easily (probably define it as being the current item's section unless that's SECTION_NONE, in which case use the previous section value). BTW, I'm thinking we could make that code simpler and faster if the _tocEntryRequired logic were done only once in an initial pass, and then we stored the teReqs result into a work field in the TocEntry struct for use in later passes. Andrew told me off-list that he would be unavailable due to travel for awhile, so I will have a go at fixing this. regards, tom lane