Thread: Small bug in pg_dump
Hi Philip, I have not updated from CVS in a few days, but I suspect you haven't noticed this yet: given a mixed-case table name and a scenario that requires emitting UPDATE pg_class commands, pg_dump puts out things like UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" ~* '"Table"'; BEGIN TRANSACTION; CREATE TEMP TABLE "tr" ("tmp_relname" name, "tmp_reltriggers" smallint); INSERT INTO "tr" SELECT C."relname", count(T."oid") FROM "pg_class" C, "pg_trigger" T WHERE C."oid" = T."tgrelid" AND C."relname"~* '"Table"' GROUP BY 1; UPDATE "pg_class" SET "reltriggers" = TMP."tmp_reltriggers" FROM "tr" TMP WHERE "pg_class"."relname" = TMP."tmp_relname"; DROP TABLE "tr"; COMMIT TRANSACTION; Of course those ~* '"Table"' clauses aren't going to work too well; the identifier should NOT be double-quoted inside the pattern. Actually, this should not be using ~* in the first place --- why isn't it just using WHERE relname = 'Table' ??? Seems like it's not cool to gratuitously reset the trigger counts on other tables that contain Table as a substring of their names. And while we're at it, the temp table hasn't been necessary for a release or three. That whole transaction should be replaced by UPDATE pg_class SET reltriggers =(SELECT count(*) FROM pg_trigger where pg_class.oid = tgrelid) WHERE relname = 'Table'; regards, tom lane
At 18:41 12/03/01 -0500, Tom Lane wrote: > >UPDATE pg_class SET reltriggers = > (SELECT count(*) FROM pg_trigger where pg_class.oid = tgrelid) >WHERE relname = 'Table'; > Fixed & done... ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > Fixed & done... Only part of the way there: pg_dump is still pretty seriously broken for mixed-case table names. Observe: regression=# create table "Foo" (z int); CREATE regression=# \q $ pg_dump -a -t '"Foo"' regression -- -- Selected TOC Entries: -- -- -- Data for TOC Entry ID 1 (OID 1845087) TABLE DATA "Foo" -- \connect - postgres -- Disable triggers UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" = '"Foo"'; COPY "Foo" FROM stdin; \. -- Enable triggers UPDATE pg_class SET reltriggers = (SELECT count(*) FROM pg_trigger where pg_class.oid = tgrelid) WHERE relname = '"Foo"'; $ These UPDATEs will certainly not work. On digging into the code, the problem seems to be one of premature optimization: fmtId() is applied to the table name when the ArchiveEntry is created, rather than when printing out from an ArchiveEntry. So there is no way to get the undecorated table name for use in these commands. It seems to me that you should remove the fmtId() from calls to ArchiveEntry. Then add it back where an archive entry's name is being printed (and quoting is appropriate). It might even make sense for an ArchiveEntry to store both forms of the name, and then using code could just select the form wanted instead of calling fmtId repeatedly. Not sure. BTW, making the -t switch compare to the unquoted name would probably also fix the bizarre need for '"Foo"' exhibited above. regards, tom lane
At 19:10 14/03/01 -0500, Tom Lane wrote: >It might even make >sense for an ArchiveEntry to store both forms of the name, and then >using code could just select the form wanted instead of calling >fmtId repeatedly. Not sure. > >BTW, making the -t switch compare to the unquoted name would probably >also fix the bizarre need for '"Foo"' exhibited above. I think these are both fixed now; the SQL in the ArchiveEntry call still uses the formatted names, but the name in the TOC entry is unformatted in all cases except functions now. The TOC entry name is used in the -t switch and in disabling triggers etc. This does make me wonder (again) about some kind of pg_dump regression test. ISTM that a test should be doable by building a DB from data files, dumping it, restoring it, then using COPY to extract the data back to files (and probably doing a sort on the output). We could also store a BLOB or two. Then we compare the initial data files with the final ones. This will test the integrity of the data & BLOB dump/restore. We then also need to test the metadata integrity somehow, probably by dumping & restoring the regression DB, but we'd need to modify the pg_dump output somewhat, I think. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
> This does make me wonder (again) about some kind of pg_dump regression > test. ISTM that a test should be doable by building a DB from data files, > dumping it, restoring it, then using COPY to extract the data back to files > (and probably doing a sort on the output). We could also store a BLOB or > two. Then we compare the initial data files with the final ones. This will > test the integrity of the data & BLOB dump/restore. We then also need to > test the metadata integrity somehow, probably by dumping & restoring the > regression DB, but we'd need to modify the pg_dump output somewhat, I think. Yes, I have often caught dump bugs by doing a COPY/restore/COPY and comparing the two COPY files. Not sure if that is what you meant. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026