Thread: Re: pg_dump, ATTACH, and independently restorable child partitions

Re: pg_dump, ATTACH, and independently restorable child partitions

From

Tom Lane

Date:

25 November 2020, 23:35:19

Justin Pryzby <pryzby@telsasoft.com> writes:
> 1. Maybe pg_restore ExecuteSqlCommandBuf() should (always?) call
> ExecuteSimpleCommands() instead of ExecuteSqlCommand().  It doesn't seem to
> break anything (although that surprised me).

That certainly does break everything, which I imagine is the reason
why the cfbot shows that this patch is failing the pg_upgrade tests.
Note the comments for ExecuteSimpleCommands:

 * We have to lex the data to the extent of identifying literals and quoted
 * identifiers, so that we can recognize statement-terminating semicolons.
 * We assume that INSERT data will not contain SQL comments, E'' literals,
 * or dollar-quoted strings, so this is much simpler than a full SQL lexer.

IOW, where that says "Simple", it means *simple* --- in practice,
we only risk using it on commands that we know pg_dump itself built
earlier.  There is no reasonable prospect of getting pg_restore to
split arbitrary SQL at command boundaries.  We'd need something
comparable to psql's lexer, which is huge, and from a future-proofing
standpoint it would be just awful.  (The worst that happens if psql
misparses your string is that it won't send the command when you
expected.  If pg_restore misparses stuff, your best case is that the
restore fails cleanly; the worst case could easily result in
SQL-injection compromises.)  So I think we cannot follow this
approach.

What we'd need to do if we want this to work with direct-to-DB restore
is to split off the ATTACH PARTITION command as a separate TOC entry.
That doesn't seem amazingly difficult, and it would even offer the
possibility that you could extract the partition standalone without
having to ignore errors.  (You'd use -l/-L to select the CREATE TABLE,
the data, etc, but not the ATTACH object.)

That would possibly come out as a larger patch than you have here,
but maybe not by much.  I don't think there's too much more involved
than setting up the proper command strings and calling ArchiveEntry().
You'd need to do some testing to verify that cases like --clean
work sanely.

Also, I read the 0002 patch briefly and couldn't make heads or tails
of it, except that it seemed to be abusing the PQExpBuffer abstraction
well beyond anything I'd consider acceptable.  If you want separate
strings, make a PQExpBuffer for each one.

            regards, tom lane

Re: pg_dump, ATTACH, and independently restorable child partitions

From

Tom Lane

Date:

04 December 2020, 17:13:05

Justin Pryzby <pryzby@telsasoft.com> writes:
> [ v4-0001-pg_dump-output-separate-object-for-ALTER-TABLE.AT.patch ]

The cfbot is being picky about this:

3218pg_dump.c: In function ‘dumpTableAttach’:
3219pg_dump.c:15600:42: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses]
3220  if (attachinfo->partitionTbl->dobj.dump & DUMP_COMPONENT_DEFINITION == 0)
3221                                          ^
3222cc1: all warnings being treated as errors

which if I've got the precedence straight is indeed a bug.

Personally I'd probably write

    if (!(attachinfo->partitionTbl->dobj.dump & DUMP_COMPONENT_DEFINITION))

as it seems like a boolean test to me.

            regards, tom lane

Re: pg_dump, ATTACH, and independently restorable child partitions

From

Tom Lane

Date:

12 January 2021, 02:28:18

Justin Pryzby <pryzby@telsasoft.com> writes:
> [ v5-0001-pg_dump-output-separate-object-for-ALTER-TABLE.AT.patch ]

Pushed with mostly cosmetic edits.

One thing I changed that isn't cosmetic is that I set the ArchiveEntry's
owner to be the owner of the child table.  Although we aren't going to
do any sort of ALTER OWNER on this, it's still important that the owner
be marked as someone who has the right permissions to issue the ALTER.
The default case is that the original user will issue the ATTACH, which
basically only works if you run the restore as superuser.  It looks to
me like you copied this decision from the INDEX ATTACH code, which is
just as broken --- I'm going to go fix/backpatch that momentarily.

Another thing that bothers me still is that it's not real clear that
this code plays nicely with selective dumps, because it's not doing
anything to set the dobj.dump field in a non-default way (which in
turn means that the dobj.dump test in dumpTableAttach can never fire).
It seems like it might be possible to emit a TABLE ATTACH object
even though one or both of the referenced tables didn't get dumped.
In some desultory testing I couldn't get that to actually happen, but
maybe I just didn't push on it in the right way.  I'd be happier about
this if we set the flags with something along the lines of

    attachinfo->dobj.dump = attachinfo->parentTbl->dobj.dump &
                attachinfo->partitionTbl->dobj.dump;

            regards, tom lane