Re: [GENERAL] pg_dump -s dumps data?! - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [GENERAL] pg_dump -s dumps data?!
Date
Msg-id 12635.1328050086@sss.pgh.pa.us
Whole thread Raw
In response to Re: [GENERAL] pg_dump -s dumps data?!  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: [GENERAL] pg_dump -s dumps data?!
List pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/30/2012 11:18 PM, Tom Lane wrote:
>> As I suspected, the behavioral change from 9.1 to HEAD is not
>> intentional.  It is an artifact of commit
>> 7b070e896ca835318c90b02c830a5c4844413b64, which is almost, but not
>> quite, entirely broken.  I won't enumerate its shortcomings here,

> I'm perplexed about what you thing the patch does wrong or how it affects this. If I've broken something I'd like to
knowhow, exactly, so I have a chance to fix it.
 

Well, it adds a new field to all instances of DumpableObject and then
leaves it uninitialized in most cases, which is bad style (and unlike in
the backend, there is no forced zeroing to ensure a consistent value);
but the proximate cause of the bug is that you put the filtering in the
wrong place.  The way this is supposed to work, or at least used to
work, is that dump-the-data-or-not is determined by whether
a TableDataInfo DumpableObject gets created --- see the callers of
makeTableDataInfo.  You didn't follow that convention but instead
inserted an extra filter test in dumpTableData.  The reason that
depesz's example is not dumping the unwanted config table is that the
code path in which getExtensionMembership calls makeTableDataInfo isn't
ever setting the dumpdata flag.  Unfortunately that means that config
table data won't get dumped when it *is* wanted, either.  Or worse,
it means that the data might or might not get dumped depending on
whether the pg_malloc in makeTableDataInfo is allocating new or recycled
memory and what happens to be in that memory in the latter case.

Now I'm not entirely sure that we ought to preserve the old code
structure as-is; it might be more future-proof if we always made
TableDataInfo objects and then used their dump flags to control whether
to dump them.  (The main potential advantage of that is that we'd deal
more sanely with dependency chains linking through TableDataInfo
objects; but I'm not sure there are any such at the moment.)  But what
we've got at the moment is a mess.  In any case I think we'd be better
off without the separate dumpdata field: if we need a flag we should use
the "dump" flag of the TableDataInfo object.

As far as depesz's actual complaint is concerned, I think the core of
the problem is that getExtensionMembership is unconditionally asking for
a config table's data to be dumped, without any consideration of whether
pg_dump switches ought to filter that.  I'm unsure whether we should
just add more logic there, or instead put the policy logic into
makeTableDataInfo.  The latter might result in less duplicative code,
but would require more rethinking of getTableData() --- which conditions
tested there ought to move into makeTableDataInfo?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [GENERAL] pg_dump -s dumps data?!
Next
From: Tom Lane
Date:
Subject: Re: Index-only scan performance regression