Re: Fixing findDependentObjects()'s dependency on scan order(regressions in DROP diagnostic messages) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Fixing findDependentObjects()'s dependency on scan order(regressions in DROP diagnostic messages)
Date
Msg-id CAH2-WznWGEaEBX=PqwdXOS_WwcRS-kB5zA5kdrmqXKmP7b6FuA@mail.gmail.com
Whole thread Raw
In response to Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Jan 17, 2019 at 12:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So I poked around for awhile with running the regression tests under
> ignore_system_indexes.  There seem to be a number of issues involved
> here.  To a significant extent, they aren't bugs, at least not according
> to the original conception of the dependency code: it was not a design
> goal that different dependencies of the same object-to-be-deleted would
> be processed in a fixed order.

I agree that it's the exceptional cases that are of concern here. The
vast majority of the changes you'll see with
"ignore_system_indexes=on" are noise.

> Now, perhaps we should make such stability a design goal, as it'd allow
> us to get rid of some "suppress the cascade outputs" hacks in the
> regression tests.  But it's a bit of a new feature.  If we wanted to
> do that, I'd be inclined to do it by absorbing all the pg_depend entries
> for a particular object into an ObjectAddress array and then sorting
> them before we process them.  The main stumbling block here is "what
> would the sort order be?".  The best idea I can come up with offhand
> is to sort by OID, which at least for regression test purposes would
> mean objects would be listed/processed more or less in creation order.

I think that we might as well have a stable order. Maybe an explicit
sort step is unnecessary -- we can actually rely on scan order, while
accepting you'll get a different order with "ignore_system_indexes=on"
(though without getting substantively different/incorrect messages).
I'm slightly concerned that an explicit sort step might present
difficulties in extreme cases. How much memory are we prepared to
allocate, just to get a stable order?

It probably won't really matter what the specific order is, once the
current problems (the DEPENDENCY_INTERNAL_AUTO issue and the issue
you'll fix with DEPFLAG_IS_SUBOBJECT) are handled in a direct manner.
As I've pointed out a couple of times already, we can add a 4 byte
tie-breaker column to both pg_depend indexes without increasing the
size of the on-disk representation, since the extra space is already
lost to alignment (we could even add a new 4 byte column to the table
without any storage overhead, if that happened to make sense).

What is the likelihood that somebody will ever find a better use for
this alignment padding? These two indexes are typically the largest
system catalog indexes by far, so the opportunity cost matters. I
don't think that the direct cost (more cycles) is worth worrying
about, though. Nobody has added a pg_depend column since it was first
introduced back in 2002.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Libpq support to connect to standby server as priority
Next
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Libpq support to connect to standby server as priority