Re: logical changeset generation v6.2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: logical changeset generation v6.2
Date
Msg-id 20131021193602.GK2968@awork2.anarazel.de
Whole thread Raw
In response to Re: logical changeset generation v6.2  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: logical changeset generation v6.2
List pgsql-hackers
On 2013-10-21 14:50:54 -0400, Robert Haas wrote:
> On Mon, Oct 21, 2013 at 2:27 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> I think it's a complete no-go.  Consider, e.g., the comment for
> >> MaxTupleAttributeNumber, which you've blithely falsified.  Even if you
> >> update the comment and the value, I'm not inspired by the idea of
> >> subtracting 32 from that number; even if it weren't already too small,
> >> it would break pg_upgrade for any users who are on the edge.
> >
> > Well, we only need to support it for (user_)catalog tables. So
> > pg_upgrade isn't a problem. And I don't really see a problem restricting
> > the number of columns for such tables.
> 
> Inch by inch, this patch set is trying to make catalog tables more and
> more different from regular tables.  I think that's a direction we're
> going to regret.

I can understand that.

> Can't both have something be a catalog table AND replicate it?  Ick,
> but OK.

User catalog tables are replicated normally.

If we want we can replicate system catalog tables as well, ok, it's
about a days worth of work. I just don't see what the use case would
be and I am afraid it'd be used as a poor man's system table trigger.

> But changing the on disk format strikes me as crossing some sort of
> bright line.  That means that you're going to have two different code
> paths in a lot of important cases, one for catalog tables and one for
> non-catalog tables, and the one that's only taken for catalog tables
> will be rather lightly traveled.

But there really isn't that much difference in the code paths, is there?
If you look at it, minus the mechanical changes around bitmasks it
really mostly just a couple of added
HeapTupleHeaderSetCmin/HeapTupleHeaderSetCmax calls + wal logging the
current cid.

> And then you've got user catalog tables, and the possibility that
> something that wasn't formerly a user catalog table might become one,
> or visca versa.

That's not really a problem - we only need cmin/cmax for catalog tables
when decoding DML that happened in the same transaction as DDL. The wide
cids could easily be removed when we freeze tuples or such.
Even something like:
BEGIN;
CREATE TABLE catalog_t(...);
INSERT INTO catalog_t ...;
UPDATE catalog_t ...
ALTER TABLE catalog_t SET (user_catalog_table = true);
INSERT INTO decoded_table (..);
INSERT INTO decoded_table (..);
UPDATE catalog_t SET ...;
INSERT INTO decoded_table (..);
COMMIT;
will work correctly.

> Even if you can flush out every bug that exists today, this is a
> recipe for future bugs.

I don't really forsee many new codepaths that care about the visibility
bits in tuple headers themselves. When has the last one of those been
added/changed? I'd bet it was 9.0 with the vacuum full/cluster merge,
and three lines to be added are surely the smallest problem of that.

> >> Things
> >> aren't looking too good for anything that uses HeapTupleFields,
> >> either; consider rewrite_heap_tuple().
> >
> > Well, that currently works, by copying cmax. Since rewriting triggered
> > the change, I am pretty sure I've actually tested & hit that path...
> 
> No offense, but that's a laughable statement.  If that path works,
> it's mostly if not entirely by accident.  You've fundamentally changed
> the heap tuple format, and that code doesn't know about it, even
> though it's deeply in bed with assumptions about the old format.

Hm? rewrite_heap_tuple() grew code to handle that case, it's not an
accident that it works.
if (HeapTupleHeaderHasWideCid(old_tuple->t_data)           && HeapTupleHeaderHasWideCid(new_tuple->t_data)){
HeapTupleHeaderSetCmin(new_tuple->t_data,
HeapTupleHeaderGetCmin(old_tuple->t_data));   HeapTupleHeaderSetCmax(new_tuple->t_data,
    HeapTupleHeaderGetCmax(old_tuple->t_data), false);}
 


Note that code handling HeapTupleHeaders outside of heapam.c, tqual.c et
al. shouldn't need to care about the changes at all.
And all the code that needs to care *already* has special-cased code
around visibility. It's relatively easy to find by grepping for
HEAP_XACT_MASK. We've so far accepted that several places need to change
if we change the visibility rules. And I don't see how we easily could
get away from that.
Patches like e.g. lsn-ranges freezing will require *gobs* more
widespread changes. And have much higher chances of breaking stuff
imo. And it's still worthwile to do them.

> I
> think this is a pretty clear indication as to what's wrong with this
> approach: a lot of stuff will not care, but the stuff that does care
> will be hard to find, and future incremental modifications either to
> that code or to the hidden data before the t_hoff pointer could break
> stuff that formerly worked.  We rejected early drafts of sepgsql RLS
> cold because they changed the tuple format, and I don't see why we
> shouldn't do exactly the same thing here.

Ok, I have a hard time to argue against that. I either skipped or forgot
that discussion.
I quickly searched and looked at the patch at:
497D7055.9090806@ak.jp.nec.com From a quick look that would have grown
much more invasive - and lots more places would have had to care. For
the proposed patch it really is only
heap_(insert|multi_insert|update|delete|rewrite_tuple) that need to
care. We could remove the tupledesc changes if we dont care about some
added malloc/memcpy'ing during DDL, the code to handle tuples without
that space is already there.

> But just suppose for a minute that we'd accepted that proposal and
> then took this one, too.  And let's suppose we also accept the next
> proposal that, like that one and this one, jams something more into
> the heap tuple header.  At that point you'll have potentially as many
> as 8 different maximum-number-of-attributes values for tuples, though
> maybe not quite that many in practice if not all of those features can
> be used together.

Well, I can understand that argument. Unfortunately. But I still claim
that this is scaling back an optimization that we've previously applied
more widely.

Note also that we actually have quite some slop in
MaxTupleAttributeNumber and MaxHeapAttributeNumber (which is the one
that matters here). They e.g. haven't been adjusted when 8.3 decided
*not* to store both cmin and cmax anymore and they *already* had slop
before. So I am not convinced that we actually need to change our
current limits.

> The macros that are needed to extract the various
> values from the heap tuple will be nightmarishly complex, and we'll
> have eaten up all (or more than all) of our remaining bit-space in the
> infomask.

Why would the macros for extracting values from heap tuples need to
change? Those shouldn't be affected by what I proposed?


I think this approach would have lower maintenance overhead in
comparison to the previous solution of Handling CommandIds because it's
actually much simpler. But I am definitely ready to try other ideas.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Cube extension point support // GSoC'13
Next
From: Tom Lane
Date:
Subject: Re: Cube extension point support // GSoC'13