Re: logical changeset generation v6.4 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: logical changeset generation v6.4
Date
Msg-id 20131018185057.GD16188@awork2.anarazel.de
Whole thread Raw
In response to Re: logical changeset generation v6.4  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: logical changeset generation v6.4  (Robert Haas <robertmhaas@gmail.com>)
Re: logical changeset generation v6.4  (Hannu Krosing <hannu@2ndQuadrant.com>)
Re: logical changeset generation v6.4  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 2013-10-18 08:11:29 -0400, Robert Haas wrote:
> On Mon, Oct 14, 2013 at 9:12 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Attached you can find version 6.4 of the patchset:
> 
> So I'm still unhappy with the arbitrary logic in what's now patch 1
> for choosing the candidate key.  On another thread, someone mentioned
> that they might want the entire old tuple, and that got me thinking:
> there's no particular reason why the user has to want exactly the
> columns that exist in some unique, immediate, non-partial index (what
> a name).  So I have two proposals:

> 1. Instead of allowing the user to choose the index to be used, or
> picking it for them, how about if we let them choose the old-tuple
> columns they want logged?  This could be a per-column option.  If the
> primary key can be assumed known and unchanging, then the answer might
> be that the user wants *no* old-tuple columns logged.  Contrariwise
> someone might want everything logged, or anything in the middle.

I definitely can see the usecase for logging anything or nothing,
arbitrary column select seems to be too complicated for now.

> 2. If that seems too complicated, how about just logging the whole old
> tuple for version 1?

I think that'd make the patch much less useful because it bloats WAL
unnecessarily for the primary user (replication) of it. I'd rather go
for primary keys only if that proves to be the contentious point.

How about modifying the selection to go from:
* all rows if ALTER TABLE ... REPLICA IDENTITY NOTHING|FULL;
* index chosen by ALTER TABLE ... REPLICA IDENTITY USING indexname
* [later, maybe] ALTER TABLE ... REPLICA IDENTITY (cola, colb)
* primary key
* candidate key with the smallest oid

Including the candidate key will help people using changeset extration
for auditing that do not have primary key. That really isn't an
infrequent usecase.

I've chosen REPLICA IDENTITY; NOTHIN; FULL; because those are all
existing keywords, and afaics shouldn't generate any conflicts. On a
green field we probably name them differently, but ...

Comments?

Greetings,

Andres Freund

PS: candidate key implies a key which is: immediate (aka not deferred),
unique, non-partial and only contains NOT NULL columns.

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6.2
Next
From: Alvaro Herrera
Date:
Subject: Re: libpgport vs libpgcommon