Re: logical changeset generation v4 - Heikki's thoughts about the patch state - Mailing list pgsql-hackers

From Steve Singer
Subject Re: logical changeset generation v4 - Heikki's thoughts about the patch state
Date
Msg-id BLU0-SMTP33C9D847962E547879A4F8DC180@phx.gbl
Whole thread Raw
In response to Re: logical changeset generation v4 - Heikki's thoughts about the patch state  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: logical changeset generation v4 - Heikki's thoughts about the patch state
List pgsql-hackers
On 13-01-28 06:17 AM, Andres Freund wrote:
> Hi,
>
> 3.  Pass the delete (with no key values) onto the replication client and let
> it deal with it (see 1 and 2)
> Hm.
>
>
> While I agree that nicer behaviour would be good I think the real
> enforcement should happen on a higher level, e.g. with event triggers or
> somesuch. It seems way too late to do anything about it when we're
> already decoding. The transaction will already have committed...

Ideally the first line of enforcement would be with event triggers. The 
thing with user-level mechanisms for enforcing things is that they 
sometimes can be disabled or by-passed.  I don't have a lot of sympathy 
for people who do this but I like the idea of at least having the option 
coding defensively to detect the situation and whine to the user.

>> How do you plan on dealing with sequences?
>> I don't see my plugin being called on sequence changes and I don't see
>> XLOG_SEQ_LOG listed in DecodeRecordIntoReorderBuffer.  Is there a reason why
>> this can't be easily added?
> I basically was hoping for Simon's sequence-am to get in before doing
> anything real here. That didn't really happen yet.
> I am not sure whether there's a real usecase in decoding normal
> XLOG_SEQ_LOG records, their content isn't all that easy to interpet
> unless youre rather familiar with pg's innards.
>
> So, adding support wouldn't hard from a technical pov but it seems the
> semantics are a bit hard to nail down.
>
>> Also what do we want to do about TRUNCATE support.  I could always leave a
>> TRUNCATE trigger in place that logged the truncate to a sl_truncates and
>> have my replication daemon respond to the insert on a   sl_truncates table
>> by actually truncating the data on the replica.
> I have planned to add some generic "table_rewrite" handling, but I have
> to admit I haven't thought too much about it yet. Currently if somebody
> rewrites a table, e.g. with an ALTER ... ADD COLUMN .. DEFAULT .. or
> ALTER COLUMN ... USING ..., you will see INSERTs into a temporary
> table. That basically seems to be a good thing, but the user needs to be
> told about that ;)
>
>> I've spent some time this weekend updating my prototype plugin that
>> generates slony 2.2 style COPY output.  I have attached my progress here
>> (also https://github.com/ssinger/slony1-engine/tree/logical_repl).  I have
>> not gotten as far as modifying slon to act as a logical log receiver, or
>> made a version of the slony apply trigger that would process these
>> changes.
> I only gave it a quick look and have a couple of questions and
> remarks. The way you used the options it looks like youre thinking of
> specifying all the tables as options? I would have thought those would
> get stored & queried locally and only something like the 'replication
> set' name or such would be set as an option.

The way slony works today is that the list of tables to pull for a SYNC 
comes from the subscriber because the subscriber might be behind the 
provider, where a table has been removed from the set in the meantime.  
The subscriber still needs to receive data from that table until it is 
caught up to the point where that removal happens.

Having a time-travelled version of a user table (sl_table) might fix 
that problem but I haven't yet figured out how that needs to work with 
cascading (since that is a feature of slony today I can't ignore the 
problem). I'm also not sure how that will work with table renames.  
Today if the user renames a table inside of an EXECUTE SCRIPT slony will 
update the name of the table in sl_table.   This type of change wouldn't 
be visible (yet) in the time-travelled catalog.  There might be a 
solution to this yet but I haven't figured out it.  Sticking with what 
slony does today seemed easier as a first step.

> Iterating over a list with
>     for(i = 0; i < options->length; i= i + 2 )
>     {
>         DefElem * def_schema = (DefElem*) list_nth(options,i);
> is not a good idea btw, thats quadratic in complexity ;)

Thanks I'll rewrite this to walk a list of  ListCell objects with next.


> In the REORDER_BUFFER_CHANGE_UPDATE I suggest using
> relation->rd_primary, just as in the DELETE case, that should always
> give you a consistent candidate key in an efficient manner.
>
>> I haven't looked into the details of what is involved in setting up a
>> subscription with the snapshot exporting.
> That hopefully shouldn't be too hard... At least thats the idea :P
>
>> I couldn't get the options on the START REPLICATION command to parse so I
>> just hard coded some list building code in the init method. I do plan on
>> pasing the list of tables to replicate from the replica to the plugin
>> (because this list comes from the replica).   Passing what could be a few
>> thousand table names as a list of arguments is a bit ugly and I admit my
>> list processing code is rough.  Does this make us want to reconsider the
>> format of the option_list ?
> Yea, something's screwed up there, sorry. Will push a fix later today.
>
>> I guess should provide an opinion on if I think that the patch in this CF,
>> if committed could be used to act as a source for slony instead of the log
>> trigger.
>> The biggest missing piece I mentioned in my email yesterday, that we aren't
>> logging the old primary key on row UPDATEs.  I don't see building a credible
>> replication system where you don't allow users to update any column of a
>> row.
> Ok, I really thought this wouldn't be that much of an issue in a first
> version, but if you think its important, I'll add support for
> it. Shouldn't be too hard.

If your using non-surragate /natural primary keys this tends to come up 
occasionally due to data-entry errors or renames.  I'm  looking at this 
from the point of view of what do I need to use this as a source for a 
production replication system with fewer sharp-edges compared to trigger 
source slony.  My standard is a bit higher than 'first' version because 
I intent to use it in the version 3.0 of slony not 1.0.  If others feel 
I'm asking for too much they should speak up, maybe I am. Also the way 
things will fail if someone were to try and update a primary key value 
is pretty nasty (it will leave them with inconsistent databases).    We 
could  install UPDATE triggers to try and detect this type of thing but 
I'd rather see us just log the old values so we can use them during replay.




>> The other issues I've raised (DecodeDelete hiding bad deletes, replication
>> options not parsing for me) look like easy fixes
>>
>>   no wal decoding support for sequences or truncate are things that I could
>> work around by doing things much like slony does today.  The SYNC can still
>> capture the sequence changes in  a table (where the INSERT's would be
>> logged) and I can have a trigger capture truncates.
> Could you explan a bit what's being done there in slony?

Each time the slon connects to the local database to create a SYNC 
event, which is when slony captures snapshot visiblity information, it 
also gets also looks at all of the replicated sequences and finds any 
that have changed since the last sync  The values sequence values as of 
the last SYNC are stored in memory.  Any sequences that have changed get 
there new values written to the table sl_seqlog. When slon applies row 
updates for a SYNC it also updates (setval) on any sequences that have 
changed.


For truncates the truncate trigger just logs a single row into sl_log 
indicating that the table has been truncated.  When slon encounters a 
row of operation 'TRUNCATE' it executes a TRUNCATE ONLY on the table.




>> If this patch is going to get bumped to 9.4 I really hope that someone with
>> good knowledge of the internals (ie a committer) can give this patch a good
>> review sooner rather than later.  If there are issues Andres has overlooked
>> that are more serious or complicated to fix I would like to see them raised
>> before the next CF in June.
> Absolutely seconded. I *really* would love to see a more technical
> review, its hard to see issues after spending that much time in a
> certain worldview...
>
> Thanks!
>
> Andres
>




pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: enhanced error fields
Next
From: Peter Geoghegan
Date:
Subject: Re: enhanced error fields