Home > mailing lists

Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached) - Mailing list pgsql-hackers

From	Steve Singer
Subject	Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
Date	October 16, 2012 03:21:02
Msg-id	BLU0-SMTP71DC1A10859E0BEF09C9D6DC700@phx.gbl Whole thread Raw
In response to	Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached) (Andres Freund <andres@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On 12-10-15 04:51 PM, Andres Freund wrote:
>
> Well, as a crosscheck, could you list your requirements?
>
> Do you need anything more than outputting data in a format compatible to whats
> stored in sl_log_*? You wouldn't have sl_actionseq, everything else should be
> there (Well, you would need to do lookups to get the tableid, but thats not
> really much of a problem). The results would be ordered in complete
> transactions, in commit order.
>
> I guess the other tables would stay as they are as they contain the "added
> value" of slony?
>
> Greetings,

I actually had spent some time a few weeks ago looking over the 
documents and code.  I never did get around to writing a review as 
elegant as Peter's.   I have not seen any red flags that make me thing 
that what your proposing wouldn't be suitable for slony but sometimes 
you don't see details until you start implementing something.

My initial approach to modifying slony to work with this might be 
something like:

* Leave sl_event as is for non SYNC events, slon would still generate 
SYNC events in sl_event
* We would modify the remote_worker thread in slon to instead of 
selecting from sl_event it would get the the next 'committed' 
transaction from your apply cache.   For each ApplyChange record we 
would check to see if it is an insert into sl_event ,if so we would 
trigger our existing event processing logic based on the contents of the 
ev_type column.
* If the change involves a insert/update/delete/truncate to a replicated 
table we would translate that change into SQL and apply it on the 
replica, we would  not commit changes on the replica until we encounter 
a SYNC being added to sl_event for the current origin.
* SQL will be applied in a slightly different order than slony does 
today.  Today if two concurrent transactions are inserting into the same 
replicated table and they commit one after the other there is a good 
chance that the apply order on the replica will also be intermixed 
(assuming both commits were in between two SYNC events). My thinking is 
that we would just replay them one after the other on the replica in 
commit order. (Slony doesn't use commit order because we don't have it, 
not because we don't like it) this would mean we do away with tracking 
the action id.

* If a node is configured as a 'forwarder' not it would store the 
processed output of each ApplyChange record in a table on the replica. 
If a slon is pulling data from a non-orign (ie if remoteWorkerThread_1 
is pulling data from node 2) then it would need to query this table 
instead of calling the functions that process the ApplyCache contents.

* To subscribe a node we would generate a SYNC event on the provider and 
do the copy_set.  We would keep track of that SYNC event.  The remote 
worker would then ignore any data that comes before that SYNC event  
when it starts pulling data from the apply cache.
* DDL events in 2.2+ go into sl_ddl_script (or someting like that) when 
we see INSERT commands to that table we would now to then apply the DDL 
on the node.

* We would need to continue to populate sl_confirm because nowing what 
SYNC events have already been processed by a node is pretty important in 
a MOVE SET or FAILOVER.  It is possible that we might need to still 
track the xip lists of each SYNC for MOVE SET/FAILOVER but I'm not sure 
why/why not.

This is all easier said than implemented

Steve

> Andres

pgsql-hackers by date:

From: Bruce Momjian
Date: 16 October 2012, 02:59:16
Subject: Re: Deprecating RULES

From: Stephen Frost
Date: 16 October 2012, 04:00:09
Subject: Re: Global Sequences

Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached) - Mailing list pgsql-hackers

Previous

Next