Re: logical changeset generation v3 - comparison to Postgres-R change set format - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: logical changeset generation v3 - comparison to Postgres-R change set format
Date
Msg-id 50A643BF.3060100@bluegap.ch
Whole thread Raw
In response to logical changeset generation v3  (andres@anarazel.de (Andres Freund))
Responses Re: logical changeset generation v3 - comparison to Postgres-R change set format
Re: logical changeset generation v3 - comparison to Postgres-R change set format
Re: logical changeset generation v3 - comparison to Postgres-R change set format
List pgsql-hackers
Andres,

On 11/15/2012 01:27 AM, Andres Freund wrote:
> In response to this you will soon find the 14 patches that currently
> implement $subject.

Congratulations on that piece of work.


I'd like to provide a comparison of the proposed change set format to
the one used in Postgres-R. I hope for this comparison to shed some
light on the similarities and differences of the two projects. As the
author of Postgres-R, I'm obviously biased, but I try to be as neutral
as I can.


Let's start with the representation: I so far considered the Postgres-R
change set format to be an implementation detail and I don't intend it
to be readable by humans or third party tools. It's thus binary only and
doesn't offer a textual representation. The approach presented here
seems to target different formats for different audiences, including
binary representations. More general, less specific.


Next, contents: this proposal is more verbose. In the textual
representation shown, it provides (usually redundant) information about
attribute names and types. Postgres-R doesn't ever transmit attribute
name or type information for INSERT, UPDATE or DELETE operations.
Instead, it relies on attribute numbers and pg_attributes being at some
known consistent state.

Let's compare by example:

> table "replication_example": INSERT: id[int4]:1 somedata[int4]:1 text[varchar]:1
> table "replication_example": UPDATE: id[int4]:1 somedata[int4]:-1 text[varchar]:1
> table "replication_example": DELETE (pkey): id[int4]:1

In Postgres-R, the change sets for these same operations would carry the
following information, in a binary representation:

> table "replication_example": INSERT: VALUES (1, 1, '1')
> table "replication_example": UPDATE: PKEY(1) COID(77) MODS('nyn') VALUES(-1)
> table "replication_example": DELETE: PKEY(1) COID(78)

You may have noticed that there's an additional COID field. This is an
identifier for the transaction that last changed this tuple. Together
with the primary key, it effectively identifies the exact version of a
tuple (during its lifetime, for example before vs after an UPDATE). This
in turn is used by Postgres-R to detect conflicts.

It may be possible to add that to the proposed format as well, for it to
be able to implement a Postgres-R-like algorithm.


To finish off this comparison, let's take a look at how and where the
change sets are generated: in Postgres-R the change set stream is
constructed directly from the heap modification routines, i.e. in
heapam.c's heap_{insert,update,delete}() methods. Where as the patches
proposed here parse the WAL to reconstruct the modifications and add the
required meta information.

To me, going via the WAL first sounded like a step that unnecessarily
complicates matters. I recently talked to Andres and brought that up.
Here's my current view of things:

The Postgres-R approach is independent of WAL and its format, where as
the approach proposed here clearly is not. Either way, there is a
certain overhead - however minimal it is - which the former adds to the
transaction processing itself, while the later postpones it to a
separate XLogReader process. If there's any noticeable difference, it
might reduce latency in case of asynchronous replication, but can only
increase latency in the synchronous case. As far as I understood Andres,
it was easier to collect the additional meta data from within the
separate process.


In summary, I'd say that Postgres-R is an approach specifically
targeting and optimized for multi-master replication between Postgres
nodes, where as the proposed patches are kept more general.

I hope you found this to be an insightful and fair comparison.

Regards

Markus Wanner



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Next
From: Amit Kapila
Date:
Subject: Re: Switching timeline over streaming replication