Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
Date
Msg-id 50767223.7070207@vmware.com
Whole thread Raw
In response to Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)  (Andres Freund <andres@2ndquadrant.com>)
Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 22.09.2012 20:00, Andres Freund wrote:
> [[basic-schema]]
> .Architecture Schema
> ["ditaa"]
> ------------------------------------------------------------------------------
>          Traditional Stuff
>
>   +---------+---------+---------+---------+----+
>   | Backend | Backend | Backend | Autovac | ...|
>   +----+----+---+-----+----+----+----+----+-+--+
>        |        |          |         |      |
>        +------+ | +--------+         |      |
>      +-+      | | | +----------------+      |
>      |        | | | |                       |
>      |        v v v v                       |
>      |     +------------+                   |
>      |     | WAL writer |<------------------+
>      |     +------------+
>      |       | | | | |
>      v       v v v v v       +-------------------+
> +--------+ +---------+   +->| Startup/Recovery  |
> |{s}     | |{s}      |   |  +-------------------+
> |Catalog | |   WAL   |---+->| SR/Hot Standby    |
> |        | |         |   |  +-------------------+
> +--------+ +---------+   +->| Point in Time     |
>      ^          |            +-------------------+
>   ---|----------|--------------------------------
>      |       New Stuff
> +---+          |
> |              v            Running separately
> | +----------------+  +=-------------------------+
> | | Walsender  |   |  |                          |
> | |            v   |  |    +-------------------+ |
> | +-------------+  |  | +->| Logical Rep.      | |
> | |     WAL     |  |  | |  +-------------------+ |
> +-|  decoding   |  |  | +->| Multimaster       | |
> | +------+------/  |  | |  +-------------------+ |
> | |            |   |  | +->| Slony             | |
> | |            v   |  | |  +-------------------+ |
> | +-------------+  |  | +->| Auditing          | |
> | |     TX      |  |  | |  +-------------------+ |
> +-| reassembly  |  |  | +->| Mysql/...         | |
> | +-------------/  |  | |  +-------------------+ |
> | |            |   |  | +->| Custom Solutions  | |
> | |            v   |  | |  +-------------------+ |
> | +-------------+  |  | +->| Debugging         | |
> | |   Output    |  |  | |  +-------------------+ |
> +-|   Plugin    |--|--|-+->| Data Recovery     | |
>    +-------------/  |  |    +-------------------+ |
>    |                |  |                          |
>    +----------------+  +--------------------------|
> ------------------------------------------------------------------------------

This diagram triggers a pet-peeve of mine: What do all the boxes and 
lines mean? An architecture diagram should always include a key. I find 
that when I am drawing a diagram myself, adding the key clarifies my own 
thinking too.

This looks like a data-flow diagram, where the arrows indicate the data 
flows between components, and the boxes seem to represent processes. But 
in that case, I think the arrows pointing from the plugins in walsender 
to Catalog are backwards. The catalog information flows from the Catalog 
to walsender, walsender does not write to the catalogs.


Zooming out to look at the big picture, I think the elephant in the room 
with this whole effort is how it fares against trigger-based 
replication. You list a number of disadvantages that trigger-based 
solutions have, compared to the proposed logical replication. Let's take 
a closer look at them:

> * essentially duplicates the amount of writes (or even more!)

True.

> * synchronous replication hard or impossible to implement

I don't see any explanation it could be implemented in the proposed 
logical replication either.

> * noticeable CPU overhead
>   * trigger functions
>   * text conversion of data

Well, I'm pretty sure we could find some micro-optimizations for these 
if we put in the effort. And the proposed code isn't exactly free, either.

> * complex parts implemented in several solutions

Not sure what this means, but the proposed code is quite complex too.

> * not in core

IMHO that's a good thing, and I'd hope this new logical replication to 
live outside core as well, as much as possible. But whether or not 
something is in core is just a political decision, not a reason to 
implement something new.

If the only meaningful advantage is reducing the amount of WAL written, 
I can't help thinking that we should just try to address that in the 
existing solutions, even if it seems "easy to solve at a first glance, 
but a solution not using a normal transactional table for its log/queue 
has to solve a lot of problems", as the document says. Sorry to be a 
naysayer, but I'm pretty scared of all the new code and complexity these 
patches bring into core.

PS. I'd love to see a basic Slony plugin for this, for example, to see 
how much extra code on top of the posted patches you need to write in a 
plugin like that to make it functional. I'm worried that it's a lot..

- Heikki



pgsql-hackers by date:

Previous
From: "Etsuro Fujita"
Date:
Subject: Re: Minor document updates
Next
From: Hannu Krosing
Date:
Subject: Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel