Re: Sending notifications from the master to the standby - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Sending notifications from the master to the standby
Date
Msg-id CA+U5nMJswZLZFOWzdAjROrFMww6yfv39aJbweMDLx35GLew96Q@mail.gmail.com
Whole thread Raw
In response to Re: Sending notifications from the master to the standby  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Jan 10, 2012 at 4:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> On Tue, Jan 10, 2012 at 5:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It might be a bit tricky to get walreceivers to inject
>>> the data into the slave-side ring buffer at the right time, ie, not
>>> until after the commit a given message describes has been replayed;
>>> but I don't immediately see a reason to think that's infeasible.
>
>> [ Simon sketches a design for that ]
>
> Seems a bit overcomplicated.  I was just thinking of having walreceiver
> note the WAL endpoint at the instant of receipt of a notify message,
> and not release the notify message to the slave ring buffer until WAL
> replay has advanced that far.  You'd need to lay down ground rules about
> how the walsender times the insertion of notify messages relative to
> WAL in its output.

You have to store the messages somewhere until they're needed. If that
somewhere isn't on the standby, very close to the Startup process then
its going to be very slow. Putting a marker in the WAL stream
guarantees arrival order. The hash table was just a place to store
them until they're needed, could be a ring buffer as well.

Inserts into the slave ring buffer already have an xid on them, so the
test will probably already cope with messages inserted but for which
the parent xid has not committed. The only problem is coping with
possible out of sequence messages.

> But I don't see the need for either explicit markers
> in the WAL stream or a hash table.  Indeed, a hash table scares me
> because it doesn't clearly guarantee that notifies will be released in
> arrival order.

The hash table is clearly not the thing providing an arrival order
guarantee, it was just a cache.

You have a few choices: (1) you either send the message while holding
an exclusive lock, or (2) you send them as they come and buffer them,
then reorder them using the WAL log sequence since that matches the
original commit sequence. Or (3) add a sequence number to the messages
sent by WALSender, so that the WALReceiver can buffer them locally and
insert them in the correct order into the normal ring buffer - so in
(3) the message sequence and the WAL sequence match, but the mechanism
is different.

(1) is out because the purpose of offloading to the standby is to give
the master more capcity. If we slow it down in order to serve the
standby we're doing things the wrong way around.

I was choosing (2), maybe you prefer (3) or another design entirely.
They look very similar to me and about the same complexity, its just
copying data and preserving sequence.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Add SPI results constants available for PL/*
Next
From: Jan-Benedict Glaw
Date:
Subject: Re: pgsphere