Postgres-R: internal messaging - Mailing list pgsql-hackers

From Markus Wanner
Subject Postgres-R: internal messaging
Date
Msg-id 4886DB0B.1090508@bluegap.ch
Whole thread Raw
Responses Re: Postgres-R: internal messaging  (Alexey Klyukin <alexk@commandprompt.com>)
List pgsql-hackers
Hi,

As you certainly know by now, Postgres-R introduces an additional
manager process. That one is forked from the postmaster, so are all
backends, no matter if they are processing local or remote transactions.
That led to a communication problem, which has originally (i.e. around
Postgres-R for 6.4) been solved by using unix pipes. I didn't like that
approach for various reasons: first, AFAIK there are portability issues,
second it eats file descriptors and third it involves copying around the
messages several times. As the replication manager needs to talk to the
backends, but they both need to be forked from the postmaster, pipes
would also have to go through the postmaster process.

Trying to be as portable as Postgres itself and still wanting an
efficient messaging system, I came up with that imessages stuff, which
I've already posted to -patches before [1]. It uses shared memory to
store and 'transfer' the messages and signals to notify other processes
(the so far unused SIGUSR2, IIRC). Of course this implies having a hard
limit on the total size of messages waiting to be delivered, due to the
fixed size of the shared memory area.

Besides the communication between the replication manager and the
backends, which is currently done by using these imessages, the
replication manager also needs to communicate with the postmaster: it
needs to be able to request new helper backends and it wants to be
notified upon termination (or crash) of such a helper backend (and other 
backends as well...). I'm currently doing this with imessages as well, 
which violates the rule that the postmaster may not to touch shared 
memory. I didn't look into ripping that out, yet. I'm not sure it can be 
done with the existing signaling of the postmaster.

Let's have a simple example: consider a local transaction which changes 
some tuples. Those are being collected into a change set, which gets 
written to the shared memory area as an imessage for the replication 
manager. The backend then also signals the manager, which then awakes 
from its select(), checks its imessages queue and processes the message, 
delivering it to the GCS. It then removes the imessage from the shared 
memory area again.

My initial design features only a single doubly linked list as the 
message queue, holding all messages for all processes. An imessages lock 
blocks concurrent writing acces. That's still what's in there, but I 
realize that's not enough. Each process should better have its own 
queue, and the single lock needs to vanish to avoid contention on that 
lock. However, that would require dynamically allocatable shared memory...

As another side node: I've had to write methods similar to those in 
libpq, which serialize and deserialize integers or strings. The libpq 
functions were not appropriate because they cannot write shared memory, 
instead they are designed to flush to a socket, if I understand 
correctly. Maybe, these could be extended or modified to be usable there 
as well? I've been hesitating and rather implemented separate methods in 
src/backed/storage/ipc/buffer.c.

Comments?

Regards

Markus Wanner

[1]: last time I published IMessage stuff on -patches, WIP:
http://archives.postgresql.org/pgsql-patches/2007-01/msg00578.php



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [PATCHES] odd output in restore mode
Next
From: Alexey Klyukin
Date:
Subject: Re: Postgres-R: internal messaging