Thread: Thread-safe queueing?

Thread-safe queueing?

From

Tim Holloway

Date:

13 November 1999, 00:17:11

I need to create a cross-process producer/consumer data queue (e.g. singly-linked list). 

That is - Processes A, B, and C add nodes to a controlled list and process D removes them.
Not sure if the creation of the nodes would be best done by the producers or consumers,
but destruction would have to be done by the consumer, as the producers don't wait for
processing. For optimal results, the consumer process should sleep until item(s) are added
to its queue.

Query: within the existing backend framework, what's the best way to accomplish this?
  Thanks,
   Tim Holloway

Re: [HACKERS] Thread-safe queueing?

From

Tom Lane

Date:

13 November 1999, 10:16:17

Tim Holloway <mtsinc@southeast.net> writes:
> I need to create a cross-process producer/consumer data queue
> (e.g. singly-linked list).  That is - Processes A, B, and C add nodes
> to a controlled list and process D removes them.  Not sure if the
> creation of the nodes would be best done by the producers or
> consumers, but destruction would have to be done by the consumer, as
> the producers don't wait for processing. For optimal results, the
> consumer process should sleep until item(s) are added to its queue.

> Query: within the existing backend framework, what's the best way to
> accomplish this?

More context, please.  What are you trying to accomplish?  Is this
really a communication path between backends (and if so, what backend
code needs it?), or are you trying to set up a queue between SQL
clients?  How much data might need to be in the queue at one time?
        regards, tom lane

Re: [HACKERS] Thread-safe queueing?

From

Tim Holloway

Date:

13 November 1999, 20:18:24

Tom Lane wrote:
> 
> Tim Holloway <mtsinc@southeast.net> writes:
> > I need to create a cross-process producer/consumer data queue
> > (e.g. singly-linked list).  That is - Processes A, B, and C add nodes
> > to a controlled list and process D removes them.  Not sure if the
> > creation of the nodes would be best done by the producers or
> > consumers, but destruction would have to be done by the consumer, as
> > the producers don't wait for processing. For optimal results, the
> > consumer process should sleep until item(s) are added to its queue.
> 
> > Query: within the existing backend framework, what's the best way to
> > accomplish this?
> 
> More context, please.  What are you trying to accomplish?  Is this
> really a communication path between backends (and if so, what backend
> code needs it?), or are you trying to set up a queue between SQL
> clients?  How much data might need to be in the queue at one time?
> 
>                         regards, tom lane
> 

This is for the logging subsystem I'm developing. The backends call pg_log(),
which is like elog(), except that the message is a resource ID + any parameters
in order to support locales and custom message formatting. These ID+parameter
packets are then pipelined down to the logging channels via the log engine to
be formatted and output according to rules in the configuration file.

I *think* that the log engine should be a distinct process. I'm not sure I can
trust the output not to come out sliced and diced if each backend can run the engine
directly -- and for that matter, I see problems if the engine is reconfigured on the
fly owing to the need for each backend to replicate the configuration process (among
other things). The basic singly-linked list component is all I need to handle the
FIFO, but obviously I need guards to preserve its integrity. As to the amount of data
involved, I sincerely hope the queue would stay pretty shallow!

I have the configuration parser and logging engine operational, so the last
significant hurdle is making sure that A) the data to be logged is
accessable/addressable by the engine, and B) that the process runs in the
proper sequence. A description of what it all will look like is now online at http://postgres.mousetech.com/index.html
(with apologies for the ugly formatting).
  Thanks,
     TIm Holloway

Re: [HACKERS] Thread-safe queueing?

From

Tom Lane

Date:

13 November 1999, 21:30:25

Tim Holloway <mtsinc@southeast.net> writes:
> Tom Lane wrote:
>> More context, please.

> This is for the logging subsystem I'm developing. The backends call
> pg_log(), which is like elog(), except that the message is a resource
> ID + any parameters in order to support locales and custom message
> formatting. These ID+parameter packets are then pipelined down to the
> logging channels via the log engine to be formatted and output
> according to rules in the configuration file.

OK.  You probably want something roughly comparable to the shared-inval
message queue --- see include/storage/sinvaladt.h and
backend/storage/ipc/sinvaladt.c.  That's more complex than your problem
in one way (the sinval queue must be read by all N backends, not just
one process) but simpler in another (we can shoehorn all SI messages
into a fixed-size structure; is that practical for log data?).  Anyway,
a queue structure in shared memory protected by spinlocks is what you
want, and sinval is about the closest thing we have to that at the
moment.

> I *think* that the log engine should be a distinct process.

Probably so, if you use a shared-memory queue.  Shared memory is a
finite resource; AFAIK it's not practical to enlarge it on-the-fly.
So you will have to set a maximum queue size --- either a compile-time
constant, or at best a value chosen at postmaster start time.  This
implies that there will be scenarios where backends are waiting for room
to be made in the log queue.  If the queue emptier is a separate process
then those waits can't turn into deadlocks.  (sinval works around the
memory-overflow problem with a special "reset" mechanism, but that
doesn't seem appropriate for logging.)

Alternatively, you could forget about a queue per se, and just allow
each backend to execute the sending of its own log messages, using
a spinlock in shared memory to prevent concurrent issuance of log
messages on channels where that's a problem.  That might be the
simplest and most robust approach.
        regards, tom lane

Re: [HACKERS] Thread-safe queueing?

From

Bruce Momjian

Date:

13 November 1999, 22:13:25

> Alternatively, you could forget about a queue per se, and just allow
> each backend to execute the sending of its own log messages, using
> a spinlock in shared memory to prevent concurrent issuance of log
> messages on channels where that's a problem.  That might be the
> simplest and most robust approach.

Hold on.  Unix guarantees all write() calls are atomic, so no one gets
in between that write.  Why not just collect the output into one buffer
in the backend, and blast the entire buffer in one write() to the log
file.

I don't think there is any way another backend could mess that up.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] Thread-safe queueing?

From

Tom Lane

Date:

13 November 1999, 22:50:26

Bruce Momjian <maillist@candle.pha.pa.us> writes:
>> Alternatively, you could forget about a queue per se, and just allow
>> each backend to execute the sending of its own log messages, using
>> a spinlock in shared memory to prevent concurrent issuance of log
>> messages on channels where that's a problem.  That might be the
>> simplest and most robust approach.

> Hold on.  Unix guarantees all write() calls are atomic, so no one gets
> in between that write.

Actually, I didn't say that I *believed* there were any channel types
where such an interlock is essential ;-).  I just said that spinlocking
is a solution if the problem comes up.

Tim mentioned on-the-fly reconfiguration of logging as an area that
might need interlocking, and I'm more prepared to believe that.
Still, we seem to be getting on just fine with each backend responding
independently to reconfiguration of the pg_options values.  So I'd be
inclined to build it that way, and wait for evidence of a performance
problem before spending effort to make it smarter.

Which I guess is Bruce's point also...
        regards, tom lane