Re: [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility - Mailing list pgsql-hackers

From Christopher Browne
Subject Re: [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
Date
Msg-id CAFNqd5VVA8n3m+6o6-CcmEkj=fgW+-oGqE2O3bKa96g2a6hA=w@mail.gmail.com
Whole thread Raw
In response to Re: [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Wed, Oct 17, 2012 at 4:25 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> It is not meant to be a full implementation of application level queuing
>> system though but just the capture, persisting and distribution parts
>>
>> Using this as an "application level queue" needs a set of interface
>> functions to extract the events and also to keep track of the processed
>> events. As there is no general consensus what these shoul be (like if
>> processing same event twice is allowed) this part is left for specific
>> queue consumer implementations.
>
> Well, but AFAICT, you've already prohibited features through your design
> which are essential to application-level queues, and are implemented by,
> for example, pgQ.
>
> 1. your design only allows the queue to be read on replicas, not on the
> node where the item was inserted.

I commented separately on this; I'm pretty sure there needs to be a
way to read the queue on a replica, yes, indeed.

> 2. if you can't UPDATE or DELETE queue items -- or LOCK them -- how on
> earth would a client know which items they have executed and which they
> haven't?

If the items are actually stored in WAL, then it seems well and truly
impossible to do any of those three things directly.

What could be done, instead, would be to add "successor" items to
indicate that they have been dealt with, in effect, back-references.

You don't get to UPDATE or DELETE; instead, you do something like:
  INSERT into queue (reference_to_xid, reference_to_id_in_xid, action)   values (old_xid_1, old_id_within_xid_1,
'COMPLETED'),(old_xid_2,
 
old_id_within_xid_2, 'CANCELLED');

In a distributed context, it's possible that multiple nodes could be
reading from the same queue, so that while "process at least once" is
no trouble, "process at most once" is just plain troublesome.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"



pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Bugs in planner's equivalence-class processing
Next
From: Alvaro Herrera
Date:
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order