Re: table as log (multiple writers and readers) - Mailing list pgsql-general

From brian
Subject Re: table as log (multiple writers and readers)
Date
Msg-id 4806253B.5040608@zijn-digital.com
Whole thread Raw
In response to table as log (multiple writers and readers)  ("Vance Maverick" <vmaverick@pgp.com>)
Responses Re: table as log (multiple writers and readers)  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-general
Vance Maverick wrote:
> I want to create a table with the semantics of a log.  There may be
> multiple concurrent writers, because log entries will be generated by
> triggers.  And there will be multiple concurrent readers -- mostly
> remote processes watching the log over time.  I'd like to guarantee that
> each of those readers will see the same, complete sequence of changes.
>
> For a first attempt, I created a table with a serial column (actually I
> created the sequence ID separately, but that's a detail).  The readers
> connect remotely, scan through the table in sequence order, and remember
> the ID of the last row they read.  When they read again, they start from
> after that ID -- the query is roughly
>
>    SELECT * FROM logtable WHERE id > ? ORDER BY id
>
> But there's a problem with this -- the rows may not be inserted into the
> log in ID order.  For example, if two concurrent sessions get the
> sequence IDs 3 and 4, the one with ID 4 might commit first.  If a reader
> queries the table at that moment, it will see the sequence (1, 2, 4).
> Later, if the other session commits, a log entry with ID 3 will be
> added.  The IDs will now be (1, 2, 4, 3) -- but if the reader comes back
> to poll the log again, it will miss entry 3.  In order not to miss it,
> the reader would have to remember much more information than just a
> high-water mark.  (I think it would have to remember the complete set of
> IDs it had processed already, and then scan the whole table to find
> entries not in the set.)
>
> Is there an approach that will give the semantics I need?  In
> particular, I'd like a reader that has already read up to a certain
> point in the log to be able to restart at or near that same point, so it
> wouldn't have to reread everything.
>
> (I recognize that these log semantics are closer to those provided by
> files, not database tables.  A file is an option I'll consider, but
> obviously it lacks transactional semantics.)
>

Use a timestamp column also.

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: "vacuum" and "cluster"
Next
From: Craig Ringer
Date:
Subject: Re: table as log (multiple writers and readers)