Re: Spec discussion: Generalized Data Queue / Modification Trigger - Mailing list pgsql-cluster-hackers

From Hannu Krosing
Subject Re: Spec discussion: Generalized Data Queue / Modification Trigger
Date
Msg-id 1267650105.5157.36.camel@hvost
Whole thread Raw
In response to Re: Spec discussion: Generalized Data Queue / Modification Trigger  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Spec discussion: Generalized Data Queue / Modification Trigger
List pgsql-cluster-hackers
On Wed, 2010-03-03 at 11:52 -0800, Josh Berkus wrote:
> Greg,
>
> >> (1) The ability to send asynchronous (or synchronous?) notifications, on
> >> a per-row basis, whenever data is modified *only after commit*.  This
> >> has been generally described as "on-commit triggers", but could actually
> >> take a variety of forms.
> >
> > I'm not sure I like the idea of this. Could be potentially dangerous, as
> > listen/notify is not treated as a "reliable" process. What's wrong with
> > the current method, namely having a row trigger update an internal
> > table, and then a statement level trigger firing off a notify?
>
> Well, the main problem with that is that it doubles the number of writes
> you have to do ... or more.  So it's a major efficiency issue.
>
> This isn't as much of a concern for a system like Slony or Londiste
> where the replication queue is a table in the database.

Yes. For Londiste, in addition to WAL writes, which write bigger chunks
of data, but need the same number of seeks and syncs, only deferred
writes to heap and a single index would be added and even those may
never be actually written to disk if replication is fast enough and the
event tables are rotated faster than background writer and checkpoints
try to write them down.

> But if you
> were, say, replicating through ApacheMQ?  Or replicating cached data to
> Redis?  Then the whole queue-table, NOTIFY, poll structure is needless
> overhead.

I't may seem easy to replace a database table with "something else" for
collecting the changes which have happened during the transaction, but
you have to answer the following questions:

1) do I need persistence, what about 2PC ?

2) does the "something else" work well for all situations an event table
would work (say, for example, a load of 500GB of data in one
transaction)

3) what would I gain in return for all the work needed to implement the
"something else" ?

> >> (3) A method of marking DDL changes in the data modification stream.

Yes, DDL triggers or somesuch would be highly desirable.

> > Hmm..can you expand on what you have in mind here? Something more than
> > just treating the DDL as another item in the (txn ordered) queue?
>
> Yeah, that would be one way to handle it.  Alternately, you could have
> the ability to mark rows with a DDL "version".

But the actual DDL would still need to be transferred, no ?

--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training



pgsql-cluster-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Spec discussion: Generalized Data Queue / Modification Trigger
Next
From: Josh Berkus
Date:
Subject: Re: Spec discussion: Generalized Data Queue / Modification Trigger