Re: GDQ iimplementation - Mailing list pgsql-cluster-hackers

From Jan Wieck
Subject Re: GDQ iimplementation
Date
Msg-id 4BF48797.6050500@Yahoo.com
Whole thread Raw
In response to Re: GDQ iimplementation  (Josh Berkus <josh@agliodbs.com>)
List pgsql-cluster-hackers
On 5/17/2010 5:46 PM, Josh Berkus wrote:
> Jan, Marko, Simon,
>
> I'm concerned that doing anything about the write overhead issue was
> discarded almost immediately in this discussion.  This is not a trivial
> issue for performance; it means that each row which is being tracked by
> the GDQ needs to be written to disk a minimum of 4 times (once to WAL,
> once to table, once to WAL for queue, once to queue).  That's at least
> one time too many, and effectively doubles the load on the master server.
>
> This is particularly unacceptable overhead for systems where users are
> not that interested in retaining the queue after an unexpected shutdown.
>
> Surely there's some way around this?  Some kind of special
> fsync-on-write table, for example?  The access pattern to a queue is
> quite specialized.
>

I recall this slightly different. The idea of a PostgreSQL managed
queue, that does NOT guarantee consistency with the final commit status
of the message generating transactions, was discarded. That is not the
same as ignoring the write overhead.

In all our existing use cases (Londiste/Slony/Bucardo) the information
in the queue cannot be entirely found in the WAL of the original
underlying row operation. There are old row key values and sequence
numbers or other meta information that isn't even known at the time, the
original rows WAL entry is written.

It may seem possible to implement the data capturing part of the queue
within the heap access methods, add the extra information to the WAL
record and thus get rid of one of the images. But that isn't as simple
as it sounds, since queue tables have toast tables too, they don't
consist of simply one "log entry", they actually consist of a bunch of
tuples. One in the queue table, 0-n in the queues toast table and then
the index tuples. In the case of compression, the binary data in the
toasted queue attribute will be entirely different than what you may
find in the WAL pieces that were written for the original data rows
toast segments. It is going to be a heck of a forensics job to
reconstruct all that.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

pgsql-cluster-hackers by date:

Previous
From: Koichi Suzuki
Date:
Subject: Postgres-XC V.0.9.1 release announcement
Next
From: Simon Riggs
Date:
Subject: Re: GDQ iimplementation