Re: Catalog/Metadata consistency during changeset extraction from wal - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Catalog/Metadata consistency during changeset extraction from wal
Date
Msg-id 201206211756.11996.andres@2ndquadrant.com
Whole thread Raw
In response to Re: Catalog/Metadata consistency during changeset extraction from wal  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
On Thursday, June 21, 2012 04:05:54 PM Florian Pflug wrote:
> On Jun21, 2012, at 13:41 , Andres Freund wrote:
> > 5.)
> > The actually good idea. Yours?
> 
> What about a mixure of (3b) and (4), which writes the data not to the WAL
> but to a separate logical replication log. More specifically:
> 
> There's a per-backend queue of change notifications.
> 
> Whenever a non-catalog tuple is modified, we queue a TUPLE_MODIFIED
> record containing (xid, databaseoid, tableoid, old xmin, old ctid, new
> ctid)
> 
> Whenever a table (or something that a table depends on) is modified we
> wait until all references to that table's oid have vanished from the queue,
> then queue a DDL record containing (xid, databaseoid, tableoid, text).
> Other backend cannot concurrently add further TUPLE_MODIFIED records since
> we alreay hold an exclusive lock on the table at that point.
> 
> A background process continually processes these queues. If the front of
> the queue is a TUPLE_MODIFIED record, it fetches the old and the new tuple
> based on their ctids and writes the old tuple's PK and the full new tuple
> to the logical replication log. Since table modifications always wait for
> all previously queued TUPLE_MODIFIED records referencing that table to be
> processes *before* altering the catalog, tuples can always be interpreted
> according to the current (SnapshotNow) catalog contents.
> 
> Upon transaction COMMIT and ROLLBACK, we queue COMMIT and ROLLBACK records,
> which are also written to the log by the background process. The background
> process may decide to wait until a backend commits before processing that
> backend's log. In that case, rolled back transaction don't leave a trace in
> the logical replication log. Should a backend, however, issue a DDL
> statement, the background process *must* process that backend's queue
> immediately, since otherwise there's a dead lock.
> 
> The background process also maintains a value in shared memory which
> contains the oldest value in any of the queue's xid or "old xmin" fields.
> VACUUM and the like must not remove tuples whose xmin is >= that value.
> Hit bits *may* be set for newest tuples though, provided that the
> background process ignores hint bits when fetching the old and new tuples.
I think thats too complicated to fly. Getting that to recover cleanly in case 
of crash would mean you'd need another wal.

I think if it comes to that going for 1) is more realistic...

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


pgsql-hackers by date:

Previous
From: D'Arcy Cain
Date:
Subject: COMMUTATOR doesn't seem to work
Next
From: Tom Lane
Date:
Subject: Re: Reseting undo/redo logs