Re: Exposing the Xact commit order to the user - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Exposing the Xact commit order to the user
Date
Msg-id 1283159933.1800.1844.camel@ebony
Whole thread Raw
In response to Exposing the Xact commit order to the user  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
On Sun, 2010-05-23 at 16:21 -0400, Jan Wieck wrote:

> In some systems (data warehousing, replication), the order of commits is
> important, since that is the order in which changes have become visible.
> This information could theoretically be extracted from the WAL, but
> scanning the entire WAL just to extract this tidbit of information would
> be excruciatingly painful.

This idea had support from at least 6 hackers. I'm happy to add my own.

Can I suggest it is added as a hook, rather than argue about the details
too much? The main use case is in combination with external systems, so
that way we can maintain the relevant code with the system that cares
about it.

> CommitTransaction() inside of xact.c will call a function, that inserts
> a new record into this array. The operation will for most of the time be
> nothing than taking a spinlock and adding the record to shared memory.
> All the data for the record is readily available, does not require
> further locking and can be collected locally before taking the spinlock.
> The begin_timestamp is the transactions idea of CURRENT_TIMESTAMP, the
> commit_timestamp is what CommitTransaction() just decided to write into
> the WAL commit record and the total_rowcount is the sum of inserted,
> updated and deleted heap tuples during the transaction, which should be
> easily available from the statistics collector, unless row stats are
> disabled, in which case the datum would be zero.

Does this need to be called while in a critical section? Or can we wait
until after the actual marking of the commit before calling this?

> Checkpoint handling will call a function to flush the shared buffers.
> Together with this, the information from WAL records will be sufficient
> to recover this data (except for row counts) during crash recovery.

So it would need to work identically in recovery also?

These two values are not currently stored in the commit WAL record.

timestamptz   xci_begin_timestamp
int64         xci_total_rowcount

Both of those seem optional, so I don't really want them added to WAL.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: pg_subtrans keeps bloating up in the standby
Next
From: Simon Riggs
Date:
Subject: cost_hashjoin