Re: Exposing the Xact commit order to the user - Mailing list pgsql-hackers
From | Dan Ports |
---|---|
Subject | Re: Exposing the Xact commit order to the user |
Date | |
Msg-id | 20100524191021.GA53044@csail.mit.edu Whole thread Raw |
In response to | Exposing the Xact commit order to the user (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: Exposing the Xact commit order to the user
|
List | pgsql-hackers |
On Sun, May 23, 2010 at 04:21:58PM -0400, Jan Wieck wrote: > In some systems (data warehousing, replication), the order of commits is > important, since that is the order in which changes have become visible. > This information could theoretically be extracted from the WAL, but > scanning the entire WAL just to extract this tidbit of information would > be excruciatingly painful. This is very interesting to me as I've been doing some (research -- nowhere near production-level) work on building a transactional application-level (i.e. memcached-like) cache atop Postgres. One of the features I needed to support it was basically what you describe. Without getting too far into the details of what I'm doing, I needed to make it clear to a higher layer which commits were visible to a given query. That is, I wanted to know both the order of commits and where particular snapshots fit into this ordering. (A SnapshotData struct obviously contains the visibility information, but a representation in terms of the commit ordering is both more succinct and allows for easy ordering comparisons). Something you might want to consider, then, is adding an interface to find out the timestamp of the current transaction's snapshot, i.e. the timestamp of the most recent committed transaction visible to it. I wouldn't expect this to be difficult to implement as transaction completion/visibility is already synchronized via ProcArrayLock. > Each record of the Transaction Commit Info consists of > > txid xci_transaction_id > timestamptz xci_begin_timestamp > timestamptz xci_commit_timestamp > int64 xci_total_rowcount Another piece of information that seems useful to provide here would be the logical timestamp of the transaction, i.e. a counter that's incremented by one for each transaction. But maybe that's implicit in the log ordering? I'm not clear on why the total rowcount is useful, but perhaps I'm missing something obvious. I've actually implemented some semblance of this on Postgres 8.2, but it sounds like what you're interested in is more sophisticated. In particular, I wasn't at all concerned with durability or WAL stuff, and I had some specific requirements about when it was OK to purge the data. Because of this (and very limited development time), I just threw something together with a simple shared buffer. I don't think I have any useful code to offer, but let me know if there's some way I can help out. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
pgsql-hackers by date: