Re: Exposing the Xact commit order to the user - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Exposing the Xact commit order to the user
Date
Msg-id AANLkTilZT1Z4YsWxjgxEptruaT3xWEaSJaJi6iO8s9MQ@mail.gmail.com
Whole thread Raw
In response to Re: Exposing the Xact commit order to the user  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
On Sun, May 23, 2010 at 9:44 PM, Jan Wieck <JanWieck@yahoo.com> wrote:
> I'm not sure the retention policies of the shared buffer cache, the WAL
> buffers, CLOG buffers and every other thing we try to cache are that easy to
> fold into one single set of logic. But I'm all ears.

I'm not sure either, although it seems like LRU ought to be good
enough for most things.  I'm more worried about things like whether
the BufferDesc abstraction is going to get in the way.

>>> CommitTransaction() inside of xact.c will call a function, that inserts
>>> a new record into this array. The operation will for most of the time be
>>> nothing than taking a spinlock and adding the record to shared memory.
>>> All the data for the record is readily available, does not require
>>> further locking and can be collected locally before taking the spinlock.
>>
>> What happens when you need to switch pages?
>
> Then the code will have to grab another free buffer or evict one.

Hopefully not while holding a spin lock.  :-)

>>> The function will return the "sequence" number which CommitTransaction()
>>> in turn will record in the WAL commit record together with the
>>> begin_timestamp. While both, the begin as well as the commit timestamp
>>> are crucial to determine what data a particular transaction should have
>>> seen, the row count is not and will not be recorded in WAL.
>>
>> It would certainly be better if we didn't to bloat the commit xlog
>> records to do this.  Is there any way to avoid that?
>
> If you can tell me how a crash recovering system can figure out what the
> exact "sequence" number of the WAL commit record at hand should be, let's
> rip it.

Hmm...  could we get away with WAL-logging the next sequence number
just once per checkpoint?  When you replay the checkpoint record, you
update the control file with the sequence number.  Then all the
commits up through the next checkpoint just use consecutive numbers
starting at that value.

> It is an option. "Keep it until I tell you" is a perfectly valid
> configuration option. One you probably don't want to forget about, but valid
> none the less.

As Tom is fond of saying, if it breaks, you get to keep both pieces.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Stefan's bug (was: max_standby_delay considered harmful)
Next
From: Robert Haas
Date:
Subject: Re: Stefan's bug (was: max_standby_delay considered harmful)