Re: synchronous commit vs. hint bits - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: synchronous commit vs. hint bits
Date
Msg-id CAHyXU0xX8ZMd9Zh0p5Xk52-Q3t7L_rJY_Zw8V52tc72OBfagYA@mail.gmail.com
Whole thread Raw
In response to Re: synchronous commit vs. hint bits  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: synchronous commit vs. hint bits
List pgsql-hackers
On Mon, Nov 7, 2011 at 9:25 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, Nov 7, 2011 at 8:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I've long considered synchronous_commit=off to be one of our best
>> performance features.  Certainly, it's not applicable in every
>> situation, but there are many applications where losing a second or so
>> worth of transactions is an acceptable price to pay for not needing to
>> wait for the disk to spin around for every commit.  However, recent
>> experimentation has convinced me that it's got a serious downside:
>> SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED
>> hints until the commit record has been durably flushed to disk.  It
>> turns out that can cause a major performance regression on systems
>> with many CPU cores.  I fixed this for temporary and unlogged tables
>> in commit 53f1ca59b5875f1d3e95ee709ecaddcbdfdbd175, but the same issue
>> exists (without any clear fix) for permanent tables.
>
> What's the source of the regression? Is it coming from losing the hint
> bit and being forced out to clog?  How likely is it really going to
> happen in non synthetic real world cases?
>
> Thinking about the hint bit cache I was playing with a while back, I
> guess you could have put the hint bit in the cache but refrained from
> marking it in the page in the TransactionIdIsValid(xid)=false case --
> in the first implementation I had only put the bit in the cache when
> it was valid -- since TransactionIdIsValid(xid) is not necessarily
> cheap though, maybe it's worth reserving an extra bit for the
> transaction being valid in the cache if you went down that road.
>
> Another way to attack this problem is to re-check and set the hint bit
> if you can do it in the bgwriter -- there's a good chance you will
> catch it in oltp environments like pgbench although it not clear if
> the cost to the general case would be too high.

Thinking about this more, the backend local cache approach is probably
going to be useless in terms of addressing this problem -- mostly due
to the fact that the cache is, well, local.  Even if backend A takes
the time to mark the bit in its own cache, backends B-Z haven't yet
and presumably by the time they do the page has been rolled out
anyways so you get no benefit.  The cache helps when a backend sees
the same transaction spread out over a number of tuples/pages --
that's simply not the case in OLTP.

Doing the work in the bgwriter might do the trick assuming the
bgwriter consistently loses the race against both transaction
resolution and the wal, and the extra clog lookup (when you win the
race) penalty doesn't sting too muh...possibly do this in conjuction
with clog striping Simon is thinking about.

merlin


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: synchronous commit vs. hint bits
Next
From: Robert Haas
Date:
Subject: Re: synchronous commit vs. hint bits