Re: Proposal for CSN based snapshots - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Proposal for CSN based snapshots
Date
Msg-id ccbbd4e3-2999-cf29-96d0-66935c4ca9aa@iki.fi
Whole thread Raw
In response to Re: Proposal for CSN based snapshots  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal for CSN based snapshots  (Andres Freund <andres@anarazel.de>)
Re: Proposal for CSN based snapshots  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 08/22/2016 07:49 PM, Robert Haas wrote:
> Nice to see you working on this again.
>
> On Mon, Aug 22, 2016 at 12:35 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> A sequential scan of a table like that with 10 million rows took about 700
>> ms on my laptop, when the hint bits are set, without this patch. With this
>> patch, if there's a snapshot holding back the xmin horizon, so that we need
>> to check the CSN log for every XID, it took about 30000 ms. So we have some
>> optimization work to do :-). I'm not overly worried about that right now, as
>> I think there's a lot of room for improvement in the SLRU code. But that's
>> the next thing I'm going to work.
>
> So the worst case for this patch is obviously bad right now and, as
> you say, that means that some optimization work is needed.
>
> But what about the best case?  If we create a scenario where there are
> no open read-write transactions at all and (somehow) lots and lots of
> ProcArrayLock contention, how much does this help?

I ran some quick pgbench tests on my laptop, but didn't see any 
meaningful benefit. I think the best I could see is about 5% speedup, 
when running "pgbench -S", with 900 idle connections sitting in the 
background. On the positive side, I didn't see much slowdown either. 
(Sorry, I didn't record the details of those tests, as I was testing 
many different options and I didn't see a clear difference either way.)

It seems that Amit's PGPROC batch clearing patch was very effective. I 
remember seeing ProcArrayLock contention very visible earlier, but I 
can't hit that now. I suspect you'd still see contention on bigger 
hardware, though, my laptop has oly 4 cores. I'll have to find a real 
server for the next round of testing.

> Because there's only a purpose to trying to minimize the losses if
> there are some gains to which we can look forward.

Aside from the potential performance gains, this slashes a lot of 
complicated code:
 70 files changed, 2429 insertions(+), 6066 deletions(-)

That removed code is quite mature at this point, and I'm sure we'll add 
some code back to this patch as it evolves, but still.

Also, I'm looking forward for a follow-up patch, to track snapshots in 
backends at a finer level, so that vacuum could remove tuples more 
aggressively, if you have pg_dump running for days. CSN snapshots isn't 
a strict requirement for that, but it makes it simpler, when you can 
represent a snapshot with a small fixed-size integer.

Yes, seeing some direct performance gains would be nice too.

- Heikki



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: distinct estimate of a hard-coded VALUES list
Next
From: Andres Freund
Date:
Subject: Re: Proposal for CSN based snapshots