Home > mailing lists

Re: patch: improve SLRU replacement algorithm - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: patch: improve SLRU replacement algorithm
Date	April 4, 2012 21:24:09
Msg-id	CA+TgmoZroEu9w9PSHZP6+Smz=or_X5jkZDNwGOqY1cLeGBDuTg@mail.gmail.com Whole thread
In response to	Re: patch: improve SLRU replacement algorithm (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: patch: improve SLRU replacement algorithm
List	pgsql-hackers

Tree view

On Wed, Apr 4, 2012 at 7:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <stark@mit.edu> writes:
>> On Wed, Apr 4, 2012 at 9:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> Why is this pgbench run accessing so much unhinted data that is > 1
>>> million transactions old? Do you believe those numbers? Looks weird.
>
>> I think this is in the nature of the workload pgbench does. Because
>> the updates are uniformly distributed, not concentrated 90% in 10% of
>> the buffers like most real-world systems, (and I believe pgbench only
>> does index lookups) the second time a tuple is looked at is going to
>> average N/2 transactions later where N is the number of tuples.
>
> That's a good point, and it makes me wonder whether pgbench is the right
> test case to be micro-optimizing around.  It would be a good idea to at
> least compare the numbers for something with more locality of reference.

I agree that there are other benchmarks that are worth optimizing for,
but this particular change is more in the nature of a bug fix.  The
current code is waiting for an I/O on buffer A when there's no real
need and we're going afterwards proceed to NOT select buffer A anyway
(or at least, with no more probability than that it will select any
other buffer).

I don't think we're micro-optimizing, either.  I don't consider
avoiding a 10-second cessation of all database activity to be a
micro-optimization even on a somewhat artificial benchmark.

One other thing to think about is that pgbench at scale factor 300 is
not exactly a large working set.  You could easily imagine a
real-world data set that is more the size of scale factor 3000, and
10% of it is hot, and you'd have pretty much the same problem.  The
indexes would be a little deeper and so on, but I see no reason why
you wouldn't be able to reproduce this effect with the right test
set-up.  I am sure there will come a point when we've learned as much
as we can from pgbench and must graduate to more complex benchmarks to
have any hope of finding problems worth fixing, but we are surely
still quite a long ways off from that happy day.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 04 April 2012, 20:28:41
Subject: Re: patch: improve SLRU replacement algorithm

From: Kyotaro HORIGUCHI
Date: 04 April 2012, 21:29:36
Subject: Re: Speed dblink using alternate libpq tuple storage

Re: patch: improve SLRU replacement algorithm - Mailing list pgsql-hackers

Previous

Next