Re: Very high effective_cache_size == worse performance? - Mailing list pgsql-performance

From Robert Haas
Subject Re: Very high effective_cache_size == worse performance?
Date
Msg-id h2w603c8f071004201044tbc8ce8fenb846b7643a2b9376@mail.gmail.com
Whole thread Raw
In response to Very high effective_cache_size == worse performance?  (David Kerr <dmk@mr-paradox.net>)
Responses Re: Very high effective_cache_size == worse performance?  (David Kerr <dmk@mr-paradox.net>)
List pgsql-performance
On Tue, Apr 20, 2010 at 1:39 PM, David Kerr <dmk@mr-paradox.net> wrote:
> Howdy all,
>
> I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9.
> 64bit OS. No users currently.
>
> I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so
> i don't think we can use copy.
>
> Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box.
>
> When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because
> on a much smaller machine I was able to do that same amount of records in 6 hours.
>
> My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give
> any indication that we had resource issues.
>
> So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size
> from 128GB to 2GB).
>
> Now the large box performs the same as the smaller box. (which is fine).
>
> incidentally, both tests were starting from a blank database.
>
> Is this expected?

Lowering effective_cache_size tends to discourage the planner from
using a nested-loop-with-inner-indexscan plan - that's it.

What may be happening is that you may be loading data into some tables
and then running a query against those tables before the autovacuum
daemon has a chance to analyze them.  I suspect that if you enable
some logging you'll find that one of those queries is really, really
slow, and that (by happy coincidence) discouraging it from using the
index it thinks it should use happens to produce a better plan.  What
you should probably do is, for each table that you bulk load and then
query, insert a manual ANALYZE between the two.

...Robert

pgsql-performance by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Very high effective_cache_size == worse performance?
Next
From: David Kerr
Date:
Subject: Re: Very high effective_cache_size == worse performance?