Re: Re: Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries - Mailing list pgsql-performance

From Gunnar \"Nick\" Bluth
Subject Re: Re: Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries
Date
Msg-id 50996746.5010508@pro-open.de
Whole thread Raw
In response to Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries  (Petr Praus <petr@praus.net>)
Responses Re: Re: Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries
List pgsql-performance
Am 06.11.2012 18:38, schrieb Petr Praus:

Yes, but note that this happens only in Linux. Increasing work_mem on my iMac increases performance (but the queries are slower under OSX than on virtualized Ubuntu on the same machine). Over the weekend, I tried the same test on my Ubuntu home machine with Ivy Bridge i5 3570K and it also slows down (from ~900ms with work_mem=1MB to ~1200ms with work_mem=96MB).

OS X is rather different from a memory access point of view, IIRC. So the direct comparison actually only shows how well the Linux FS cache works (for the temp files created with small work_mem ;-).

The i5 puzzles me a bit though...


I'm pretty sure you're hitting some subtle, memory-access-related cornercase here.

The L2 cache of your X7350 CPUs is 2MB, could you run the tests with, say, 1, 2, 4 and 8MB of work_mem and post the results?
I made a pgbench test with the same query and run it 25 times (5 clients, 5 transactions each):
work_mem   speed
1MB        1794ms
2MB        1877ms
4MB        2084ms
8MB        2141ms
10MB       2124ms
12MB       3018ms
16MB       3004ms
32MB       2999ms
64MB       3015ms

It seems that there is some sort of "plateau".
Two, afaics. The 1->2 change hints towards occasionally breaching your L2 cache, so it can probably be ignored. The actual plateaus thus seem to be 0-2, 2-12, >= 12.
It'd be interesting to see the EXPLAIN ANALYSE outputs for these levels, the buckets and batches in particular. I'd reckon we'll see significant changes at 2->4 and 10->12MB work_mem.

So, to sum this up (and make someone more competent bite on it maybe ;-), on your SMP, FSB, "fake-multicore" system all "hash"-related works that potentially switch to different implementations internally (but w/out telling us so) when given more work_mem are slower.
See other post... it actually does tell us (# of buckets/batches). However, the result is not good and could potentially be improved be twealing the statistic_targets of the joined tables/columns.

I wonder why noone actually understanding the implementation chipped in yet... Andres, Greg, Tom, whoever actually understands what's happening here, anyone reading this? ;-)

Cheers,
-- 
Gunnar "Nick" Bluth
RHCE/SCLA

Mobil   +49 172 8853339
Email: gunnar.bluth@pro-open.de
__________________________________________________________________________
In 1984 mainstream users were choosing VMS over UNIX.  Ten years later
they are choosing Windows over UNIX.  What part of that message aren't you
getting? - Tom Payne

pgsql-performance by date:

Previous
From: Rodrigo Rosenfeld Rosas
Date:
Subject: Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Next
From: Tom Lane
Date:
Subject: Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2