Re: postgresql scalability issue - Mailing list pgsql-general

From Ivan Voras
Subject Re: postgresql scalability issue
Date
Msg-id ibbc4d$7go$1@dough.gmane.org
Whole thread Raw
In response to postgresql scalability issue  (umut orhan <umut_angelfire@yahoo.com>)
List pgsql-general
On 11/08/10 16:33, umut orhan wrote:
> Hi all,
>
>
> I've collected some interesting results during my experiments which I couldn't
> figure out the reason behind them and need your assistance.
>
> I'm running PostgreSQL 9.0 on a quad-core machine having two level on-chip cache
> hierarchy.

Based on this information, you are most likely running on Intel Xeon
5000-5400 series CPU, right? It probably doesn't matter much since you
apparently have only a single socket populated but is a bit old
architecture known for its FSB bottleneck to the memory.

You should write some details about your hardware: at least CPU
model/speed and memory speed, and software (which OS? 32-bit or 64-bit?)

> PostgreSQL has a large and warmed-up  buffer
> cache thus, no disk I/O is observed during experiments (i.e. for each query
> buffer cache hit rate is 100%). I'm pinning each query/process to an individual
> core. Queries are simple read-only queries (only selects). Nested loop (without
> materialize) is used for the join operator.

> When I pin a single query to an individual core, its execution time is observed
> as 111 seconds. This result is my base case. Then, I fire two instances of the
> same query concurrently and pin them to two different cores separately. However,
> each execution time becomes 132 seconds in this case. In a similar trend,
> execution times are increasing for three instances (164 seconds)  and four
> instances (201 seconds) cases too. What I was expecting is a linear improvement
> in throughput (at least). I tried several different queries and got the same
> trend at each time.

Are you measuring wall-clock execution time for queries in parallel?
I.e. start measuring when the first query is started (asynchronously?)
and stop when the last one is finished?

Did you try the same measurement without pinning?

> I wonder why execution times of individual queries are increasing when I
> increase the number of their instances.

> Btw, I don't think on-chip cache hit/miss rates make a  difference since L2
> cache misses are decreasing as expected. I'm not an expert in PostgreSQL
> internals. Maybe there is a lock-contention (spinlocks?) occurring even if the
> queries are read-only. Anyways, all ideas are welcome.

As others said, memory bandwidth is the most likely suspect here. CPUs
are unfortunately so much faster than memory and memory buses that they
frequently have to wait. Unless PostgreSQL uses the exclusive lock model
instead of shared-exclusive, there shouldn't be much contention for the
shared buffers.



pgsql-general by date:

Previous
From: Matthieu Huin
Date:
Subject: Re: temporary table as a subset of an existing table and indexes
Next
From: Merlin Moncure
Date:
Subject: Re: temporary table as a subset of an existing table and indexes