Thread: Optimizing PostgreSQL for Windows

Optimizing PostgreSQL for Windows

From

"Christian Rengstl"

Date:

30 October 2007, 10:28:53

Hi list,

I have the following query:
select t.a1, t.a2 from table1 t inner join table2 s
using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos;

With the following output from analyze:
"Sort  (cost=35075.03..35077.51 rows=991 width=14) (actual
time=33313.718..33321.935 rows=22599 loops=1)"
"  Sort Key: s.pos"
"  ->  Hash Join  (cost=7851.48..35025.71 rows=991 width=14) (actual
time=256.513..33249.701 rows=22599 loops=1)"
"        Hash Cond: ((t.id)::text = (s.id)::text)"
"        ->  Bitmap Heap Scan on table1 t  (cost=388.25..27357.57
rows=22286 width=23) (actual time=112.595..32989.663 rows=22864
loops=1)"
"              Recheck Cond: ((pid)::text = 'xyz'::text)"
"              ->  Bitmap Index Scan on idx_table1  (cost=0.00..382.67
rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)"
"                    Index Cond: ((pid)::text = 'xyz'::text)"
"        ->  Hash  (cost=7180.62..7180.62 rows=22609 width=17) (actual
time=143.867..143.867 rows=22864 loops=1)"
"              ->  Bitmap Heap Scan on table2 s  (cost=333.00..7180.62
rows=22609 width=17) (actual time=108.715..126.637 rows=22864 loops=1)"
"                    Recheck Cond: ((chromosome)::text = '9'::text)"
"                    ->  Bitmap Index Scan on idx_table2
(cost=0.00..327.35 rows=22609 width=0) (actual time=108.608..108.608
rows=22864 loops=1)"
"                          Index Cond: ((chromosome)::text =
'9'::text)"

My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz;
shared_buffers is set to 32MB (as I read it should be fairly low on
Windows) and work_mem is set to 2500MB, but nevertheless the query takes
about 38 seconds to finish. The table "table1" contains approx. 3
million tuples and table2 approx. 500.000 tuples. If anyone could give
an advice on either how to optimize the settings in postgresql.conf or
anything else to make this query run faster, I really would appreciate.




Christian Rengstl M.A.
Klinik und Poliklinik für Innere Medizin II
Kardiologie - Forschung
Universitätsklinikum Regensburg
B3 1.388
Franz-Josef-Strauss-Allee 11
93053 Regensburg
Tel.: +49-941-944-7230

Re: Optimizing PostgreSQL for Windows

From

Marc Schablewski

Date:

30 October 2007, 11:19:58

Although I'm not an expert on this stuff, but 32 MB of shared buffers
seems quite low to me, even for a windows machine. I'm running postgres
8.2 on my workstation with 2GB of ram and an AMD x64 3500+ with
shared_buffer set to 256MB without any trouble an it's running fine,
even on large datasets and other applications running. In my experience,
shared_buffers are more important than work_mem.

Have you tried increasing default_statistic_targets (eg to 200 or more) and after that
running "analyze" on your tables or the entire database?


Marc

Christian Rengstl wrote:
> Hi list,
>
> I have the following query:
> select t.a1, t.a2 from table1 t inner join table2 s
> using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos;
>
> With the following output from analyze:
> "Sort  (cost=35075.03..35077.51 rows=991 width=14) (actual
> time=33313.718..33321.935 rows=22599 loops=1)"
> "  Sort Key: s.pos"
> "  ->  Hash Join  (cost=7851.48..35025.71 rows=991 width=14) (actual
> time=256.513..33249.701 rows=22599 loops=1)"
> "        Hash Cond: ((t.id)::text = (s.id)::text)"
> "        ->  Bitmap Heap Scan on table1 t  (cost=388.25..27357.57
> rows=22286 width=23) (actual time=112.595..32989.663 rows=22864
> loops=1)"
> "              Recheck Cond: ((pid)::text = 'xyz'::text)"
> "              ->  Bitmap Index Scan on idx_table1  (cost=0.00..382.67
> rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)"
> "                    Index Cond: ((pid)::text = 'xyz'::text)"
> "        ->  Hash  (cost=7180.62..7180.62 rows=22609 width=17) (actual
> time=143.867..143.867 rows=22864 loops=1)"
> "              ->  Bitmap Heap Scan on table2 s  (cost=333.00..7180.62
> rows=22609 width=17) (actual time=108.715..126.637 rows=22864 loops=1)"
> "                    Recheck Cond: ((chromosome)::text = '9'::text)"
> "                    ->  Bitmap Index Scan on idx_table2
> (cost=0.00..327.35 rows=22609 width=0) (actual time=108.608..108.608
> rows=22864 loops=1)"
> "                          Index Cond: ((chromosome)::text =
> '9'::text)"
>
> My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz;
> shared_buffers is set to 32MB (as I read it should be fairly low on
> Windows) and work_mem is set to 2500MB, but nevertheless the query takes
> about 38 seconds to finish. The table "table1" contains approx. 3
> million tuples and table2 approx. 500.000 tuples. If anyone could give
> an advice on either how to optimize the settings in postgresql.conf or
> anything else to make this query run faster, I really would appreciate.
>
>
>
>
> Christian Rengstl M.A.
> Klinik und Poliklinik für Innere Medizin II
> Kardiologie - Forschung
> Universitätsklinikum Regensburg
> B3 1.388
> Franz-Josef-Strauss-Allee 11
> 93053 Regensburg
> Tel.: +49-941-944-7230
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>
>

--

Marc Schablewski
click:ware Informationstechnik GmbH

Re: Optimizing PostgreSQL for Windows

From

"Dave Dutcher"

Date:

30 October 2007, 13:14:55

>From: Christian Rengstl
>Subject: [PERFORM] Optimizing PostgreSQL for Windows
>
>Hi list,
>
>I have the following query:
>select t.a1, t.a2 from table1 t inner join table2 s
>using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos;
>
>"        ->  Bitmap Heap Scan on table1 t  (cost=388.25..27357.57
>rows=22286 width=23) (actual time=112.595..32989.663 rows=22864 loops=1)"
>"              Recheck Cond: ((pid)::text = 'xyz'::text)"
>"              ->  Bitmap Index Scan on idx_table1  (cost=0.00..382.67
>rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)"
>"                    Index Cond: ((pid)::text = 'xyz'::text)"


The bitmap heap scan on table1 seems very slow.  What version of Postgres
are you using?  There were performance enhancements in 8.1 and 8.2.  What
kind of a hard drive are you using?  I would guess a single SATA drive would
give you better performance than that, but I don't know for sure.  Do you
regularly vacuum the table?  Not enough vacuuming can lead to tables filled
with dead rows, which can increase the amount of data needing to be scanned
considerably.

Dave

Re: Optimizing PostgreSQL for Windows

From

Guillaume Lelarge

Date:

30 October 2007, 16:21:22

Christian Rengstl a écrit :
> My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz;
> shared_buffers is set to 32MB (as I read it should be fairly low on
> Windows) and work_mem is set to 2500MB, but nevertheless the query takes
> about 38 seconds to finish. The table "table1" contains approx. 3
> million tuples and table2 approx. 500.000 tuples. If anyone could give
> an advice on either how to optimize the settings in postgresql.conf or
> anything else to make this query run faster, I really would appreciate.
>

32MB for shared_buffers seems really low to me but 2500MB for work_mem
seems awfully high. The highest I've seen for work_mem was something
like 128MB. I think the first thing you have to do is to really lower
work_mem. Something like 64MB seems a better bet at first.

Regards.


--
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

Re: Optimizing PostgreSQL for Windows

From

"Christian Rengstl"

Date:

31 October 2007, 05:44:14

Now the execution time for my query is down to ~10 - 13 seconds, which
is already a big step ahead. Thanks!
Are there any other settings that might be necessary to tweak on
windows in order to reduce execution time even a little bit more?
One thing i don't understand very well though is that if I execute the
query on table 1 with some conditions for the first time it is still
slow, but when i execute it more often with changing the conditions it
gets faster. Even when i query table 1 then query table 3 (with the same
table definition) and then query table 1 again, the query on table 1
gets faster again.

Christian Rengstl M.A.
Klinik und Poliklinik für Innere Medizin II
Kardiologie - Forschung
Universitätsklinikum Regensburg
B3 1.388
Franz-Josef-Strauss-Allee 11
93053 Regensburg
Tel.: +49-941-944-7230

>>> On Tue, Oct 30, 2007 at  8:21 PM, in message
<47278421.6010906@lelarge.info>,
Guillaume Lelarge <guillaume@lelarge.info> wrote:
> Christian Rengstl a écrit :
>> My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz;
>> shared_buffers is set to 32MB (as I read it should be fairly low on
>> Windows) and work_mem is set to 2500MB, but nevertheless the query
takes
>> about 38 seconds to finish. The table "table1" contains approx. 3
>> million tuples and table2 approx. 500.000 tuples. If anyone could
give
>> an advice on either how to optimize the settings in postgresql.conf
or
>> anything else to make this query run faster, I really would
appreciate.
>>
>
> 32MB for shared_buffers seems really low to me but 2500MB for
work_mem
> seems awfully high. The highest I've seen for work_mem was something
> like 128MB. I think the first thing you have to do is to really
lower
> work_mem. Something like 64MB seems a better bet at first.
>
> Regards.
>
>
> --
> Guillaume.
>  http://www.postgresqlfr.org
>  http://dalibo.com