Thread: Optimizing PostgreSQL for Windows
Hi list, I have the following query: select t.a1, t.a2 from table1 t inner join table2 s using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos; With the following output from analyze: "Sort (cost=35075.03..35077.51 rows=991 width=14) (actual time=33313.718..33321.935 rows=22599 loops=1)" " Sort Key: s.pos" " -> Hash Join (cost=7851.48..35025.71 rows=991 width=14) (actual time=256.513..33249.701 rows=22599 loops=1)" " Hash Cond: ((t.id)::text = (s.id)::text)" " -> Bitmap Heap Scan on table1 t (cost=388.25..27357.57 rows=22286 width=23) (actual time=112.595..32989.663 rows=22864 loops=1)" " Recheck Cond: ((pid)::text = 'xyz'::text)" " -> Bitmap Index Scan on idx_table1 (cost=0.00..382.67 rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)" " Index Cond: ((pid)::text = 'xyz'::text)" " -> Hash (cost=7180.62..7180.62 rows=22609 width=17) (actual time=143.867..143.867 rows=22864 loops=1)" " -> Bitmap Heap Scan on table2 s (cost=333.00..7180.62 rows=22609 width=17) (actual time=108.715..126.637 rows=22864 loops=1)" " Recheck Cond: ((chromosome)::text = '9'::text)" " -> Bitmap Index Scan on idx_table2 (cost=0.00..327.35 rows=22609 width=0) (actual time=108.608..108.608 rows=22864 loops=1)" " Index Cond: ((chromosome)::text = '9'::text)" My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz; shared_buffers is set to 32MB (as I read it should be fairly low on Windows) and work_mem is set to 2500MB, but nevertheless the query takes about 38 seconds to finish. The table "table1" contains approx. 3 million tuples and table2 approx. 500.000 tuples. If anyone could give an advice on either how to optimize the settings in postgresql.conf or anything else to make this query run faster, I really would appreciate. Christian Rengstl M.A. Klinik und Poliklinik für Innere Medizin II Kardiologie - Forschung Universitätsklinikum Regensburg B3 1.388 Franz-Josef-Strauss-Allee 11 93053 Regensburg Tel.: +49-941-944-7230
Although I'm not an expert on this stuff, but 32 MB of shared buffers seems quite low to me, even for a windows machine. I'm running postgres 8.2 on my workstation with 2GB of ram and an AMD x64 3500+ with shared_buffer set to 256MB without any trouble an it's running fine, even on large datasets and other applications running. In my experience, shared_buffers are more important than work_mem. Have you tried increasing default_statistic_targets (eg to 200 or more) and after that running "analyze" on your tables or the entire database? Marc Christian Rengstl wrote: > Hi list, > > I have the following query: > select t.a1, t.a2 from table1 t inner join table2 s > using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos; > > With the following output from analyze: > "Sort (cost=35075.03..35077.51 rows=991 width=14) (actual > time=33313.718..33321.935 rows=22599 loops=1)" > " Sort Key: s.pos" > " -> Hash Join (cost=7851.48..35025.71 rows=991 width=14) (actual > time=256.513..33249.701 rows=22599 loops=1)" > " Hash Cond: ((t.id)::text = (s.id)::text)" > " -> Bitmap Heap Scan on table1 t (cost=388.25..27357.57 > rows=22286 width=23) (actual time=112.595..32989.663 rows=22864 > loops=1)" > " Recheck Cond: ((pid)::text = 'xyz'::text)" > " -> Bitmap Index Scan on idx_table1 (cost=0.00..382.67 > rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)" > " Index Cond: ((pid)::text = 'xyz'::text)" > " -> Hash (cost=7180.62..7180.62 rows=22609 width=17) (actual > time=143.867..143.867 rows=22864 loops=1)" > " -> Bitmap Heap Scan on table2 s (cost=333.00..7180.62 > rows=22609 width=17) (actual time=108.715..126.637 rows=22864 loops=1)" > " Recheck Cond: ((chromosome)::text = '9'::text)" > " -> Bitmap Index Scan on idx_table2 > (cost=0.00..327.35 rows=22609 width=0) (actual time=108.608..108.608 > rows=22864 loops=1)" > " Index Cond: ((chromosome)::text = > '9'::text)" > > My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz; > shared_buffers is set to 32MB (as I read it should be fairly low on > Windows) and work_mem is set to 2500MB, but nevertheless the query takes > about 38 seconds to finish. The table "table1" contains approx. 3 > million tuples and table2 approx. 500.000 tuples. If anyone could give > an advice on either how to optimize the settings in postgresql.conf or > anything else to make this query run faster, I really would appreciate. > > > > > Christian Rengstl M.A. > Klinik und Poliklinik für Innere Medizin II > Kardiologie - Forschung > Universitätsklinikum Regensburg > B3 1.388 > Franz-Josef-Strauss-Allee 11 > 93053 Regensburg > Tel.: +49-941-944-7230 > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > > > -- Marc Schablewski click:ware Informationstechnik GmbH
>From: Christian Rengstl >Subject: [PERFORM] Optimizing PostgreSQL for Windows > >Hi list, > >I have the following query: >select t.a1, t.a2 from table1 t inner join table2 s >using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos; > >" -> Bitmap Heap Scan on table1 t (cost=388.25..27357.57 >rows=22286 width=23) (actual time=112.595..32989.663 rows=22864 loops=1)" >" Recheck Cond: ((pid)::text = 'xyz'::text)" >" -> Bitmap Index Scan on idx_table1 (cost=0.00..382.67 >rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)" >" Index Cond: ((pid)::text = 'xyz'::text)" The bitmap heap scan on table1 seems very slow. What version of Postgres are you using? There were performance enhancements in 8.1 and 8.2. What kind of a hard drive are you using? I would guess a single SATA drive would give you better performance than that, but I don't know for sure. Do you regularly vacuum the table? Not enough vacuuming can lead to tables filled with dead rows, which can increase the amount of data needing to be scanned considerably. Dave
Christian Rengstl a écrit : > My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz; > shared_buffers is set to 32MB (as I read it should be fairly low on > Windows) and work_mem is set to 2500MB, but nevertheless the query takes > about 38 seconds to finish. The table "table1" contains approx. 3 > million tuples and table2 approx. 500.000 tuples. If anyone could give > an advice on either how to optimize the settings in postgresql.conf or > anything else to make this query run faster, I really would appreciate. > 32MB for shared_buffers seems really low to me but 2500MB for work_mem seems awfully high. The highest I've seen for work_mem was something like 128MB. I think the first thing you have to do is to really lower work_mem. Something like 64MB seems a better bet at first. Regards. -- Guillaume. http://www.postgresqlfr.org http://dalibo.com
Now the execution time for my query is down to ~10 - 13 seconds, which is already a big step ahead. Thanks! Are there any other settings that might be necessary to tweak on windows in order to reduce execution time even a little bit more? One thing i don't understand very well though is that if I execute the query on table 1 with some conditions for the first time it is still slow, but when i execute it more often with changing the conditions it gets faster. Even when i query table 1 then query table 3 (with the same table definition) and then query table 1 again, the query on table 1 gets faster again. Christian Rengstl M.A. Klinik und Poliklinik für Innere Medizin II Kardiologie - Forschung Universitätsklinikum Regensburg B3 1.388 Franz-Josef-Strauss-Allee 11 93053 Regensburg Tel.: +49-941-944-7230 >>> On Tue, Oct 30, 2007 at 8:21 PM, in message <47278421.6010906@lelarge.info>, Guillaume Lelarge <guillaume@lelarge.info> wrote: > Christian Rengstl a écrit : >> My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz; >> shared_buffers is set to 32MB (as I read it should be fairly low on >> Windows) and work_mem is set to 2500MB, but nevertheless the query takes >> about 38 seconds to finish. The table "table1" contains approx. 3 >> million tuples and table2 approx. 500.000 tuples. If anyone could give >> an advice on either how to optimize the settings in postgresql.conf or >> anything else to make this query run faster, I really would appreciate. >> > > 32MB for shared_buffers seems really low to me but 2500MB for work_mem > seems awfully high. The highest I've seen for work_mem was something > like 128MB. I think the first thing you have to do is to really lower > work_mem. Something like 64MB seems a better bet at first. > > Regards. > > > -- > Guillaume. > http://www.postgresqlfr.org > http://dalibo.com