Thread: simple join uses indexes, very slow
[PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1] I have a simple join on two tables that takes way too long. Can you help me understand what's wrong? There are indexes defined on the relevant columns. I just did a fresh vacuum --full --analyze on the two tables. Is there something I'm not seeing? [CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk] -- George Young Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples. explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num= p.opset_num and ro.run='team9'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual time=14.986..70197.129 rows=43050 loops=1) -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959rows=263 loops=1) Index Cond: (run = 'team9'::text) -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 width=22) (actual time=1.591..266.211 rows=164 loops=263) Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 rows=27 width=0) (actual time=1.153..1.153 rows=164 loops=263) Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) Total runtime: 70237.727 ms (8 rows) Table "public.run_opsets" Column | Type | Modifiers --------------+-----------------------------+------------------------- run | text | not null opset | text | opset_ver | integer | opset_num | integer | not null status | opset_status | date_started | timestamp without time zone | date_done | timestamp without time zone | work_started | timestamp without time zone | lock_user | text | default 'NO-USER'::text lock_pid | integer | needs_review | text | Indexes: "run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER -- Table "public.parameters" Column | Type | Modifiers -----------+---------+------------------------------- run | text | not null opset_num | integer | not null opset | text | not null opset_ver | integer | not null step_num | integer | not null step | text | not null step_ver | integer | not null name | text | not null value | text | split | boolean | not null default false wafers | text[] | not null default '{}'::text[] Indexes: "parameters_idx" btree (run, opset_num, step_num, opset, opset_ver, step, step_ver, name, split, wafers) "parameters_opset_idx" btree (opset, step, name) "parameters_step_idx" btree (step, name) "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
On Mon, 2006-03-27 at 13:47 -0500, george young wrote: > Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples. > > explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num= p.opset_num and ro.run='team9'; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual time=14.986..70197.129 rows=43050 loops=1) > -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959rows=263 loops=1) > Index Cond: (run = 'team9'::text) > -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 width=22) (actual time=1.591..266.211 rows=164 loops=263) > Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) > -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 rows=27 width=0) (actual time=1.153..1.153 rows=164loops=263) > Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) > Total runtime: 70237.727 ms > (8 rows) The planner appears to be underestimating the number of rows retrieved in both cases, then multiplying them together to make it worse. Multi-column indexes provide less accurate estimates (right now). Looks like a hash join might be faster. What is your work_mem set to? Can you SET enable_nestloop=off and rerun the EXPLAIN ANALYZE? Best Regards, Simon Riggs
On Tue, 28 Mar 2006 09:30:54 +0100 Simon Riggs <simon@2ndquadrant.com> threw this fish to the penguins: > On Mon, 2006-03-27 at 13:47 -0500, george young wrote: > > > Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples. > > > > explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num= p.opset_num and ro.run='team9'; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------- > > Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual time=14.986..70197.129 rows=43050 loops=1) > > -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959rows=263 loops=1) > > Index Cond: (run = 'team9'::text) > > -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 width=22) (actual time=1.591..266.211 rows=164 loops=263) > > Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) > > -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 rows=27 width=0) (actual time=1.153..1.153 rows=164loops=263) > > Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) > > Total runtime: 70237.727 ms > > (8 rows) > > The planner appears to be underestimating the number of rows retrieved > in both cases, then multiplying them together to make it worse. > Multi-column indexes provide less accurate estimates (right now). > > Looks like a hash join might be faster. What is your work_mem set to? work_mem= 1024 > Can you SET enable_nestloop=off and rerun the EXPLAIN ANALYZE? newschm3=> set enable_nestloop=off ; SET newschm3=> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.runAND ro.opset_num = p.opset_num and ro.run='team9'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=34177.87..34291.36 rows=6707 width=22) (actual time=68421.681..68547.686 rows=43050 loops=1) Merge Cond: ("outer".opset_num = "inner".opset_num) -> Sort (cost=130.93..131.11 rows=71 width=18) (actual time=107.744..107.901 rows=263 loops=1) Sort Key: ro.opset_num -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=57.641..106.096rows=263 loops=1) Index Cond: (run = 'team9'::text) -> Sort (cost=34046.94..34070.02 rows=9231 width=22) (actual time=68301.325..68358.087 rows=43050 loops=1) Sort Key: p.opset_num -> Bitmap Heap Scan on parameters p (cost=272.31..33438.97 rows=9231 width=22) (actual time=526.462..67363.577rows=43050 loops=1) Recheck Cond: ('team9'::text = run) -> Bitmap Index Scan on parameters_idx (cost=0.00..272.31 rows=9231 width=0) (actual time=483.500..483.500rows=43050 loops=1) Index Cond: ('team9'::text = run) Total runtime: 68595.868 ms (13 rows) -- George Young -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
Hi, George, george young wrote: >>Looks like a hash join might be faster. What is your work_mem set to? > > work_mem= 1024 This is 1 Megabyte. By all means, increase it, if possible. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
> -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of george young > Sent: Monday, March 27, 2006 12:48 PM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] simple join uses indexes, very slow > [Snip] > > Indexes: > "parameters_idx" btree (run, opset_num, step_num, opset, opset_ver, > step, step_ver, name, split, wafers) > "parameters_opset_idx" btree (opset, step, name) > "parameters_step_idx" btree (step, name) > Have you tried creating some different indexes on parameters? I don't know if it should matter or not, but I would try some indexes like: (run, opset_num) //Without all the other columns (opset_num, run) //Backwards (opset_num) I don't really know Postgres internals all that well. It just seems to me that parameters_idx has a lot of columns this query is not interested in. I'd just be curious to see what happens.
On Tue, Mar 28, 2006 at 10:18:25AM -0600, Dave Dutcher wrote: >> "parameters_idx" btree (run, opset_num, step_num, opset, > opset_ver, >> step, step_ver, name, split, wafers) >> "parameters_opset_idx" btree (opset, step, name) >> "parameters_step_idx" btree (step, name) > Have you tried creating some different indexes on parameters? I don't > know if it should matter or not, but I would try some indexes like: > > (run, opset_num) //Without all the other columns > (opset_num, run) //Backwards > (opset_num) An index on (A,B,C) can be used for a query on (A,B) or (A), so it doesn't really matter. It isn't usable for a query on (B), (C) or (B,C), though. (The index rows will get bigger, of course, so you'll need more I/O if you want to scan large parts of it, but I guess that's beside the point.) /* Steinar */ -- Homepage: http://www.sesse.net/
> -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Steinar H. Gunderson > Sent: Tuesday, March 28, 2006 10:29 AM > > An index on (A,B,C) can be used for a query on (A,B) or (A), so it doesn't > really matter. It isn't usable for a query on (B), (C) or (B,C), though. > (The > index rows will get bigger, of course, so you'll need more I/O if you want > to > scan large parts of it, but I guess that's beside the point.) I guess what I am really curious about is why was the OP getting an expensive sort when the planner tried a merge join? Most of the time was spent sorting the parameters parameters table by opset_num even though opset_num is indexed. Isn't Postgres able to walk the index instead of sorting? I was wondering if maybe Postgres wasn't recognizing that it could just walk the index because the opset_num column isn't the first in the index.
On Tue, Mar 28, 2006 at 06:29:08PM +0200, Steinar H. Gunderson wrote: > On Tue, Mar 28, 2006 at 10:18:25AM -0600, Dave Dutcher wrote: > >> "parameters_idx" btree (run, opset_num, step_num, opset, > > opset_ver, > >> step, step_ver, name, split, wafers) > >> "parameters_opset_idx" btree (opset, step, name) > >> "parameters_step_idx" btree (step, name) > > Have you tried creating some different indexes on parameters? I don't > > know if it should matter or not, but I would try some indexes like: > > > > (run, opset_num) //Without all the other columns > > (opset_num, run) //Backwards > > (opset_num) > > An index on (A,B,C) can be used for a query on (A,B) or (A), so it doesn't > really matter. It isn't usable for a query on (B), (C) or (B,C), though. (The > index rows will get bigger, of course, so you'll need more I/O if you want to > scan large parts of it, but I guess that's beside the point.) Note that given how statistics currenly work, there are many situations where the planner will refuse to use a multi-column index. This probably won't change until there's some concept of multi-column statistics, at least for multi-column indexes. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, Mar 28, 2006 at 11:20:19AM -0600, Dave Dutcher wrote: > I guess what I am really curious about is why was the OP getting an > expensive sort when the planner tried a merge join? A merge join requires sorted inputs. > Most of the time was spent sorting the parameters parameters table by > opset_num even though opset_num is indexed. Isn't Postgres able to walk the > index instead of sorting? The time of an index scan vs. a sequential scan + sort depends on several factors, so it's not just a matter of walking the index whenever there is one. /* Steinar */ -- Homepage: http://www.sesse.net/
On Tue, 2006-03-28 at 10:22 -0500, george young wrote: > work_mem= 1024 Set that higher. Try a couple of other plans using enable_* and let us have the EXPLAIN ANALYZE plans. Best Regards, Simon Riggs
On Tue, 28 Mar 2006 19:17:49 +0100 Simon Riggs <simon@2ndquadrant.com> threw this fish to the penguins: > On Tue, 2006-03-28 at 10:22 -0500, george young wrote: > > > work_mem= 1024 > > Set that higher. > > Try a couple of other plans using enable_* and let us have the EXPLAIN > ANALYZE plans. I tried this, but it doesn't seem to have made much difference that I can see: newschm3=> show work_mem; work_mem ---------- 8024 newschm3=> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.runAND ro.opset_num = p.opset_num and ro.run='team9'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual time=292.739..107672.525 rows=43050 loops=1) -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=115.134..197.818rows=263 loops=1) Index Cond: (run = 'team9'::text) -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 width=22) (actual time=2.559..408.125 rows=164 loops=263) Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 rows=27 width=0) (actual time=2.099..2.099 rows=164 loops=263) Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) Total runtime: 107860.493 ms (8 rows) newschm3=> shoe enable_nestloop; ERROR: syntax error at or near "shoe" at character 1 LINE 1: shoe enable_nestloop; ^ newschm3=> show enable_nestloop; enable_nestloop ----------------- on (1 row) newschm3=> set enable_nestloop=off; SET newschm3=> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.runAND ro.opset_num = p.opset_num and ro.run='team9'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=34177.87..34291.36 rows=6707 width=22) (actual time=64654.744..64760.875 rows=43050 loops=1) Merge Cond: ("outer".opset_num = "inner".opset_num) -> Sort (cost=130.93..131.11 rows=71 width=18) (actual time=62.177..62.333 rows=263 loops=1) Sort Key: ro.opset_num -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=40.415..55.745rows=263 loops=1) Index Cond: (run = 'team9'::text) -> Sort (cost=34046.94..34070.02 rows=9231 width=22) (actual time=64592.526..64615.228 rows=43050 loops=1) Sort Key: p.opset_num -> Bitmap Heap Scan on parameters p (cost=272.31..33438.97 rows=9231 width=22) (actual time=333.975..64126.200rows=43050 loops=1) Recheck Cond: ('team9'::text = run) -> Bitmap Index Scan on parameters_idx (cost=0.00..272.31 rows=9231 width=0) (actual time=309.199..309.199rows=43050 loops=1) Index Cond: ('team9'::text = run) Total runtime: 64919.714 ms (13 rows) -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
> -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Steinar H. Gunderson > A merge join requires sorted inputs. > > > Most of the time was spent sorting the parameters parameters table by > > opset_num even though opset_num is indexed. Isn't Postgres able to walk > the > > index instead of sorting? > > The time of an index scan vs. a sequential scan + sort depends on several > factors, so it's not just a matter of walking the index whenever there is > one. I was just looking this over again and I realized I misread the query plan. The slowest step was the Bitmap Heap Scan not the sort. (The sort was relatively fast.)
george young wrote: > [PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1] > I have a simple join on two tables that takes way too long. Can you help > me understand what's wrong? There are indexes defined on the relevant columns. > I just did a fresh vacuum --full --analyze on the two tables. > Is there something I'm not seeing? > [CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk] > -- George Young > > Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples. > > explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num= p.opset_num and ro.run='team9'; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual time=14.986..70197.129 rows=43050 loops=1) > -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959rows=263 loops=1) > Index Cond: (run = 'team9'::text) > -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 width=22) (actual time=1.591..266.211 rows=164 loops=263) > Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) > -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 rows=27 width=0) (actual time=1.153..1.153 rows=164loops=263) > Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) > Total runtime: 70237.727 ms > (8 rows) > > Table "public.run_opsets" > Column | Type | Modifiers > --------------+-----------------------------+------------------------- > run | text | not null > opset | text | > opset_ver | integer | > opset_num | integer | not null > status | opset_status | > date_started | timestamp without time zone | > date_done | timestamp without time zone | > work_started | timestamp without time zone | > lock_user | text | default 'NO-USER'::text > lock_pid | integer | > needs_review | text | > Indexes: > "run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER > > > -- Table "public.parameters" > Column | Type | Modifiers > -----------+---------+------------------------------- > run | text | not null > opset_num | integer | not null > opset | text | not null > opset_ver | integer | not null > step_num | integer | not null > step | text | not null > step_ver | integer | not null > name | text | not null > value | text | > split | boolean | not null default false > wafers | text[] | not null default '{}'::text[] > Indexes: > "parameters_idx" btree (run, opset_num, step_num, opset, opset_ver, step, step_ver, name, split, wafers) > "parameters_opset_idx" btree (opset, step, name) > "parameters_step_idx" btree (step, name) More for my own information (because nobody else has suggested it), would it make a difference if 'run' was a varchar field rather than text? -- Postgresql & php tutorials http://www.designmagick.com/
If your looking for suggestions, I would suggest updating the 8.1.x you have installed to the latest version, as of typing this is 8.1.3 ;) Most notable is some of the -bug- fixes that are in since 8.1.0, for example; * Fix incorrect optimizations of outer-join conditions (Tom) You know, minor point releases aren't adding new features or changing basic functionality, they are pure and simple bugfixes. If I was in -your- position, I would run (don't walk ;) and install upto 8.1.3 of course, thats jst my 2c, feel free to ignore :D Regards Stef Chris wrote: > george young wrote: > >> [PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1] >> I have a simple join on two tables that takes way too long. Can you >> help >> me understand what's wrong? There are indexes defined on the >> relevant columns. >> I just did a fresh vacuum --full --analyze on the two tables. >> Is there something I'm not seeing? >> [CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk] >> -- George Young >> >> Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 >> tuples. >> >> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM >> run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num = >> p.opset_num and ro.run='team9'; >> >> QUERY PLAN >> -------------------------------------------------------------------------------------------------------------------------------------------- >> >> Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual >> time=14.986..70197.129 rows=43050 loops=1) >> -> Index Scan using run_opsets_pkey on run_opsets ro >> (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959 >> rows=263 loops=1) >> Index Cond: (run = 'team9'::text) >> -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 >> width=22) (actual time=1.591..266.211 rows=164 loops=263) >> Recheck Cond: (('team9'::text = p.run) AND >> ("outer".opset_num = p.opset_num)) >> -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 >> rows=27 width=0) (actual time=1.153..1.153 rows=164 loops=263) >> Index Cond: (('team9'::text = p.run) AND >> ("outer".opset_num = p.opset_num)) >> Total runtime: 70237.727 ms >> (8 rows) >> >> Table "public.run_opsets" >> Column | Type | Modifiers >> --------------+-----------------------------+------------------------- >> run | text | not null >> opset | text | >> opset_ver | integer | >> opset_num | integer | not null >> status | opset_status | >> date_started | timestamp without time zone | >> date_done | timestamp without time zone | >> work_started | timestamp without time zone | >> lock_user | text | default 'NO-USER'::text >> lock_pid | integer | >> needs_review | text | >> Indexes: >> "run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER >> >> >> -- Table "public.parameters" >> Column | Type | Modifiers >> -----------+---------+------------------------------- >> run | text | not null >> opset_num | integer | not null >> opset | text | not null >> opset_ver | integer | not null >> step_num | integer | not null >> step | text | not null >> step_ver | integer | not null >> name | text | not null >> value | text | >> split | boolean | not null default false >> wafers | text[] | not null default '{}'::text[] >> Indexes: >> "parameters_idx" btree (run, opset_num, step_num, opset, >> opset_ver, step, step_ver, name, split, wafers) >> "parameters_opset_idx" btree (opset, step, name) >> "parameters_step_idx" btree (step, name) > > > More for my own information (because nobody else has suggested it), > would it make a difference if 'run' was a varchar field rather than text? > >
On Wed, 29 Mar 2006 01:08:15 -0500 stef <stef@ummon.com> threw this fish to the penguins: > > If your looking for suggestions, I would suggest updating the 8.1.x you > have installed to the latest version, as of typing this is 8.1.3 ;) Most > notable is some of the -bug- fixes that are in since 8.1.0, for example; > > * Fix incorrect optimizations of outer-join conditions (Tom) > > You know, minor point releases aren't adding new features or changing > basic functionality, they are pure and simple bugfixes. If I was in > -your- position, I would run (don't walk ;) and install upto 8.1.3 I just did this(8.1.3). I also moved the server to a host with more ram and faster cpu. And I did cluster on the main index of the large parameters table. The result is less than a second instead of 70 seconds. Sorry I didn't have time to isolate the individual effects of the above changes, but sometimes you just have to do "a bunch of good things" and move on. For your enjoyment here's the latest analyze: newschm3=> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.runAND ro.opset_num = p.opset_num and ro.run='team9'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..6194.18 rows=9186 width=22) (actual time=0.477..175.554 rows=43050 loops=1) -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..122.27 rows=68 width=18) (actual time=0.222..1.093rows=263 loops=1) Index Cond: (run = 'team9'::text) -> Index Scan using parameters_idx on parameters p (cost=0.00..88.72 rows=46 width=22) (actual time=0.023..0.498 rows=164loops=263) Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) Total runtime: 190.821 ms Thank you all very much for you help! -- George Young > > of course, thats jst my 2c, feel free to ignore :D > Regards > Stef > > Chris wrote: > > > george young wrote: > > > >> [PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1] > >> I have a simple join on two tables that takes way too long. Can you > >> help > >> me understand what's wrong? There are indexes defined on the > >> relevant columns. > >> I just did a fresh vacuum --full --analyze on the two tables. > >> Is there something I'm not seeing? > >> [CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk] > >> -- George Young > >> > >> Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 > >> tuples. > >> > >> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM > >> run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num = > >> p.opset_num and ro.run='team9'; > >> > >> QUERY PLAN > >> -------------------------------------------------------------------------------------------------------------------------------------------- > >> > >> Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual > >> time=14.986..70197.129 rows=43050 loops=1) > >> -> Index Scan using run_opsets_pkey on run_opsets ro > >> (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959 > >> rows=263 loops=1) > >> Index Cond: (run = 'team9'::text) > >> -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 > >> width=22) (actual time=1.591..266.211 rows=164 loops=263) > >> Recheck Cond: (('team9'::text = p.run) AND > >> ("outer".opset_num = p.opset_num)) > >> -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 > >> rows=27 width=0) (actual time=1.153..1.153 rows=164 loops=263) > >> Index Cond: (('team9'::text = p.run) AND > >> ("outer".opset_num = p.opset_num)) > >> Total runtime: 70237.727 ms > >> (8 rows) > >> > >> Table "public.run_opsets" > >> Column | Type | Modifiers > >> --------------+-----------------------------+------------------------- > >> run | text | not null > >> opset | text | > >> opset_ver | integer | > >> opset_num | integer | not null > >> status | opset_status | > >> date_started | timestamp without time zone | > >> date_done | timestamp without time zone | > >> work_started | timestamp without time zone | > >> lock_user | text | default 'NO-USER'::text > >> lock_pid | integer | > >> needs_review | text | > >> Indexes: > >> "run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER > >> > >> > >> -- Table "public.parameters" > >> Column | Type | Modifiers > >> -----------+---------+------------------------------- > >> run | text | not null > >> opset_num | integer | not null > >> opset | text | not null > >> opset_ver | integer | not null > >> step_num | integer | not null > >> step | text | not null > >> step_ver | integer | not null > >> name | text | not null > >> value | text | > >> split | boolean | not null default false > >> wafers | text[] | not null default '{}'::text[] > >> Indexes: > >> "parameters_idx" btree (run, opset_num, step_num, opset, > >> opset_ver, step, step_ver, name, split, wafers) > >> "parameters_opset_idx" btree (opset, step, name) > >> "parameters_step_idx" btree (step, name) > > > > > > More for my own information (because nobody else has suggested it), > > would it make a difference if 'run' was a varchar field rather than text? > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
On Wed, Mar 29, 2006 at 01:08:15AM -0500, stef wrote: > > If your looking for suggestions, I would suggest updating the 8.1.x you > have installed to the latest version, as of typing this is 8.1.3 ;) Most > notable is some of the -bug- fixes that are in since 8.1.0, for example; > > * Fix incorrect optimizations of outer-join conditions (Tom) > > You know, minor point releases aren't adding new features or changing > basic functionality, they are pure and simple bugfixes. If I was in > -your- position, I would run (don't walk ;) and install upto 8.1.3 More important, there are data loss bugfixes between 8.1.0 and 8.1.3. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461