Thread: faster search
Hi- Would someone please enlighten me as to why I'm not seeing a faster execution time on the simple scenario below? there are 412,485 rows in the table and the query matches on 132,528 rows, taking almost a minute to execute. vaccuum analyze was just run. Thanks! Clark test ------------------------- id | integer partnumber | character varying(32) productlistid | integer typeid | integer Indexes: "test_id" btree (id) "test_plid" btree (productlistid) "test_typeid" btree (typeid) "test_plidtypeid" btree (productlistid, typeid) explain analyze select * from test where productlistid=3 and typeid=9 order by partnumber limit 15; QUERY PLAN --------------------------------------------------------------------------------------------- Limit (cost=201073.76..201073.79 rows=15 width=722) (actual time=58092.477..58092.518 rows=15 loops=1) -> Sort (cost=201073.76..201451.76 rows=151200 width=722) (actual time=58092.470..58092.505 rows=15 loops=1) Sort Key: partnumber -> Seq Scan on test (cost=0.00..96458.27 rows=151200 width=722) (actual time=2.515..40201.275 rows=132528 loops=1) Filter: ((productlistid = 3) AND (typeid = 9)) Total runtime: 59664.765 ms (6 rows) System specs: PostgreSQL 7.4.2 on RedHat 9 dual AMD Athlon 2GHz processors 1 gig memory mirrored 7200 RPM IDE disks
On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote: > Indexes: > "test_id" btree (id) > "test_plid" btree (productlistid) > "test_typeid" btree (typeid) > "test_plidtypeid" btree (productlistid, typeid) > > > explain analyze select * from test where productlistid=3 and typeid=9 > order by partnumber limit 15; You do not have an index on partnumber. Try adding one. /* Steinar */ -- Homepage: http://www.sesse.net/
Clark Slater wrote: > Hi- > > Would someone please enlighten me as > to why I'm not seeing a faster execution > time on the simple scenario below? > > there are 412,485 rows in the table and the > query matches on 132,528 rows, taking > almost a minute to execute. vaccuum > analyze was just run. Well, if you are matching 130k out of 400k rows, then a sequential scan is certainly prefered to an index scan. And then you have to sort those 130k rows by partnumber. This *might* be spilling to disk depending on what your workmem/sortmem is set to. I would also say that what you would really want is some way to get the whole thing from an index. And I think the way to do that is: CREATE INDEX test_partnum_listid_typeid_idx ON test(partnumber, productlistid, typeid); VACUUM ANALYZE test; EXPLAIN ANALYZE SELECT * FROM test WHERE productlistid=3 AND typeid=9 ORDER BY partnumber, productlistid, typeid LIMIT 15 ; The trick is that you have to match the order by exactly with the index, so the planner realizes it can do an indexed lookup to get the information. You could also just create an index on partnumber, and see how that affects your original query. I think the planner could use an index lookup on partnumber to get the ordering correct. But it will have to do filtering after the fact based on productlistid and typeid. With my extended index, I think the planner can be smarter and lookup all 3 by the index. > > Thanks! > Clark Good luck, John =:->
Attachment
[Clark Slater - Fri at 01:45:05PM -0400] > Would someone please enlighten me as > to why I'm not seeing a faster execution > time on the simple scenario below? Just some thoughts from a novice PG-DBA .. :-) My general experience is that PG usually prefers sequal scans to indices if a large portion of the table is to be selected, because it is faster to do a seqscan than to follow an index and constantly seek between different positions on the hard disk. However, most of the time is spent sorting on partnumber, and you only want 15 rows, so of course you should have an index on partnumber! Picking up 15 rows will be ligtning fast with that index. If you may want to select significantly more than 15 rows, you can also try to make a partial index: create index test_pli3_ti9_by_part on test (partnumber) where productlistid=3 and typeid=9; If 3 and 9 are not constants in the query, try to make a three-key index (it's important with partnumber because a lot of time is spent sorting): create index test_pli_type_part on test (productslistid,typeid,partnumber); To get pg to recognize the index, you will probably have to help it a bit: select * from test where productlistid=3 and typeid=9 order by productlistid,typeid,partnumber limit 15; -- Tobias Brox, +47-91700050
On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote: > Hi- > > Would someone please enlighten me as > to why I'm not seeing a faster execution > time on the simple scenario below? Because you need to extract a huge number of rows via a seqscan, sort them and then throw them away, I think. > explain analyze select * from test where productlistid=3 and typeid=9 > order by partnumber limit 15; Create an index on (productlistid, typeid, partnumber) then select * from test where productlistid=3 and typeid=9 order by productlistid, typeid, partnumber LIMIT 15; ? Cheers, Steve
hmm, i'm baffled. i simplified the query and it is still taking forever... test ------------------------- id | integer partnumber | character varying(32) productlistid | integer typeid | integer Indexes: "test_productlistid" btree (productlistid) "test_typeid" btree (typeid) "test_productlistid_typeid" btree (productlistid, typeid) explain analyze select * from test where (productlistid=3 and typeid=9); QUERY PLAN ----------------------------------------------------------------------- Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual time=516.459..41930.250 rows=132528 loops=1) Filter: ((productlistid = 3) AND (typeid = 9)) Total runtime: 41975.154 ms (3 rows) System specs: PostgreSQL 7.4.2 on RedHat 9 dual AMD Athlon 2GHz processors 1 gig memory mirrored 7200 RPM IDE disks On Fri, 10 Jun 2005, John A Meinel wrote: > Clark Slater wrote: >> Hi- >> >> Would someone please enlighten me as >> to why I'm not seeing a faster execution >> time on the simple scenario below? >> >> there are 412,485 rows in the table and the >> query matches on 132,528 rows, taking >> almost a minute to execute. vaccuum >> analyze was just run. > > Well, if you are matching 130k out of 400k rows, then a sequential scan > is certainly prefered to an index scan. And then you have to sort those > 130k rows by partnumber. This *might* be spilling to disk depending on > what your workmem/sortmem is set to. > > I would also say that what you would really want is some way to get the > whole thing from an index. And I think the way to do that is: > > CREATE INDEX test_partnum_listid_typeid_idx ON > test(partnumber, productlistid, typeid); > > VACUUM ANALYZE test; > > EXPLAIN ANALYZE SELECT * FROM test > WHERE productlistid=3 AND typeid=9 > ORDER BY partnumber, productlistid, typeid > LIMIT 15 > ; > > The trick is that you have to match the order by exactly with the index, > so the planner realizes it can do an indexed lookup to get the information. > > You could also just create an index on partnumber, and see how that > affects your original query. I think the planner could use an index > lookup on partnumber to get the ordering correct. But it will have to do > filtering after the fact based on productlistid and typeid. > With my extended index, I think the planner can be smarter and lookup > all 3 by the index. > >> >> Thanks! >> Clark > > Good luck, > John > =:-> >
Clark Slater wrote: > hmm, i'm baffled. i simplified the query > and it is still taking forever... What happens if you: alter table test alter column productlistid set statistics 150; alter table test alter column typeid set statistics 150; explain analyze select * from test where (productlistid=3 and typeid=9); Sincerely, Joshua D. Drake > > > test > ------------------------- > id | integer > partnumber | character varying(32) > productlistid | integer > typeid | integer > > > Indexes: > "test_productlistid" btree (productlistid) > "test_typeid" btree (typeid) > "test_productlistid_typeid" btree (productlistid, typeid) > > > explain analyze select * from test where (productlistid=3 and typeid=9); > > QUERY PLAN > ----------------------------------------------------------------------- > Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual > time=516.459..41930.250 rows=132528 loops=1) > Filter: ((productlistid = 3) AND (typeid = 9)) > Total runtime: 41975.154 ms > (3 rows) > > > System specs: > PostgreSQL 7.4.2 on RedHat 9 > dual AMD Athlon 2GHz processors > 1 gig memory > mirrored 7200 RPM IDE disks > > > On Fri, 10 Jun 2005, John A Meinel wrote: > >> Clark Slater wrote: >> >>> Hi- >>> >>> Would someone please enlighten me as >>> to why I'm not seeing a faster execution >>> time on the simple scenario below? >>> >>> there are 412,485 rows in the table and the >>> query matches on 132,528 rows, taking >>> almost a minute to execute. vaccuum >>> analyze was just run. >> >> >> Well, if you are matching 130k out of 400k rows, then a sequential scan >> is certainly prefered to an index scan. And then you have to sort those >> 130k rows by partnumber. This *might* be spilling to disk depending on >> what your workmem/sortmem is set to. >> >> I would also say that what you would really want is some way to get the >> whole thing from an index. And I think the way to do that is: >> >> CREATE INDEX test_partnum_listid_typeid_idx ON >> test(partnumber, productlistid, typeid); >> >> VACUUM ANALYZE test; >> >> EXPLAIN ANALYZE SELECT * FROM test >> WHERE productlistid=3 AND typeid=9 >> ORDER BY partnumber, productlistid, typeid >> LIMIT 15 >> ; >> >> The trick is that you have to match the order by exactly with the index, >> so the planner realizes it can do an indexed lookup to get the >> information. >> >> You could also just create an index on partnumber, and see how that >> affects your original query. I think the planner could use an index >> lookup on partnumber to get the ordering correct. But it will have to do >> filtering after the fact based on productlistid and typeid. >> With my extended index, I think the planner can be smarter and lookup >> all 3 by the index. >> >>> >>> Thanks! >>> Clark >> >> >> Good luck, >> John >> =:-> >> > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org
thanks for your suggestion. a small improvement. still pretty slow... vbp=# alter table test alter column productlistid set statistics 150; ALTER TABLE vbp=# alter table test alter column typeid set statistics 150; ALTER TABLE vbp=# explain analyze select * from test where (productlistid=3 and typeid=9); QUERY PLAN ------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual time=525.617..36802.556 rows=132528 loops=1) Filter: ((productlistid = 3) AND (typeid = 9)) Total runtime: 36847.754 ms (3 rows) Time: 36850.719 ms On Fri, 10 Jun 2005, Joshua D. Drake wrote: > Clark Slater wrote: >> hmm, i'm baffled. i simplified the query >> and it is still taking forever... > > What happens if you: > > alter table test alter column productlistid set statistics 150; > alter table test alter column typeid set statistics 150; > explain analyze select * from test where (productlistid=3 and typeid=9); > > Sincerely, > > Joshua D. Drake > > >> >> >> test >> ------------------------- >> id | integer >> partnumber | character varying(32) >> productlistid | integer >> typeid | integer >> >> >> Indexes: >> "test_productlistid" btree (productlistid) >> "test_typeid" btree (typeid) >> "test_productlistid_typeid" btree (productlistid, typeid) >> >> >> explain analyze select * from test where (productlistid=3 and typeid=9); >> >> QUERY PLAN >> ----------------------------------------------------------------------- >> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual >> time=516.459..41930.250 rows=132528 loops=1) >> Filter: ((productlistid = 3) AND (typeid = 9)) >> Total runtime: 41975.154 ms >> (3 rows) >> >> >> System specs: >> PostgreSQL 7.4.2 on RedHat 9 >> dual AMD Athlon 2GHz processors >> 1 gig memory >> mirrored 7200 RPM IDE disks >> >> >> On Fri, 10 Jun 2005, John A Meinel wrote: >> >>> Clark Slater wrote: >>> >>>> Hi- >>>> >>>> Would someone please enlighten me as >>>> to why I'm not seeing a faster execution >>>> time on the simple scenario below? >>>> >>>> there are 412,485 rows in the table and the >>>> query matches on 132,528 rows, taking >>>> almost a minute to execute. vaccuum >>>> analyze was just run. >>> >>> >>> Well, if you are matching 130k out of 400k rows, then a sequential scan >>> is certainly prefered to an index scan. And then you have to sort those >>> 130k rows by partnumber. This *might* be spilling to disk depending on >>> what your workmem/sortmem is set to. >>> >>> I would also say that what you would really want is some way to get the >>> whole thing from an index. And I think the way to do that is: >>> >>> CREATE INDEX test_partnum_listid_typeid_idx ON >>> test(partnumber, productlistid, typeid); >>> >>> VACUUM ANALYZE test; >>> >>> EXPLAIN ANALYZE SELECT * FROM test >>> WHERE productlistid=3 AND typeid=9 >>> ORDER BY partnumber, productlistid, typeid >>> LIMIT 15 >>> ; >>> >>> The trick is that you have to match the order by exactly with the index, >>> so the planner realizes it can do an indexed lookup to get the >>> information. >>> >>> You could also just create an index on partnumber, and see how that >>> affects your original query. I think the planner could use an index >>> lookup on partnumber to get the ordering correct. But it will have to do >>> filtering after the fact based on productlistid and typeid. >>> With my extended index, I think the planner can be smarter and lookup >>> all 3 by the index. >>> >>>> >>>> Thanks! >>>> Clark >>> >>> >>> Good luck, >>> John >>> =:-> >>> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > -- > Your PostgreSQL solutions provider, Command Prompt, Inc. > 24x7 support - 1.800.492.2240, programming, and consulting > Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit > http://www.commandprompt.com / http://www.postgresql.org >
Clark Slater wrote: > thanks for your suggestion. > a small improvement. still pretty slow... > > vbp=# alter table test alter column productlistid set statistics 150; > ALTER TABLE > vbp=# alter table test alter column typeid set statistics 150; > ALTER TABLE > vbp=# explain analyze select * from test where (productlistid=3 and > typeid=9); > QUERY PLAN > ------------------------------------------------------------------------------ > > Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual > time=525.617..36802.556 rows=132528 loops=1) > Filter: ((productlistid = 3) AND (typeid = 9)) > Total runtime: 36847.754 ms > (3 rows) > > Time: 36850.719 ms > > > On Fri, 10 Jun 2005, Joshua D. Drake wrote: > >> Clark Slater wrote: >> >>> hmm, i'm baffled. i simplified the query >>> and it is still taking forever... >> >> >> What happens if you: >> >> alter table test alter column productlistid set statistics 150; >> alter table test alter column typeid set statistics 150; >> explain analyze select * from test where (productlistid=3 and typeid=9); How many rows should it return? >> >> Sincerely, >> >> Joshua D. Drake >> >> >>> >>> >>> test >>> ------------------------- >>> id | integer >>> partnumber | character varying(32) >>> productlistid | integer >>> typeid | integer >>> >>> >>> Indexes: >>> "test_productlistid" btree (productlistid) >>> "test_typeid" btree (typeid) >>> "test_productlistid_typeid" btree (productlistid, typeid) >>> >>> >>> explain analyze select * from test where (productlistid=3 and typeid=9); >>> >>> QUERY PLAN >>> ----------------------------------------------------------------------- >>> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual >>> time=516.459..41930.250 rows=132528 loops=1) >>> Filter: ((productlistid = 3) AND (typeid = 9)) >>> Total runtime: 41975.154 ms >>> (3 rows) >>> >>> >>> System specs: >>> PostgreSQL 7.4.2 on RedHat 9 >>> dual AMD Athlon 2GHz processors >>> 1 gig memory >>> mirrored 7200 RPM IDE disks >>> >>> >>> On Fri, 10 Jun 2005, John A Meinel wrote: >>> >>>> Clark Slater wrote: >>>> >>>>> Hi- >>>>> >>>>> Would someone please enlighten me as >>>>> to why I'm not seeing a faster execution >>>>> time on the simple scenario below? >>>>> >>>>> there are 412,485 rows in the table and the >>>>> query matches on 132,528 rows, taking >>>>> almost a minute to execute. vaccuum >>>>> analyze was just run. >>>> >>>> >>>> >>>> Well, if you are matching 130k out of 400k rows, then a sequential scan >>>> is certainly prefered to an index scan. And then you have to sort those >>>> 130k rows by partnumber. This *might* be spilling to disk depending on >>>> what your workmem/sortmem is set to. >>>> >>>> I would also say that what you would really want is some way to get the >>>> whole thing from an index. And I think the way to do that is: >>>> >>>> CREATE INDEX test_partnum_listid_typeid_idx ON >>>> test(partnumber, productlistid, typeid); >>>> >>>> VACUUM ANALYZE test; >>>> >>>> EXPLAIN ANALYZE SELECT * FROM test >>>> WHERE productlistid=3 AND typeid=9 >>>> ORDER BY partnumber, productlistid, typeid >>>> LIMIT 15 >>>> ; >>>> >>>> The trick is that you have to match the order by exactly with the >>>> index, >>>> so the planner realizes it can do an indexed lookup to get the >>>> information. >>>> >>>> You could also just create an index on partnumber, and see how that >>>> affects your original query. I think the planner could use an index >>>> lookup on partnumber to get the ordering correct. But it will have >>>> to do >>>> filtering after the fact based on productlistid and typeid. >>>> With my extended index, I think the planner can be smarter and lookup >>>> all 3 by the index. >>>> >>>>> >>>>> Thanks! >>>>> Clark >>>> >>>> >>>> >>>> Good luck, >>>> John >>>> =:-> >>>> >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 2: you can get off all lists at once with the unregister command >>> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >> >> >> >> -- >> Your PostgreSQL solutions provider, Command Prompt, Inc. >> 24x7 support - 1.800.492.2240, programming, and consulting >> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit >> http://www.commandprompt.com / http://www.postgresql.org >> > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org
Clark Slater wrote: > thanks for your suggestion. > a small improvement. still pretty slow... > > vbp=# alter table test alter column productlistid set statistics 150; > ALTER TABLE > vbp=# alter table test alter column typeid set statistics 150; > ALTER TABLE > vbp=# explain analyze select * from test where (productlistid=3 and Hello, Also what happens if you: set enable_seqscan = false; explain analyze query.... Sincerely, Joshua D. Drake > typeid=9); > QUERY PLAN > ------------------------------------------------------------------------------ > > Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual > time=525.617..36802.556 rows=132528 loops=1) > Filter: ((productlistid = 3) AND (typeid = 9)) > Total runtime: 36847.754 ms > (3 rows) > > Time: 36850.719 ms > > > On Fri, 10 Jun 2005, Joshua D. Drake wrote: > >> Clark Slater wrote: >> >>> hmm, i'm baffled. i simplified the query >>> and it is still taking forever... >> >> >> What happens if you: >> >> alter table test alter column productlistid set statistics 150; >> alter table test alter column typeid set statistics 150; >> explain analyze select * from test where (productlistid=3 and typeid=9); >> >> Sincerely, >> >> Joshua D. Drake >> >> >>> >>> >>> test >>> ------------------------- >>> id | integer >>> partnumber | character varying(32) >>> productlistid | integer >>> typeid | integer >>> >>> >>> Indexes: >>> "test_productlistid" btree (productlistid) >>> "test_typeid" btree (typeid) >>> "test_productlistid_typeid" btree (productlistid, typeid) >>> >>> >>> explain analyze select * from test where (productlistid=3 and typeid=9); >>> >>> QUERY PLAN >>> ----------------------------------------------------------------------- >>> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual >>> time=516.459..41930.250 rows=132528 loops=1) >>> Filter: ((productlistid = 3) AND (typeid = 9)) >>> Total runtime: 41975.154 ms >>> (3 rows) >>> >>> >>> System specs: >>> PostgreSQL 7.4.2 on RedHat 9 >>> dual AMD Athlon 2GHz processors >>> 1 gig memory >>> mirrored 7200 RPM IDE disks >>> >>> >>> On Fri, 10 Jun 2005, John A Meinel wrote: >>> >>>> Clark Slater wrote: >>>> >>>>> Hi- >>>>> >>>>> Would someone please enlighten me as >>>>> to why I'm not seeing a faster execution >>>>> time on the simple scenario below? >>>>> >>>>> there are 412,485 rows in the table and the >>>>> query matches on 132,528 rows, taking >>>>> almost a minute to execute. vaccuum >>>>> analyze was just run. >>>> >>>> >>>> >>>> Well, if you are matching 130k out of 400k rows, then a sequential scan >>>> is certainly prefered to an index scan. And then you have to sort those >>>> 130k rows by partnumber. This *might* be spilling to disk depending on >>>> what your workmem/sortmem is set to. >>>> >>>> I would also say that what you would really want is some way to get the >>>> whole thing from an index. And I think the way to do that is: >>>> >>>> CREATE INDEX test_partnum_listid_typeid_idx ON >>>> test(partnumber, productlistid, typeid); >>>> >>>> VACUUM ANALYZE test; >>>> >>>> EXPLAIN ANALYZE SELECT * FROM test >>>> WHERE productlistid=3 AND typeid=9 >>>> ORDER BY partnumber, productlistid, typeid >>>> LIMIT 15 >>>> ; >>>> >>>> The trick is that you have to match the order by exactly with the >>>> index, >>>> so the planner realizes it can do an indexed lookup to get the >>>> information. >>>> >>>> You could also just create an index on partnumber, and see how that >>>> affects your original query. I think the planner could use an index >>>> lookup on partnumber to get the ordering correct. But it will have >>>> to do >>>> filtering after the fact based on productlistid and typeid. >>>> With my extended index, I think the planner can be smarter and lookup >>>> all 3 by the index. >>>> >>>>> >>>>> Thanks! >>>>> Clark >>>> >>>> >>>> >>>> Good luck, >>>> John >>>> =:-> >>>> >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 2: you can get off all lists at once with the unregister command >>> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >> >> >> >> -- >> Your PostgreSQL solutions provider, Command Prompt, Inc. >> 24x7 support - 1.800.492.2240, programming, and consulting >> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit >> http://www.commandprompt.com / http://www.postgresql.org >> > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org
Clark Slater wrote: > hmm, i'm baffled. i simplified the query > and it is still taking forever... > > > test > ------------------------- > id | integer > partnumber | character varying(32) > productlistid | integer > typeid | integer > > > Indexes: > "test_productlistid" btree (productlistid) > "test_typeid" btree (typeid) > "test_productlistid_typeid" btree (productlistid, typeid) > > > explain analyze select * from test where (productlistid=3 and typeid=9); > > QUERY PLAN > ----------------------------------------------------------------------- > Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual > time=516.459..41930.250 rows=132528 loops=1) > Filter: ((productlistid = 3) AND (typeid = 9)) > Total runtime: 41975.154 ms > (3 rows) > > This query is still going to take a long time, because you have to scan the whole table. Your WHERE clause is not very specific (it takes 25% of the table). Convention says that any time you want > 5-10% of a table, a sequential scan is better, because it does it in order. Now if you did: explain analyze select * from test where (productlistid=3 and typeid=9) limit 15; I think that would be very fast. I am a little surprised that it is taking 40s to scan only 400k rows, though. On an older machine of mine (with only 256M ram and dual 450MHz Celerons), I have a table with 74k rows which takes about .5 sec. At those numbers it should take more like 4s not 40. John =:->
Attachment
Query should return 132,528 rows. vbp=# set enable_seqscan = false; SET vbp=# explain analyze select * from test where (productlistid=3 and typeid=9); QUERY PLAN ------------------------------------------------------------------------ Index Scan using test_typeid on test (cost=0.00..137223.89 rows=156194 width=725) (actual time=25.999..25708.478 rows=132528 loops=1) Index Cond: (typeid = 9) Filter: (productlistid = 3) Total runtime: 25757.679 ms (4 rows) On Fri, 10 Jun 2005, Joshua D. Drake wrote: > Clark Slater wrote: >> thanks for your suggestion. >> a small improvement. still pretty slow... >> >> vbp=# alter table test alter column productlistid set statistics 150; >> ALTER TABLE >> vbp=# alter table test alter column typeid set statistics 150; >> ALTER TABLE >> vbp=# explain analyze select * from test where (productlistid=3 and > > Hello, > > Also what happens if you: > > set enable_seqscan = false; > explain analyze query.... > > Sincerely, > > Joshua D. Drake > > > >> typeid=9); >> QUERY PLAN >> >> ------------------------------------------------------------------------------ >> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual >> time=525.617..36802.556 rows=132528 loops=1) >> Filter: ((productlistid = 3) AND (typeid = 9)) >> Total runtime: 36847.754 ms >> (3 rows) >> >> Time: 36850.719 ms >> >> >> On Fri, 10 Jun 2005, Joshua D. Drake wrote: >> >>> Clark Slater wrote: >>> >>>> hmm, i'm baffled. i simplified the query >>>> and it is still taking forever... >>> >>> >>> What happens if you: >>> >>> alter table test alter column productlistid set statistics 150; >>> alter table test alter column typeid set statistics 150; >>> explain analyze select * from test where (productlistid=3 and typeid=9); >>> >>> Sincerely, >>> >>> Joshua D. Drake >>> >>> >>>> >>>> >>>> test >>>> ------------------------- >>>> id | integer >>>> partnumber | character varying(32) >>>> productlistid | integer >>>> typeid | integer >>>> >>>> >>>> Indexes: >>>> "test_productlistid" btree (productlistid) >>>> "test_typeid" btree (typeid) >>>> "test_productlistid_typeid" btree (productlistid, typeid) >>>> >>>> >>>> explain analyze select * from test where (productlistid=3 and typeid=9); >>>> >>>> QUERY PLAN >>>> ----------------------------------------------------------------------- >>>> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual >>>> time=516.459..41930.250 rows=132528 loops=1) >>>> Filter: ((productlistid = 3) AND (typeid = 9)) >>>> Total runtime: 41975.154 ms >>>> (3 rows) >>>> >>>> >>>> System specs: >>>> PostgreSQL 7.4.2 on RedHat 9 >>>> dual AMD Athlon 2GHz processors >>>> 1 gig memory >>>> mirrored 7200 RPM IDE disks >>>> >>>> >>>> On Fri, 10 Jun 2005, John A Meinel wrote: >>>> >>>>> Clark Slater wrote: >>>>> >>>>>> Hi- >>>>>> >>>>>> Would someone please enlighten me as >>>>>> to why I'm not seeing a faster execution >>>>>> time on the simple scenario below? >>>>>> >>>>>> there are 412,485 rows in the table and the >>>>>> query matches on 132,528 rows, taking >>>>>> almost a minute to execute. vaccuum >>>>>> analyze was just run. >>>>> >>>>> >>>>> >>>>> Well, if you are matching 130k out of 400k rows, then a sequential scan >>>>> is certainly prefered to an index scan. And then you have to sort those >>>>> 130k rows by partnumber. This *might* be spilling to disk depending on >>>>> what your workmem/sortmem is set to. >>>>> >>>>> I would also say that what you would really want is some way to get the >>>>> whole thing from an index. And I think the way to do that is: >>>>> >>>>> CREATE INDEX test_partnum_listid_typeid_idx ON >>>>> test(partnumber, productlistid, typeid); >>>>> >>>>> VACUUM ANALYZE test; >>>>> >>>>> EXPLAIN ANALYZE SELECT * FROM test >>>>> WHERE productlistid=3 AND typeid=9 >>>>> ORDER BY partnumber, productlistid, typeid >>>>> LIMIT 15 >>>>> ; >>>>> >>>>> The trick is that you have to match the order by exactly with the index, >>>>> so the planner realizes it can do an indexed lookup to get the >>>>> information. >>>>> >>>>> You could also just create an index on partnumber, and see how that >>>>> affects your original query. I think the planner could use an index >>>>> lookup on partnumber to get the ordering correct. But it will have to do >>>>> filtering after the fact based on productlistid and typeid. >>>>> With my extended index, I think the planner can be smarter and lookup >>>>> all 3 by the index. >>>>> >>>>>> >>>>>> Thanks! >>>>>> Clark >>>>> >>>>> >>>>> >>>>> Good luck, >>>>> John >>>>> =:-> >>>>> >>>> >>>> ---------------------------(end of broadcast)--------------------------- >>>> TIP 2: you can get off all lists at once with the unregister command >>>> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >>> >>> >>> >>> -- >>> Your PostgreSQL solutions provider, Command Prompt, Inc. >>> 24x7 support - 1.800.492.2240, programming, and consulting >>> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit >>> http://www.commandprompt.com / http://www.postgresql.org >>> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > -- > Your PostgreSQL solutions provider, Command Prompt, Inc. > 24x7 support - 1.800.492.2240, programming, and consulting > Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit > http://www.commandprompt.com / http://www.postgresql.org >
Clark Slater wrote: > Query should return 132,528 rows. O.k. then the planner is doing fine it looks like. The problem is you are pulling 132,528 rows. I would suggest moving to a cursor which will allow you to fetch in smaller chunks much quicker. Sincerely, Joshua D. Drake > > vbp=# set enable_seqscan = false; > SET > vbp=# explain analyze select * from test where (productlistid=3 and > typeid=9); > > QUERY PLAN > ------------------------------------------------------------------------ > Index Scan using test_typeid on test (cost=0.00..137223.89 rows=156194 > width=725) (actual time=25.999..25708.478 rows=132528 > loops=1) > Index Cond: (typeid = 9) > Filter: (productlistid = 3) > Total runtime: 25757.679 ms > (4 rows) > > > On Fri, 10 Jun 2005, Joshua D. Drake wrote: > >> Clark Slater wrote: >> >>> thanks for your suggestion. >>> a small improvement. still pretty slow... >>> >>> vbp=# alter table test alter column productlistid set statistics 150; >>> ALTER TABLE >>> vbp=# alter table test alter column typeid set statistics 150; >>> ALTER TABLE >>> vbp=# explain analyze select * from test where (productlistid=3 and >> >> >> Hello, >> >> Also what happens if you: >> >> set enable_seqscan = false; >> explain analyze query.... >> >> Sincerely, >> >> Joshua D. Drake >> >> >> >>> typeid=9); >>> QUERY PLAN >>> >>> ------------------------------------------------------------------------------ >>> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) >>> (actual time=525.617..36802.556 rows=132528 loops=1) >>> Filter: ((productlistid = 3) AND (typeid = 9)) >>> Total runtime: 36847.754 ms >>> (3 rows) >>> >>> Time: 36850.719 ms >>> >>> >>> On Fri, 10 Jun 2005, Joshua D. Drake wrote: >>> >>>> Clark Slater wrote: >>>> >>>>> hmm, i'm baffled. i simplified the query >>>>> and it is still taking forever... >>>> >>>> >>>> >>>> What happens if you: >>>> >>>> alter table test alter column productlistid set statistics 150; >>>> alter table test alter column typeid set statistics 150; >>>> explain analyze select * from test where (productlistid=3 and >>>> typeid=9); >>>> >>>> Sincerely, >>>> >>>> Joshua D. Drake >>>> >>>> >>>>> >>>>> >>>>> test >>>>> ------------------------- >>>>> id | integer >>>>> partnumber | character varying(32) >>>>> productlistid | integer >>>>> typeid | integer >>>>> >>>>> >>>>> Indexes: >>>>> "test_productlistid" btree (productlistid) >>>>> "test_typeid" btree (typeid) >>>>> "test_productlistid_typeid" btree (productlistid, typeid) >>>>> >>>>> >>>>> explain analyze select * from test where (productlistid=3 and >>>>> typeid=9); >>>>> >>>>> QUERY PLAN >>>>> ----------------------------------------------------------------------- >>>>> >>>>> Seq Scan on test (cost=0.00..96458.27 rows=156194 width=725) (actual >>>>> time=516.459..41930.250 rows=132528 loops=1) >>>>> Filter: ((productlistid = 3) AND (typeid = 9)) >>>>> Total runtime: 41975.154 ms >>>>> (3 rows) >>>>> >>>>> >>>>> System specs: >>>>> PostgreSQL 7.4.2 on RedHat 9 >>>>> dual AMD Athlon 2GHz processors >>>>> 1 gig memory >>>>> mirrored 7200 RPM IDE disks >>>>> >>>>> >>>>> On Fri, 10 Jun 2005, John A Meinel wrote: >>>>> >>>>>> Clark Slater wrote: >>>>>> >>>>>>> Hi- >>>>>>> >>>>>>> Would someone please enlighten me as >>>>>>> to why I'm not seeing a faster execution >>>>>>> time on the simple scenario below? >>>>>>> >>>>>>> there are 412,485 rows in the table and the >>>>>>> query matches on 132,528 rows, taking >>>>>>> almost a minute to execute. vaccuum >>>>>>> analyze was just run. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Well, if you are matching 130k out of 400k rows, then a sequential >>>>>> scan >>>>>> is certainly prefered to an index scan. And then you have to sort >>>>>> those >>>>>> 130k rows by partnumber. This *might* be spilling to disk >>>>>> depending on >>>>>> what your workmem/sortmem is set to. >>>>>> >>>>>> I would also say that what you would really want is some way to >>>>>> get the >>>>>> whole thing from an index. And I think the way to do that is: >>>>>> >>>>>> CREATE INDEX test_partnum_listid_typeid_idx ON >>>>>> test(partnumber, productlistid, typeid); >>>>>> >>>>>> VACUUM ANALYZE test; >>>>>> >>>>>> EXPLAIN ANALYZE SELECT * FROM test >>>>>> WHERE productlistid=3 AND typeid=9 >>>>>> ORDER BY partnumber, productlistid, typeid >>>>>> LIMIT 15 >>>>>> ; >>>>>> >>>>>> The trick is that you have to match the order by exactly with the >>>>>> index, >>>>>> so the planner realizes it can do an indexed lookup to get the >>>>>> information. >>>>>> >>>>>> You could also just create an index on partnumber, and see how that >>>>>> affects your original query. I think the planner could use an index >>>>>> lookup on partnumber to get the ordering correct. But it will have >>>>>> to do >>>>>> filtering after the fact based on productlistid and typeid. >>>>>> With my extended index, I think the planner can be smarter and lookup >>>>>> all 3 by the index. >>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> Clark >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Good luck, >>>>>> John >>>>>> =:-> >>>>>> >>>>> >>>>> ---------------------------(end of >>>>> broadcast)--------------------------- >>>>> TIP 2: you can get off all lists at once with the unregister command >>>>> (send "unregister YourEmailAddressHere" to >>>>> majordomo@postgresql.org) >>>> >>>> >>>> >>>> >>>> -- >>>> Your PostgreSQL solutions provider, Command Prompt, Inc. >>>> 24x7 support - 1.800.492.2240, programming, and consulting >>>> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit >>>> http://www.commandprompt.com / http://www.postgresql.org >>>> >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >> >> >> >> -- >> Your PostgreSQL solutions provider, Command Prompt, Inc. >> 24x7 support - 1.800.492.2240, programming, and consulting >> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit >> http://www.commandprompt.com / http://www.postgresql.org >> -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org
Steve Atkins wrote: > On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote: > >>Hi- >> >>Would someone please enlighten me as >>to why I'm not seeing a faster execution >>time on the simple scenario below? > > [...] > > Create an index on (productlistid, typeid, partnumber) then > > select * from test where productlistid=3 and typeid=9 > order by productlistid, typeid, partnumber LIMIT 15; > Clark, try also adding (just for testing) partnumber to your where clause, like this: select * from test where productlistid=3 and typeid=9 and partnumber='foo' order by productlistid, typeid, partnumber; and check output of explain analyze. I had experiences of planner "bad" use of indexes when attribute types were integer and cardinality was low (a single attribute value, like "typeid=9" selects one or few rows). However, this was on 7.1.3, and probably is not relevant to your case. -- Cosimo
John A Meinel <john@arbash-meinel.com> writes: > I am a little surprised that it is taking 40s to scan only 400k rows, > though. Yeah, that seemed high to me too. Table bloat maybe? It would be interesting to look at the output of "vacuum verbose test" to see how much dead space there is. regards, tom lane