Re: faster search - Mailing list pgsql-performance

From Clark Slater
Subject Re: faster search
Date
Msg-id 20050610205113.V40688@vbp2.vbp2.com
Whole thread Raw
In response to Re: faster search  ("Joshua D. Drake" <jd@commandprompt.com>)
Responses Re: faster search
List pgsql-performance
Query should return 132,528 rows.

vbp=# set enable_seqscan = false;
SET
vbp=# explain analyze select * from test where (productlistid=3 and typeid=9);

                            QUERY PLAN
------------------------------------------------------------------------
  Index Scan using test_typeid on test  (cost=0.00..137223.89 rows=156194
width=725) (actual time=25.999..25708.478 rows=132528
  loops=1)
    Index Cond: (typeid = 9)
    Filter: (productlistid = 3)
  Total runtime: 25757.679 ms
(4 rows)


On Fri, 10 Jun 2005, Joshua D. Drake wrote:

> Clark Slater wrote:
>> thanks for your suggestion.
>> a small improvement.  still pretty slow...
>>
>> vbp=# alter table test alter column productlistid set statistics 150;
>> ALTER TABLE
>> vbp=# alter table test alter column typeid set statistics 150;
>> ALTER TABLE
>> vbp=# explain analyze select * from test where (productlistid=3 and
>
> Hello,
>
> Also what happens if you:
>
> set enable_seqscan = false;
> explain analyze query....
>
> Sincerely,
>
> Joshua D. Drake
>
>
>
>> typeid=9);
>>                             QUERY PLAN
>>
>> ------------------------------------------------------------------------------
>>  Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
>> time=525.617..36802.556 rows=132528 loops=1)
>>    Filter: ((productlistid = 3) AND (typeid = 9))
>>  Total runtime: 36847.754 ms
>> (3 rows)
>>
>> Time: 36850.719 ms
>>
>>
>> On Fri, 10 Jun 2005, Joshua D. Drake wrote:
>>
>>> Clark Slater wrote:
>>>
>>>> hmm, i'm baffled.  i simplified the query
>>>> and it is still taking forever...
>>>
>>>
>>> What happens if you:
>>>
>>> alter table test alter column productlistid set statistics 150;
>>> alter table test alter column typeid set statistics 150;
>>> explain analyze select * from test where (productlistid=3 and typeid=9);
>>>
>>> Sincerely,
>>>
>>> Joshua D. Drake
>>>
>>>
>>>>
>>>>
>>>>           test
>>>> -------------------------
>>>>  id            | integer
>>>>  partnumber    | character varying(32)
>>>>  productlistid | integer
>>>>  typeid        | integer
>>>>
>>>>
>>>> Indexes:
>>>> "test_productlistid" btree (productlistid)
>>>> "test_typeid" btree (typeid)
>>>> "test_productlistid_typeid" btree (productlistid, typeid)
>>>>
>>>>
>>>> explain analyze select * from test where (productlistid=3 and typeid=9);
>>>>
>>>>                               QUERY PLAN
>>>> -----------------------------------------------------------------------
>>>>  Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
>>>> time=516.459..41930.250 rows=132528 loops=1)
>>>>    Filter: ((productlistid = 3) AND (typeid = 9))
>>>>  Total runtime: 41975.154 ms
>>>> (3 rows)
>>>>
>>>>
>>>> System specs:
>>>> PostgreSQL 7.4.2 on RedHat 9
>>>> dual AMD Athlon 2GHz processors
>>>> 1 gig memory
>>>> mirrored 7200 RPM IDE disks
>>>>
>>>>
>>>> On Fri, 10 Jun 2005, John A Meinel wrote:
>>>>
>>>>> Clark Slater wrote:
>>>>>
>>>>>> Hi-
>>>>>>
>>>>>> Would someone please enlighten me as
>>>>>> to why I'm not seeing a faster execution
>>>>>> time on the simple scenario below?
>>>>>>
>>>>>> there are 412,485 rows in the table and the
>>>>>> query matches on 132,528 rows, taking
>>>>>> almost a minute to execute.  vaccuum
>>>>>> analyze was just run.
>>>>>
>>>>>
>>>>>
>>>>> Well, if you are matching 130k out of 400k rows, then a sequential scan
>>>>> is certainly prefered to an index scan. And then you have to sort those
>>>>> 130k rows by partnumber. This *might* be spilling to disk depending on
>>>>> what your workmem/sortmem is set to.
>>>>>
>>>>> I would also say that what you would really want is some way to get the
>>>>> whole thing from an index. And I think the way to do that is:
>>>>>
>>>>> CREATE INDEX test_partnum_listid_typeid_idx ON
>>>>>     test(partnumber, productlistid, typeid);
>>>>>
>>>>> VACUUM ANALYZE test;
>>>>>
>>>>> EXPLAIN ANALYZE SELECT * FROM test
>>>>>     WHERE productlistid=3 AND typeid=9
>>>>>     ORDER BY partnumber, productlistid, typeid
>>>>>     LIMIT 15
>>>>> ;
>>>>>
>>>>> The trick is that you have to match the order by exactly with the index,
>>>>> so the planner realizes it can do an indexed lookup to get the
>>>>> information.
>>>>>
>>>>> You could also just create an index on partnumber, and see how that
>>>>> affects your original query. I think the planner could use an index
>>>>> lookup on partnumber to get the ordering correct. But it will have to do
>>>>> filtering after the fact based on productlistid and typeid.
>>>>> With my extended index, I think the planner can be smarter and lookup
>>>>> all 3 by the index.
>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>> Clark
>>>>>
>>>>>
>>>>>
>>>>> Good luck,
>>>>> John
>>>>> =:->
>>>>>
>>>>
>>>> ---------------------------(end of broadcast)---------------------------
>>>> TIP 2: you can get off all lists at once with the unregister command
>>>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>
>>>
>>>
>>> --
>>> Your PostgreSQL solutions provider, Command Prompt, Inc.
>>> 24x7 support - 1.800.492.2240, programming, and consulting
>>> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
>>> http://www.commandprompt.com / http://www.postgresql.org
>>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
> --
> Your PostgreSQL solutions provider, Command Prompt, Inc.
> 24x7 support - 1.800.492.2240, programming, and consulting
> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
> http://www.commandprompt.com / http://www.postgresql.org
>

pgsql-performance by date:

Previous
From: John A Meinel
Date:
Subject: Re: faster search
Next
From: "Joshua D. Drake"
Date:
Subject: Re: faster search