Re: faster search - Mailing list pgsql-performance

From Joshua D. Drake
Subject Re: faster search
Date
Msg-id 42AA3697.3040706@commandprompt.com
Whole thread Raw
In response to Re: faster search  (Clark Slater <list@slatech.com>)
List pgsql-performance
Clark Slater wrote:
> Query should return 132,528 rows.

O.k. then the planner is doing fine it looks like. The problem is you
are pulling 132,528 rows. I would suggest moving to a cursor which will
allow you to fetch in smaller chunks much quicker.

Sincerely,

Joshua D. Drake


>
> vbp=# set enable_seqscan = false;
> SET
> vbp=# explain analyze select * from test where (productlistid=3 and
> typeid=9);
>
>                            QUERY PLAN
> ------------------------------------------------------------------------
>  Index Scan using test_typeid on test  (cost=0.00..137223.89 rows=156194
> width=725) (actual time=25.999..25708.478 rows=132528
>  loops=1)
>    Index Cond: (typeid = 9)
>    Filter: (productlistid = 3)
>  Total runtime: 25757.679 ms
> (4 rows)
>
>
> On Fri, 10 Jun 2005, Joshua D. Drake wrote:
>
>> Clark Slater wrote:
>>
>>> thanks for your suggestion.
>>> a small improvement.  still pretty slow...
>>>
>>> vbp=# alter table test alter column productlistid set statistics 150;
>>> ALTER TABLE
>>> vbp=# alter table test alter column typeid set statistics 150;
>>> ALTER TABLE
>>> vbp=# explain analyze select * from test where (productlistid=3 and
>>
>>
>> Hello,
>>
>> Also what happens if you:
>>
>> set enable_seqscan = false;
>> explain analyze query....
>>
>> Sincerely,
>>
>> Joshua D. Drake
>>
>>
>>
>>> typeid=9);
>>>                             QUERY PLAN
>>>
>>> ------------------------------------------------------------------------------
>>>  Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725)
>>> (actual time=525.617..36802.556 rows=132528 loops=1)
>>>    Filter: ((productlistid = 3) AND (typeid = 9))
>>>  Total runtime: 36847.754 ms
>>> (3 rows)
>>>
>>> Time: 36850.719 ms
>>>
>>>
>>> On Fri, 10 Jun 2005, Joshua D. Drake wrote:
>>>
>>>> Clark Slater wrote:
>>>>
>>>>> hmm, i'm baffled.  i simplified the query
>>>>> and it is still taking forever...
>>>>
>>>>
>>>>
>>>> What happens if you:
>>>>
>>>> alter table test alter column productlistid set statistics 150;
>>>> alter table test alter column typeid set statistics 150;
>>>> explain analyze select * from test where (productlistid=3 and
>>>> typeid=9);
>>>>
>>>> Sincerely,
>>>>
>>>> Joshua D. Drake
>>>>
>>>>
>>>>>
>>>>>
>>>>>           test
>>>>> -------------------------
>>>>>  id            | integer
>>>>>  partnumber    | character varying(32)
>>>>>  productlistid | integer
>>>>>  typeid        | integer
>>>>>
>>>>>
>>>>> Indexes:
>>>>> "test_productlistid" btree (productlistid)
>>>>> "test_typeid" btree (typeid)
>>>>> "test_productlistid_typeid" btree (productlistid, typeid)
>>>>>
>>>>>
>>>>> explain analyze select * from test where (productlistid=3 and
>>>>> typeid=9);
>>>>>
>>>>>                               QUERY PLAN
>>>>> -----------------------------------------------------------------------
>>>>>
>>>>>  Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
>>>>> time=516.459..41930.250 rows=132528 loops=1)
>>>>>    Filter: ((productlistid = 3) AND (typeid = 9))
>>>>>  Total runtime: 41975.154 ms
>>>>> (3 rows)
>>>>>
>>>>>
>>>>> System specs:
>>>>> PostgreSQL 7.4.2 on RedHat 9
>>>>> dual AMD Athlon 2GHz processors
>>>>> 1 gig memory
>>>>> mirrored 7200 RPM IDE disks
>>>>>
>>>>>
>>>>> On Fri, 10 Jun 2005, John A Meinel wrote:
>>>>>
>>>>>> Clark Slater wrote:
>>>>>>
>>>>>>> Hi-
>>>>>>>
>>>>>>> Would someone please enlighten me as
>>>>>>> to why I'm not seeing a faster execution
>>>>>>> time on the simple scenario below?
>>>>>>>
>>>>>>> there are 412,485 rows in the table and the
>>>>>>> query matches on 132,528 rows, taking
>>>>>>> almost a minute to execute.  vaccuum
>>>>>>> analyze was just run.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Well, if you are matching 130k out of 400k rows, then a sequential
>>>>>> scan
>>>>>> is certainly prefered to an index scan. And then you have to sort
>>>>>> those
>>>>>> 130k rows by partnumber. This *might* be spilling to disk
>>>>>> depending on
>>>>>> what your workmem/sortmem is set to.
>>>>>>
>>>>>> I would also say that what you would really want is some way to
>>>>>> get the
>>>>>> whole thing from an index. And I think the way to do that is:
>>>>>>
>>>>>> CREATE INDEX test_partnum_listid_typeid_idx ON
>>>>>>     test(partnumber, productlistid, typeid);
>>>>>>
>>>>>> VACUUM ANALYZE test;
>>>>>>
>>>>>> EXPLAIN ANALYZE SELECT * FROM test
>>>>>>     WHERE productlistid=3 AND typeid=9
>>>>>>     ORDER BY partnumber, productlistid, typeid
>>>>>>     LIMIT 15
>>>>>> ;
>>>>>>
>>>>>> The trick is that you have to match the order by exactly with the
>>>>>> index,
>>>>>> so the planner realizes it can do an indexed lookup to get the
>>>>>> information.
>>>>>>
>>>>>> You could also just create an index on partnumber, and see how that
>>>>>> affects your original query. I think the planner could use an index
>>>>>> lookup on partnumber to get the ordering correct. But it will have
>>>>>> to do
>>>>>> filtering after the fact based on productlistid and typeid.
>>>>>> With my extended index, I think the planner can be smarter and lookup
>>>>>> all 3 by the index.
>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Clark
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Good luck,
>>>>>> John
>>>>>> =:->
>>>>>>
>>>>>
>>>>> ---------------------------(end of
>>>>> broadcast)---------------------------
>>>>> TIP 2: you can get off all lists at once with the unregister command
>>>>>    (send "unregister YourEmailAddressHere" to
>>>>> majordomo@postgresql.org)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Your PostgreSQL solutions provider, Command Prompt, Inc.
>>>> 24x7 support - 1.800.492.2240, programming, and consulting
>>>> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
>>>> http://www.commandprompt.com / http://www.postgresql.org
>>>>
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>
>>
>>
>> --
>> Your PostgreSQL solutions provider, Command Prompt, Inc.
>> 24x7 support - 1.800.492.2240, programming, and consulting
>> Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
>> http://www.commandprompt.com / http://www.postgresql.org
>>


--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org

pgsql-performance by date:

Previous
From: Clark Slater
Date:
Subject: Re: faster search
Next
From: Tobias Brox
Date:
Subject: Re: Help with rewriting query