Re: Help with rewriting query - Mailing list pgsql-performance
From | Bruno Wolff III |
---|---|
Subject | Re: Help with rewriting query |
Date | |
Msg-id | 20050610120638.GA1365@wolff.to Whole thread Raw |
In response to | Re: Help with rewriting query (Junaili Lie <junaili@gmail.com>) |
List | pgsql-performance |
On Thu, Jun 09, 2005 at 18:26:09 -0700, Junaili Lie <junaili@gmail.com> wrote: > Hi Bruno, > I followed your suggestion. > The query plan shows that it uses the index (id, person_id). However, > the execution time is still slow. I have to do ctl-C to stop it. > Maybe something is wrong with my postgresql config. > It's running Solaris on dual Opteron, 4GB. > I allocated around 128MB for sorting and more than 80% for > effective_cache_size and shared_buffers = 32768. > Any further ideas is much appreciated. It might be useful to see that plan and the actual query you used. There were only 569 entries in the people table, so I find it hard to believe that an index look up per person is taking so long that you need to cancel the query. > > > > > On 6/8/05, Bruno Wolff III <bruno@wolff.to> wrote: > > On Wed, Jun 08, 2005 at 15:48:27 -0700, > > Junaili Lie <junaili@gmail.com> wrote: > > > Hi, > > > The suggested query below took forever when I tried it. > > > In addition, as suggested by Tobias, I also tried to create index on > > > food(p_id, id), but still no goal (same query plan). > > > Here is the explain: > > > TEST1=# explain select f.p_id, max(f.id) from Food f, Person p where > > > (f.p_id = p.id) group by p.id; > > > > The above is going to require reading all the food table (assuming no > > orphaned records), so the plan below seems reasonable. > > > > > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------- > > > GroupAggregate (cost=0.00..214585.51 rows=569 width=16) > > > -> Merge Join (cost=0.00..200163.50 rows=2884117 width=16) > > > Merge Cond: ("outer".id = "inner".p_id) > > > -> Index Scan using person_pkey on person p > > > (cost=0.00..25.17 rows=569 width=8) > > > -> Index Scan using person_id_food_index on food f > > > (cost=0.00..164085.54 rows=2884117 width=16) > > > (5 rows) > > > > > > > > > > > > > > > TEST1=# explain select p.id, (Select f.id from food f where > > > f.p_id=p.id order by f.id desc limit 1) from person p; > > > > Using a subselect seems to be the best hope of getting better performance. > > I think you almost got it right, but in order to use the index on > > (p_id, id) you need to order by f.p_id desc, f.id desc. Postgres won't > > deduce this index can be used because f.p_id is constant in the subselect, > > you need to give it some help. > > > > > QUERY PLAN > > > ----------------------------------------------------------------------------------------------------------- > > > Seq Scan on Person p (cost=100000000.00..100007015.24 rows=569 width=8) > > > SubPlan > > > -> Limit (cost=0.00..12.31 rows=1 width=8) > > > -> Index Scan Backward using food_pkey on food f > > > (cost=0.00..111261.90 rows=9042 width=8) > > > Filter: (p_id = $0) > > > (5 rows) > > > > > > any ideas or suggestions is appreciate. > > > > > > > > > On 6/8/05, Tobias Brox <tobias@nordicbet.com> wrote: > > > > [Junaili Lie - Wed at 12:34:32PM -0700] > > > > > select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group > > > > > by f.p_id will work. > > > > > But I understand this is not the most efficient way. Is there another > > > > > way to rewrite this query? (maybe one that involves order by desc > > > > > limit 1) > > > > > > > > eventually, try something like > > > > > > > > select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc limit 1) > > > > from person p > > > > > > > > not tested, no warranties. > > > > > > > > Since subqueries can be inefficient, use "explain analyze" to see which one > > > > is actually better. > > > > > > > > This issue will be solved in future versions of postgresql. > > > > > > > > -- > > > > Tobias Brox, +47-91700050 > > > > Tallinn > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 9: the planner will ignore your desire to choose an index scan if your > > > joining column's datatypes do not match > >
pgsql-performance by date: