Re: Performance of query - Mailing list pgsql-performance
From | Roman Konoval |
---|---|
Subject | Re: Performance of query |
Date | |
Msg-id | CABcZEEBDCaMmPWftiNZa1d6n4=Q85Ai3mNQ2VAQhBG2YQCQNoA@mail.gmail.com Whole thread Raw |
In response to | Re: Performance of query (Misa Simic <misa.simic@gmail.com>) |
List | pgsql-performance |
I assume there are reasons not to throw away join to state. May be it still can be done as the last thing. This should help further:
SELECT counts.* FROM (
SELECT busbase.state AS state, count(busbase.id) AS m0 FROM busbase
GROUP BY busbase.state ) AS counts
INNER JOIN state USING (state)
Regards,
Roman Konoval
On Sun, Mar 24, 2013 at 12:27 AM, Misa Simic <misa.simic@gmail.com> wrote:
Hi Jeff,It seems my previous mail has not showed up in the list... copied/pasted again belloewHowever, you said something important:"The join to the "state" table is not necessary. Between the foreign key and the primary key, you know that every state exists, and that every state exists only once. But, that will not solve your problem, as the join to the state table is not where the time goes."I think it is something what planner could/should be "aware off"... and discard the join" Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"" Merge Cond: (state.state = busbase.state)"this part from bellow plan would save significant time if planner didn't decide to take this step at all ....Kind regards,Misa"Hi CindyTBH - I don't know...I have added this to list so maybe someone else can help...To recap:from start situation (table structure and indexes are in the first mail in this thread)EXPLAIN ANALYZESELECT busbase.state AS c0, count(busbase.id) AS m0 FROM busbase INNER JOIN state USING (state)GROUP BY busbase.statesays:"HashAggregate (cost=7416975.58..7416976.09 rows=51 width=7) (actual time=285339.465..285339.473 rows=51 loops=1)"" -> Hash Join (cost=2.15..7139961.94 rows=55402728 width=7) (actual time=0.066..269527.934 rows=60057057 loops=1)"" Hash Cond: (busbase.state = state.state)"" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728 width=7) (actual time=0.022..251029.307 rows=60057057 loops=1)"" -> Hash (cost=1.51..1.51 rows=51 width=3) (actual time=0.028..0.028 rows=51 loops=1)"" Buckets: 1024 Batches: 1 Memory Usage: 2kB"" -> Seq Scan on state (cost=0.00..1.51 rows=51 width=3) (actual time=0.003..0.019 rows=51 loops=1)""Total runtime: 285339.516 ms"on created composite indexCREATE INDEX comp_statidx2ON busbaseUSING btree(state, id );we got:"GroupAggregate (cost=0.00..2610570.81 rows=51 width=3) (actual time=98.923..51033.888 rows=51 loops=1)"" -> Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"" Merge Cond: (state.state = busbase.state)"" -> Index Only Scan using state_pkey on state (cost=0.00..13.02 rows=51 width=3) (actual time=0.008..0.148 rows=51 loops=1)"" Heap Fetches: 51"" -> Index Only Scan using comp_statidx2 on busbase (cost=0.00..1559558.68 rows=60057056 width=3) (actual time=38.408..12883.575 rows=60057057 loops=1)"" Heap Fetches: 0""Total runtime: 51045.648 ms"Question is - is it possible to improve it more?"
pgsql-performance by date: