Re: Optimising a query - Mailing list pgsql-performance
From | Richard Huxton |
---|---|
Subject | Re: Optimising a query |
Date | |
Msg-id | 4768DAC0.4010304@archonet.com Whole thread Raw |
In response to | Re: Optimising a query (Paul Lambert <paul.lambert@reynolds.com.au>) |
Responses |
Re: Optimising a query
Re: Optimising a query |
List | pgsql-performance |
Paul Lambert wrote: > Paul Lambert wrote: >> <snip> > > > This part of the query alone takes a significant part of the time: > > SELECT DISTINCT ON (finbalance.dealer_id, finbalance.year_id, > finbalance.subledger_id, finbalance.account_id) > finbalance.year_id AS year, > finbalance.dealer_id AS dealer_id, > lpad(finbalance.subledger_id::text,4,'0') AS subledger, > lpad(finbalance.account_id::text,4,'0') AS account > FROM finbalance > > Runs with a query plan of : > > "Unique (cost=30197.98..32782.33 rows=20675 width=16) (actual > time=5949.695..7197.475 rows=17227 loops=1)" > " -> Sort (cost=30197.98..30714.85 rows=206748 width=16) (actual > time=5949.691..7018.931 rows=206748 loops=1)" > " Sort Key: dealer_id, year_id, subledger_id, account_id" > " Sort Method: external merge Disk: 8880kB" > " -> Seq Scan on finbalance (cost=0.00..8409.70 rows=206748 > width=16) (actual time=0.042..617.949 rows=206748 loops=1)" > "Total runtime: 7210.966 ms" > > > So basically selecting from the finbalance table (approx. 206,000 > records) takes 10 seconds, even longer without the distinct clause in > there - the distinct collapses the result-set down to around 17,000 rows. Well, if you need to summarise all the rows then that plan is as good as any. If you run this query very frequently, you'll probably want to look into keeping a summary table updated via triggers. Before that though, try issuing a "SET work_mem = '9MB'" before running your query. If that doesn't change the plan step up gradually. You should be able to get the sort stage to happen in RAM rather than on disk (see "Sort Method" above). Don't go overboard though, your big query will probably use multiples of that value. > Taking out the two lpad's in there knocks off about 1500ms, so I can > come up with something else for them - but I'd like to get the query as > a whole down to under a second. Stick the lpads in a query that wraps your DISTINCT query. > dealer_id, year_id, subledger_id and account_id are all part of the > primary key on the finbalance table, so I don't think I can index them > down any further. A CLUSTER <pkey-index> ON <table> might help, but it will degrade as you update the finbalance table. -- Richard Huxton Archonet Ltd
pgsql-performance by date: