Re: Changing ORDER BY column slows query dramatically - Mailing list pgsql-performance
From | brick pglists |
---|---|
Subject | Re: Changing ORDER BY column slows query dramatically |
Date | |
Msg-id | CALXym+K1QmKPnvBRs+c2M7J+VPTAdpobDoTrqibG6+9mpW8oaA@mail.gmail.com Whole thread Raw |
In response to | Re: Changing ORDER BY column slows query dramatically (Shaun Thomas <sthomas@optionshouse.com>) |
List | pgsql-performance |
Hi Shaun, On Fri, Apr 12, 2013 at 12:59 PM, Shaun Thomas <sthomas@optionshouse.com> wrote: > On 04/12/2013 11:51 AM, brick pglists wrote: > > Since it doesn't show up in your GUC list, you should probably increase your > default_statistics_target to 400 or more, analyze, and try again. The > heuristics for the dates aren't complete enough, so it thinks there are few > matches. If that doesn't work and you want a quick, but ugly fix for this, > you can create the following index: > > CREATE INDEX event_20130406_id_desc_tstamp_utc_idx > ON event_20130406 (id DESC, tstamp_utc); Thanks for your suggestions. Bumping up the default_statistics_target several times all the way to 4000 (ANALYZEd each time) did not help, however, adding the index you suggested helped with that query. It is still over a magnitude slower than the version that sorts by tstamp_utc, but it's a start. I created a similar index (CREATE INDEX event_20130406_id_desc_tstamp_utc_desc_idx ON event_20130406 (id DESC, tstamp_utc DESC)) where both columns were sorted DESCm and given the choice between those two, it chose the latter. Setting enable_mergejoin to false results in a plan much closer to the original fast one, and further changing cpu_tuple_cost up to 1 results in a query about 3x slower than the original fast one. The ORDER BY e.id query, with the new index, enable_mergejoin disabled, and cpu_tuple_cost bumped up to 1: Limit (cost=125386.16..126640.02 rows=100 width=42) (actual time=220.807..221.864 rows=100 loops=1) Buffers: shared hit=49171 read=6770 I/O Timings: read=44.980 -> Nested Loop (cost=0.00..7734858.92 rows=616883 width=42) (actual time=110.718..213.923 rows=10100 loops=1) Buffers: shared hit=49171 read=6770 I/O Timings: read=44.980 -> Index Scan using event_20130406_id_desc_tstamp_utc_desc_idx on event_20130406 e (cost=0.00..2503426.81 rows=1851068 width=34) (actual time=110.690..139.001 rows=10100 loops=1) Index Cond: ((tstamp_utc >= '2013-04-06 10:00:00'::timestamp without time zone) AND (tstamp_utc <= '2013-04-06 18:00:00'::timestamp without time zone)) Filter: ((date_utc = '2013-04-06'::date) AND (org_id = 216471)) Rows Removed by Filter: 1554 Buffers: shared hit=8647 read=6770 I/O Timings: read=44.980 -> Index Scan using notification_counts_by_status_20130406_event_id_org_id_pk on notification_counts_by_status_20130406 ncbs (cost=0.00..1.83 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=10100) Index Cond: (event_id = e.id) Filter: ((event_creation_tstamp_utc >= '2013-04-06 10:00:00'::timestamp without time zone) AND (event_creation_tstamp_utc <= '2013-04-06 18:00:00'::timestamp without time zone) AND (status = 'DELIVERED'::text)) Buffers: shared hit=40524 Total runtime: 222.127 ms (17 rows) Still not at the ~90ms from the "ORDER BY e.tstamp_utc DESC" version, but not too bad. Now I need to figure out how I can get the best plan choice without monkeying around with enable_mergejoin and changing cpu_tuple_cost too much. If any more suggestions are forthcoming, I am all ears!
pgsql-performance by date: