Re: Slow plan for MAX/MIN or LIMIT 1? - Mailing list pgsql-performance
From | Sam Wong |
---|---|
Subject | Re: Slow plan for MAX/MIN or LIMIT 1? |
Date | |
Msg-id | 015601ceba0d$04010540$0c030fc0$@hellosam.net Whole thread Raw |
In response to | Re: Slow plan for MAX/MIN or LIMIT 1? (Merlin Moncure <mmoncure@gmail.com>) |
Responses |
Re: Slow plan for MAX/MIN or LIMIT 1?
|
List | pgsql-performance |
> -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin > Moncure > Sent: Wednesday, September 25, 2013 23:55 > To: Claudio Freire > Cc: Sam Wong; postgres performance list > Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1? > > On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com> > wrote: > > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com> > wrote: > >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> > wrote: > >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote: > >>>> This event_log table has 4 million rows. > >>>> > >>>> "log_id" is the primary key (bigint), > >>>> > >>>> there is a composite index "event_data_search" over (event::text, > >>>> insert_time::datetime). > >>> > >>> > >>> I think you need to add log_id to that composite index to get pg to use it. > >> > >> hurk: OP is two statistics misses (one of them massive that are > >> combing to gobsmack you). > >> > >> your solution unfortuantely wont work: you can't combine two range > >> searches in a single index scan. it would probably work if you it > >> like this. If insert_time is a timestamp, not a timestamptz, we can > >> convert it to date to get what I think he wants (as long as his > >> queries are along date boundaries). > > > > > > I was thinking an index over: > > > > (event, date_trunc('day', insert_time), log_id) > > > > And the query like > > > > SELECT min(log_id) FROM event_log > > WHERE event='S-Create' AND > > date_trunc('day',insert_time) = '2013-09-15' > > > > > > That's a regular simple range scan over the index. > > *) date_trunc has same problems as ::date: it is stable expression only for > timestamptz. also, the index will be bigger since you're still indexing > timestamp > > *) row wise comparison search might be faster and is generalized to return N > records, not jut one. > > merlin I'm afraid it's not anything about composite index. (Because the query B works fine and the plan is as expected) BTW, the timestamp is without timezone. I just thought of another way that demonstrate the issue. Both query returns the same one row result. log_id is the bigint primary key, event_data_search is still that indexed. --- Fast query --- with Q AS ( select event from event_log WHERE log_id>10000 and log_id<50000 order by event ) SELECT * FROM Q LIMIT 1 Limit (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342 rows=1 loops=1) Output: q.event Buffers: shared hit=93 read=622 CTE q -> Sort (cost=2502.07..2521.82 rows=39502 width=6) (actual time=88.335..88.335 rows=1 loops=1) Output: event_log.event Sort Key: event_log.event Sort Method: quicksort Memory: 3486kB Buffers: shared hit=93 read=622 -> Index Scan using event_log_pkey on uco.event_log (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573 rows=39999 loops=1) Output: event_log.event Index Cond: ((event_log.log_id > 1010000) AND (event_log.log_id < 1050000)) Buffers: shared hit=93 read=622 -> CTE Scan on q (cost=0.00..237.01 rows=39502 width=32) (actual time=88.340..88.340 rows=1 loops=1) Output: q.event Buffers: shared hit=93 read=622 Total runtime: 89.039 ms --- Slow Query --- Result (cost=1241.05..1241.05 rows=1 width=0) (actual time=1099.532..1099.533 rows=1 loops=1) Output: $0 Buffers: shared hit=49029 read=57866 InitPlan 1 (returns $0) -> Limit (cost=0.00..1241.05 rows=1 width=6) (actual time=1099.527..1099.527 rows=1 loops=1) Output: uco.event_log.event Buffers: shared hit=49029 read=57866 -> Index Scan using event_data_search on uco.event_log (cost=0.00..49024009.79 rows=39502 width=6) (actual time=1099.523..1099.523 rows=1 loops=1) Output: uco.event_log.event Index Cond: (uco.event_log.event IS NOT NULL) Filter: ((uco.event_log.log_id > 1010000) AND (uco.event_log.log_id < 1050000)) Rows Removed by Filter: 303884 Buffers: shared hit=49029 read=57866 Total runtime: 1099.568 ms (Note: Things got buffered so it goes down to 1 second, but comparing to the buffer count with the query above this is a few orders slower than that) --- The CTE "fast query" works, it is completed with index scan over "event_log_pkey", which is what I am expecting, and it is good. But it's much straight forward to write it as the "slow query", I am expecting the planner would give the same optimum plan for both. For the plan of "Slow query" has an estimated total cost of 1241.05, and the "Fast query" has 2521.83, hence the planner prefers that plan - the scanning over the "event_data_search" index plan - but this choice doesn't make sense to me. This is my point I want to bring up, why the planner would do that? Instead of scanning over the "event_log_pkey"? Looking into the estimated cost of the Slow Query Plan, the Index Scan node lower bound estimation is 0.00. IIRC, it is saying the planner estimated that the first result row could be returned at 0.00. but actually it has to do a scan almost the whole index and table scan to check if the log_id are within the condition range, there is no way that the first row could ever be returned at 0.00. (The event is a text column with cardinality at around 20, and the order is appearing randomly all over the table - 0 correlation) Hence this questionable estimation bubble up along the tree, the Limit node thought that it could finish within 1241.05 (once the first row is obtained), and the whole plan is thought to be finisable within 1241.05 - which is not the case. Sam
pgsql-performance by date: