Re: Slow plan for MAX/MIN or LIMIT 1? - Mailing list pgsql-performance

From Sam Wong
Subject Re: Slow plan for MAX/MIN or LIMIT 1?
Date
Msg-id 015601ceba0d$04010540$0c030fc0$@hellosam.net
Whole thread Raw
In response to Re: Slow plan for MAX/MIN or LIMIT 1?  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Slow plan for MAX/MIN or LIMIT 1?
List pgsql-performance
> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin
> Moncure
> Sent: Wednesday, September 25, 2013 23:55
> To: Claudio Freire
> Cc: Sam Wong; postgres performance list
> Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
>
> On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com>
> wrote:
> > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com>
> wrote:
> >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire
<klaussfreire@gmail.com>
> wrote:
> >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote:
> >>>> This event_log table has 4 million rows.
> >>>>
> >>>> "log_id" is the primary key (bigint),
> >>>>
> >>>> there is a composite index "event_data_search" over (event::text,
> >>>> insert_time::datetime).
> >>>
> >>>
> >>> I think you need to add log_id to that composite index to get pg to
use it.
> >>
> >> hurk: OP is two statistics misses (one of them massive that are
> >> combing to gobsmack you).
> >>
> >> your solution unfortuantely wont work: you can't combine two range
> >> searches in a single index scan.  it would probably work if you it
> >> like this.  If insert_time is a timestamp, not a timestamptz, we can
> >> convert it to date to get what I think he wants (as long as his
> >> queries are along date boundaries).
> >
> >
> > I was thinking an index over:
> >
> > (event, date_trunc('day', insert_time), log_id)
> >
> > And the query like
> >
> > SELECT min(log_id) FROM event_log
> > WHERE event='S-Create' AND
> > date_trunc('day',insert_time) = '2013-09-15'
> >
> >
> > That's a regular simple range scan over the index.
>
> *) date_trunc has same problems as ::date: it is stable expression only
for
> timestamptz.  also, the index will be bigger since you're still indexing
> timestamp
>
> *) row wise comparison search might be faster and is generalized to return
N
> records, not jut one.
>
> merlin
I'm afraid it's not anything about composite index. (Because the query B
works fine and the plan is as expected)
BTW, the timestamp is without timezone.

I just thought of another way that demonstrate the issue.
Both query returns the same one row result. log_id is the bigint primary
key, event_data_search is still that indexed.
---
Fast query
---
with Q AS (
select event
from event_log
WHERE log_id>10000 and log_id<50000
order by event
)
SELECT * FROM Q LIMIT 1

Limit  (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
rows=1 loops=1)
  Output: q.event
  Buffers: shared hit=93 read=622
  CTE q
    ->  Sort  (cost=2502.07..2521.82 rows=39502 width=6) (actual
time=88.335..88.335 rows=1 loops=1)
          Output: event_log.event
          Sort Key: event_log.event
          Sort Method: quicksort  Memory: 3486kB
          Buffers: shared hit=93 read=622
          ->  Index Scan using event_log_pkey on uco.event_log
(cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
rows=39999 loops=1)
                Output: event_log.event
                Index Cond: ((event_log.log_id > 1010000) AND
(event_log.log_id < 1050000))
                Buffers: shared hit=93 read=622
  ->  CTE Scan on q  (cost=0.00..237.01 rows=39502 width=32) (actual
time=88.340..88.340 rows=1 loops=1)
        Output: q.event
        Buffers: shared hit=93 read=622
Total runtime: 89.039 ms

---
Slow Query
---
Result  (cost=1241.05..1241.05 rows=1 width=0) (actual
time=1099.532..1099.533 rows=1 loops=1)
  Output: $0
  Buffers: shared hit=49029 read=57866
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.00..1241.05 rows=1 width=6) (actual
time=1099.527..1099.527 rows=1 loops=1)
          Output: uco.event_log.event
          Buffers: shared hit=49029 read=57866
          ->  Index Scan using event_data_search on uco.event_log
(cost=0.00..49024009.79 rows=39502 width=6) (actual time=1099.523..1099.523
rows=1 loops=1)
                Output: uco.event_log.event
                Index Cond: (uco.event_log.event IS NOT NULL)
                Filter: ((uco.event_log.log_id > 1010000) AND
(uco.event_log.log_id < 1050000))
                Rows Removed by Filter: 303884
                Buffers: shared hit=49029 read=57866
Total runtime: 1099.568 ms
(Note: Things got buffered so it goes down to 1 second, but comparing to the
buffer count with the query above this is a few orders slower than that)

---
The CTE "fast query" works, it is completed with index scan over
"event_log_pkey", which is what I am expecting, and it is good.
But it's much straight forward to write it as the "slow query", I am
expecting the planner would give the same optimum plan for both.

For the plan of "Slow query" has an estimated total cost of 1241.05, and the
"Fast query" has 2521.83,
hence the planner prefers that plan - the scanning over the
"event_data_search" index plan - but this choice doesn't make sense to me.

This is my point I want to bring up, why the planner would do that? Instead
of scanning over the "event_log_pkey"?

Looking into the estimated cost of the Slow Query Plan, the Index Scan node
lower bound estimation is 0.00. IIRC, it is saying the planner estimated
that the first result row could be returned at 0.00.
but actually it has to do a scan almost the whole index and table scan to
check if the log_id are within the condition range, there is no way that the
first row could ever be returned at 0.00.
(The event is a text column with cardinality at around 20, and the order is
appearing randomly all over the table - 0 correlation)
Hence this questionable estimation bubble up along the tree, the Limit node
thought that it could finish within 1241.05 (once the first row is
obtained), and the whole plan is thought to be finisable within 1241.05 -
which is not the case.

Sam



pgsql-performance by date:

Previous
From: bricklen
Date:
Subject: Re: Troubleshooting query performance issues
Next
From: "Sam Wong"
Date:
Subject: Re: Slow plan for MAX/MIN or LIMIT 1?