Thread: Slow plan for MAX/MIN or LIMIT 1?

Slow plan for MAX/MIN or LIMIT 1?

From
"Sam Wong"
Date:

Hi There,

 

I have hit a query plan issue that I believe is a bug or under-estimation, and would like to know if there it is known or if there is any workaround…

 

This event_log table has 4 million rows.

“log_id” is the primary key (bigint),

there is a composite index “event_data_search” over (event::text, insert_time::datetime).

 

Query A:

SELECT min(log_id) FROM event_log

WHERE event='S-Create' AND

insert_time>'2013-09-15' and insert_time<'2013-09-16'

 

Query B:

SELECT log_id FROM event_log

WHERE event='S-Create' AND

insert_time>'2013-09-15' and insert_time<'2013-09-16'

ORDER BY log_id

 

What I want to achieve is Query A – get the min log_id within a range. But it is very slow, taking 10 or 20 seconds.

If I don’t limit the output to LIMIT 1 – like Query B – then it is sub-second fast.

 

Explain of A – take 10~20 seconds to run

Limit  (cost=0.00..132.54 rows=1 width=8)

  ->  Index Scan using event_log_pkey on event_log  (cost=0.00..1503484.33 rows=11344 width=8)

        Filter: ((insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time zone) AND (event = 'S-Create'::text))

 

Explain of B – take a few milliseconds to run

Sort  (cost=41015.85..41021.52 rows=11344 width=8)

  Sort Key: log_id

  ->  Bitmap Heap Scan on event_log  (cost=311.42..40863.05 rows=11344 width=8)

        Recheck Cond: ((event = 'S-Create'::text) AND (insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time zone))

        ->  Bitmap Index Scan on event_data_search  (cost=0.00..310.86 rows=11344 width=0)

              Index Cond: ((event = 'S-Create'::text) AND (insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time zone))

 

Plan of A thought that the index scan node will get the first row right at 0.00, and hence the limit node will get all the rows needed within 132.54 (because event_log_pkey are sorted)

I would like to point out that – this 0.00 estimation omits the fact that it actually takes a much longer time for the index scan node to get the first row, because 3.99M rows it comes across won’t meet the condition filter at all.

 

Other background info:

The event_log table has been vacuumed and analyzed.

I have PostgreSQL 9.2.4 (x64) on Windows Server 2008 R2 with me. 8GB ram. 1*Xeon E5606.

 

Thanks,

Sam

Re: Slow plan for MAX/MIN or LIMIT 1?

From
Merlin Moncure
Date:
On Tue, Sep 24, 2013 at 4:24 AM, Sam Wong <sam@hellosam.net> wrote:
> Hi There,
>
>
>
> I have hit a query plan issue that I believe is a bug or under-estimation,
> and would like to know if there it is known or if there is any workaround…
>
>
>
> This event_log table has 4 million rows.
>
> “log_id” is the primary key (bigint),
>
> there is a composite index “event_data_search” over (event::text,
> insert_time::datetime).
>
>
>
> Query A:
>
> SELECT min(log_id) FROM event_log
>
> WHERE event='S-Create' AND
>
> insert_time>'2013-09-15' and insert_time<'2013-09-16'
>
>
>
> Query B:
>
> SELECT log_id FROM event_log
>
> WHERE event='S-Create' AND
>
> insert_time>'2013-09-15' and insert_time<'2013-09-16'
>
> ORDER BY log_id
>
>
>
> What I want to achieve is Query A – get the min log_id within a range. But
> it is very slow, taking 10 or 20 seconds.
>
> If I don’t limit the output to LIMIT 1 – like Query B – then it is
> sub-second fast.
>
>
>
> Explain of A – take 10~20 seconds to run
>
> Limit  (cost=0.00..132.54 rows=1 width=8)
>
>   ->  Index Scan using event_log_pkey on event_log  (cost=0.00..1503484.33
> rows=11344 width=8)
>
>         Filter: ((insert_time > '2013-09-15 00:00:00'::timestamp without
> time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time
> zone) AND (event = 'S-Create'::text))
>
>
>
> Explain of B – take a few milliseconds to run
>
> Sort  (cost=41015.85..41021.52 rows=11344 width=8)
>
>   Sort Key: log_id
>
>   ->  Bitmap Heap Scan on event_log  (cost=311.42..40863.05 rows=11344
> width=8)
>
>         Recheck Cond: ((event = 'S-Create'::text) AND (insert_time >
> '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time <
> '2013-09-16 00:00:00'::timestamp without time zone))
>
>         ->  Bitmap Index Scan on event_data_search  (cost=0.00..310.86
> rows=11344 width=0)
>
>               Index Cond: ((event = 'S-Create'::text) AND (insert_time >
> '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time <
> '2013-09-16 00:00:00'::timestamp without time zone))
>
>
>
> Plan of A thought that the index scan node will get the first row right at
> 0.00, and hence the limit node will get all the rows needed within 132.54
> (because event_log_pkey are sorted)
>
> I would like to point out that – this 0.00 estimation omits the fact that it
> actually takes a much longer time for the index scan node to get the first
> row, because 3.99M rows it comes across won’t meet the condition filter at
> all.

I think you got A and B mixed up there.   Can you post explain analyze
(not just 'explain'){ of the slow plan?

merlin


Re: Slow plan for MAX/MIN or LIMIT 1?

From
Claudio Freire
Date:
On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote:
> This event_log table has 4 million rows.
>
> “log_id” is the primary key (bigint),
>
> there is a composite index “event_data_search” over (event::text,
> insert_time::datetime).


I think you need to add log_id to that composite index to get pg to use it.


Re: Slow plan for MAX/MIN or LIMIT 1?

From
"Sam Wong"
Date:
Hi All and Merlin,

So here is the explain analyze output.

------
Query A -- single row output, but very slow query
------
SELECT min(log_id) FROM event_log
WHERE event='S-Create' AND
insert_time>'2013-09-15' and insert_time<'2013-09-16'

http://explain.depesz.com/s/3H5
Result  (cost=134.48..134.49 rows=1 width=0) (actual
time=348370.719..348370.720 rows=1 loops=1)
  Output: $0
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.00..134.48 rows=1 width=8) (actual
time=348370.712..348370.713 rows=1 loops=1)
          Output: uco.event_log.log_id
          ->  Index Scan using event_log_pkey on uco.event_log
(cost=0.00..1525564.02 rows=11344 width=8) (actual
time=348370.709..348370.709 rows=1 loops=1)
                Output: uco.event_log.log_id
                Index Cond: (uco.event_log.log_id IS NOT NULL)
                Filter: ((uco.event_log.insert_time > '2013-09-15
00:00:00'::timestamp without time zone) AND (uco.event_log.insert_time <
'2013-09-16 00:00:00'::timestamp without time zone) AND (uco.event_log.event
= 'S-Create'::text))
                Rows Removed by Filter: 43249789
Total runtime: 348370.762 ms

------
Query B -- multiple row output, fast query, but I could get what I want from
the first output row
------
SELECT log_id FROM event_log
WHERE event='S-Create' AND
insert_time>'2013-09-15' and insert_time<'2013-09-16'
ORDER BY log_id

http://explain.depesz.com/s/s6P
Sort  (cost=41015.85..41021.52 rows=11344 width=8) (actual
time=3651.695..3652.160 rows=6948 loops=1)
  Output: log_id
  Sort Key: event_log.log_id
  Sort Method: quicksort  Memory: 518kB
  ->  Bitmap Heap Scan on uco.event_log  (cost=311.42..40863.05 rows=11344
width=8) (actual time=448.349..3645.465 rows=6948 loops=1)
        Output: log_id
        Recheck Cond: ((event_log.event = 'S-Create'::text) AND
(event_log.insert_time > '2013-09-15 00:00:00'::timestamp without time zone)
AND (event_log.insert_time < '2013-09-16 00:00:00'::timestamp without time
zone))
        ->  Bitmap Index Scan on event_data_search  (cost=0.00..310.86
rows=11344 width=0) (actual time=447.670..447.670 rows=6948 loops=1)
              Index Cond: ((event_log.event = 'S-Create'::text) AND
(event_log.insert_time > '2013-09-15 00:00:00'::timestamp without time zone)
AND (event_log.insert_time < '2013-09-16 00:00:00'::timestamp without time
zone))
Total runtime: 3652.535 ms

P.S. If I put a LIMIT 1 at the end of this query, it will get an identical
plan just like Query A.

------
My observation:
In Query A, the lower bound of the INDEX SCAN node estimation is way off. It
won't get the first row output right at 0.00 because the filter needed to be
applied.

Thanks,
Sam



Re: Slow plan for MAX/MIN or LIMIT 1?

From
Merlin Moncure
Date:
On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote:
>> This event_log table has 4 million rows.
>>
>> “log_id” is the primary key (bigint),
>>
>> there is a composite index “event_data_search” over (event::text,
>> insert_time::datetime).
>
>
> I think you need to add log_id to that composite index to get pg to use it.

hurk: OP is two statistics misses (one of them massive that are
combing to gobsmack you).

your solution unfortuantely wont work: you can't combine two range
searches in a single index scan.  it would probably work if you it
like this.  If insert_time is a timestamp, not a timestamptz, we can
convert it to date to get what I think he wants (as long as his
queries are along date boundaries).

how about:
CREATE INDEX ON event_log(event_id, insert_time::date, log_id);

EXPLAIN ANALYZE
  SELECT * FROM event_log
  WHERE
    (event_id, insert_time::date, log_id) >=  ('S-Create',
'2013-09-15'::date, 0)
    AND event_id = 'S-Create' AND insert_time::date < '2013-09-16'::date
  ORDER BY
    event_id, insert_time::date, log_id
  LIMIT 1

if insert_time is a timestamptz, we can materialize the date into the
table to get around that (timestamptz->date is a stable expression).
If date boundary handling is awkward, our best bet is probably to hack
the planner with a CTE.  Note the above query will smoke the CTE based
one.

WITH data AS
(
  SELECT log_id FROM event_log
  WHERE event='S-Create' AND
  insert_time>'2013-09-15' and insert_time<'2013-09-16'
)
SELECT * from data ORDER BY log_id LIMIT 1;

merlin


Re: Slow plan for MAX/MIN or LIMIT 1?

From
Claudio Freire
Date:
On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote:
>>> This event_log table has 4 million rows.
>>>
>>> “log_id” is the primary key (bigint),
>>>
>>> there is a composite index “event_data_search” over (event::text,
>>> insert_time::datetime).
>>
>>
>> I think you need to add log_id to that composite index to get pg to use it.
>
> hurk: OP is two statistics misses (one of them massive that are
> combing to gobsmack you).
>
> your solution unfortuantely wont work: you can't combine two range
> searches in a single index scan.  it would probably work if you it
> like this.  If insert_time is a timestamp, not a timestamptz, we can
> convert it to date to get what I think he wants (as long as his
> queries are along date boundaries).


I was thinking an index over:

(event, date_trunc('day', insert_time), log_id)

And the query like

SELECT min(log_id) FROM event_log
WHERE event='S-Create' AND
date_trunc('day',insert_time) = '2013-09-15'


That's a regular simple range scan over the index.


Re: Slow plan for MAX/MIN or LIMIT 1?

From
Merlin Moncure
Date:
On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote:
>>>> This event_log table has 4 million rows.
>>>>
>>>> “log_id” is the primary key (bigint),
>>>>
>>>> there is a composite index “event_data_search” over (event::text,
>>>> insert_time::datetime).
>>>
>>>
>>> I think you need to add log_id to that composite index to get pg to use it.
>>
>> hurk: OP is two statistics misses (one of them massive that are
>> combing to gobsmack you).
>>
>> your solution unfortuantely wont work: you can't combine two range
>> searches in a single index scan.  it would probably work if you it
>> like this.  If insert_time is a timestamp, not a timestamptz, we can
>> convert it to date to get what I think he wants (as long as his
>> queries are along date boundaries).
>
>
> I was thinking an index over:
>
> (event, date_trunc('day', insert_time), log_id)
>
> And the query like
>
> SELECT min(log_id) FROM event_log
> WHERE event='S-Create' AND
> date_trunc('day',insert_time) = '2013-09-15'
>
>
> That's a regular simple range scan over the index.

*) date_trunc has same problems as ::date: it is stable expression
only for timestamptz.  also, the index will be bigger since you're
still indexing timestamp

*) row wise comparison search might be faster and is generalized to
return N records, not jut one.

merlin


Re: Slow plan for MAX/MIN or LIMIT 1?

From
"Sam Wong"
Date:
> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin
> Moncure
> Sent: Wednesday, September 25, 2013 23:55
> To: Claudio Freire
> Cc: Sam Wong; postgres performance list
> Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
>
> On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com>
> wrote:
> > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com>
> wrote:
> >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire
<klaussfreire@gmail.com>
> wrote:
> >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote:
> >>>> This event_log table has 4 million rows.
> >>>>
> >>>> "log_id" is the primary key (bigint),
> >>>>
> >>>> there is a composite index "event_data_search" over (event::text,
> >>>> insert_time::datetime).
> >>>
> >>>
> >>> I think you need to add log_id to that composite index to get pg to
use it.
> >>
> >> hurk: OP is two statistics misses (one of them massive that are
> >> combing to gobsmack you).
> >>
> >> your solution unfortuantely wont work: you can't combine two range
> >> searches in a single index scan.  it would probably work if you it
> >> like this.  If insert_time is a timestamp, not a timestamptz, we can
> >> convert it to date to get what I think he wants (as long as his
> >> queries are along date boundaries).
> >
> >
> > I was thinking an index over:
> >
> > (event, date_trunc('day', insert_time), log_id)
> >
> > And the query like
> >
> > SELECT min(log_id) FROM event_log
> > WHERE event='S-Create' AND
> > date_trunc('day',insert_time) = '2013-09-15'
> >
> >
> > That's a regular simple range scan over the index.
>
> *) date_trunc has same problems as ::date: it is stable expression only
for
> timestamptz.  also, the index will be bigger since you're still indexing
> timestamp
>
> *) row wise comparison search might be faster and is generalized to return
N
> records, not jut one.
>
> merlin
I'm afraid it's not anything about composite index. (Because the query B
works fine and the plan is as expected)
BTW, the timestamp is without timezone.

I just thought of another way that demonstrate the issue.
Both query returns the same one row result. log_id is the bigint primary
key, event_data_search is still that indexed.
---
Fast query
---
with Q AS (
select event
from event_log
WHERE log_id>10000 and log_id<50000
order by event
)
SELECT * FROM Q LIMIT 1

Limit  (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
rows=1 loops=1)
  Output: q.event
  Buffers: shared hit=93 read=622
  CTE q
    ->  Sort  (cost=2502.07..2521.82 rows=39502 width=6) (actual
time=88.335..88.335 rows=1 loops=1)
          Output: event_log.event
          Sort Key: event_log.event
          Sort Method: quicksort  Memory: 3486kB
          Buffers: shared hit=93 read=622
          ->  Index Scan using event_log_pkey on uco.event_log
(cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
rows=39999 loops=1)
                Output: event_log.event
                Index Cond: ((event_log.log_id > 1010000) AND
(event_log.log_id < 1050000))
                Buffers: shared hit=93 read=622
  ->  CTE Scan on q  (cost=0.00..237.01 rows=39502 width=32) (actual
time=88.340..88.340 rows=1 loops=1)
        Output: q.event
        Buffers: shared hit=93 read=622
Total runtime: 89.039 ms

---
Slow Query
---
Result  (cost=1241.05..1241.05 rows=1 width=0) (actual
time=1099.532..1099.533 rows=1 loops=1)
  Output: $0
  Buffers: shared hit=49029 read=57866
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.00..1241.05 rows=1 width=6) (actual
time=1099.527..1099.527 rows=1 loops=1)
          Output: uco.event_log.event
          Buffers: shared hit=49029 read=57866
          ->  Index Scan using event_data_search on uco.event_log
(cost=0.00..49024009.79 rows=39502 width=6) (actual time=1099.523..1099.523
rows=1 loops=1)
                Output: uco.event_log.event
                Index Cond: (uco.event_log.event IS NOT NULL)
                Filter: ((uco.event_log.log_id > 1010000) AND
(uco.event_log.log_id < 1050000))
                Rows Removed by Filter: 303884
                Buffers: shared hit=49029 read=57866
Total runtime: 1099.568 ms
(Note: Things got buffered so it goes down to 1 second, but comparing to the
buffer count with the query above this is a few orders slower than that)

---
The CTE "fast query" works, it is completed with index scan over
"event_log_pkey", which is what I am expecting, and it is good.
But it's much straight forward to write it as the "slow query", I am
expecting the planner would give the same optimum plan for both.

For the plan of "Slow query" has an estimated total cost of 1241.05, and the
"Fast query" has 2521.83,
hence the planner prefers that plan - the scanning over the
"event_data_search" index plan - but this choice doesn't make sense to me.

This is my point I want to bring up, why the planner would do that? Instead
of scanning over the "event_log_pkey"?

Looking into the estimated cost of the Slow Query Plan, the Index Scan node
lower bound estimation is 0.00. IIRC, it is saying the planner estimated
that the first result row could be returned at 0.00.
but actually it has to do a scan almost the whole index and table scan to
check if the log_id are within the condition range, there is no way that the
first row could ever be returned at 0.00.
(The event is a text column with cardinality at around 20, and the order is
appearing randomly all over the table - 0 correlation)
Hence this questionable estimation bubble up along the tree, the Limit node
thought that it could finish within 1241.05 (once the first row is
obtained), and the whole plan is thought to be finisable within 1241.05 -
which is not the case.

Sam



Re: Slow plan for MAX/MIN or LIMIT 1?

From
"Sam Wong"
Date:
> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Sam Wong
> Sent: Thursday, September 26, 2013 0:34
> To: 'postgres performance list'
> Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
>
> > -----Original Message-----
> > From: pgsql-performance-owner@postgresql.org
> > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin
> > Moncure
> > Sent: Wednesday, September 25, 2013 23:55
> > To: Claudio Freire
> > Cc: Sam Wong; postgres performance list
> > Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
> >
> > On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire
> > <klaussfreire@gmail.com>
> > wrote:
> > > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure
> > > <mmoncure@gmail.com>
> > wrote:
> > >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire
> <klaussfreire@gmail.com>
> > wrote:
> > >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net>
> wrote:
> > >>>> This event_log table has 4 million rows.
> > >>>>
> > >>>> "log_id" is the primary key (bigint),
> > >>>>
> > >>>> there is a composite index "event_data_search" over (event::text,
> > >>>> insert_time::datetime).
> > >>>
> > >>>
> > >>> I think you need to add log_id to that composite index to get pg
> > >>> to
> use it.
> > >>
> > >> hurk: OP is two statistics misses (one of them massive that are
> > >> combing to gobsmack you).
> > >>
> > >> your solution unfortuantely wont work: you can't combine two range
> > >> searches in a single index scan.  it would probably work if you it
> > >> like this.  If insert_time is a timestamp, not a timestamptz, we
> > >> can convert it to date to get what I think he wants (as long as his
> > >> queries are along date boundaries).
> > >
> > >
> > > I was thinking an index over:
> > >
> > > (event, date_trunc('day', insert_time), log_id)
> > >
> > > And the query like
> > >
> > > SELECT min(log_id) FROM event_log
> > > WHERE event='S-Create' AND
> > > date_trunc('day',insert_time) = '2013-09-15'
> > >
> > >
> > > That's a regular simple range scan over the index.
> >
> > *) date_trunc has same problems as ::date: it is stable expression
> > only
> for
> > timestamptz.  also, the index will be bigger since you're still
> > indexing timestamp
> >
> > *) row wise comparison search might be faster and is generalized to
> > return
> N
> > records, not jut one.
> >
> > merlin
> I'm afraid it's not anything about composite index. (Because the query B
works
> fine and the plan is as expected) BTW, the timestamp is without timezone.
>
> I just thought of another way that demonstrate the issue.
> Both query returns the same one row result. log_id is the bigint primary
key,
> event_data_search is still that indexed.
> ---
> Fast query
> ---
Excuse me, this is the actual query.

with Q AS (
select event
from event_log
WHERE log_id>1010000 and log_id<1050000
order by event
)
SELECT * FROM Q LIMIT 1
>
> Limit  (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
> rows=1 loops=1)
>   Output: q.event
>   Buffers: shared hit=93 read=622
>   CTE q
>     ->  Sort  (cost=2502.07..2521.82 rows=39502 width=6) (actual
> time=88.335..88.335 rows=1 loops=1)
>           Output: event_log.event
>           Sort Key: event_log.event
>           Sort Method: quicksort  Memory: 3486kB
>           Buffers: shared hit=93 read=622
>           ->  Index Scan using event_log_pkey on uco.event_log
> (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
> rows=39999 loops=1)
>                 Output: event_log.event
>                 Index Cond: ((event_log.log_id > 1010000) AND
> (event_log.log_id < 1050000))
>                 Buffers: shared hit=93 read=622
>   ->  CTE Scan on q  (cost=0.00..237.01 rows=39502 width=32) (actual
> time=88.340..88.340 rows=1 loops=1)
>         Output: q.event
>         Buffers: shared hit=93 read=622
> Total runtime: 89.039 ms
>
> ---
> Slow Query
> ---
And the query for this...I must have forgot to paste.

select min(event)
from event_log
WHERE log_id>1010000 and log_id<1050000
> Result  (cost=1241.05..1241.05 rows=1 width=0) (actual
> time=1099.532..1099.533 rows=1 loops=1)
>   Output: $0
>   Buffers: shared hit=49029 read=57866
>   InitPlan 1 (returns $0)
>     ->  Limit  (cost=0.00..1241.05 rows=1 width=6) (actual
> time=1099.527..1099.527 rows=1 loops=1)
>           Output: uco.event_log.event
>           Buffers: shared hit=49029 read=57866
>           ->  Index Scan using event_data_search on uco.event_log
> (cost=0.00..49024009.79 rows=39502 width=6) (actual
> time=1099.523..1099.523
> rows=1 loops=1)
>                 Output: uco.event_log.event
>                 Index Cond: (uco.event_log.event IS NOT NULL)
>                 Filter: ((uco.event_log.log_id > 1010000) AND
> (uco.event_log.log_id < 1050000))
>                 Rows Removed by Filter: 303884
>                 Buffers: shared hit=49029 read=57866 Total runtime:
> 1099.568 ms
> (Note: Things got buffered so it goes down to 1 second, but comparing to
the
> buffer count with the query above this is a few orders slower than that)
>
> ---
> The CTE "fast query" works, it is completed with index scan over
> "event_log_pkey", which is what I am expecting, and it is good.
> But it's much straight forward to write it as the "slow query", I am
expecting
> the planner would give the same optimum plan for both.
>
> For the plan of "Slow query" has an estimated total cost of 1241.05, and
the
> "Fast query" has 2521.83, hence the planner prefers that plan - the
scanning
> over the "event_data_search" index plan - but this choice doesn't make
sense
> to me.
>
> This is my point I want to bring up, why the planner would do that?
Instead of
> scanning over the "event_log_pkey"?
>
> Looking into the estimated cost of the Slow Query Plan, the Index Scan
node
> lower bound estimation is 0.00. IIRC, it is saying the planner estimated
that the
> first result row could be returned at 0.00.
> but actually it has to do a scan almost the whole index and table scan to
check
> if the log_id are within the condition range, there is no way that the
first row
> could ever be returned at 0.00.
> (The event is a text column with cardinality at around 20, and the order
is
> appearing randomly all over the table - 0 correlation) Hence this
questionable
> estimation bubble up along the tree, the Limit node thought that it could
finish
> within 1241.05 (once the first row is obtained), and the whole plan is
thought to
> be finisable within 1241.05 - which is not the case.
>
> Sam
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



Re: Slow plan for MAX/MIN or LIMIT 1?

From
Claudio Freire
Date:
On Wed, Sep 25, 2013 at 12:54 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> I was thinking an index over:
>>
>> (event, date_trunc('day', insert_time), log_id)
>>
>> And the query like
>>
>> SELECT min(log_id) FROM event_log
>> WHERE event='S-Create' AND
>> date_trunc('day',insert_time) = '2013-09-15'
>>
>>
>> That's a regular simple range scan over the index.
>
> *) date_trunc has same problems as ::date: it is stable expression
> only for timestamptz.  also, the index will be bigger since you're
> still indexing timestamp


Ah, yes, good point.


Re: Slow plan for MAX/MIN or LIMIT 1?

From
didier
Date:
Hi


On Wed, Sep 25, 2013 at 6:42 PM, Sam Wong <sam@hellosam.net> wrote:

> ---
Excuse me, this is the actual query.

with Q AS (
select event
from event_log
WHERE log_id>1010000 and log_id<1050000
order by event
)
SELECT * FROM Q LIMIT 1
>
> Limit  (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
> rows=1 loops=1)
>   Output: q.event
>   Buffers: shared hit=93 read=622
>   CTE q
>     ->  Sort  (cost=2502.07..2521.82 rows=39502 width=6) (actual
> time=88.335..88.335 rows=1 loops=1)
>           Output: event_log.event
>           Sort Key: event_log.event
>           Sort Method: quicksort  Memory: 3486kB
>           Buffers: shared hit=93 read=622
>           ->  Index Scan using event_log_pkey on uco.event_log
> (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
> rows=39999 loops=1)
>                 Output: event_log.event
>                 Index Cond: ((event_log.log_id > 1010000) AND
> (event_log.log_id < 1050000))
>                 Buffers: shared hit=93 read=622
>   ->  CTE Scan on q  (cost=0.00..237.01 rows=39502 width=32) (actual
> time=88.340..88.340 rows=1 loops=1)
>         Output: q.event
>         Buffers: shared hit=93 read=622
> Total runtime: 89.039 ms
>
> ---
> Slow Query
select min(event)
from event_log
WHERE log_id>1010000 and log_id<1050000
> Result  (cost=1241.05..1241.05 rows=1 width=0) (actual
> time=1099.532..1099.533 rows=1 loops=1)
>   Output: $0
>   Buffers: shared hit=49029 read=57866
>   InitPlan 1 (returns $0)
>     ->  Limit  (cost=0.00..1241.05 rows=1 width=6) (actual
> time=1099.527..1099.527 rows=1 loops=1)
>           Output: uco.event_log.event
>           Buffers: shared hit=49029 read=57866
>           ->  Index Scan using event_data_search on uco.event_log
> (cost=0.00..49024009.79 rows=39502 width=6) (actual
> time=1099.523..1099.523
> rows=1 loops=1)
>                 Output: uco.event_log.event
>                 Index Cond: (uco.event_log.event IS NOT NULL)
>                 Filter: ((uco.event_log.log_id > 1010000) AND
> (uco.event_log.log_id < 1050000))
>                 Rows Removed by Filter: 303884
>                 Buffers: shared hit=49029 read=57866 Total runtime:
> 1099.568 ms
> (Note: Things got buffered so it goes down to 1 second, but comparing to
the
> buffer count with the query above this is a few orders slower than that)
>
> ---
> The CTE "fast query" works, it is completed with index scan over
> "event_log_pkey", which is what I am expecting, and it is good.
> But it's much straight forward to write it as the "slow query", I am
expecting
> the planner would give the same optimum plan for both.
>
> For the plan of "Slow query" has an estimated total cost of 1241.05, and
the
> "Fast query" has 2521.83, hence the planner prefers that plan - the
scanning
> over the "event_data_search" index plan - but this choice doesn't make
sense
> to me.
>
> This is my point I want to bring up, why the planner would do that?
Instead of
> scanning over the "event_log_pkey"?
>
 Maybe there's a bug but I don't think so, Postgresql optimizer is strongly bias toward uncorrelated data but in your case log_id and event are highly correlated, right?

With your example it has to chose between:
1- play safe and use  event_log_pkey, scan 39502 rows and select the smallest event.

2- use event_data_search, 4 000 000 rows, 40 000 with a log_id in the interval thus win big and find one in the first 100 scanned rows or lose big and scan 4 000 000 rows.

With uncorrelated data 2- is 400 time faster than 1-, 100 rows vs 40 000.

Postgresql is a high stake gambler :)

Didier