Thread: Slow plan for MAX/MIN or LIMIT 1?
Hi There,
I have hit a query plan issue that I believe is a bug or under-estimation, and would like to know if there it is known or if there is any workaround…
This event_log table has 4 million rows.
“log_id” is the primary key (bigint),
there is a composite index “event_data_search” over (event::text, insert_time::datetime).
Query A:
SELECT min(log_id) FROM event_log
WHERE event='S-Create' AND
insert_time>'2013-09-15' and insert_time<'2013-09-16'
Query B:
SELECT log_id FROM event_log
WHERE event='S-Create' AND
insert_time>'2013-09-15' and insert_time<'2013-09-16'
ORDER BY log_id
What I want to achieve is Query A – get the min log_id within a range. But it is very slow, taking 10 or 20 seconds.
If I don’t limit the output to LIMIT 1 – like Query B – then it is sub-second fast.
Explain of A – take 10~20 seconds to run
Limit (cost=0.00..132.54 rows=1 width=8)
-> Index Scan using event_log_pkey on event_log (cost=0.00..1503484.33 rows=11344 width=8)
Filter: ((insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time zone) AND (event = 'S-Create'::text))
Explain of B – take a few milliseconds to run
Sort (cost=41015.85..41021.52 rows=11344 width=8)
Sort Key: log_id
-> Bitmap Heap Scan on event_log (cost=311.42..40863.05 rows=11344 width=8)
Recheck Cond: ((event = 'S-Create'::text) AND (insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on event_data_search (cost=0.00..310.86 rows=11344 width=0)
Index Cond: ((event = 'S-Create'::text) AND (insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time zone))
Plan of A thought that the index scan node will get the first row right at 0.00, and hence the limit node will get all the rows needed within 132.54 (because event_log_pkey are sorted)
I would like to point out that – this 0.00 estimation omits the fact that it actually takes a much longer time for the index scan node to get the first row, because 3.99M rows it comes across won’t meet the condition filter at all.
Other background info:
The event_log table has been vacuumed and analyzed.
I have PostgreSQL 9.2.4 (x64) on Windows Server 2008 R2 with me. 8GB ram. 1*Xeon E5606.
Thanks,
Sam
On Tue, Sep 24, 2013 at 4:24 AM, Sam Wong <sam@hellosam.net> wrote: > Hi There, > > > > I have hit a query plan issue that I believe is a bug or under-estimation, > and would like to know if there it is known or if there is any workaround… > > > > This event_log table has 4 million rows. > > “log_id” is the primary key (bigint), > > there is a composite index “event_data_search” over (event::text, > insert_time::datetime). > > > > Query A: > > SELECT min(log_id) FROM event_log > > WHERE event='S-Create' AND > > insert_time>'2013-09-15' and insert_time<'2013-09-16' > > > > Query B: > > SELECT log_id FROM event_log > > WHERE event='S-Create' AND > > insert_time>'2013-09-15' and insert_time<'2013-09-16' > > ORDER BY log_id > > > > What I want to achieve is Query A – get the min log_id within a range. But > it is very slow, taking 10 or 20 seconds. > > If I don’t limit the output to LIMIT 1 – like Query B – then it is > sub-second fast. > > > > Explain of A – take 10~20 seconds to run > > Limit (cost=0.00..132.54 rows=1 width=8) > > -> Index Scan using event_log_pkey on event_log (cost=0.00..1503484.33 > rows=11344 width=8) > > Filter: ((insert_time > '2013-09-15 00:00:00'::timestamp without > time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time > zone) AND (event = 'S-Create'::text)) > > > > Explain of B – take a few milliseconds to run > > Sort (cost=41015.85..41021.52 rows=11344 width=8) > > Sort Key: log_id > > -> Bitmap Heap Scan on event_log (cost=311.42..40863.05 rows=11344 > width=8) > > Recheck Cond: ((event = 'S-Create'::text) AND (insert_time > > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < > '2013-09-16 00:00:00'::timestamp without time zone)) > > -> Bitmap Index Scan on event_data_search (cost=0.00..310.86 > rows=11344 width=0) > > Index Cond: ((event = 'S-Create'::text) AND (insert_time > > '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time < > '2013-09-16 00:00:00'::timestamp without time zone)) > > > > Plan of A thought that the index scan node will get the first row right at > 0.00, and hence the limit node will get all the rows needed within 132.54 > (because event_log_pkey are sorted) > > I would like to point out that – this 0.00 estimation omits the fact that it > actually takes a much longer time for the index scan node to get the first > row, because 3.99M rows it comes across won’t meet the condition filter at > all. I think you got A and B mixed up there. Can you post explain analyze (not just 'explain'){ of the slow plan? merlin
On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote: > This event_log table has 4 million rows. > > “log_id” is the primary key (bigint), > > there is a composite index “event_data_search” over (event::text, > insert_time::datetime). I think you need to add log_id to that composite index to get pg to use it.
Hi All and Merlin, So here is the explain analyze output. ------ Query A -- single row output, but very slow query ------ SELECT min(log_id) FROM event_log WHERE event='S-Create' AND insert_time>'2013-09-15' and insert_time<'2013-09-16' http://explain.depesz.com/s/3H5 Result (cost=134.48..134.49 rows=1 width=0) (actual time=348370.719..348370.720 rows=1 loops=1) Output: $0 InitPlan 1 (returns $0) -> Limit (cost=0.00..134.48 rows=1 width=8) (actual time=348370.712..348370.713 rows=1 loops=1) Output: uco.event_log.log_id -> Index Scan using event_log_pkey on uco.event_log (cost=0.00..1525564.02 rows=11344 width=8) (actual time=348370.709..348370.709 rows=1 loops=1) Output: uco.event_log.log_id Index Cond: (uco.event_log.log_id IS NOT NULL) Filter: ((uco.event_log.insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (uco.event_log.insert_time < '2013-09-16 00:00:00'::timestamp without time zone) AND (uco.event_log.event = 'S-Create'::text)) Rows Removed by Filter: 43249789 Total runtime: 348370.762 ms ------ Query B -- multiple row output, fast query, but I could get what I want from the first output row ------ SELECT log_id FROM event_log WHERE event='S-Create' AND insert_time>'2013-09-15' and insert_time<'2013-09-16' ORDER BY log_id http://explain.depesz.com/s/s6P Sort (cost=41015.85..41021.52 rows=11344 width=8) (actual time=3651.695..3652.160 rows=6948 loops=1) Output: log_id Sort Key: event_log.log_id Sort Method: quicksort Memory: 518kB -> Bitmap Heap Scan on uco.event_log (cost=311.42..40863.05 rows=11344 width=8) (actual time=448.349..3645.465 rows=6948 loops=1) Output: log_id Recheck Cond: ((event_log.event = 'S-Create'::text) AND (event_log.insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (event_log.insert_time < '2013-09-16 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on event_data_search (cost=0.00..310.86 rows=11344 width=0) (actual time=447.670..447.670 rows=6948 loops=1) Index Cond: ((event_log.event = 'S-Create'::text) AND (event_log.insert_time > '2013-09-15 00:00:00'::timestamp without time zone) AND (event_log.insert_time < '2013-09-16 00:00:00'::timestamp without time zone)) Total runtime: 3652.535 ms P.S. If I put a LIMIT 1 at the end of this query, it will get an identical plan just like Query A. ------ My observation: In Query A, the lower bound of the INDEX SCAN node estimation is way off. It won't get the first row output right at 0.00 because the filter needed to be applied. Thanks, Sam
On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote: >> This event_log table has 4 million rows. >> >> “log_id” is the primary key (bigint), >> >> there is a composite index “event_data_search” over (event::text, >> insert_time::datetime). > > > I think you need to add log_id to that composite index to get pg to use it. hurk: OP is two statistics misses (one of them massive that are combing to gobsmack you). your solution unfortuantely wont work: you can't combine two range searches in a single index scan. it would probably work if you it like this. If insert_time is a timestamp, not a timestamptz, we can convert it to date to get what I think he wants (as long as his queries are along date boundaries). how about: CREATE INDEX ON event_log(event_id, insert_time::date, log_id); EXPLAIN ANALYZE SELECT * FROM event_log WHERE (event_id, insert_time::date, log_id) >= ('S-Create', '2013-09-15'::date, 0) AND event_id = 'S-Create' AND insert_time::date < '2013-09-16'::date ORDER BY event_id, insert_time::date, log_id LIMIT 1 if insert_time is a timestamptz, we can materialize the date into the table to get around that (timestamptz->date is a stable expression). If date boundary handling is awkward, our best bet is probably to hack the planner with a CTE. Note the above query will smoke the CTE based one. WITH data AS ( SELECT log_id FROM event_log WHERE event='S-Create' AND insert_time>'2013-09-15' and insert_time<'2013-09-16' ) SELECT * from data ORDER BY log_id LIMIT 1; merlin
On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote: >> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote: >>> This event_log table has 4 million rows. >>> >>> “log_id” is the primary key (bigint), >>> >>> there is a composite index “event_data_search” over (event::text, >>> insert_time::datetime). >> >> >> I think you need to add log_id to that composite index to get pg to use it. > > hurk: OP is two statistics misses (one of them massive that are > combing to gobsmack you). > > your solution unfortuantely wont work: you can't combine two range > searches in a single index scan. it would probably work if you it > like this. If insert_time is a timestamp, not a timestamptz, we can > convert it to date to get what I think he wants (as long as his > queries are along date boundaries). I was thinking an index over: (event, date_trunc('day', insert_time), log_id) And the query like SELECT min(log_id) FROM event_log WHERE event='S-Create' AND date_trunc('day',insert_time) = '2013-09-15' That's a regular simple range scan over the index.
On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote: >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote: >>>> This event_log table has 4 million rows. >>>> >>>> “log_id” is the primary key (bigint), >>>> >>>> there is a composite index “event_data_search” over (event::text, >>>> insert_time::datetime). >>> >>> >>> I think you need to add log_id to that composite index to get pg to use it. >> >> hurk: OP is two statistics misses (one of them massive that are >> combing to gobsmack you). >> >> your solution unfortuantely wont work: you can't combine two range >> searches in a single index scan. it would probably work if you it >> like this. If insert_time is a timestamp, not a timestamptz, we can >> convert it to date to get what I think he wants (as long as his >> queries are along date boundaries). > > > I was thinking an index over: > > (event, date_trunc('day', insert_time), log_id) > > And the query like > > SELECT min(log_id) FROM event_log > WHERE event='S-Create' AND > date_trunc('day',insert_time) = '2013-09-15' > > > That's a regular simple range scan over the index. *) date_trunc has same problems as ::date: it is stable expression only for timestamptz. also, the index will be bigger since you're still indexing timestamp *) row wise comparison search might be faster and is generalized to return N records, not jut one. merlin
> -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin > Moncure > Sent: Wednesday, September 25, 2013 23:55 > To: Claudio Freire > Cc: Sam Wong; postgres performance list > Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1? > > On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com> > wrote: > > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure <mmoncure@gmail.com> > wrote: > >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire <klaussfreire@gmail.com> > wrote: > >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> wrote: > >>>> This event_log table has 4 million rows. > >>>> > >>>> "log_id" is the primary key (bigint), > >>>> > >>>> there is a composite index "event_data_search" over (event::text, > >>>> insert_time::datetime). > >>> > >>> > >>> I think you need to add log_id to that composite index to get pg to use it. > >> > >> hurk: OP is two statistics misses (one of them massive that are > >> combing to gobsmack you). > >> > >> your solution unfortuantely wont work: you can't combine two range > >> searches in a single index scan. it would probably work if you it > >> like this. If insert_time is a timestamp, not a timestamptz, we can > >> convert it to date to get what I think he wants (as long as his > >> queries are along date boundaries). > > > > > > I was thinking an index over: > > > > (event, date_trunc('day', insert_time), log_id) > > > > And the query like > > > > SELECT min(log_id) FROM event_log > > WHERE event='S-Create' AND > > date_trunc('day',insert_time) = '2013-09-15' > > > > > > That's a regular simple range scan over the index. > > *) date_trunc has same problems as ::date: it is stable expression only for > timestamptz. also, the index will be bigger since you're still indexing > timestamp > > *) row wise comparison search might be faster and is generalized to return N > records, not jut one. > > merlin I'm afraid it's not anything about composite index. (Because the query B works fine and the plan is as expected) BTW, the timestamp is without timezone. I just thought of another way that demonstrate the issue. Both query returns the same one row result. log_id is the bigint primary key, event_data_search is still that indexed. --- Fast query --- with Q AS ( select event from event_log WHERE log_id>10000 and log_id<50000 order by event ) SELECT * FROM Q LIMIT 1 Limit (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342 rows=1 loops=1) Output: q.event Buffers: shared hit=93 read=622 CTE q -> Sort (cost=2502.07..2521.82 rows=39502 width=6) (actual time=88.335..88.335 rows=1 loops=1) Output: event_log.event Sort Key: event_log.event Sort Method: quicksort Memory: 3486kB Buffers: shared hit=93 read=622 -> Index Scan using event_log_pkey on uco.event_log (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573 rows=39999 loops=1) Output: event_log.event Index Cond: ((event_log.log_id > 1010000) AND (event_log.log_id < 1050000)) Buffers: shared hit=93 read=622 -> CTE Scan on q (cost=0.00..237.01 rows=39502 width=32) (actual time=88.340..88.340 rows=1 loops=1) Output: q.event Buffers: shared hit=93 read=622 Total runtime: 89.039 ms --- Slow Query --- Result (cost=1241.05..1241.05 rows=1 width=0) (actual time=1099.532..1099.533 rows=1 loops=1) Output: $0 Buffers: shared hit=49029 read=57866 InitPlan 1 (returns $0) -> Limit (cost=0.00..1241.05 rows=1 width=6) (actual time=1099.527..1099.527 rows=1 loops=1) Output: uco.event_log.event Buffers: shared hit=49029 read=57866 -> Index Scan using event_data_search on uco.event_log (cost=0.00..49024009.79 rows=39502 width=6) (actual time=1099.523..1099.523 rows=1 loops=1) Output: uco.event_log.event Index Cond: (uco.event_log.event IS NOT NULL) Filter: ((uco.event_log.log_id > 1010000) AND (uco.event_log.log_id < 1050000)) Rows Removed by Filter: 303884 Buffers: shared hit=49029 read=57866 Total runtime: 1099.568 ms (Note: Things got buffered so it goes down to 1 second, but comparing to the buffer count with the query above this is a few orders slower than that) --- The CTE "fast query" works, it is completed with index scan over "event_log_pkey", which is what I am expecting, and it is good. But it's much straight forward to write it as the "slow query", I am expecting the planner would give the same optimum plan for both. For the plan of "Slow query" has an estimated total cost of 1241.05, and the "Fast query" has 2521.83, hence the planner prefers that plan - the scanning over the "event_data_search" index plan - but this choice doesn't make sense to me. This is my point I want to bring up, why the planner would do that? Instead of scanning over the "event_log_pkey"? Looking into the estimated cost of the Slow Query Plan, the Index Scan node lower bound estimation is 0.00. IIRC, it is saying the planner estimated that the first result row could be returned at 0.00. but actually it has to do a scan almost the whole index and table scan to check if the log_id are within the condition range, there is no way that the first row could ever be returned at 0.00. (The event is a text column with cardinality at around 20, and the order is appearing randomly all over the table - 0 correlation) Hence this questionable estimation bubble up along the tree, the Limit node thought that it could finish within 1241.05 (once the first row is obtained), and the whole plan is thought to be finisable within 1241.05 - which is not the case. Sam
> -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Sam Wong > Sent: Thursday, September 26, 2013 0:34 > To: 'postgres performance list' > Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1? > > > -----Original Message----- > > From: pgsql-performance-owner@postgresql.org > > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin > > Moncure > > Sent: Wednesday, September 25, 2013 23:55 > > To: Claudio Freire > > Cc: Sam Wong; postgres performance list > > Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1? > > > > On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire > > <klaussfreire@gmail.com> > > wrote: > > > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure > > > <mmoncure@gmail.com> > > wrote: > > >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire > <klaussfreire@gmail.com> > > wrote: > > >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@hellosam.net> > wrote: > > >>>> This event_log table has 4 million rows. > > >>>> > > >>>> "log_id" is the primary key (bigint), > > >>>> > > >>>> there is a composite index "event_data_search" over (event::text, > > >>>> insert_time::datetime). > > >>> > > >>> > > >>> I think you need to add log_id to that composite index to get pg > > >>> to > use it. > > >> > > >> hurk: OP is two statistics misses (one of them massive that are > > >> combing to gobsmack you). > > >> > > >> your solution unfortuantely wont work: you can't combine two range > > >> searches in a single index scan. it would probably work if you it > > >> like this. If insert_time is a timestamp, not a timestamptz, we > > >> can convert it to date to get what I think he wants (as long as his > > >> queries are along date boundaries). > > > > > > > > > I was thinking an index over: > > > > > > (event, date_trunc('day', insert_time), log_id) > > > > > > And the query like > > > > > > SELECT min(log_id) FROM event_log > > > WHERE event='S-Create' AND > > > date_trunc('day',insert_time) = '2013-09-15' > > > > > > > > > That's a regular simple range scan over the index. > > > > *) date_trunc has same problems as ::date: it is stable expression > > only > for > > timestamptz. also, the index will be bigger since you're still > > indexing timestamp > > > > *) row wise comparison search might be faster and is generalized to > > return > N > > records, not jut one. > > > > merlin > I'm afraid it's not anything about composite index. (Because the query B works > fine and the plan is as expected) BTW, the timestamp is without timezone. > > I just thought of another way that demonstrate the issue. > Both query returns the same one row result. log_id is the bigint primary key, > event_data_search is still that indexed. > --- > Fast query > --- Excuse me, this is the actual query. with Q AS ( select event from event_log WHERE log_id>1010000 and log_id<1050000 order by event ) SELECT * FROM Q LIMIT 1 > > Limit (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342 > rows=1 loops=1) > Output: q.event > Buffers: shared hit=93 read=622 > CTE q > -> Sort (cost=2502.07..2521.82 rows=39502 width=6) (actual > time=88.335..88.335 rows=1 loops=1) > Output: event_log.event > Sort Key: event_log.event > Sort Method: quicksort Memory: 3486kB > Buffers: shared hit=93 read=622 > -> Index Scan using event_log_pkey on uco.event_log > (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573 > rows=39999 loops=1) > Output: event_log.event > Index Cond: ((event_log.log_id > 1010000) AND > (event_log.log_id < 1050000)) > Buffers: shared hit=93 read=622 > -> CTE Scan on q (cost=0.00..237.01 rows=39502 width=32) (actual > time=88.340..88.340 rows=1 loops=1) > Output: q.event > Buffers: shared hit=93 read=622 > Total runtime: 89.039 ms > > --- > Slow Query > --- And the query for this...I must have forgot to paste. select min(event) from event_log WHERE log_id>1010000 and log_id<1050000 > Result (cost=1241.05..1241.05 rows=1 width=0) (actual > time=1099.532..1099.533 rows=1 loops=1) > Output: $0 > Buffers: shared hit=49029 read=57866 > InitPlan 1 (returns $0) > -> Limit (cost=0.00..1241.05 rows=1 width=6) (actual > time=1099.527..1099.527 rows=1 loops=1) > Output: uco.event_log.event > Buffers: shared hit=49029 read=57866 > -> Index Scan using event_data_search on uco.event_log > (cost=0.00..49024009.79 rows=39502 width=6) (actual > time=1099.523..1099.523 > rows=1 loops=1) > Output: uco.event_log.event > Index Cond: (uco.event_log.event IS NOT NULL) > Filter: ((uco.event_log.log_id > 1010000) AND > (uco.event_log.log_id < 1050000)) > Rows Removed by Filter: 303884 > Buffers: shared hit=49029 read=57866 Total runtime: > 1099.568 ms > (Note: Things got buffered so it goes down to 1 second, but comparing to the > buffer count with the query above this is a few orders slower than that) > > --- > The CTE "fast query" works, it is completed with index scan over > "event_log_pkey", which is what I am expecting, and it is good. > But it's much straight forward to write it as the "slow query", I am expecting > the planner would give the same optimum plan for both. > > For the plan of "Slow query" has an estimated total cost of 1241.05, and the > "Fast query" has 2521.83, hence the planner prefers that plan - the scanning > over the "event_data_search" index plan - but this choice doesn't make sense > to me. > > This is my point I want to bring up, why the planner would do that? Instead of > scanning over the "event_log_pkey"? > > Looking into the estimated cost of the Slow Query Plan, the Index Scan node > lower bound estimation is 0.00. IIRC, it is saying the planner estimated that the > first result row could be returned at 0.00. > but actually it has to do a scan almost the whole index and table scan to check > if the log_id are within the condition range, there is no way that the first row > could ever be returned at 0.00. > (The event is a text column with cardinality at around 20, and the order is > appearing randomly all over the table - 0 correlation) Hence this questionable > estimation bubble up along the tree, the Limit node thought that it could finish > within 1241.05 (once the first row is obtained), and the whole plan is thought to > be finisable within 1241.05 - which is not the case. > > Sam > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
On Wed, Sep 25, 2013 at 12:54 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> I was thinking an index over: >> >> (event, date_trunc('day', insert_time), log_id) >> >> And the query like >> >> SELECT min(log_id) FROM event_log >> WHERE event='S-Create' AND >> date_trunc('day',insert_time) = '2013-09-15' >> >> >> That's a regular simple range scan over the index. > > *) date_trunc has same problems as ::date: it is stable expression > only for timestamptz. also, the index will be bigger since you're > still indexing timestamp Ah, yes, good point.
Excuse me, this is the actual query.
> ---WHERE log_id>1010000 and log_id<1050000
with Q AS (
select event
from event_logselect min(event)order by event
)
SELECT * FROM Q LIMIT 1
>
> Limit (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
> rows=1 loops=1)
> Output: q.event
> Buffers: shared hit=93 read=622
> CTE q
> -> Sort (cost=2502.07..2521.82 rows=39502 width=6) (actual
> time=88.335..88.335 rows=1 loops=1)
> Output: event_log.event
> Sort Key: event_log.event
> Sort Method: quicksort Memory: 3486kB
> Buffers: shared hit=93 read=622
> -> Index Scan using event_log_pkey on uco.event_log
> (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
> rows=39999 loops=1)
> Output: event_log.event
> Index Cond: ((event_log.log_id > 1010000) AND
> (event_log.log_id < 1050000))
> Buffers: shared hit=93 read=622
> -> CTE Scan on q (cost=0.00..237.01 rows=39502 width=32) (actual
> time=88.340..88.340 rows=1 loops=1)
> Output: q.event
> Buffers: shared hit=93 read=622
> Total runtime: 89.039 ms
>
> ---
> Slow Query
from event_log
WHERE log_id>1010000 and log_id<1050000> Result (cost=1241.05..1241.05 rows=1 width=0) (actual
> time=1099.532..1099.533 rows=1 loops=1)
> Output: $0
> Buffers: shared hit=49029 read=57866
> InitPlan 1 (returns $0)
> -> Limit (cost=0.00..1241.05 rows=1 width=6) (actual
> time=1099.527..1099.527 rows=1 loops=1)
> Output: uco.event_log.event
> Buffers: shared hit=49029 read=57866
> -> Index Scan using event_data_search on uco.event_log
> (cost=0.00..49024009.79 rows=39502 width=6) (actual
> time=1099.523..1099.523
> rows=1 loops=1)
> Output: uco.event_log.event
> Index Cond: (uco.event_log.event IS NOT NULL)
> Filter: ((uco.event_log.log_id > 1010000) AND
> (uco.event_log.log_id < 1050000))
> Rows Removed by Filter: 303884
> Buffers: shared hit=49029 read=57866 Total runtime:
> 1099.568 ms
> (Note: Things got buffered so it goes down to 1 second, but comparing to
the
> buffer count with the query above this is a few orders slower than that)
>
> ---
> The CTE "fast query" works, it is completed with index scan over
> "event_log_pkey", which is what I am expecting, and it is good.
> But it's much straight forward to write it as the "slow query", I am
expecting
> the planner would give the same optimum plan for both.
>
> For the plan of "Slow query" has an estimated total cost of 1241.05, and
the
> "Fast query" has 2521.83, hence the planner prefers that plan - the
scanning
> over the "event_data_search" index plan - but this choice doesn't make
sense
> to me.
>
> This is my point I want to bring up, why the planner would do that?
Instead of
> scanning over the "event_log_pkey"?
>
2- use event_data_search, 4 000 000 rows, 40 000 with a log_id in the interval thus win big and find one in the first 100 scanned rows or lose big and scan 4 000 000 rows.
With uncorrelated data 2- is 400 time faster than 1-, 100 rows vs 40 000.