Thread: Odd planner choice?

Odd planner choice?

From
"Gary Doades"
Date:
I'm looking at one of my standard queries and have encountered some strange performance problems.

The query below is to search for vacant staff member date/time slots given a series of target date/times. The data contained in the booking_plan/staff_booking tables contain the existing bookings, so I'm looking for "clashing" bookings to eliminate them from a candidate list.

The query is:

select distinct b.staff_id  from staff_booking b, booking_plan bp, t_search_reqt_dates rd
where b.booking_id = bp.booking_id
and rd.datetime_from <= bp.datetime_to and rd.datetime_to >= bp.datetime_from
AND bp.booking_date between rd.reqt_date-1 and rd.reqt_date+1
and rd.search_id = 13
and rd.reqt_date between '2004-09-30' AND '2005-12-31'

There are 197877 rows in staff_booking, 573416 rows in booking_plan and 26 rows in t_search_reqt_dates.

The t_search reqt_dates is a temp table created and populated with the target date/times. The temp table is *not* analyzed, all the other are.

The "good" query plan comes with the criteria on search_id and reqt_date given in the last two lines in the query. Note all the rows in the temp table are search_id = 13 and all the rows are between the two dates, so the whole 26 rows is always pulled out.

In this case it is doing exactly what I expect. It is pulling all rows from the t_search_reqt_dates table, then pulling the relevant records from the booking_plan and then hashing with staff_booking. Excellent performance.

The problem is I don't need the clauses for search_id and reqt_dates as the whole table is always read anyway. The good plan is because the planner thinks just one row will be read from t_search_reqt_dates.

If I remove the redundant clauses, the planner now estimates 1000 rows returned from the table, not unreasonable since it has no statistics. But *why* in that case, with *more* estimated rows does it choose to materialize that table (26 rows) 573416 times!!!

whenever it estimates more than one row it chooses the bad plan.

I really want to remove the redundant clauses, but I can't. If I analyse the table, then it knows there are 26 rows and chooses the "bad" plan whatever I do.

Any ideas???

Cheers,
Gary.

-------------------- Plans for above query ------------------------

Good QUERY PLAN
Unique  (cost=15440.83..15447.91 rows=462 width=4) (actual time=1342.000..1342.000 rows=110 loops=1)
  ->  Sort  (cost=15440.83..15444.37 rows=7081 width=4) (actual time=1342.000..1342.000 rows=2173 loops=1)
        Sort Key: b.staff_id
        ->  Hash Join  (cost=10784.66..15350.26 rows=7081 width=4) (actual time=601.000..1331.000 rows=2173 loops=1)
              Hash Cond: ("outer".booking_id = "inner".booking_id)
              ->  Seq Scan on staff_booking b  (cost=0.00..4233.39 rows=197877 width=8) (actual time=0.000..400.000 rows=197877 loops=1)
              ->  Hash  (cost=10781.12..10781.12 rows=7080 width=4) (actual time=591.000..591.000 rows=0 loops=1)
                    ->  Nested Loop  (cost=0.00..10781.12 rows=7080 width=4) (actual time=10.000..581.000 rows=2173 loops=1)
                          Join Filter: (("outer".datetime_from <= "inner".datetime_to) AND ("outer".datetime_to >= "inner".datetime_from))
                          ->  Seq Scan on t_search_reqt_dates rd  (cost=0.00..16.50 rows=1 width=20) (actual time=0.000..0.000 rows=26 loops=1)
                                Filter: ((search_id = 13) AND (reqt_date >= '2004-09-30'::date) AND (reqt_date <= '2005-12-31'::date))
                          ->  Index Scan using booking_plan_idx2 on booking_plan bp (cost=0.00..10254.91 rows=63713 width=24) (actual time=0.000..11.538 rows=5871 loops=26)
                                Index Cond: ((bp.booking_date >= ("outer".reqt_date - 1)) AND (bp.booking_date <= ("outer".reqt_date + 1)))
Total runtime: 1342.000 ms


Bad QUERY PLAN
Unique  (cost=7878387.29..7885466.50 rows=462 width=4) (actual time=41980.000..41980.000 rows=110 loops=1)
  ->  Sort  (cost=7878387.29..7881926.90 rows=7079211 width=4) (actual time=41980.000..41980.000 rows=2173 loops=1)
        Sort Key: b.staff_id
        ->  Nested Loop  (cost=5314.32..7480762.73 rows=7079211 width=4) (actual time=6579.000..41980.000 rows=2173 loops=1)
              Join Filter: (("inner".datetime_from <= "outer".datetime_to) AND ("inner".datetime_to >= "outer".datetime_from) AND ("outer".booking_date >= ("inner".reqt_date - 1)) AND ("outer".booking_date <= ("inner".reqt_date + 1)))
              ->  Hash Join  (cost=5299.32..26339.73 rows=573416 width=24) (actual time=2413.000..7832.000 rows=573416 loops=1)
                    Hash Cond: ("outer".booking_id = "inner".booking_id)
                    ->  Seq Scan on booking_plan bp  (cost=0.00..7646.08 rows=573416 width=24) (actual time=0.000..1201.000 rows=573416 loops=1)
                    ->  Hash  (cost=4233.39..4233.39 rows=197877 width=8) (actual time=811.000..811.000 rows=0 loops=1)
                          ->  Seq Scan on staff_booking b  (cost=0.00..4233.39 rows=197877 width=8) (actual time=0.000..430.000 rows=197877 loops=1)
              ->  Materialize  (cost=15.00..20.00 rows=1000 width=20) (actual time=0.001..0.019 rows=26 loops=573416)
                    ->  Seq Scan on t_search_reqt_dates rd  (cost=0.00..15.00 rows=1000 width=20) (actual time=0.000..0.000 rows=26 loops=1)
Total runtime: 41980.000 ms

Re: Odd planner choice?

From
"Gary Doades"
Date:
Oops, forgot to mention:

PostgreSQL 8.0 beta 2 Windows.

Thanks,
Gary.

On 8 Oct 2004 at 20:32, Gary Doades wrote:

>
> I'm looking at one of my standard queries and have encountered some strange performance
> problems.
>
> The query below is to search for vacant staff member date/time slots given a series of target
> date/times. The data contained in the booking_plan/staff_booking tables contain the existing
> bookings, so I'm looking for "clashing" bookings to eliminate them from a candidate list.
>
> The query is:
>
> select distinct b.staff_id from staff_booking b, booking_plan bp, t_search_reqt_dates rd
> where b.booking_id = bp.booking_id
> and rd.datetime_from <= bp.datetime_to and rd.datetime_to >= bp.datetime_from
> AND bp.booking_date between rd.reqt_date-1 and rd.reqt_date+1
> and rd.search_id = 13
> and rd.reqt_date between '2004-09-30' AND '2005-12-31'
>
> There are 197877 rows in staff_booking, 573416 rows in booking_plan and 26 rows in
> t_search_reqt_dates.
>
> The t_search reqt_dates is a temp table created and populated with the target date/times. The
> temp table is *not* analyzed, all the other are.
>
> The "good" query plan comes with the criteria on search_id and reqt_date given in the last two
> lines in the query. Note all the rows in the temp table are search_id = 13 and all the rows are
> between the two dates, so the whole 26 rows is always pulled out.
>
> In this case it is doing exactly what I expect. It is pulling all rows from the t_search_reqt_dates
> table, then pulling the relevant records from the booking_plan and then hashing with
> staff_booking. Excellent performance.
>
> The problem is I don't need the clauses for search_id and reqt_dates as the whole table is
> always read anyway. The good plan is because the planner thinks just one row will be read from
> t_search_reqt_dates.
>
> If I remove the redundant clauses, the planner now estimates 1000 rows returned from the table,
> not unreasonable since it has no statistics. But *why* in that case, with *more* estimated rows
> does it choose to materialize that table (26 rows) 573416 times!!!
>
> whenever it estimates more than one row it chooses the bad plan.
>
> I really want to remove the redundant clauses, but I can't. If I analyse the table, then it knows
> there are 26 rows and chooses the "bad" plan whatever I do.
>
> Any ideas???
>
> Cheers,
> Gary.
>
> -------------------- Plans for above query ------------------------
>
> Good QUERY PLAN
> Unique (cost=15440.83..15447.91 rows=462 width=4) (actual time=1342.000..1342.000
> rows=110 loops=1)
>  -> Sort (cost=15440.83..15444.37 rows=7081 width=4) (actual time=1342.000..1342.000
> rows=2173 loops=1)
>  Sort Key: b.staff_id
>  -> Hash Join (cost=10784.66..15350.26 rows=7081 width=4) (actual
> time=601.000..1331.000 rows=2173 loops=1)
>  Hash Cond: ("outer".booking_id = "inner".booking_id)
>  -> Seq Scan on staff_booking b (cost=0.00..4233.39 rows=197877 width=8) (actual
> time=0.000..400.000 rows=197877 loops=1)
>  -> Hash (cost=10781.12..10781.12 rows=7080 width=4) (actual
> time=591.000..591.000 rows=0 loops=1)
>  -> Nested Loop (cost=0.00..10781.12 rows=7080 width=4) (actual
> time=10.000..581.000 rows=2173 loops=1)
>  Join Filter: (("outer".datetime_from <= "inner".datetime_to) AND
> ("outer".datetime_to >= "inner".datetime_from))
>  -> Seq Scan on t_search_reqt_dates rd (cost=0.00..16.50 rows=1 width=20)
> (actual time=0.000..0.000 rows=26 loops=1)
>  Filter: ((search_id = 13) AND (reqt_date >= '2004-09-30'::date) AND
> (reqt_date <= '2005-12-31'::date))
>  -> Index Scan using booking_plan_idx2 on booking_plan bp
> (cost=0.00..10254.91 rows=63713 width=24) (actual time=0.000..11.538 rows=5871 loops=26)
>  Index Cond: ((bp.booking_date >= ("outer".reqt_date - 1)) AND
> (bp.booking_date <= ("outer".reqt_date + 1)))
> Total runtime: 1342.000 ms
>
>
> Bad QUERY PLAN
> Unique (cost=7878387.29..7885466.50 rows=462 width=4) (actual time=41980.000..41980.000
> rows=110 loops=1)
>  -> Sort (cost=7878387.29..7881926.90 rows=7079211 width=4) (actual
> time=41980.000..41980.000 rows=2173 loops=1)
>  Sort Key: b.staff_id
>  -> Nested Loop (cost=5314.32..7480762.73 rows=7079211 width=4) (actual
> time=6579.000..41980.000 rows=2173 loops=1)
>  Join Filter: (("inner".datetime_from <= "outer".datetime_to) AND ("inner".datetime_to >=
> "outer".datetime_from) AND ("outer".booking_date >= ("inner".reqt_date - 1)) AND
> ("outer".booking_date <= ("inner".reqt_date + 1)))
>  -> Hash Join (cost=5299.32..26339.73 rows=573416 width=24) (actual
> time=2413.000..7832.000 rows=573416 loops=1)
>  Hash Cond: ("outer".booking_id = "inner".booking_id)
>  -> Seq Scan on booking_plan bp (cost=0.00..7646.08 rows=573416 width=24)
> (actual time=0.000..1201.000 rows=573416 loops=1)
>  -> Hash (cost=4233.39..4233.39 rows=197877 width=8) (actual
> time=811.000..811.000 rows=0 loops=1)
>  -> Seq Scan on staff_booking b (cost=0.00..4233.39 rows=197877 width=8)
> (actual time=0.000..430.000 rows=197877 loops=1)
>  -> Materialize (cost=15.00..20.00 rows=1000 width=20) (actual time=0.001..0.019
> rows=26 loops=573416)
>  -> Seq Scan on t_search_reqt_dates rd (cost=0.00..15.00 rows=1000 width=20)
> (actual time=0.000..0.000 rows=26 loops=1)
> Total runtime: 41980.000 ms
>



Re: Odd planner choice?

From
Tom Lane
Date:
"Gary Doades" <gpd@gpdnet.co.uk> writes:
> If I remove the redundant clauses, the planner now estimates 1000 rows returned from
> the table, not unreasonable since it has no statistics. But *why* in that case, with *more*
> estimated rows does it choose to materialize that table (26 rows) 573416 times!!!

It isn't.  It's materializing that once and scanning it 573416 times,
once for each row in the outer relation.  And this is not a bad plan
given the estimates.  If it had stuck to what you call the good plan,
and there *had* been 1000 rows in the temp table, that plan would have
run 1000 times longer than it did.

As a general rule, if your complaint is that you get a bad plan for an
unanalyzed table, the response is going to be "so analyze the table".

            regards, tom lane

Re: Odd planner choice?

From
"Gary Doades"
Date:
On 8 Oct 2004 at 16:04, Tom Lane wrote:

> "Gary Doades" <gpd@gpdnet.co.uk> writes:
> > If I remove the redundant clauses, the planner now estimates 1000 rows returned from
> > the table, not unreasonable since it has no statistics. But *why* in that case, with *more*
> > estimated rows does it choose to materialize that table (26 rows) 573416 times!!!
>
> It isn't.  It's materializing that once and scanning it 573416 times,
> once for each row in the outer relation.  And this is not a bad plan
> given the estimates.  If it had stuck to what you call the good plan,
> and there *had* been 1000 rows in the temp table, that plan would have
> run 1000 times longer than it did.
>
> As a general rule, if your complaint is that you get a bad plan for an
> unanalyzed table, the response is going to be "so analyze the table".
>

The problem is in this case is that if I *do* analyse the table I *always* get the bad plan.
Bad in this case meaning the query takes a lot longer. I'm still not sure why it can't
choose the better plan by just reading the 26 rows once and index scan the
booking_plan table 26 times (as in the "good" plan).

OK, with 1000 row estimate I can see that index scanning 1000 times into the
booking_plan table would take some time, but the even if planner estimates 5 rows it still
produces the same slow query.

If I analyze the table it then knows there are 26 rows and therefore always goes slow.

This is why I am not analyzing this table, to fool the planner into thinking there is only
one row and produce a much faster access plan. Not ideal I know.

Just using one redundant clause I now get:

select distinct b.staff_id  from staff_booking b, booking_plan bp, t_search_reqt_dates rd
where b.booking_id = bp.booking_id
and rd.datetime_from <= bp.datetime_to and rd.datetime_to >= bp.datetime_from
AND bp.booking_date between rd.reqt_date-1 and rd.reqt_date+1
and rd.search_id = 13

QUERY PLAN
Unique  (cost=50885.97..50921.37 rows=462 width=4) (actual
time=35231.000..35241.000 rows=110 loops=1)
  ->  Sort  (cost=50885.97..50903.67 rows=35397 width=4) (actual
time=35231.000..35241.000 rows=2173 loops=1)
        Sort Key: b.staff_id
        ->  Hash Join  (cost=44951.32..50351.07 rows=35397 width=4) (actual
time=34530.000..35231.000 rows=2173 loops=1)
              Hash Cond: ("outer".booking_id = "inner".booking_id)
              ->  Seq Scan on staff_booking b  (cost=0.00..4233.39 rows=197877 width=8)
(actual time=0.000..351.000 rows=197877 loops=1)
              ->  Hash  (cost=44933.62..44933.62 rows=35397 width=4) (actual
time=34530.000..34530.000 rows=0 loops=1)
                    ->  Nested Loop  (cost=15.50..44933.62 rows=35397 width=4) (actual
time=8342.000..34520.000 rows=2173 loops=1)
                          Join Filter: (("inner".datetime_from <= "outer".datetime_to) AND
("inner".datetime_to >= "outer".datetime_from) AND ("outer".booking_date >=
("inner".reqt_date - 1)) AND ("outer".booking_date <= ("inner".reqt_date + 1)))
                          ->  Seq Scan on booking_plan bp  (cost=0.00..7646.08 rows=573416
width=24) (actual time=0.000..1053.000 rows=573416 loops=1)
                          ->  Materialize  (cost=15.50..15.53 rows=5 width=20) (actual
time=0.001..0.019 rows=26 loops=573416)
                                ->  Seq Scan on t_search_reqt_dates rd  (cost=0.00..15.50 rows=5
width=20) (actual time=0.000..0.000 rows=26 loops=1)
                                      Filter: (search_id = 13)
Total runtime: 35241.000 ms

If this is the only answer for now, then fair enough I will just have to do more testing.

Regards,
Gary.