Thread: Planner picks the wrong plan?

Planner picks the wrong plan?

From
Nichlas Löfdahl
Date:
Hello!

I'm using Postgres 7.4.5, sort_mem is 8192. Tables analyzed / vacuumed.

Here's a function I'm using to get an age from the user's birthday:

agey(date) -> SELECT date_part('year', age($1::timestamp))


The problem is, why do the plans differ so much between Q1 & Q3 below? Something with age() being a non-IMMUTABLE
function?


Q1: explain analyze SELECT al.pid, al.owner, al.title, al.front, al.created_at, al.n_images, u.username as owner_str,
u.imageas owner_image, u.puid as owner_puid FROM albums al  , users u WHERE  u.uid = al.owner AND  al.security='a' AND
al.n_images> 0 AND date_part('year', age(u.born)) > 17 AND date_part('year', age(u.born)) < 20 AND city = 1 ORDER BY
al.idDESC LIMIT 9; 

                                    QUERY PLAN
         

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5700.61..5700.63 rows=9 width=183) (actual time=564.291..564.299 rows=9 loops=1)
   ->  Sort  (cost=5700.61..5700.82 rows=83 width=183) (actual time=564.289..564.291 rows=9 loops=1)
         Sort Key: al.id
         ->  Nested Loop  (cost=0.00..5697.97 rows=83 width=183) (actual time=30.029..526.211 rows=4510 loops=1)
               ->  Seq Scan on users u  (cost=0.00..5311.05 rows=86 width=86) (actual time=5.416..421.264 rows=3021
loops=1)
                     Filter: ((date_part('year'::text, age((('now'::text)::date)::timestamp with time zone,
(born)::timestampwith time zone)) > 17::double precision) AND (date_part('year'::text,
age((('now'::text)::date)::timestampwith time zone, (born)::timestamp with time zone)) < 20::double precision) AND
(city= 1)) 
               ->  Index Scan using albums_owner_key on albums al  (cost=0.00..4.47 rows=2 width=101) (actual
time=0.014..0.025rows=1 loops=3021) 
                     Index Cond: ("outer".uid = al."owner")
                     Filter: (("security" = 'a'::bpchar) AND (n_images > 0))
 Total runtime: 565.120 ms
(10 rows)


Result when removing the second age-check (AND date_part('year', age(u.born)) < 20):

Q2: explain analyze SELECT al.pid, al.owner, al.title, al.front, al.created_at, al.n_images, u.username as owner_str,
u.imageas owner_image, u.puid as owner_puid FROM albums al, users u WHERE  u.uid = al.owner AND  al.security='a' AND
al.n_images> 0 AND date_part('year', age(u.born)) > 17 AND city = 1 ORDER BY al.id DESC LIMIT 9; 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..140.95 rows=9 width=183) (actual time=0.217..2.474 rows=9 loops=1)
   ->  Nested Loop  (cost=0.00..86200.99 rows=5504 width=183) (actual time=0.216..2.464 rows=9 loops=1)
         ->  Index Scan Backward using albums_id_key on albums al  (cost=0.00..2173.32 rows=27610 width=101) (actual
time=0.086..1.080rows=40 loops=1) 
               Filter: (("security" = 'a'::bpchar) AND (n_images > 0))
         ->  Index Scan using users_pkey on users u  (cost=0.00..3.03 rows=1 width=86) (actual time=0.031..0.031 rows=0
loops=40)
               Index Cond: (u.uid = "outer"."owner")
               Filter: ((date_part('year'::text, age((('now'::text)::date)::timestamp with time zone, (born)::timestamp
withtime zone)) > 17::double precision) AND (city = 1)) 
 Total runtime: 2.611 ms
(8 rows)

Trying another approach: adding a separate "stale" age-column to the users-table:

alter table users add column age smallint;
update users set age=date_part('year'::text, age((('now'::text)::date)::timestamp with time zone, (born)::timestamp
withtime zone)); 
analyze users;

Result with separate column:
Q3: explain analyze SELECT al.pid, al.owner, al.title, al.front, al.created_at, al.n_images, u.username as owner_str,
u.imageas owner_image, u.puid as owner_puid FROM albums al  , users u WHERE  u.uid = al.owner AND  al.security='a' AND
al.n_images> 0 AND age > 17 AND age < 20 AND city = 1 ORDER BY al.id DESC LIMIT 9; 

--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..263.40 rows=9 width=183) (actual time=0.165..2.832 rows=9 loops=1)
   ->  Nested Loop  (cost=0.00..85925.69 rows=2936 width=183) (actual time=0.163..2.825 rows=9 loops=1)
         ->  Index Scan Backward using albums_id_key on albums al  (cost=0.00..2173.32 rows=27610 width=101) (actual
time=0.043..1.528rows=56 loops=1) 
               Filter: (("security" = 'a'::bpchar) AND (n_images > 0))
         ->  Index Scan using users_pkey on users u  (cost=0.00..3.02 rows=1 width=86) (actual time=0.020..0.020 rows=0
loops=56)
               Index Cond: (u.uid = "outer"."owner")
               Filter: ((age > 17) AND (age < 20) AND (city = 1))
 Total runtime: 2.973 ms
(8 rows)

My question is, why doesn't the planner pick the same plan for Q1 & Q3?

/Nichlas

Re: Planner picks the wrong plan?

From
Tom Lane
Date:
Nichlas =?iso-8859-1?Q?L=F6fdahl?= <crotalus@acc.umu.se> writes:
> My question is, why doesn't the planner pick the same plan for Q1 & Q3?

I think it's mostly that after you've added and ANALYZEd the "age"
column, the planner has a pretty good idea of how many rows will pass
the "age > 17 AND age < 20" condition.  It can't do very much with the
equivalent condition in the original form, though, and in fact ends up
drastically underestimating the number of matching rows (86 vs reality
of 3021).  That leads directly to a bad plan choice :-(

            regards, tom lane