inheritance, and plans - Mailing list pgsql-performance

From Grzegorz Jaśkiewicz
Subject inheritance, and plans
Date
Msg-id 2f4958ff0902061000h6551de8an83c39f2de1fffb7a@mail.gmail.com
Whole thread Raw
Responses Re: inheritance, and plans
Re: inheritance, and plans
List pgsql-performance
Hey folks,

I have few tables, that inherit from table X.
The query I perform, tries to obtain information about changes in all
tables that inherit from X,
aside from that, I have table Y that keeps another information related
to changes, but in bit different schema.
Anyway, there is one unique id field, shared amongst them.

When I want to obtain all that information, I do:


select updateid from (
  select updateid from r.skel
union all
  select updateid from r.history
) as foo
  where  updateid > 1232634919168805;

And what amazes me, is that no matter what value I choose in where X >
, postgres will always think this is the best plan:

                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Subquery Scan foo  (cost=0.00..167736.75 rows=978726 width=8)
   Filter: (foo.updateid > 1232634919168805::bigint)
   ->  Append  (cost=0.00..131034.54 rows=2936177 width=8)
         ->  Subquery Scan "*SELECT* 1"  (cost=0.00..130999.94
rows=2934947 width=8)
               ->  Result  (cost=0.00..101650.47 rows=2934947 width=8)
                     ->  Append  (cost=0.00..101650.47 rows=2934947 width=8)
                           ->  Seq Scan on skel  (cost=0.00..24.80
rows=1480 width=8)
                           ->  Seq Scan on a skel
(cost=0.00..22028.96 rows=923596 width=8)
                           ->  Seq Scan on b skel  (cost=0.00..8.01
rows=201 width=8)
                           ->  Seq Scan on c skel  (cost=0.00..1.81
rows=81 width=8)
                           ->  Seq Scan on d skel
(cost=0.00..22117.94 rows=923594 width=8)
                           ->  Seq Scan on e skel  (cost=0.00..6.03
rows=303 width=8)
                           ->  Seq Scan on f skel  (cost=0.00..6.02
rows=202 width=8)
                           ->  Seq Scan on g skel  (cost=0.00..1987.40
rows=85140 width=8)
                           ->  Seq Scan on h skel  (cost=0.00..1.01
rows=1 width=8)
                           ->  Seq Scan on i skel
(cost=0.00..55454.99 rows=999999 width=8)
                           ->  Seq Scan on j skel  (cost=0.00..13.50
rows=350 width=8)
         ->  Seq Scan on history  (cost=0.00..22.30 rows=1230 width=8)
(18 rows)

so my question is, why isn't postgres use index on some tables , and
search for the X > N individually ?

Because, yet - I tried to recreate problem, but I wasn't able. I have
this test db:

create schema r;

create sequence fooseq;
create domain r.fooint AS bigint NOT NULL default  nextval('fooseq');

create table skel(aid r.fooint, cd timestamp default now() not null);

create table one( a bigserial, aid r.fooint, cd timestamp not null);
create table two( a bigserial, aid r.fooint, cd timestamp not null);
create table three( a bigserial, aid r.fooint, cd timestamp not null);
create table four( a bigserial, aid r.fooint, cd timestamp not null);
create table five( a bigserial, aid r.fooint, cd timestamp not null);

create unique index one_aid on one(aid);
create unique index two_aid on two(aid);
create unique index three_aid on three(aid);
create unique index four_aid on four(aid);
create unique index five_aid on five(aid);


create table numbers( something int default random()*666, aid_foo r.fooint);

create unique index numbers_aid on numbers(aid_foo);

insert into one(a, cd) select generate_series(1,2000000), now();
insert into two(a, cd) select generate_series(1,200000), now();
insert into three(a, cd) select generate_series(1,2200000), now();
insert into four(a, cd) select generate_series(1,2200000), now();
insert into five(a, cd) select generate_series(1,2200000), now();

insert into numbers(something) select generate_series(1,870000);

alter table one inherit skel;
alter table two inherit skel;
alter table three inherit skel;
alter table four inherit skel;
alter table five inherit skel;


But no matter how many tables I throw in ( and I got to 20 ) - it will
always do it right:
gjaskie=# explain  select aid from (select aid from skel union all
select aid_foo as aid from numbers) AS foo where  aid > 999000;
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Result  (cost=0.00..178034.88 rows=8661268 width=8)
   ->  Append  (cost=0.00..178034.88 rows=8661268 width=8)
         ->  Seq Scan on skel  (cost=0.00..32.12 rows=590 width=8)
               Filter: ((aid)::bigint > 999000)
         ->  Index Scan using one_aid on one skel
(cost=0.00..34549.76 rows=991445 width=8)
               Index Cond: ((aid)::bigint > 999000)
         ->  Seq Scan on two skel  (cost=0.00..3774.00 rows=199980 width=8)
               Filter: ((aid)::bigint > 999000)
         ->  Seq Scan on three skel  (cost=0.00..41513.00 rows=2199780 width=8)
               Filter: ((aid)::bigint > 999000)
         ->  Seq Scan on four skel  (cost=0.00..41513.00 rows=2199780 width=8)
               Filter: ((aid)::bigint > 999000)
         ->  Seq Scan on five skel  (cost=0.00..41513.00 rows=2199780 width=8)
               Filter: ((aid)::bigint > 999000)
         ->  Seq Scan on numbers  (cost=0.00..15140.00 rows=869913 width=8)
               Filter: ((aid_foo)::bigint > 999000)
(16 rows)

Time: 36.326 ms



But, if I add another union, it screws it up:

gjaskie=# explain  select aid from (select aid from skel union all
select aid_foo as aid from numbers union all select 1 aid) AS foo
where  aid > 999000;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Subquery Scan foo  (cost=0.00..374659.56 rows=3223924 width=8)
   Filter: (foo.aid > 999000)
   ->  Append  (cost=0.00..253762.42 rows=9671771 width=8)
         ->  Result  (cost=0.00..253762.40 rows=9671770 width=8)
               ->  Append  (cost=0.00..253762.40 rows=9671770 width=8)
                     ->  Result  (cost=0.00..144079.70 rows=8801770 width=8)
                           ->  Append  (cost=0.00..144079.70
rows=8801770 width=8)
                                 ->  Seq Scan on skel
(cost=0.00..27.70 rows=1770 width=8)
                                 ->  Seq Scan on one skel
(cost=0.00..32739.00 rows=2000000 width=8)
                                 ->  Seq Scan on two skel
(cost=0.00..3274.00 rows=200000 width=8)
                                 ->  Seq Scan on three skel
(cost=0.00..36013.00 rows=2200000 width=8)
                                 ->  Seq Scan on four skel
(cost=0.00..36013.00 rows=2200000 width=8)
                                 ->  Seq Scan on five skel
(cost=0.00..36013.00 rows=2200000 width=8)
                     ->  Seq Scan on numbers  (cost=0.00..12965.00
rows=870000 width=8)
         ->  Subquery Scan "*SELECT* 3"  (cost=0.00..0.02 rows=1 width=0)
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
(16 rows)

Time: 1.502 ms

now the question is, how my test db's query:
  select aid from (select aid from skel union all select aid_foo as
aid from numbers union all select 1 aid) AS foo where  aid > 999000;

differ from original:
  select updateid from ( select updateid from r.skel union all select
updateid from r.history ) as foo where  updateid > 1232634919168805;


Oh, and the value N doesn't change the plan here either :/

tested on both 8.3 and 8.4, same results..
ideas welcomed

--
GJ

pgsql-performance by date:

Previous
From: Scott Carey
Date:
Subject: Re: suggestions for postgresql setup on Dell 2950 , PERC6i controller
Next
From: Arjen van der Meijden
Date:
Subject: Re: suggestions for postgresql setup on Dell 2950 , PERC6i controller