Thread: query from partitions
Hi. create table device(id int); insert into device values(1); insert into device values(2); ..... insert into device values(250); create table base ( id int, data float, datatime timestamp, mode int, status int); create table base_1 ( check ( id = 1 and datatime >= DATE '2005-01-01' and datatime < DATE '2006-01-01' ) ) INHERITS (base); create table base_2 ( check ( id = 2 and datatime >= DATE '2005-01-01' and datatime < DATE '2006-01-01' ) ) INHERITS (base); .... create table base_250 And select * from base where id in (1,2) and datatime between '2005-05-15' and '2005-05-17'; 10 seconds select * from base where id in (select id from device where id = 1 or id = 2) and datatime between '2005-05-15' and '2005-05-17'; 10 minits Why? -- mailto: alexs@analytic.mv.ru
Ключников А.С. wrote: > And > select * from base > where id in (1,2) and datatime between '2005-05-15' and '2005-05-17'; > 10 seconds > > select * from base > where id in (select id from device where id = 1 or id = 2) and > datatime between '2005-05-15' and '2005-05-17'; > 10 minits > > Why? Run EXPLAIN ANALYSE on both queries to see how the plan has changed. My guess for why the plans are different is that in the first case your query ends up as ...where (id=1 or id=2)... In the second case, the planner doesn't know what it's going to get back from the subquery until it's executed it, so can't tell it just needs to scan base_1,base_2. Result: you'll scan all child tables of base. I think the planner will occasionally evaluate constants before planning, but I don't think it will ever execute a subquery and then re-plan the outer query based on those results. Of course, someone might pop up and tell me I'm wrong now... -- Richard Huxton Archonet Ltd
On Tue, Dec 13, 2005 at 06:18:19PM +0300, Ключников А.С. wrote: > select * from base > where id in (select id from device where id = 1 or id = 2) and > datatime between '2005-05-15' and '2005-05-17'; > 10 minits That's a really odd way of saying "1 or 2". It probably has to go through all the records in device, not realizing it can just scan for two of them (using two index scans). I'd guess an EXPLAIN ANALYZE would confirm something like this happening (you'd want to run that and post the results here anyhow). /* Steinar */ -- Homepage: http://www.sesse.net/
* Richard Huxton <dev@archonet.com> [2005-12-13 15:59:11 +0000]: > Ключников А.С. wrote: > >And > >select * from base > > where id in (1,2) and datatime between '2005-05-15' and '2005-05-17'; > >10 seconds > > > >select * from base > > where id in (select id from device where id = 1 or id = 2) and > > datatime between '2005-05-15' and '2005-05-17'; > >10 minits > > > >Why? > > Run EXPLAIN ANALYSE on both queries to see how the plan has changed. explain select distinct on(id) * from base where id in (1,2) and data_type=2 and datatime < '2005-11-02' order by id, datatime desc; Unique (cost=10461.14..10527.30 rows=2342 width=38) -> Sort (cost=10461.14..10494.22 rows=13232 width=38) Sort Key: public.base.id, public.base.datatime -> Result (cost=0.00..9555.29 rows=13232 width=38) -> Append (cost=0.00..9555.29 rows=13232 width=38) -> Seq Scan on base (cost=0.00..32.60 rows=1 width=38) Filter: (((id = 1) OR (id = 2)) AND (data_type = 2) AND (datatime < '2005-11-02 00:00:00'::timestamp without time zone)) -> Seq Scan on base_batch base (cost=0.00..32.60 rows=1 width=38) ....................... -> Seq Scan on base_1_2004 base (cost=0.00..32.60 rows=1 width=38) Filter: (((id = 1) OR (id = 2)) AND (data_type = 2) AND (datatime < '2005-11-02 00:00:00'::timestamp without time zone)) (записей: 34) and explain select distinct on(id) * from base where id in (select id from device where id = 1 or id = 2) and data_type=2 and datatime < '2005-11-02' order by id, datatime desc; Unique (cost=369861.89..369872.52 rows=2126 width=38) -> Sort (cost=369861.89..369867.21 rows=2126 width=38) Sort Key: public.base.id, public.base.datatime -> Hash IN Join (cost=5.88..369744.39 rows=2126 width=38) Hash Cond: ("outer".id = "inner".id) -> Append (cost=0.00..368654.47 rows=212554 width=38) -> Seq Scan on base (cost=0.00..26.95 rows=2 width=38) Filter: ((data_type = 2) AND (datatime < '2005-11-02 00:00:00'::timestamp without time zone)) -> Seq Scan on base_batch base (cost=0.00..26.95 rows=2 width=38) Filter: ((data_type = 2) AND (datatime < '2005-11-02 00:00:00'::timestamp without time zone)) -> Seq Scan on base_lines_05_12 base (cost=0.00..26.95 rows=2 width=38) ............................ -> Hash (cost=5.88..5.88 rows=2 width=4) -> Seq Scan on device (cost=0.00..5.88 rows=2 width=4) Filter: ((id = 1) OR (id = 2)) (записей: 851) > > My guess for why the plans are different is that in the first case your > query ends up as ...where (id=1 or id=2)... > > In the second case, the planner doesn't know what it's going to get back > from the subquery until it's executed it, so can't tell it just needs to > scan base_1,base_2. Result: you'll scan all child tables of base. > > I think the planner will occasionally evaluate constants before > planning, but I don't think it will ever execute a subquery and then > re-plan the outer query based on those results. Of course, someone might > pop up and tell me I'm wrong now... > > -- > Richard Huxton > Archonet Ltd > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- С уважением, Ключников А.С. Ведущий инженер ПРП "Аналитприбор" 432030 г.Ульяновск, а/я 3117 тел./факс +7 (8422) 43-44-78 mailto: alexs@analytic.mv.ru
On Tue, 2005-12-13 at 15:59 +0000, Richard Huxton wrote: > Ключников А.С. wrote: > > And > > select * from base > > where id in (1,2) and datatime between '2005-05-15' and '2005-05-17'; > > 10 seconds > > > > select * from base > > where id in (select id from device where id = 1 or id = 2) and > > datatime between '2005-05-15' and '2005-05-17'; > > 10 minits > > > > Why? > > Run EXPLAIN ANALYSE on both queries to see how the plan has changed. > > My guess for why the plans are different is that in the first case your > query ends up as ...where (id=1 or id=2)... > > In the second case, the planner doesn't know what it's going to get back > from the subquery until it's executed it, so can't tell it just needs to > scan base_1,base_2. Result: you'll scan all child tables of base. > > I think the planner will occasionally evaluate constants before > planning, but I don't think it will ever execute a subquery and then > re-plan the outer query based on those results. Of course, someone might > pop up and tell me I'm wrong now... Thats right. Partitioning doesn't work for joins in 8.1. Best Regards, Simon Riggs