Re: Data Warehouse Reevaluation - MySQL vs Postgres -- - Mailing list pgsql-performance

From Joe Conway
Subject Re: Data Warehouse Reevaluation - MySQL vs Postgres --
Date
Msg-id 4149B2FF.6030808@joeconway.com
Whole thread Raw
In response to Re: Data Warehouse Reevaluation - MySQL vs Postgres --  ("Iain" <iain@mst.co.jp>)
List pgsql-performance
Iain wrote:
> Joe's example wasn't excluding partions, as he didn't use a predicated UNION
> ALL view to select from. His queries use an indexed column that allow the
> various partitions to be probed at low cost, and he was satisfied wth that.

Right.

> My point in my previous post was that you could still do all that that if
> you wanted to, by building the predicated view with UNION ALL of each of the
> child tables.

Right. It doesn't look that much different:

create or replace view foo_vw as
select * from foo_2004_01 where f2 >= '2004-jan-01' and f2 <= '2004-jan-31'
union all
select * from foo_2004_02 where f2 >= '2004-feb-01' and f2 <= '2004-feb-29'
union all
select * from foo_2004_03 where f2 >= '2004-mar-01' and f2 <= '2004-mar-31'
;

  -- needed just for illustration since these are toy tables
set enable_seqscan to false;

explain analyze select * from foo_vw where f2 = '2004-feb-15';
                                      QUERY PLAN
----------------------------------------------------------------------------------
  Subquery Scan foo_vw  (cost=0.00..14.54 rows=3 width=16) (actual
time=0.022..0.027 rows=1 loops=1)
    ->  Append  (cost=0.00..14.51 rows=3 width=16) (actual
time=0.019..0.022 rows=1 loops=1)
          ->  Subquery Scan "*SELECT* 1"  (cost=0.00..4.84 rows=1
width=16) (actual time=0.004..0.004 rows=0 loops=1)
                ->  Index Scan using foo_2004_01_idx2 on foo_2004_01
(cost=0.00..4.83 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=1)
                      Index Cond: ((f2 >= '2004-01-01'::date) AND (f2 <=
'2004-01-31'::date) AND (f2 = '2004-02-15'::date))
          ->  Subquery Scan "*SELECT* 2"  (cost=0.00..4.84 rows=1
width=16) (actual time=0.013..0.015 rows=1 loops=1)
                ->  Index Scan using foo_2004_02_idx2 on foo_2004_02
(cost=0.00..4.83 rows=1 width=16) (actual time=0.009..0.010 rows=1 loops=1)
                      Index Cond: ((f2 >= '2004-02-01'::date) AND (f2 <=
'2004-02-29'::date) AND (f2 = '2004-02-15'::date))
          ->  Subquery Scan "*SELECT* 3"  (cost=0.00..4.84 rows=1
width=16) (actual time=0.001..0.001 rows=0 loops=1)
                ->  Index Scan using foo_2004_03_idx2 on foo_2004_03
(cost=0.00..4.83 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=1)
                      Index Cond: ((f2 >= '2004-03-01'::date) AND (f2 <=
'2004-03-31'::date) AND (f2 = '2004-02-15'::date))
  Total runtime: 0.188 ms
(12 rows)

regression=# explain analyze select * from foo where f2 = '2004-feb-15';
                                      QUERY PLAN
----------------------------------------------------------------------------------
  Result  (cost=100000000.00..100000073.70 rows=20 width=16) (actual
time=0.059..0.091 rows=1 loops=1)
    ->  Append  (cost=100000000.00..100000073.70 rows=20 width=16)
(actual time=0.055..0.086 rows=1 loops=1)
          ->  Seq Scan on foo  (cost=100000000.00..100000022.50 rows=5
width=16) (actual time=0.001..0.001 rows=0 loops=1)
                Filter: (f2 = '2004-02-15'::date)
          ->  Index Scan using foo_2004_01_idx2 on foo_2004_01 foo
(cost=0.00..17.07 rows=5 width=16) (actual time=0.045..0.045 rows=0 loops=1)
                Index Cond: (f2 = '2004-02-15'::date)
          ->  Index Scan using foo_2004_02_idx2 on foo_2004_02 foo
(cost=0.00..17.07 rows=5 width=16) (actual time=0.008..0.009 rows=1 loops=1)
                Index Cond: (f2 = '2004-02-15'::date)
          ->  Index Scan using foo_2004_03_idx2 on foo_2004_03 foo
(cost=0.00..17.07 rows=5 width=16) (actual time=0.029..0.029 rows=0 loops=1)
                Index Cond: (f2 = '2004-02-15'::date)
  Total runtime: 0.191 ms
(11 rows)


The main difference being that the view needs to be recreated every time
a table is added or dropped, whereas with the inherited tables method
that isn't needed.

Joe

pgsql-performance by date:

Previous
From: Markus Schaber
Date:
Subject: Re: Data Warehouse Reevaluation - MySQL vs Postgres --
Next
From: Daniel Ceregatti
Date:
Subject: Re: Comparing user attributes with bitwise operators