Re: Query performance problems with partitioned tables

From: Andreas Haumer
Subject: Re: Query performance problems with partitioned tables
Date: ,
Msg-id: 4635EF3A.1070204@xss.co.at
(view: Whole thread, Raw)
In response to: Re: Query performance problems with partitioned tables  (Guillaume Cottenceau)
Responses: Re: Query performance problems with partitioned tables  (Richard Huxton)
Re: Query performance problems with partitioned tables  (Guillaume Cottenceau)
Re: Query performance problems with partitioned tables  ("Steinar H. Gunderson")
List: pgsql-performance

Tree view

Query performance problems with partitioned tables  (Andreas Haumer, )
 Re: Query performance problems with partitioned tables  (Guillaume Cottenceau, )
  Re: Query performance problems with partitioned tables  (Andreas Haumer, )
   Re: Query performance problems with partitioned tables  (Richard Huxton, )
    Re: Query performance problems with partitioned tables  ("Neil Peter Braggio", )
     Re: Query performance problems with partitioned tables  (Andreas Haumer, )
   Re: Query performance problems with partitioned tables  (Guillaume Cottenceau, )
    Re: Query performance problems with partitioned tables  (Gregory Stark, )
   Re: Query performance problems with partitioned tables  ("Steinar H. Gunderson", )
 Re: Query performance problems with partitioned tables  (Tom Lane, )
  Re: Query performance problems with partitioned tables  (Andreas Haumer, )
   Re: Query performance problems with partitioned tables  (Gregory Stark, )
 Re: Query performance problems with partitioned tables  (Fei Liu, )
  Re: Query performance problems with partitioned tables  ("Merlin Moncure", )
   Re: Query performance problems with partitioned tables  (Scott Marlowe, )
    Re: Query performance problems with partitioned tables  ("Merlin Moncure", )
    Re: Query performance problems with partitioned tables  (Fei Liu, )
     Re: Query performance problems with partitioned tables  (Scott Marlowe, )

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Guillaume Cottenceau schrieb:
> Andreas Haumer <andreas 'at' xss.co.at> writes:
[...]
>
>> Now my question is: Does the query planner in the case of partitioned tables
>> really have to scan all indexes in order to get the next timestamp smaller
>> (or larger) than a given one?
>
> Well, how can the planner know inside which partition the wanted
> row is? There might be no data, say, inside a couple of
> partitions in the past before finding the wanted row, in which
> case 3 partitions in the past must be scanned.
>

I think the planner could do the following:

a) It could make a better decision in which direction to scan
   the partitions (depending on sort order involved in the query)

b) It could stop scanning as soon as there can not be any further
   resulting row according to the CHECK constraints given on the tables.


Currently it doesn't do this.

Look at this example:

testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit 1;
           ts
- ------------------------
 2006-01-01 01:00:00+01
(1 row)

testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc
limit1; 
                                                                          QUERY PLAN
-
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=15843.41..15843.41 rows=1 width=8) (actual time=152.476..152.478 rows=1 loops=1)
   ->  Sort  (cost=15843.41..15865.39 rows=8795 width=8) (actual time=152.472..152.472 rows=1 loops=1)
         Sort Key: mwdb.t_mv.ts
         ->  Result  (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.102..122.540 rows=11629 loops=1)
               ->  Append  (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.098..76.140 rows=11629 loops=1)
                     ->  Index Scan using pk_mv_zr_ts on t_mv  (cost=0.00..8.27 rows=1 width=8) (actual
time=0.022..0.022rows=0 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200601 on t_mv_200601 t_mv  (cost=0.00..986.73 rows=582 width=8)
(actualtime=0.070..2.136 rows=743 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200602 on t_mv_200602 t_mv  (cost=0.00..847.40 rows=497 width=8)
(actualtime=0.066..2.063 rows=672 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200603 on t_mv_200603 t_mv  (cost=0.00..961.33 rows=565 width=8)
(actualtime=0.063..2.115 rows=743 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200604 on t_mv_200604 t_mv  (cost=0.00..901.09 rows=528 width=8)
(actualtime=0.156..2.200 rows=720 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200605 on t_mv_200605 t_mv  (cost=0.00..945.38 rows=555 width=8)
(actualtime=0.052..2.088 rows=744 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200606 on t_mv_200606 t_mv  (cost=0.00..995.58 rows=587 width=8)
(actualtime=0.054..1.869 rows=720 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200607 on t_mv_200607 t_mv  (cost=0.00..983.15 rows=578 width=8)
(actualtime=0.045..1.989 rows=744 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200608 on t_mv_200608 t_mv  (cost=0.00..976.05 rows=573 width=8)
(actualtime=0.048..1.877 rows=744 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200609 on t_mv_200609 t_mv  (cost=0.00..902.86 rows=529 width=8)
(actualtime=0.054..2.225 rows=720 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200610 on t_mv_200610 t_mv  (cost=0.00..934.74 rows=548 width=8)
(actualtime=0.034..2.671 rows=745 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200611 on t_mv_200611 t_mv  (cost=0.00..913.50 rows=536 width=8)
(actualtime=0.053..2.302 rows=720 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200612 on t_mv_200612 t_mv  (cost=0.00..983.15 rows=578 width=8)
(actualtime=0.059..2.449 rows=744 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200701 on t_mv_200701 t_mv  (cost=0.00..929.43 rows=545 width=8)
(actualtime=0.034..2.035 rows=744 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200702 on t_mv_200702 t_mv  (cost=0.00..863.33 rows=506 width=8)
(actualtime=0.034..1.675 rows=672 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200703 on t_mv_200703 t_mv  (cost=0.00..925.87 rows=542 width=8)
(actualtime=0.055..2.036 rows=743 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
                     ->  Index Scan using pk_mv_200704 on t_mv_200704 t_mv  (cost=0.00..1209.39 rows=545 width=8)
(actualtime=0.061..2.296 rows=711 loops=1) 
                           Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
 Total runtime: 153.195 ms
(40 rows)


Table t_mv_200601 gets scanned first, which is fine.

This already gives a row matching the given WHERE clause.
It makes no sense to scan the other tables, as the query
asks for one row only and all the other tables have timestamps
larger than all the timestamps in table t_mv_200601 (according
to the CHECK constraints for the partion tables)

The same would be true with the following query using an aggregate function
(perhaps this is a better example for my reasoning):

testdb_part=> select min(ts) from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00';
          min
- ------------------------
 2006-01-01 01:00:00+01
(1 row)

testdb_part=> explain analyze select min(ts) from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00';
                                                                    QUERY PLAN
-
--------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=15289.22..15289.23 rows=1 width=8) (actual time=106.735..106.737 rows=1 loops=1)
   ->  Append  (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.184..78.174 rows=11629 loops=1)
         ->  Index Scan using pk_mv_zr_ts on t_mv  (cost=0.00..8.27 rows=1 width=8) (actual time=0.035..0.035 rows=0
loops=1)
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200601 on t_mv_200601 t_mv  (cost=0.00..986.73 rows=582 width=8) (actual
time=0.143..2.207rows=743 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200602 on t_mv_200602 t_mv  (cost=0.00..847.40 rows=497 width=8) (actual
time=0.020..1.709rows=672 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200603 on t_mv_200603 t_mv  (cost=0.00..961.33 rows=565 width=8) (actual
time=0.033..2.076rows=743 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200604 on t_mv_200604 t_mv  (cost=0.00..901.09 rows=528 width=8) (actual
time=0.027..2.039rows=720 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200605 on t_mv_200605 t_mv  (cost=0.00..945.38 rows=555 width=8) (actual
time=0.031..2.109rows=744 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200606 on t_mv_200606 t_mv  (cost=0.00..995.58 rows=587 width=8) (actual
time=0.023..2.001rows=720 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200607 on t_mv_200607 t_mv  (cost=0.00..983.15 rows=578 width=8) (actual
time=0.027..2.064rows=744 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200608 on t_mv_200608 t_mv  (cost=0.00..976.05 rows=573 width=8) (actual
time=0.030..1.932rows=744 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200609 on t_mv_200609 t_mv  (cost=0.00..902.86 rows=529 width=8) (actual
time=0.021..2.408rows=720 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200610 on t_mv_200610 t_mv  (cost=0.00..934.74 rows=548 width=8) (actual
time=0.014..2.046rows=745 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200611 on t_mv_200611 t_mv  (cost=0.00..913.50 rows=536 width=8) (actual
time=0.024..1.846rows=720 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200612 on t_mv_200612 t_mv  (cost=0.00..983.15 rows=578 width=8) (actual
time=0.019..2.556rows=744 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200701 on t_mv_200701 t_mv  (cost=0.00..929.43 rows=545 width=8) (actual
time=0.022..2.188rows=744 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200702 on t_mv_200702 t_mv  (cost=0.00..863.33 rows=506 width=8) (actual
time=0.023..2.311rows=672 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200703 on t_mv_200703 t_mv  (cost=0.00..925.87 rows=542 width=8) (actual
time=0.027..1.977rows=743 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
         ->  Index Scan using pk_mv_200704 on t_mv_200704 t_mv  (cost=0.00..1209.39 rows=545 width=8) (actual
time=0.022..2.084rows=711 loops=1) 
               Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01
00:00:00+01'::timestampwith time zone)) 
 Total runtime: 107.152 ms
(37 rows)


As soon as the query found a "min(ts)" which is larger than
timestamp "2006-01-01 00:00:00" (the WHERE clause) in table
t_mv_200601, it can stop scanning, as there *can not* be any
timestamp smaller than the one already found in the other
tables (again, according to the CHECK constraints)!

Perhaps the logic to implement this is complex, but IMHO
it _should_ be doable (and proofable), shouldn't it?

In fact, the query planner already does partly select the tables
to scan in an intelligent way, because it does not scan the tables
with timestamps smaller than "2006-01-01 00:00:00", but IMHO it
still scans too much tables.

Comments?

- - andreas

- --
Andreas Haumer                     | mailto:
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGNe83xJmyeGcXPhERAk2fAJ98aqfKl7pQtac4HvSRr9GYbktadgCfU76J
ZmMj1A3UFejvS+2JrstrTaA=
=Myeo
-----END PGP SIGNATURE-----


pgsql-performance by date:

From: Gregory Stark
Date:
Subject: Re: Query performance problems with partitioned tables
From: Josh Berkus
Date:
Subject: Re: Feature Request --- was: PostgreSQL Performance Tuning