Query performance problems with partitioned tables - Mailing list pgsql-performance
From | Andreas Haumer |
---|---|
Subject | Query performance problems with partitioned tables |
Date | |
Msg-id | 4635D678.3050205@xss.co.at Whole thread Raw |
Responses |
Re: Query performance problems with partitioned tables
Re: Query performance problems with partitioned tables Re: Query performance problems with partitioned tables |
List | pgsql-performance |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi! I'm currently experimenting with PostgreSQL 8.2.4 and table partitioning in order to improve the performance of an application I'm working on. My application is about managing measurement values (lots of!) I have one table "t_mv" which stores all the measurement values. A single measurement value has a timestamp and belongs to a single time series, so table "t_mv" looks like this: CREATE TABLE t_mv ( zr integer NOT NULL, -- the time series id ts timestamp with time zone NOT NULL, -- the timestamp ... -- other attributes of a mv ) WITHOUT OIDS; ALTER TABLE t_mv ADD CONSTRAINT pk_mv_zr_ts PRIMARY KEY (zr, ts); Each time series defines several other attributes which are common to all measurement values of this time series (like sampling location, physical parameter, aggregation, cardinality, type, visibility, etc.) The application should be able to handle several thousand different time series and hundreds of millions of measurement values, so table t_mv can get quite large. I have tested installations with up to 70 millions rows in t_mv and PostgreSQL can handle that with a quite good performance even on non high-end machines (operating system is Linux, btw) But as I expect installations witch much more rows in t_mv, I tried to implement a "partitioned tables" concept using inheritance and CHECK constraints, just like it is described in the docs (e.g. chapter 5.9 in the current PostgreSQL 8.2.4 documentation) I split the t_mv table on the timestamp attribute to build child tables which hold all measurement values for a single month. That way I have several tables called "t_mv_YYYYMM" which all inherit from "t_mv". The number of child tables depends on the time period the application has to store the measurement values (which can be several years so I'm expecting up to 100 child tables or even more). For the application everything looks the same: inserts, updates and queries all are against the "t_mv" parent table, the application is not aware of the fact that this table is actually "split" into several child tables. This is working fine and for some standard queries it actually gives some performance improvement compared to the standard "everything in one big table" concept. The performance improvement increases with the number of rows in t_mv, for a small table (less than 10 million rows or so) IMHO it is not really worth the effort or even counter-productive. But I have some special queries where the performance with partitioned tables actually get much worse: those are queries where I'm working with "open" time intervals, i.e. where I want to get the previous and/or next timestamp from a given interval. A simple example: Get the timestamp of a measurement value for time series 3622 which is right before the measurement value with time stamp '2007-04-22 00:00:00': testdb_std=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; ts - ------------------------ 2007-04-21 23:00:00+02 (1 row) Im my application there are many queries like this. Such queries also come in several variations, including quite sophisticated joins with lots of other tables "above" the time series table. Note: as I'm working with (potentially) non-equidistant time series I can not just calculate the timestamps, I have to retrieve them from the database! In the standard case, the query plan for the example query looks like this: testdb_std=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; QUERY PLAN - ----------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..1.70 rows=1 width=8) (actual time=0.233..0.235 rows=1 loops=1) -> Index Scan Backward using pk_mv_zr_ts on t_mv (cost=0.00..21068.91 rows=12399 width=8) (actual time=0.221..0.221rows=1 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestamp withtime zone)) Total runtime: 0.266 ms (4 rows) If I switch to partitioned tables, the query retrieves the same result (of course): testdb_std=> \c testdb_part You are now connected to database "testdb_part". testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; ts - ------------------------ 2007-04-21 23:00:00+02 (1 row) But the query plan becomes: testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; QUERY PLAN - --------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=23985.83..23985.83 rows=1 width=8) (actual time=230.100..230.102 rows=1 loops=1) -> Sort (cost=23985.83..24019.84 rows=13605 width=8) (actual time=230.095..230.095 rows=1 loops=1) Sort Key: mwdb.t_mv.ts -> Result (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.154..177.519 rows=15810 loops=1) -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.149..114.186 rows=15810 loops=1) -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.047..0.047rows=0 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actualtime=0.095..2.419 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200508 on t_mv_200508 t_mv (cost=0.00..918.81 rows=539 width=8) (actualtime=0.081..2.134 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200509 on t_mv_200509 t_mv (cost=0.00..941.88 rows=555 width=8) (actualtime=0.061..2.051 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200510 on t_mv_200510 t_mv (cost=0.00..915.29 rows=538 width=8) (actualtime=0.064..2.113 rows=715 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200511 on t_mv_200511 t_mv (cost=0.00..925.93 rows=545 width=8) (actualtime=0.048..2.986 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..936.53 rows=550 width=8) (actualtime=0.049..2.212 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..981.42 rows=579 width=8) (actualtime=0.065..3.029 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..856.25 rows=502 width=8) (actualtime=0.045..2.866 rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..977.84 rows=575 width=8) (actualtime=0.052..3.044 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..906.40 rows=531 width=8) (actualtime=0.053..1.976 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..938.28 rows=550 width=8) (actualtime=0.050..2.357 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..922.35 rows=541 width=8) (actualtime=0.054..2.063 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..2112.64 rows=1315 width=8) (actualtime=0.047..2.226 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..990.23 rows=582 width=8) (actualtime=0.048..2.094 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.84 rows=528 width=8) (actualtime=0.039..2.252 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..964.87 rows=567 width=8) (actualtime=0.033..2.118 rows=745 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..947.17 rows=557 width=8) (actualtime=0.060..2.160 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..929.43 rows=545 width=8) (actualtime=0.039..2.051 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..940.05 rows=551 width=8) (actualtime=0.036..2.217 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..847.38 rows=496 width=8) (actualtime=0.035..1.830 rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..956.00 rows=561 width=8) (actualtime=0.062..2.326 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actualtime=0.050..1.406 rows=504 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) Total runtime: 231.730 ms (52 rows) Oops! Compare the costs or the actual query time between those queries! (Note: I set "constraint_exclusion = on", of course!) As such queries are used all over the application, this nullifies any performance improvements for standard queries and in fact makes the overall application performance as "feeled" by the user _much_ worse. I also tried it with "min()" and "max()" aggregate functions instead of the "limit 1" query, but this does not change much: Standard "big" table: testdb_std=> select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; max - ------------------------ 2007-04-21 23:00:00+02 (1 row) testdb_std=> explain analyze select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; QUERY PLAN - ------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=1.70..1.71 rows=1 width=0) (actual time=0.071..0.073 rows=1 loops=1) InitPlan -> Limit (cost=0.00..1.70 rows=1 width=8) (actual time=0.060..0.062 rows=1 loops=1) -> Index Scan Backward using pk_mv_zr_ts on t_mv (cost=0.00..21068.91 rows=12399 width=8) (actual time=0.056..0.056rows=1 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) Filter: ((ts)::timestamp with time zone IS NOT NULL) Total runtime: 0.221 ms (7 rows) "Partitioned table": testdb_part=> select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; max - ------------------------ 2007-04-21 23:00:00+02 (1 row) testdb_part=> explain analyze select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; QUERY PLAN - ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=23085.73..23085.74 rows=1 width=8) (actual time=390.094..390.096 rows=1 loops=1) -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.241..290.934 rows=15810 loops=1) -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.038..0.038 rows=0 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actual time=0.197..12.598rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200508 on t_mv_200508 t_mv (cost=0.00..918.81 rows=539 width=8) (actual time=0.095..5.947rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200509 on t_mv_200509 t_mv (cost=0.00..941.88 rows=555 width=8) (actual time=0.118..2.247rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200510 on t_mv_200510 t_mv (cost=0.00..915.29 rows=538 width=8) (actual time=0.121..6.219rows=715 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200511 on t_mv_200511 t_mv (cost=0.00..925.93 rows=545 width=8) (actual time=2.287..9.991rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..936.53 rows=550 width=8) (actual time=0.110..2.285rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..981.42 rows=579 width=8) (actual time=0.209..4.682rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..856.25 rows=502 width=8) (actual time=0.079..6.079rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..977.84 rows=575 width=8) (actual time=0.091..4.793rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..906.40 rows=531 width=8) (actual time=0.108..7.637rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..938.28 rows=550 width=8) (actual time=0.116..4.772rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..922.35 rows=541 width=8) (actual time=0.074..6.071rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..2112.64 rows=1315 width=8) (actual time=0.082..4.807rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..990.23 rows=582 width=8) (actual time=2.283..8.671rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.84 rows=528 width=8) (actual time=0.107..6.067rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..964.87 rows=567 width=8) (actual time=0.074..3.933rows=745 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..947.17 rows=557 width=8) (actual time=0.091..6.291rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..929.43 rows=545 width=8) (actual time=0.077..4.101rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..940.05 rows=551 width=8) (actual time=0.077..2.558rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..847.38 rows=496 width=8) (actual time=0.073..4.346rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..956.00 rows=561 width=8) (actual time=2.532..7.206rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actual time=0.120..4.163rows=504 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) Total runtime: 394.384 ms (49 rows) Now my question is: Does the query planner in the case of partitioned tables really have to scan all indexes in order to get the next timestamp smaller (or larger) than a given one? There are check conditions on all table partitions like this: For table t_mv_200704: CHECK (ts::timestamp with time zone >= '2007-04-01 00:00:00+02'::timestamp with time zone AND ts::timestamp with time zone < '2007-05-01 00:00:00+02'::timestamp with time zone) For table t_mv_200703: CHECK (ts::timestamp with time zone >= '2007-03-01 00:00:00+01'::timestamp with time zone AND ts::timestamp with time zone < '2007-04-01 00:00:00+02'::timestamp with time zone) and so on... So the tables are in a well defined, monotonic sort order regarding the timestamp. This means that if there is a max(ts) for ts < '2007-04-22 00:00:00' already in table t_mv_200704, it makes no sense to look further in other tables where the timestamps can only be smaller than the timestamp already found. Am I correct? Is there room for improvements of the query planner for queries like this or is this a special case which will never get handled anyway? Or would you suggest a completely different table structure or perhaps some other query? I'm open for any suggestion! - - andreas - -- Andreas Haumer | mailto:andreas@xss.co.at *x Software + Systeme | http://www.xss.co.at/ Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0 A-1100 Vienna, Austria | Fax: +43-1-6060114-71 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNdZ2xJmyeGcXPhERAsbfAJ9nA+z50uXiV4SHntt1Y9IuZ/rzWwCff8ar xKSMfzwgjx9kQipeDoEnXWE= =57aJ -----END PGP SIGNATURE-----
pgsql-performance by date: