Thread: Query performance problems with partitioned tables
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi! I'm currently experimenting with PostgreSQL 8.2.4 and table partitioning in order to improve the performance of an application I'm working on. My application is about managing measurement values (lots of!) I have one table "t_mv" which stores all the measurement values. A single measurement value has a timestamp and belongs to a single time series, so table "t_mv" looks like this: CREATE TABLE t_mv ( zr integer NOT NULL, -- the time series id ts timestamp with time zone NOT NULL, -- the timestamp ... -- other attributes of a mv ) WITHOUT OIDS; ALTER TABLE t_mv ADD CONSTRAINT pk_mv_zr_ts PRIMARY KEY (zr, ts); Each time series defines several other attributes which are common to all measurement values of this time series (like sampling location, physical parameter, aggregation, cardinality, type, visibility, etc.) The application should be able to handle several thousand different time series and hundreds of millions of measurement values, so table t_mv can get quite large. I have tested installations with up to 70 millions rows in t_mv and PostgreSQL can handle that with a quite good performance even on non high-end machines (operating system is Linux, btw) But as I expect installations witch much more rows in t_mv, I tried to implement a "partitioned tables" concept using inheritance and CHECK constraints, just like it is described in the docs (e.g. chapter 5.9 in the current PostgreSQL 8.2.4 documentation) I split the t_mv table on the timestamp attribute to build child tables which hold all measurement values for a single month. That way I have several tables called "t_mv_YYYYMM" which all inherit from "t_mv". The number of child tables depends on the time period the application has to store the measurement values (which can be several years so I'm expecting up to 100 child tables or even more). For the application everything looks the same: inserts, updates and queries all are against the "t_mv" parent table, the application is not aware of the fact that this table is actually "split" into several child tables. This is working fine and for some standard queries it actually gives some performance improvement compared to the standard "everything in one big table" concept. The performance improvement increases with the number of rows in t_mv, for a small table (less than 10 million rows or so) IMHO it is not really worth the effort or even counter-productive. But I have some special queries where the performance with partitioned tables actually get much worse: those are queries where I'm working with "open" time intervals, i.e. where I want to get the previous and/or next timestamp from a given interval. A simple example: Get the timestamp of a measurement value for time series 3622 which is right before the measurement value with time stamp '2007-04-22 00:00:00': testdb_std=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; ts - ------------------------ 2007-04-21 23:00:00+02 (1 row) Im my application there are many queries like this. Such queries also come in several variations, including quite sophisticated joins with lots of other tables "above" the time series table. Note: as I'm working with (potentially) non-equidistant time series I can not just calculate the timestamps, I have to retrieve them from the database! In the standard case, the query plan for the example query looks like this: testdb_std=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; QUERY PLAN - ----------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..1.70 rows=1 width=8) (actual time=0.233..0.235 rows=1 loops=1) -> Index Scan Backward using pk_mv_zr_ts on t_mv (cost=0.00..21068.91 rows=12399 width=8) (actual time=0.221..0.221rows=1 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestamp withtime zone)) Total runtime: 0.266 ms (4 rows) If I switch to partitioned tables, the query retrieves the same result (of course): testdb_std=> \c testdb_part You are now connected to database "testdb_part". testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; ts - ------------------------ 2007-04-21 23:00:00+02 (1 row) But the query plan becomes: testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; QUERY PLAN - --------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=23985.83..23985.83 rows=1 width=8) (actual time=230.100..230.102 rows=1 loops=1) -> Sort (cost=23985.83..24019.84 rows=13605 width=8) (actual time=230.095..230.095 rows=1 loops=1) Sort Key: mwdb.t_mv.ts -> Result (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.154..177.519 rows=15810 loops=1) -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.149..114.186 rows=15810 loops=1) -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.047..0.047rows=0 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actualtime=0.095..2.419 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200508 on t_mv_200508 t_mv (cost=0.00..918.81 rows=539 width=8) (actualtime=0.081..2.134 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200509 on t_mv_200509 t_mv (cost=0.00..941.88 rows=555 width=8) (actualtime=0.061..2.051 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200510 on t_mv_200510 t_mv (cost=0.00..915.29 rows=538 width=8) (actualtime=0.064..2.113 rows=715 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200511 on t_mv_200511 t_mv (cost=0.00..925.93 rows=545 width=8) (actualtime=0.048..2.986 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..936.53 rows=550 width=8) (actualtime=0.049..2.212 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..981.42 rows=579 width=8) (actualtime=0.065..3.029 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..856.25 rows=502 width=8) (actualtime=0.045..2.866 rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..977.84 rows=575 width=8) (actualtime=0.052..3.044 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..906.40 rows=531 width=8) (actualtime=0.053..1.976 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..938.28 rows=550 width=8) (actualtime=0.050..2.357 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..922.35 rows=541 width=8) (actualtime=0.054..2.063 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..2112.64 rows=1315 width=8) (actualtime=0.047..2.226 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..990.23 rows=582 width=8) (actualtime=0.048..2.094 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.84 rows=528 width=8) (actualtime=0.039..2.252 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..964.87 rows=567 width=8) (actualtime=0.033..2.118 rows=745 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..947.17 rows=557 width=8) (actualtime=0.060..2.160 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..929.43 rows=545 width=8) (actualtime=0.039..2.051 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..940.05 rows=551 width=8) (actualtime=0.036..2.217 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..847.38 rows=496 width=8) (actualtime=0.035..1.830 rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..956.00 rows=561 width=8) (actualtime=0.062..2.326 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actualtime=0.050..1.406 rows=504 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) Total runtime: 231.730 ms (52 rows) Oops! Compare the costs or the actual query time between those queries! (Note: I set "constraint_exclusion = on", of course!) As such queries are used all over the application, this nullifies any performance improvements for standard queries and in fact makes the overall application performance as "feeled" by the user _much_ worse. I also tried it with "min()" and "max()" aggregate functions instead of the "limit 1" query, but this does not change much: Standard "big" table: testdb_std=> select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; max - ------------------------ 2007-04-21 23:00:00+02 (1 row) testdb_std=> explain analyze select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; QUERY PLAN - ------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=1.70..1.71 rows=1 width=0) (actual time=0.071..0.073 rows=1 loops=1) InitPlan -> Limit (cost=0.00..1.70 rows=1 width=8) (actual time=0.060..0.062 rows=1 loops=1) -> Index Scan Backward using pk_mv_zr_ts on t_mv (cost=0.00..21068.91 rows=12399 width=8) (actual time=0.056..0.056rows=1 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) Filter: ((ts)::timestamp with time zone IS NOT NULL) Total runtime: 0.221 ms (7 rows) "Partitioned table": testdb_part=> select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; max - ------------------------ 2007-04-21 23:00:00+02 (1 row) testdb_part=> explain analyze select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; QUERY PLAN - ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=23085.73..23085.74 rows=1 width=8) (actual time=390.094..390.096 rows=1 loops=1) -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.241..290.934 rows=15810 loops=1) -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.038..0.038 rows=0 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actual time=0.197..12.598rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200508 on t_mv_200508 t_mv (cost=0.00..918.81 rows=539 width=8) (actual time=0.095..5.947rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200509 on t_mv_200509 t_mv (cost=0.00..941.88 rows=555 width=8) (actual time=0.118..2.247rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200510 on t_mv_200510 t_mv (cost=0.00..915.29 rows=538 width=8) (actual time=0.121..6.219rows=715 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200511 on t_mv_200511 t_mv (cost=0.00..925.93 rows=545 width=8) (actual time=2.287..9.991rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..936.53 rows=550 width=8) (actual time=0.110..2.285rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..981.42 rows=579 width=8) (actual time=0.209..4.682rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..856.25 rows=502 width=8) (actual time=0.079..6.079rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..977.84 rows=575 width=8) (actual time=0.091..4.793rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..906.40 rows=531 width=8) (actual time=0.108..7.637rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..938.28 rows=550 width=8) (actual time=0.116..4.772rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..922.35 rows=541 width=8) (actual time=0.074..6.071rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..2112.64 rows=1315 width=8) (actual time=0.082..4.807rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..990.23 rows=582 width=8) (actual time=2.283..8.671rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.84 rows=528 width=8) (actual time=0.107..6.067rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..964.87 rows=567 width=8) (actual time=0.074..3.933rows=745 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..947.17 rows=557 width=8) (actual time=0.091..6.291rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..929.43 rows=545 width=8) (actual time=0.077..4.101rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..940.05 rows=551 width=8) (actual time=0.077..2.558rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..847.38 rows=496 width=8) (actual time=0.073..4.346rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..956.00 rows=561 width=8) (actual time=2.532..7.206rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actual time=0.120..4.163rows=504 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) Total runtime: 394.384 ms (49 rows) Now my question is: Does the query planner in the case of partitioned tables really have to scan all indexes in order to get the next timestamp smaller (or larger) than a given one? There are check conditions on all table partitions like this: For table t_mv_200704: CHECK (ts::timestamp with time zone >= '2007-04-01 00:00:00+02'::timestamp with time zone AND ts::timestamp with time zone < '2007-05-01 00:00:00+02'::timestamp with time zone) For table t_mv_200703: CHECK (ts::timestamp with time zone >= '2007-03-01 00:00:00+01'::timestamp with time zone AND ts::timestamp with time zone < '2007-04-01 00:00:00+02'::timestamp with time zone) and so on... So the tables are in a well defined, monotonic sort order regarding the timestamp. This means that if there is a max(ts) for ts < '2007-04-22 00:00:00' already in table t_mv_200704, it makes no sense to look further in other tables where the timestamps can only be smaller than the timestamp already found. Am I correct? Is there room for improvements of the query planner for queries like this or is this a special case which will never get handled anyway? Or would you suggest a completely different table structure or perhaps some other query? I'm open for any suggestion! - - andreas - -- Andreas Haumer | mailto:andreas@xss.co.at *x Software + Systeme | http://www.xss.co.at/ Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0 A-1100 Vienna, Austria | Fax: +43-1-6060114-71 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNdZ2xJmyeGcXPhERAsbfAJ9nA+z50uXiV4SHntt1Y9IuZ/rzWwCff8ar xKSMfzwgjx9kQipeDoEnXWE= =57aJ -----END PGP SIGNATURE-----
Andreas Haumer <andreas 'at' xss.co.at> writes: [...] > testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=23985.83..23985.83 rows=1 width=8) (actual time=230.100..230.102 rows=1 loops=1) > -> Sort (cost=23985.83..24019.84 rows=13605 width=8) (actual time=230.095..230.095 rows=1 loops=1) > Sort Key: mwdb.t_mv.ts > -> Result (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.154..177.519 rows=15810 loops=1) > -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.149..114.186 rows=15810 loops=1) > -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.047..0.047rows=0 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actualtime=0.095..2.419 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) [...] > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actualtime=0.050..1.406 rows=504 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > Total runtime: 231.730 ms > (52 rows) > > Oops! > Compare the costs or the actual query time between those queries! Well, I'd say that scanning all partitions until the partition containing april 2007, when one of the query parameter is having timestamp before april 2007 but without an initial timestamp limit, looks normal :) [...] > Now my question is: Does the query planner in the case of partitioned tables > really have to scan all indexes in order to get the next timestamp smaller > (or larger) than a given one? Well, how can the planner know inside which partition the wanted row is? There might be no data, say, inside a couple of partitions in the past before finding the wanted row, in which case 3 partitions in the past must be scanned. -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi! Guillaume Cottenceau schrieb: > Andreas Haumer <andreas 'at' xss.co.at> writes: [...] > >> Now my question is: Does the query planner in the case of partitioned tables >> really have to scan all indexes in order to get the next timestamp smaller >> (or larger) than a given one? > > Well, how can the planner know inside which partition the wanted > row is? There might be no data, say, inside a couple of > partitions in the past before finding the wanted row, in which > case 3 partitions in the past must be scanned. > I think the planner could do the following: a) It could make a better decision in which direction to scan the partitions (depending on sort order involved in the query) b) It could stop scanning as soon as there can not be any further resulting row according to the CHECK constraints given on the tables. Currently it doesn't do this. Look at this example: testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit 1; ts - ------------------------ 2006-01-01 01:00:00+01 (1 row) testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit1; QUERY PLAN - -------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=15843.41..15843.41 rows=1 width=8) (actual time=152.476..152.478 rows=1 loops=1) -> Sort (cost=15843.41..15865.39 rows=8795 width=8) (actual time=152.472..152.472 rows=1 loops=1) Sort Key: mwdb.t_mv.ts -> Result (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.102..122.540 rows=11629 loops=1) -> Append (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.098..76.140 rows=11629 loops=1) -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.022..0.022rows=0 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..986.73 rows=582 width=8) (actualtime=0.070..2.136 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..847.40 rows=497 width=8) (actualtime=0.066..2.063 rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..961.33 rows=565 width=8) (actualtime=0.063..2.115 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..901.09 rows=528 width=8) (actualtime=0.156..2.200 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..945.38 rows=555 width=8) (actualtime=0.052..2.088 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..995.58 rows=587 width=8) (actualtime=0.054..1.869 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..983.15 rows=578 width=8) (actualtime=0.045..1.989 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..976.05 rows=573 width=8) (actualtime=0.048..1.877 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.86 rows=529 width=8) (actualtime=0.054..2.225 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..934.74 rows=548 width=8) (actualtime=0.034..2.671 rows=745 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..913.50 rows=536 width=8) (actualtime=0.053..2.302 rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..983.15 rows=578 width=8) (actualtime=0.059..2.449 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..929.43 rows=545 width=8) (actualtime=0.034..2.035 rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..863.33 rows=506 width=8) (actualtime=0.034..1.675 rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..925.87 rows=542 width=8) (actualtime=0.055..2.036 rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..1209.39 rows=545 width=8) (actualtime=0.061..2.296 rows=711 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) Total runtime: 153.195 ms (40 rows) Table t_mv_200601 gets scanned first, which is fine. This already gives a row matching the given WHERE clause. It makes no sense to scan the other tables, as the query asks for one row only and all the other tables have timestamps larger than all the timestamps in table t_mv_200601 (according to the CHECK constraints for the partion tables) The same would be true with the following query using an aggregate function (perhaps this is a better example for my reasoning): testdb_part=> select min(ts) from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00'; min - ------------------------ 2006-01-01 01:00:00+01 (1 row) testdb_part=> explain analyze select min(ts) from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00'; QUERY PLAN - -------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=15289.22..15289.23 rows=1 width=8) (actual time=106.735..106.737 rows=1 loops=1) -> Append (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.184..78.174 rows=11629 loops=1) -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.035..0.035 rows=0 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..986.73 rows=582 width=8) (actual time=0.143..2.207rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..847.40 rows=497 width=8) (actual time=0.020..1.709rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..961.33 rows=565 width=8) (actual time=0.033..2.076rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..901.09 rows=528 width=8) (actual time=0.027..2.039rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..945.38 rows=555 width=8) (actual time=0.031..2.109rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..995.58 rows=587 width=8) (actual time=0.023..2.001rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..983.15 rows=578 width=8) (actual time=0.027..2.064rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..976.05 rows=573 width=8) (actual time=0.030..1.932rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.86 rows=529 width=8) (actual time=0.021..2.408rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..934.74 rows=548 width=8) (actual time=0.014..2.046rows=745 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..913.50 rows=536 width=8) (actual time=0.024..1.846rows=720 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..983.15 rows=578 width=8) (actual time=0.019..2.556rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..929.43 rows=545 width=8) (actual time=0.022..2.188rows=744 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..863.33 rows=506 width=8) (actual time=0.023..2.311rows=672 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..925.87 rows=542 width=8) (actual time=0.027..1.977rows=743 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..1209.39 rows=545 width=8) (actual time=0.022..2.084rows=711 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) Total runtime: 107.152 ms (37 rows) As soon as the query found a "min(ts)" which is larger than timestamp "2006-01-01 00:00:00" (the WHERE clause) in table t_mv_200601, it can stop scanning, as there *can not* be any timestamp smaller than the one already found in the other tables (again, according to the CHECK constraints)! Perhaps the logic to implement this is complex, but IMHO it _should_ be doable (and proofable), shouldn't it? In fact, the query planner already does partly select the tables to scan in an intelligent way, because it does not scan the tables with timestamps smaller than "2006-01-01 00:00:00", but IMHO it still scans too much tables. Comments? - - andreas - -- Andreas Haumer | mailto:andreas@xss.co.at *x Software + Systeme | http://www.xss.co.at/ Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0 A-1100 Vienna, Austria | Fax: +43-1-6060114-71 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNe83xJmyeGcXPhERAk2fAJ98aqfKl7pQtac4HvSRr9GYbktadgCfU76J ZmMj1A3UFejvS+2JrstrTaA= =Myeo -----END PGP SIGNATURE-----
Andreas Haumer wrote: > > I think the planner could do the following: > > a) It could make a better decision in which direction to scan > the partitions (depending on sort order involved in the query) > > b) It could stop scanning as soon as there can not be any further > resulting row according to the CHECK constraints given on the tables. [snip] > Perhaps the logic to implement this is complex, but IMHO > it _should_ be doable (and proofable), shouldn't it? Ah, it might be do-able for some subset of cases, but is it cost-effective to check for in *all* cases? Don't forget the constraints and where clauses can be arbitrarily complex. -- Richard Huxton Archonet Ltd
Andreas Haumer <andreas 'at' xss.co.at> writes: > > Well, how can the planner know inside which partition the wanted > > row is? There might be no data, say, inside a couple of > > partitions in the past before finding the wanted row, in which > > case 3 partitions in the past must be scanned. > > > > I think the planner could do the following: > > a) It could make a better decision in which direction to scan > the partitions (depending on sort order involved in the query) > > b) It could stop scanning as soon as there can not be any further > resulting row according to the CHECK constraints given on the tables. About these precise points, I'll let a pg guru give an answer. > Look at this example: > > testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit 1; > ts > ------------------------ > 2006-01-01 01:00:00+01 > (1 row) > > testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit1; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=15843.41..15843.41 rows=1 width=8) (actual time=152.476..152.478 rows=1 loops=1) > -> Sort (cost=15843.41..15865.39 rows=8795 width=8) (actual time=152.472..152.472 rows=1 loops=1) > Sort Key: mwdb.t_mv.ts > -> Result (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.102..122.540 rows=11629 loops=1) > -> Append (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.098..76.140 rows=11629 loops=1) > -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.022..0.022rows=0 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) [...] > -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..1209.39 rows=545 width=8) (actualtime=0.061..2.296 rows=711 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestampwith time zone)) > Total runtime: 153.195 ms > (40 rows) > > > Table t_mv_200601 gets scanned first, which is fine. > > This already gives a row matching the given WHERE clause. > It makes no sense to scan the other tables, as the query > asks for one row only and all the other tables have timestamps > larger than all the timestamps in table t_mv_200601 (according > to the CHECK constraints for the partion tables) I think this is the last claimed point which is incorrect. Pg has no general guarantee the partitions actually create a disjoint set, even with the CHECK constraints. Pg can only optimize by avoiding scanning the partitions inside which no satisfactory data could be found by the CHECK constraint, but I think it's not possible (too complicated) to infer that any found row in your other partitions would not be in the final resultset because of 1. the query's resultset order 2. the limit 3. the actual conditions in the CHECK constraints (there is no direct way to see that timestamps in your 200704 partition are greater than timsteamp in your 200601 partition). I guess some sort of pg guru would be needed here to clarify things in a smart way, unlike me :) -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36
Just cast the value in the WHERE clause:
select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' ::TIMESTAMP order by ts asc limit 1;
This search only into the right partitioned tables if you build the rules based in the ts field.
----
Neil Peter Braggio
pbraggio@gmail.com
select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' ::TIMESTAMP order by ts asc limit 1;
This search only into the right partitioned tables if you build the rules based in the ts field.
----
Neil Peter Braggio
pbraggio@gmail.com
On 4/30/07, Richard Huxton <dev@archonet.com> wrote:
Andreas Haumer wrote:
>
> I think the planner could do the following:
>
> a) It could make a better decision in which direction to scan
> the partitions (depending on sort order involved in the query)
>
> b) It could stop scanning as soon as there can not be any further
> resulting row according to the CHECK constraints given on the tables.
[snip]
> Perhaps the logic to implement this is complex, but IMHO
> it _should_ be doable (and proofable), shouldn't it?
Ah, it might be do-able for some subset of cases, but is it
cost-effective to check for in *all* cases? Don't forget the constraints
and where clauses can be arbitrarily complex.
--
Richard Huxton
Archonet Ltd
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Andreas Haumer <andreas@xss.co.at> writes: > A simple example: Get the timestamp of a measurement value for time > series 3622 which is right before the measurement value with time > stamp '2007-04-22 00:00:00': > testdb_std=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; As already pointed out, this is only going to be able to exclude partitions that are strictly after the limit-time, since you have no WHERE clause that excludes anything before. Can you set a reasonable upper bound on the maximum inter-measurement time? If so, you could query something like this: select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' and ts > '2007-04-21 00:00:00' order by ts desc limit 1; If you don't have a hard limit, but do have some smarts on the client side, you could try successive queries like this with larger and larger windows until you get an answer. regards, tom lane
"Guillaume Cottenceau" <gc@mnc.ch> writes: > I think this is the last claimed point which is incorrect. Pg has > no general guarantee the partitions actually create a disjoint > set, even with the CHECK constraints. Pg can only optimize by > avoiding scanning the partitions inside which no satisfactory > data could be found by the CHECK constraint, but I think it's not > possible (too complicated) to infer that any found row in your > other partitions would not be in the final resultset because of > 1. the query's resultset order 2. the limit 3. the actual > conditions in the CHECK constraints (there is no direct way to > see that timestamps in your 200704 partition are greater than > timsteamp in your 200601 partition). I think the answer is that yes there are a number of query transformations that could be done for partitioned tables that we're not doing currently. Generally speaking we need to look at each type of plan node and figure out whether it can usefully be pushed down below the Append node and how we can determine when that can be done. So if each arm of the Append was already in order we could easily push the LIMIT inside the Append (and duplicating the work above the Append). But that wouldn't save us any work because we only generate rows from the partitions as they're needed anyways. In your example we could save work because the Sort needs all the rows before it starts. So we could sort each arm and apply the limit before passing it up to the outer Sort. That might save work in this case. But figuring out when that's less work than just sorting all of them is tricky. If the LIMIT is 1,000 and you have ten arms of 1,000 tuples each then you aren't going to save any work doing this. But if the LIMIT is 1 and you have few arms with many tuples then you could save a lot of work. Actually I happen to have been reading up on algorithms related to this this weekend. It's possible to implement a LimitUnsorted in linear-time which would fetch the first n records according to some sort key without actually sorting the records. That might make it more worthwhile. In short. Yes, there are a lot of optimizations possible around partitioned tables that we don't do either because it's not clear how to tell when they're worthwhile or because the code just isn't there yet. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi! Neil Peter Braggio schrieb: > Just cast the value in the WHERE clause: > > select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' > ::TIMESTAMP order by ts asc limit 1; > > This search only into the right partitioned tables if you build the > rules based in the ts field. > This doesn't help. A cast is not needed in this case, as the following query shows, where the query planner already is able to reduce the scan to the right tables: testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts > '2005-12-31 22:00:00' and ts < '2006-01-01 02:00:00'; ts - ------------------------ 2005-12-31 23:00:00+01 2006-01-01 00:00:00+01 2006-01-01 01:00:00+01 (3 rows) testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts > '2005-12-31 22:00:00' and ts < '2006-01-0102:00:00'; QUERY PLAN - ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..26.64 rows=4 width=8) (actual time=0.040..0.088 rows=3 loops=1) -> Append (cost=0.00..26.64 rows=4 width=8) (actual time=0.035..0.071 rows=3 loops=1) -> Index Scan using i_mv_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1) Index Cond: (((ts)::timestamp with time zone > '2005-12-31 22:00:00+01'::timestamp with time zone) AND ((ts)::timestampwith time zone < '2006-01-01 02:00:00+01'::timestamp with time zone)) Filter: ((zr)::integer = 3622) -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..8.30 rows=1 width=8) (actual time=0.019..0.022rows=1 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2005-12-31 22:00:00+01'::timestampwith time zone) AND ((ts)::timestamp with time zone < '2006-01-01 02:00:00+01'::timestamp with timezone)) -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..10.07 rows=2 width=8) (actual time=0.014..0.019rows=2 loops=1) Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2005-12-31 22:00:00+01'::timestampwith time zone) AND ((ts)::timestamp with time zone < '2006-01-01 02:00:00+01'::timestamp with timezone)) Total runtime: 0.176 ms (10 rows) Here, two child tables are involved (t_mv_200512 and t_mv_200601) and the query only uses those two, even without cast of the constants in the where clause. - - andreas - -- Andreas Haumer | mailto:andreas@xss.co.at *x Software + Systeme | http://www.xss.co.at/ Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0 A-1100 Vienna, Austria | Fax: +43-1-6060114-71 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNiATxJmyeGcXPhERAo23AJwPCBwvWQT/m3QRXRWqK0aECeMQ2gCbBDjA E5iZNnU41vrFBNtXzdCSmWY= =0+pC -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi! Tom Lane schrieb: [...] > As already pointed out, this is only going to be able to exclude > partitions that are strictly after the limit-time, since you have no > WHERE clause that excludes anything before. Can you set a reasonable > upper bound on the maximum inter-measurement time? If so, you could > query something like this: > > select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' > and ts > '2007-04-21 00:00:00' > order by ts desc limit 1; > That might be possible, though I'll have to check with our business logic. Those "open interval" queries usually are needed to catch the immediate neighbors for navigation purposes (e.g. the "next" and "previous" values in a list) or for drawing diagrams where the line starts somewhere left or right "outside" the diagram. > If you don't have a hard limit, but do have some smarts on the client > side, you could try successive queries like this with larger and larger > windows until you get an answer. > Well, the beauty of the "inheritance method" of course is to keep such rules out of the application... ;-) I have a DAO layer on top of Hibernate and I'd rather not touch this to put special database access logic in (especially as I plan to use partitioned tables as an option for really large installations. For small ones it looks like we don't need or want partitioned tables anyway) Perhaps I can hide this logic in some stored procedures (I already have several stored procedures to handle automatic and transparent creation of child tables on INSERTs anyway...) - - andreas - -- Andreas Haumer | mailto:andreas@xss.co.at *x Software + Systeme | http://www.xss.co.at/ Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0 A-1100 Vienna, Austria | Fax: +43-1-6060114-71 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFGNiVvxJmyeGcXPhERAkbzAJj7HBK6tMZpb0RPD7iN6vpyc1tiAKC2heFx 7pnq02iqW2QosLd93Y03PA== =pJ7q -----END PGP SIGNATURE-----
On Mon, Apr 30, 2007 at 03:29:30PM +0200, Andreas Haumer wrote: > This already gives a row matching the given WHERE clause. > It makes no sense to scan the other tables, as the query > asks for one row only and all the other tables have timestamps > larger than all the timestamps in table t_mv_200601 (according > to the CHECK constraints for the partion tables) So for each row, it has to check all CHECK constraints to see if it has enough rows? That sounds fairly inefficient. I wonder if the planner could copy the limit down through the Append, though -- it certainly doesn't need more than one row from each partition. It sounds slightly cumbersome to try to plan such a thing, though... /* Steinar */ -- Homepage: http://www.sesse.net/
Wait, rereading the original queries I seem to have misunderstood something. The individual parts of the partitioned tables are being accessed in timestamp order. So what's missing is some way for the optimizer to know that the resulting append results will still be in order. If it knew that all the constraints were mutually exclusive and covered ascending ranges then it could avoid doing the extra sort. Hm... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Andreas Haumer wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi! > > I'm currently experimenting with PostgreSQL 8.2.4 and table > partitioning in order to improve the performance of an > application I'm working on. > > My application is about managing measurement values (lots of!) > I have one table "t_mv" which stores all the measurement values. > A single measurement value has a timestamp and belongs > to a single time series, so table "t_mv" looks like this: > > CREATE TABLE t_mv > ( > zr integer NOT NULL, -- the time series id > ts timestamp with time zone NOT NULL, -- the timestamp > ... -- other attributes of a mv > ) > WITHOUT OIDS; > > ALTER TABLE t_mv > ADD CONSTRAINT pk_mv_zr_ts PRIMARY KEY (zr, ts); > > Each time series defines several other attributes which are common > to all measurement values of this time series (like sampling location, > physical parameter, aggregation, cardinality, type, visibility, etc.) > > The application should be able to handle several thousand > different time series and hundreds of millions of measurement > values, so table t_mv can get quite large. > > I have tested installations with up to 70 millions rows in t_mv > and PostgreSQL can handle that with a quite good performance > even on non high-end machines (operating system is Linux, btw) > > But as I expect installations witch much more rows in t_mv, I > tried to implement a "partitioned tables" concept using inheritance > and CHECK constraints, just like it is described in the docs > (e.g. chapter 5.9 in the current PostgreSQL 8.2.4 documentation) > > I split the t_mv table on the timestamp attribute to build > child tables which hold all measurement values for a single month. > That way I have several tables called "t_mv_YYYYMM" which all > inherit from "t_mv". The number of child tables depends on the > time period the application has to store the measurement values > (which can be several years so I'm expecting up to 100 child > tables or even more). > For the application everything looks the same: inserts, updates > and queries all are against the "t_mv" parent table, the application > is not aware of the fact that this table is actually "split" into > several child tables. > > This is working fine and for some standard queries it actually > gives some performance improvement compared to the standard > "everything in one big table" concept. The performance improvement > increases with the number of rows in t_mv, for a small table (less > than 10 million rows or so) IMHO it is not really worth the effort > or even counter-productive. > > But I have some special queries where the performance with > partitioned tables actually get much worse: those are queries where > I'm working with "open" time intervals, i.e. where I want to > get the previous and/or next timestamp from a given interval. > > A simple example: Get the timestamp of a measurement value for time > series 3622 which is right before the measurement value with time > stamp '2007-04-22 00:00:00': > > testdb_std=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; > ts > - ------------------------ > 2007-04-21 23:00:00+02 > (1 row) > > > Im my application there are many queries like this. Such > queries also come in several variations, including quite > sophisticated joins with lots of other tables "above" the > time series table. > > Note: as I'm working with (potentially) non-equidistant > time series I can not just calculate the timestamps, I > have to retrieve them from the database! > > In the standard case, the query plan for the example query looks like this: > > testdb_std=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; > QUERY PLAN > - ----------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..1.70 rows=1 width=8) (actual time=0.233..0.235 rows=1 loops=1) > -> Index Scan Backward using pk_mv_zr_ts on t_mv (cost=0.00..21068.91 rows=12399 width=8) (actual time=0.221..0.221rows=1 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > Total runtime: 0.266 ms > (4 rows) > > > If I switch to partitioned tables, the query retrieves the same result (of course): > > testdb_std=> \c testdb_part > You are now connected to database "testdb_part". > testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit 1; > ts > - ------------------------ > 2007-04-21 23:00:00+02 > (1 row) > > > But the query plan becomes: > > testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' order by ts desc limit1; > QUERY PLAN > - --------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=23985.83..23985.83 rows=1 width=8) (actual time=230.100..230.102 rows=1 loops=1) > -> Sort (cost=23985.83..24019.84 rows=13605 width=8) (actual time=230.095..230.095 rows=1 loops=1) > Sort Key: mwdb.t_mv.ts > -> Result (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.154..177.519 rows=15810 loops=1) > -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.149..114.186 rows=15810 loops=1) > -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.047..0.047rows=0 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actualtime=0.095..2.419 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200508 on t_mv_200508 t_mv (cost=0.00..918.81 rows=539 width=8) (actualtime=0.081..2.134 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200509 on t_mv_200509 t_mv (cost=0.00..941.88 rows=555 width=8) (actualtime=0.061..2.051 rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200510 on t_mv_200510 t_mv (cost=0.00..915.29 rows=538 width=8) (actualtime=0.064..2.113 rows=715 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200511 on t_mv_200511 t_mv (cost=0.00..925.93 rows=545 width=8) (actualtime=0.048..2.986 rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..936.53 rows=550 width=8) (actualtime=0.049..2.212 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..981.42 rows=579 width=8) (actualtime=0.065..3.029 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..856.25 rows=502 width=8) (actualtime=0.045..2.866 rows=672 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..977.84 rows=575 width=8) (actualtime=0.052..3.044 rows=743 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..906.40 rows=531 width=8) (actualtime=0.053..1.976 rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..938.28 rows=550 width=8) (actualtime=0.050..2.357 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..922.35 rows=541 width=8) (actualtime=0.054..2.063 rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..2112.64 rows=1315 width=8) (actualtime=0.047..2.226 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..990.23 rows=582 width=8) (actualtime=0.048..2.094 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.84 rows=528 width=8) (actualtime=0.039..2.252 rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..964.87 rows=567 width=8) (actualtime=0.033..2.118 rows=745 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..947.17 rows=557 width=8) (actualtime=0.060..2.160 rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..929.43 rows=545 width=8) (actualtime=0.039..2.051 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..940.05 rows=551 width=8) (actualtime=0.036..2.217 rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..847.38 rows=496 width=8) (actualtime=0.035..1.830 rows=672 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..956.00 rows=561 width=8) (actualtime=0.062..2.326 rows=743 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actualtime=0.050..1.406 rows=504 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > Total runtime: 231.730 ms > (52 rows) > > Oops! > Compare the costs or the actual query time between those queries! > (Note: I set "constraint_exclusion = on", of course!) > > As such queries are used all over the application, this nullifies > any performance improvements for standard queries and in fact makes > the overall application performance as "feeled" by the user _much_ > worse. > > I also tried it with "min()" and "max()" aggregate functions > instead of the "limit 1" query, but this does not change much: > > > Standard "big" table: > > testdb_std=> select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; > max > - ------------------------ > 2007-04-21 23:00:00+02 > (1 row) > > > testdb_std=> explain analyze select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; > QUERY PLAN > - ------------------------------------------------------------------------------------------------------------------------------------------------- > Result (cost=1.70..1.71 rows=1 width=0) (actual time=0.071..0.073 rows=1 loops=1) > InitPlan > -> Limit (cost=0.00..1.70 rows=1 width=8) (actual time=0.060..0.062 rows=1 loops=1) > -> Index Scan Backward using pk_mv_zr_ts on t_mv (cost=0.00..21068.91 rows=12399 width=8) (actual time=0.056..0.056rows=1 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > Filter: ((ts)::timestamp with time zone IS NOT NULL) > Total runtime: 0.221 ms > (7 rows) > > > "Partitioned table": > > testdb_part=> select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; > max > - ------------------------ > 2007-04-21 23:00:00+02 > (1 row) > > testdb_part=> explain analyze select max(ts) from mwdb.t_mv where zr=3622 and ts < '2007-04-22 00:00:00' ; > QUERY PLAN > - ---------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=23085.73..23085.74 rows=1 width=8) (actual time=390.094..390.096 rows=1 loops=1) > -> Append (cost=0.00..23051.72 rows=13605 width=8) (actual time=0.241..290.934 rows=15810 loops=1) > -> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.038..0.038 rows=0 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200507 on t_mv_200507 t_mv (cost=0.00..2417.53 rows=1519 width=8) (actual time=0.197..12.598rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200508 on t_mv_200508 t_mv (cost=0.00..918.81 rows=539 width=8) (actual time=0.095..5.947rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200509 on t_mv_200509 t_mv (cost=0.00..941.88 rows=555 width=8) (actual time=0.118..2.247rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200510 on t_mv_200510 t_mv (cost=0.00..915.29 rows=538 width=8) (actual time=0.121..6.219rows=715 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200511 on t_mv_200511 t_mv (cost=0.00..925.93 rows=545 width=8) (actual time=2.287..9.991rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200512 on t_mv_200512 t_mv (cost=0.00..936.53 rows=550 width=8) (actual time=0.110..2.285rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..981.42 rows=579 width=8) (actual time=0.209..4.682rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..856.25 rows=502 width=8) (actual time=0.079..6.079rows=672 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..977.84 rows=575 width=8) (actual time=0.091..4.793rows=743 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..906.40 rows=531 width=8) (actual time=0.108..7.637rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..938.28 rows=550 width=8) (actual time=0.116..4.772rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..922.35 rows=541 width=8) (actual time=0.074..6.071rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..2112.64 rows=1315 width=8) (actual time=0.082..4.807rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..990.23 rows=582 width=8) (actual time=2.283..8.671rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.84 rows=528 width=8) (actual time=0.107..6.067rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..964.87 rows=567 width=8) (actual time=0.074..3.933rows=745 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..947.17 rows=557 width=8) (actual time=0.091..6.291rows=720 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..929.43 rows=545 width=8) (actual time=0.077..4.101rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..940.05 rows=551 width=8) (actual time=0.077..2.558rows=744 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..847.38 rows=496 width=8) (actual time=0.073..4.346rows=672 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..956.00 rows=561 width=8) (actual time=2.532..7.206rows=743 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > -> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..814.38 rows=378 width=8) (actual time=0.120..4.163rows=504 loops=1) > Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone < '2007-04-22 00:00:00+02'::timestampwith time zone)) > Total runtime: 394.384 ms > (49 rows) > > > Now my question is: Does the query planner in the case of partitioned tables > really have to scan all indexes in order to get the next timestamp smaller > (or larger) than a given one? > > There are check conditions on all table partitions like this: > > For table t_mv_200704: > CHECK (ts::timestamp with time zone >= '2007-04-01 00:00:00+02'::timestamp with time zone > AND ts::timestamp with time zone < '2007-05-01 00:00:00+02'::timestamp with time zone) > > For table t_mv_200703: > CHECK (ts::timestamp with time zone >= '2007-03-01 00:00:00+01'::timestamp with time zone > AND ts::timestamp with time zone < '2007-04-01 00:00:00+02'::timestamp with time zone) > > and so on... > > So the tables are in a well defined, monotonic sort order regarding the timestamp. > > This means that if there is a max(ts) for ts < '2007-04-22 00:00:00' > already in table t_mv_200704, it makes no sense to look further in > other tables where the timestamps can only be smaller than the > timestamp already found. Am I correct? > > Is there room for improvements of the query planner for queries > like this or is this a special case which will never get handled > anyway? > > Or would you suggest a completely different table structure > or perhaps some other query? > > I'm open for any suggestion! > > - - andreas > > - -- > Andreas Haumer | mailto:andreas@xss.co.at > *x Software + Systeme | http://www.xss.co.at/ > Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0 > A-1100 Vienna, Austria | Fax: +43-1-6060114-71 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFGNdZ2xJmyeGcXPhERAsbfAJ9nA+z50uXiV4SHntt1Y9IuZ/rzWwCff8ar > xKSMfzwgjx9kQipeDoEnXWE= > =57aJ > -----END PGP SIGNATURE----- > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > Hello, Andreas, I too am having exactly the same issue as you do. Comparing my partitioned and plain table performance, I've found that the plain tables perform about 25% faster than partitioned table. Using 'explain select ...', I see that constraints are being used so in partitioned tables fewer rows are examined. But still partitioned tables are 25% slower, what a let down. Fei
On 5/3/07, Fei Liu <fei.liu@aepnetworks.com> wrote: > Hello, Andreas, I too am having exactly the same issue as you do. > Comparing my partitioned and plain table performance, I've found that > the plain tables perform about 25% faster than partitioned table. Using > 'explain select ...', I see that constraints are being used so in > partitioned tables fewer rows are examined. But still partitioned tables > are 25% slower, what a let down. That's a little bit harsh. The main use of partitioning is not to make the table faster but to make the maintenance easier. When constraint exclusion works well for a particular query you can get a small boost but many queries will break down in a really negative way. So, you are sacrificing flexibility for easier maintenance. You have to really be careful how you use it. The best case for partitioning is when you can logically divide up your data so that you really only have to deal with one sliver of it at a time...for joins and such. If the OP could force the constraint exclusion (maybe by hashing the timestamp down to a period and using that for where clause), his query would be fine. The problem is it's not always easy to do that. merlin
On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote: > On 5/3/07, Fei Liu <fei.liu@aepnetworks.com> wrote: > > Hello, Andreas, I too am having exactly the same issue as you do. > > Comparing my partitioned and plain table performance, I've found that > > the plain tables perform about 25% faster than partitioned table. Using > > 'explain select ...', I see that constraints are being used so in > > partitioned tables fewer rows are examined. But still partitioned tables > > are 25% slower, what a let down. > > That's a little bit harsh. The main use of partitioning is not to > make the table faster but to make the maintenance easier. When > constraint exclusion works well for a particular query you can get a > small boost but many queries will break down in a really negative way. > So, you are sacrificing flexibility for easier maintenance. You have > to really be careful how you use it. > > The best case for partitioning is when you can logically divide up > your data so that you really only have to deal with one sliver of it > at a time...for joins and such. If the OP could force the constraint > exclusion (maybe by hashing the timestamp down to a period and using > that for where clause), his query would be fine. The problem is it's > not always easy to do that. Agree++ I've been testing partitioning for a zip code lookup thing that was posted here earlier, and I partitioned a 10,000,000 row set into about 400 partitions. I found that selecting a range of areas defined by x/y coordinates was faster without any indexes. The same selection with one big table and one big (x,y) index took 3 to 10 seconds typically, same select against the partitions with no indexes took 0.2 to 0.5 seconds. For that particular application, the only way to scale it was with partitioning.
On 5/4/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote: > > On 5/3/07, Fei Liu <fei.liu@aepnetworks.com> wrote: > > > Hello, Andreas, I too am having exactly the same issue as you do. > > > Comparing my partitioned and plain table performance, I've found that > > > the plain tables perform about 25% faster than partitioned table. Using > > > 'explain select ...', I see that constraints are being used so in > > > partitioned tables fewer rows are examined. But still partitioned tables > > > are 25% slower, what a let down. > > > > That's a little bit harsh. The main use of partitioning is not to > > make the table faster but to make the maintenance easier. When > > constraint exclusion works well for a particular query you can get a > > small boost but many queries will break down in a really negative way. > > So, you are sacrificing flexibility for easier maintenance. You have > > to really be careful how you use it. > > > > The best case for partitioning is when you can logically divide up > > your data so that you really only have to deal with one sliver of it > > at a time...for joins and such. If the OP could force the constraint > > exclusion (maybe by hashing the timestamp down to a period and using > > that for where clause), his query would be fine. The problem is it's > > not always easy to do that. > > Agree++ > > I've been testing partitioning for a zip code lookup thing that was > posted here earlier, and I partitioned a 10,000,000 row set into about > 400 partitions. I found that selecting a range of areas defined by x/y > coordinates was faster without any indexes. The same selection with one > big table and one big (x,y) index took 3 to 10 seconds typically, same > select against the partitions with no indexes took 0.2 to 0.5 seconds. I was thinking about that problem....one approach I was playing with was to normalize the 10mm table to zipcode (chopping off + 4) and then doing bounding box ops on the zipcode (using earthdistance/gist) table and also the detail table using tradictional tactics or gist. I think this would give reasonable performance without partitioning (10mm records doesn't scare me anymore!). If the records are frequently updated you may want to TP anways though do to (pre-hot) vacuum issues. merlin
Scott Marlowe wrote: > On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote: > >> On 5/3/07, Fei Liu <fei.liu@aepnetworks.com> wrote: >> >>> Hello, Andreas, I too am having exactly the same issue as you do. >>> Comparing my partitioned and plain table performance, I've found that >>> the plain tables perform about 25% faster than partitioned table. Using >>> 'explain select ...', I see that constraints are being used so in >>> partitioned tables fewer rows are examined. But still partitioned tables >>> are 25% slower, what a let down. >>> >> That's a little bit harsh. The main use of partitioning is not to >> make the table faster but to make the maintenance easier. When >> constraint exclusion works well for a particular query you can get a >> small boost but many queries will break down in a really negative way. >> So, you are sacrificing flexibility for easier maintenance. You have >> to really be careful how you use it. >> >> The best case for partitioning is when you can logically divide up >> your data so that you really only have to deal with one sliver of it >> at a time...for joins and such. If the OP could force the constraint >> exclusion (maybe by hashing the timestamp down to a period and using >> that for where clause), his query would be fine. The problem is it's >> not always easy to do that. >> > > Agree++ > > I've been testing partitioning for a zip code lookup thing that was > posted here earlier, and I partitioned a 10,000,000 row set into about > 400 partitions. I found that selecting a range of areas defined by x/y > coordinates was faster without any indexes. The same selection with one > big table and one big (x,y) index took 3 to 10 seconds typically, same > select against the partitions with no indexes took 0.2 to 0.5 seconds. > > For that particular application, the only way to scale it was with > partitioning. > In my particular case, I have 2 million records uniformly split up in 40 partitions. It's ranged data varying with time, each partition has one month of data. Do you think this is a good candidate to seek performance boost with partitioned tables?
On Tue, 2007-05-08 at 13:41, Fei Liu wrote: > Scott Marlowe wrote: > > On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote: > > > >> On 5/3/07, Fei Liu <fei.liu@aepnetworks.com> wrote: > >> > >>> Hello, Andreas, I too am having exactly the same issue as you do. > >>> Comparing my partitioned and plain table performance, I've found that > >>> the plain tables perform about 25% faster than partitioned table. Using > >>> 'explain select ...', I see that constraints are being used so in > >>> partitioned tables fewer rows are examined. But still partitioned tables > >>> are 25% slower, what a let down. > >>> > >> That's a little bit harsh. The main use of partitioning is not to > >> make the table faster but to make the maintenance easier. When > >> constraint exclusion works well for a particular query you can get a > >> small boost but many queries will break down in a really negative way. > >> So, you are sacrificing flexibility for easier maintenance. You have > >> to really be careful how you use it. > >> > >> The best case for partitioning is when you can logically divide up > >> your data so that you really only have to deal with one sliver of it > >> at a time...for joins and such. If the OP could force the constraint > >> exclusion (maybe by hashing the timestamp down to a period and using > >> that for where clause), his query would be fine. The problem is it's > >> not always easy to do that. > >> > > > > Agree++ > > > > I've been testing partitioning for a zip code lookup thing that was > > posted here earlier, and I partitioned a 10,000,000 row set into about > > 400 partitions. I found that selecting a range of areas defined by x/y > > coordinates was faster without any indexes. The same selection with one > > big table and one big (x,y) index took 3 to 10 seconds typically, same > > select against the partitions with no indexes took 0.2 to 0.5 seconds. > > > > For that particular application, the only way to scale it was with > > partitioning. > > > In my particular case, I have 2 million records uniformly split up in 40 > partitions. It's ranged data varying with time, each partition has one > month of data. Do you think this is a good candidate to seek performance > boost with partitioned tables? That really really really depends on your access patterns. IF you typically access them by certain date ranges, then partitioning is almost always a win. If you have enough requests that don't select a range of dates, it might wind up being slow. There are other advantages to partitioning though, such as ease of maintenance, being able to do partial backups easily, archiving old partitions, placing the more active partitions on faster storage.