Query causing explosion of temp space with join involving partitioning - Mailing list pgsql-bugs
From | Krzysztof Nienartowicz |
---|---|
Subject | Query causing explosion of temp space with join involving partitioning |
Date | |
Msg-id | AANLkTine_e2T4lpFfPMvneOtDyhA4u2eQCvtnMdcQb-y@mail.gmail.com Whole thread Raw |
Responses |
Re: Query causing explosion of temp space with join involving partitioning
|
List | pgsql-bugs |
Hello, Sorry for the re-post - not sure list is the relevant one, I included slightly changed query in the previous message, sent to bugs list. I have an ORM-generated queries where parent table keys are used to fetch the records from the child table (with relevant FK indicies), where child table is partitioned. My understanding is that Postgres is unable to properly use constraint exclusion to query only a relevant table? Half of the join condition is propagated down, while the other is not. table sources has pk (sureyid,srcid), ts has fk(survey_pk,source_pk) on source (sureyid,srcid) and another index with survey_pk,source_pk,tstype (not used in the query). This is very unfortunate as the queries are auto-generated and I cannot move predicate to apply it directly to partitioned table. The plan includes all the partitions, next snippet shows exclusion works for the table when condition is used directly on the partitioned table. surveys-> SELECT t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE, t1.VALS surveys-> FROM sources t0 ,TS t1 where surveys-> (t0.SURVEYID = 16 AND t0.SRCID >= 203510110032281 AND t0.SRCID <= 203520107001677 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID = t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC surveys-> surveys-> ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=11575858.83..11730569.40 rows=3448336 width=60) Merge Cond: (t0.srcid = t1.source_pk) -> Index Scan using sources_pkey on sources t0 (cost=0.00..68407.63 rows=37817 width=12) Index Cond: ((surveyid = 16) AND (srcid >= 203510110032281::bigint) AND (srcid <= 203520107001677::bigint)) -> Materialize (cost=11575858.83..11618963.03 rows=3448336 width=48) -> Sort (cost=11575858.83..11584479.67 rows=3448336 width=48) Sort Key: t1.source_pk -> Append (cost=0.00..11049873.18 rows=3448336 width=48) -> Index Scan using ts_pkey on ts t1 (cost=0.00..8.27 rows=1 width=853) Index Cond: (survey_pk = 16) -> Index Scan using ts_part_bs3000l00000_ts_pkey on ts_part_bs3000l00000 t1 (cost=0.00..8.27 rows=1 width=48) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_bs3000l00001_cg0346l00000 t1 (cost=5760.36..1481735.21 rows=462422 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_bs3000l00001_cg0346l00000_ts_pkey (cost=0.00..5644.75 rows=462422 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_cg0346l00001_cg0816k00000 t1 (cost=5951.07..1565423.79 rows=488582 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_cg0346l00001_cg0816k00000_ts_pkey (cost=0.00..5828.93 rows=488582 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_cg0816k00001_cg1180k00000 t1 (cost=5513.54..1432657.90 rows=447123 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_cg0816k00001_cg1180k00000_ts_pkey (cost=0.00..5401.75 rows=447123 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_cg1180k00001_cg6204k00000 t1 (cost=5212.63..1329884.46 rows=415019 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_cg1180k00001_cg6204k00000_ts_pkey (cost=0.00..5108.87 rows=415019 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_cg6204k00001_lm0022n00000 t1 (cost=5450.37..1371917.76 rows=428113 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_cg6204k00001_lm0022n00000_ts_pkey (cost=0.00..5343.35 rows=428113 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_lm0022n00001_lm0276m00000 t1 (cost=5136.71..1298542.32 rows=405223 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_lm0022n00001_lm0276m00000_ts_pkey (cost=0.00..5035.40 rows=405223 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_lm0276m00001_lm0584k00000 t1 (cost=5770.98..1525737.42 rows=476204 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_lm0276m00001_lm0584k00000_ts_pkey (cost=0.00..5651.93 rows=476204 width=0) Index Cond: (survey_pk = 16) -> Bitmap Heap Scan on ts_part_lm0584k00001_sm0073k00000 t1 (cost=4536.03..1043949.51 rows=325647 width=48) Recheck Cond: (survey_pk = 16) -> Bitmap Index Scan on ts_part_lm0584k00001_sm0073k00000_ts_pkey (cost=0.00..4454.62 rows=325647 width=0) Index Cond: (survey_pk = 16) -> Index Scan using ts_part_sm0073k00001_ts_pkey on ts_part_sm0073k00001 t1 (cost=0.00..8.27 rows=1 width=48) Index Cond: (survey_pk = 16) (46 rows) Check to see if the exclusion works and yes, it does. surveys=> explain SELECT t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE, t1.VALS surveys-> FROM TS t1 WHERE t1.SURVEY_PK =16 AND t1.SOURCE_PK>= 202970108014045 AND t1.SOURCE_PK <= 202970108014909 surveys-> ORDER BY t1.SURVEY_PK ASC, t1.SOURCE_PK ASC surveys-> ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=9454.13..9459.91 rows=2313 width=48) Sort Key: t1.source_pk -> Result (cost=0.00..9324.88 rows=2313 width=48) -> Append (cost=0.00..9324.88 rows=2313 width=48) -> Index Scan using ts_pkey on ts t1 (cost=0.00..8.27 rows=1 width=853) Index Cond: ((survey_pk = 16) AND (source_pk >= 202970108014045::bigint) AND (source_pk <= 202970108014909::bigint)) -> Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t1 (cost=0.00..9316.61 rows=2312 width=48) Index Cond: ((survey_pk = 16) AND (source_pk >= 202970108014045::bigint) AND (source_pk <= 202970108014909::bigint)) (8 rows) Is there any workaround for that, except changing the query? Any plans to implement it? The second issue is connected to this one it looks like a bug to me: I have around 30 clients running the same query with different parameters, but the query always returns 1000 rows (boundary values are pre-calculated,so it's like traversal of the equiwidth histogram if it comes to srcid/source_pk) and the rows from parallel queries cannot be overlapping. Usually query returns within around a second. I noticed however there are some queries that hang for many hours and what is most curious some of them created several GB of temp files. The partition size is around 10M entries, there are around 10 partitions as I mentioned there is no way this query should fetch more then 1000 entries. Some queries may span multiple, adjacent partitions (but not the one above, as we can see). I tried to run this exploding query without ordering, but it did not change anything, behaviour is repeatable from the command line, if the query is divided by hand with same parameters values that are returned are ok, within a second. There are many AccessShareLock locks, all of them granted. Killing the clients that issues the queries does not change much - these are still running, DB immediate restart helps. Environment: Rocks 5.2, kernel 2.6.18, PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42), 64-bit, client is JDBC. The queries start to fail eventually due to the lack of space (over 500GB used by temp files), i.e. one of queries hangs for hours with temp allocation like: < ls -hs1 data/base/pgsql_tmp/ 1.1G pgsql_tmp27571.0 1.1G pgsql_tmp27571.1 1.1G pgsql_tmp27571.10 1.1G pgsql_tmp27571.11 1.1G pgsql_tmp27571.12 1.1G pgsql_tmp27571.13 1.1G pgsql_tmp27571.14 1.1G pgsql_tmp27571.15 1.1G pgsql_tmp27571.16 1.1G pgsql_tmp27571.17 1.1G pgsql_tmp27571.18 1.1G pgsql_tmp27571.19 1.1G pgsql_tmp27571.2 1.0G pgsql_tmp27571.20 1.0G pgsql_tmp27571.21 1.1G pgsql_tmp27571.22 1.1G pgsql_tmp27571.23 1.1G pgsql_tmp27571.24 1.1G pgsql_tmp27571.25 1.1G pgsql_tmp27571.26 801M pgsql_tmp27571.27 1.1G pgsql_tmp27571.3 1.1G pgsql_tmp27571.4 1.1G pgsql_tmp27571.5 1.1G pgsql_tmp27571.6 1.1G pgsql_tmp27571.7 1.1G pgsql_tmp27571.8 1.1G pgsql_tmp27571.9 > The transaction does not generate any updates/deletes. Working queries (subsecond) have plan like this, condition pushed down will use the index this time. surveys=> explain SELECT t0.SURVEYID, t0.SRCID, t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE, t1.VALS, t1.CCDIDS, t1.FLAGS, t1.OBSTIME, t1.LEN, t1.VALUEERRORS FROM sources t0 ,TS t1 where (t0.SURVEYID = 16 AND t0.SRCID >= 202970108014045 AND t0.SRCID <= 202970108014909 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID = t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..68958.49 rows=17242 width=192) Join Filter: (t0.srcid = t1.source_pk) -> Index Scan using sources_pkey on sources t0 (cost=0.00..35.48 rows=1 width=12) Index Cond: ((surveyid = 16) AND (srcid >= 202970108014045::bigint) AND (srcid <= 202970108014909::bigint)) -> Append (cost=0.00..68707.45 rows=17245 width=180) -> Index Scan using ts_pkey on ts t1 (cost=0.00..8.27 rows=1 width=1518) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Index Scan using ts_part_bs3000l00000_ts_pkey on ts_part_bs3000l00000 t1 (cost=0.00..8.27 rows=1 width=180) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_bs3000l00001_cg0346l00000 t1 (cost=40.29..9208.75 rows=2312 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_bs3000l00001_cg0346l00000_ts_pkey (cost=0.00..39.71 rows=2312 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_cg0346l00001_cg0816k00000 t1 (cost=41.60..9729.55 rows=2443 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_cg0346l00001_cg0816k00000_ts_pkey (cost=0.00..40.99 rows=2443 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_cg0816k00001_cg1180k00000 t1 (cost=39.25..8906.31 rows=2236 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_cg0816k00001_cg1180k00000_ts_pkey (cost=0.00..38.69 rows=2236 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_cg1180k00001_cg6204k00000 t1 (cost=37.50..8266.11 rows=2075 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_cg1180k00001_cg6204k00000_ts_pkey (cost=0.00..36.98 rows=2075 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_cg6204k00001_lm0022n00000 t1 (cost=38.44..8528.77 rows=2141 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_cg6204k00001_lm0022n00000_ts_pkey (cost=0.00..37.91 rows=2141 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_lm0022n00001_lm0276m00000 t1 (cost=36.99..8071.29 rows=2026 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_lm0022n00001_lm0276m00000_ts_pkey (cost=0.00..36.49 rows=2026 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_lm0276m00001_lm0584k00000 t1 (cost=40.80..9482.89 rows=2381 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_lm0276m00001_lm0584k00000_ts_pkey (cost=0.00..40.21 rows=2381 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Heap Scan on ts_part_lm0584k00001_sm0073k00000 t1 (cost=32.95..6488.95 rows=1628 width=180) Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Bitmap Index Scan on ts_part_lm0584k00001_sm0073k00000_ts_pkey (cost=0.00..32.54 rows=1628 width=0) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) -> Index Scan using ts_part_sm0073k00001_ts_pkey on ts_part_sm0073k00001 t1 (cost=0.00..8.27 rows=1 width=180) Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid)) (43 rows) And to check if partition exclusion is used for the working query: surveys=> explain select t0.SURVEY_PK, t0.SOURCE_PK, t0.TSTYPE, t0.VALS from ts t0 where t0.SURVEY_pk = 16 AND t0.SOURCE_PK >= 202970108014045 AND t0.Source_pk <= 202970108014909 ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..9324.88 rows=2313 width=48) -> Append (cost=0.00..9324.88 rows=2313 width=48) -> Index Scan using ts_pkey on ts t0 (cost=0.00..8.27 rows=1 width=853) Index Cond: ((survey_pk = 16) AND (source_pk >= 202970108014045::bigint) AND (source_pk <= 202970108014909::bigint)) -> Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t0 (cost=0.00..9316.61 rows=2312 width=48) Index Cond: ((survey_pk = 16) AND (source_pk >= 202970108014045::bigint) AND (source_pk <= 202970108014909::bigint)) (6 rows) Could you please advise how to cope with this? Should I file the bug? Does any workaround exist? Best Regards, Krzysztof
pgsql-bugs by date: