Thread: Query causing explosion of temp space with join involving partitioning

Query causing explosion of temp space with join involving partitioning

From
Krzysztof Nienartowicz
Date:
Hello,

Sorry for the re-post  - not sure list is the relevant one, I included
slightly changed query in the previous message, sent to bugs list.

I have an ORM-generated queries where parent table keys are used to
fetch the records from the child table (with relevant FK indicies),
where child table is partitioned. My understanding is that Postgres is
unable to properly use constraint exclusion to query only a relevant
table? Half of the join condition is propagated down, while the other
is not.

table sources has pk (sureyid,srcid), ts has fk(survey_pk,source_pk)
on source (sureyid,srcid) and another index with
survey_pk,source_pk,tstype (not used in the query).

This is very unfortunate as the queries are auto-generated and I
cannot move predicate to apply it directly to partitioned table.

The plan includes all the partitions, next snippet shows exclusion
works for the table when condition is used directly on the partitioned
table.

surveys-> SELECT  t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE,  t1.VALS
surveys->   FROM sources t0 ,TS t1 where
surveys->   (t0.SURVEYID = 16 AND t0.SRCID >= 203510110032281 AND
t0.SRCID <= 203520107001677 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID =
t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC
surveys->
surveys-> ;
                                                             QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------
Merge Join  (cost=11575858.83..11730569.40 rows=3448336 width=60)
 Merge Cond: (t0.srcid = t1.source_pk)
 ->  Index Scan using sources_pkey on sources t0
(cost=0.00..68407.63 rows=37817 width=12)
       Index Cond: ((surveyid = 16) AND (srcid >=
203510110032281::bigint) AND (srcid <= 203520107001677::bigint))
 ->  Materialize  (cost=11575858.83..11618963.03 rows=3448336 width=48)
       ->  Sort  (cost=11575858.83..11584479.67 rows=3448336 width=48)
             Sort Key: t1.source_pk
             ->  Append  (cost=0.00..11049873.18 rows=3448336 width=48)
                   ->  Index Scan using ts_pkey on ts t1
(cost=0.00..8.27 rows=1 width=853)
                         Index Cond: (survey_pk = 16)
                   ->  Index Scan using ts_part_bs3000l00000_ts_pkey
on ts_part_bs3000l00000 t1  (cost=0.00..8.27 rows=1 width=48)
                         Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_bs3000l00001_cg0346l00000 t1  (cost=5760.36..1481735.21
rows=462422 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_bs3000l00001_cg0346l00000_ts_pkey  (cost=0.00..5644.75
rows=462422 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_cg0346l00001_cg0816k00000 t1  (cost=5951.07..1565423.79
rows=488582 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_cg0346l00001_cg0816k00000_ts_pkey  (cost=0.00..5828.93
rows=488582 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_cg0816k00001_cg1180k00000 t1  (cost=5513.54..1432657.90
rows=447123 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_cg0816k00001_cg1180k00000_ts_pkey  (cost=0.00..5401.75
rows=447123 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_cg1180k00001_cg6204k00000 t1  (cost=5212.63..1329884.46
rows=415019 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_cg1180k00001_cg6204k00000_ts_pkey  (cost=0.00..5108.87
rows=415019 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_cg6204k00001_lm0022n00000 t1  (cost=5450.37..1371917.76
rows=428113 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_cg6204k00001_lm0022n00000_ts_pkey  (cost=0.00..5343.35
rows=428113 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_lm0022n00001_lm0276m00000 t1  (cost=5136.71..1298542.32
rows=405223 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_lm0022n00001_lm0276m00000_ts_pkey  (cost=0.00..5035.40
rows=405223 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_lm0276m00001_lm0584k00000 t1  (cost=5770.98..1525737.42
rows=476204 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_lm0276m00001_lm0584k00000_ts_pkey  (cost=0.00..5651.93
rows=476204 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Bitmap Heap Scan on
ts_part_lm0584k00001_sm0073k00000 t1  (cost=4536.03..1043949.51
rows=325647 width=48)
                         Recheck Cond: (survey_pk = 16)
                         ->  Bitmap Index Scan on
ts_part_lm0584k00001_sm0073k00000_ts_pkey  (cost=0.00..4454.62
rows=325647 width=0)
                               Index Cond: (survey_pk = 16)
                   ->  Index Scan using ts_part_sm0073k00001_ts_pkey
on ts_part_sm0073k00001 t1  (cost=0.00..8.27 rows=1 width=48)
                         Index Cond: (survey_pk = 16)
(46 rows)


Check to see if the exclusion works and yes, it does.


surveys=> explain SELECT  t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE,  t1.VALS
surveys->   FROM  TS t1 WHERE t1.SURVEY_PK =16  AND t1.SOURCE_PK>=
202970108014045 AND t1.SOURCE_PK <= 202970108014909
surveys->  ORDER BY t1.SURVEY_PK ASC, t1.SOURCE_PK ASC
surveys->   ;

 QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=9454.13..9459.91 rows=2313 width=48)
 Sort Key: t1.source_pk
 ->  Result  (cost=0.00..9324.88 rows=2313 width=48)
       ->  Append  (cost=0.00..9324.88 rows=2313 width=48)
             ->  Index Scan using ts_pkey on ts t1  (cost=0.00..8.27
rows=1 width=853)
                   Index Cond: ((survey_pk = 16) AND (source_pk >=
202970108014045::bigint) AND (source_pk <= 202970108014909::bigint))
             ->  Index Scan using
ts_part_bs3000l00001_cg0346l00000_ts_pkey on
ts_part_bs3000l00001_cg0346l00000 t1  (cost=0.00..9316.61 rows=2312
width=48)
                   Index Cond: ((survey_pk = 16) AND (source_pk >=
202970108014045::bigint) AND (source_pk <= 202970108014909::bigint))
(8 rows)



Is there any workaround for that, except changing the query? Any plans
to implement it?

The second issue is connected to this one it looks like a bug to me:

I have around 30 clients running the same query with different
parameters, but the query always returns 1000 rows (boundary values
are pre-calculated,so it's like traversal of the equiwidth histogram
if it comes to srcid/source_pk) and the rows from parallel queries
cannot be overlapping. Usually query returns within around a second.
I noticed however there are some queries that hang for many hours and
what is most curious some of them created several GB of temp files.
The partition size is around 10M entries, there are around 10
partitions as I mentioned there is no way this query should fetch more
then 1000 entries. Some queries may span multiple, adjacent partitions
(but not the one above, as we can see). I tried to run this exploding
query without ordering, but it did not change anything, behaviour is
repeatable from the command line, if the query is divided by hand with
same parameters values that are returned are ok, within a second.
There are many AccessShareLock locks, all of them granted.
Killing the clients that issues the queries does not change much -
these are still running, DB immediate restart helps.


Environment: Rocks 5.2, kernel 2.6.18,  PostgreSQL 8.4.2 on
x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20071124 (Red
Hat 4.1.2-42), 64-bit, client is JDBC.

The queries start to fail eventually due to the lack of space (over
500GB used by temp files), i.e. one of queries hangs for hours with
temp allocation like:

< ls -hs1 data/base/pgsql_tmp/
1.1G pgsql_tmp27571.0
1.1G pgsql_tmp27571.1
1.1G pgsql_tmp27571.10
1.1G pgsql_tmp27571.11
1.1G pgsql_tmp27571.12
1.1G pgsql_tmp27571.13
1.1G pgsql_tmp27571.14
1.1G pgsql_tmp27571.15
1.1G pgsql_tmp27571.16
1.1G pgsql_tmp27571.17
1.1G pgsql_tmp27571.18
1.1G pgsql_tmp27571.19
1.1G pgsql_tmp27571.2
1.0G pgsql_tmp27571.20
1.0G pgsql_tmp27571.21
1.1G pgsql_tmp27571.22
1.1G pgsql_tmp27571.23
1.1G pgsql_tmp27571.24
1.1G pgsql_tmp27571.25
1.1G pgsql_tmp27571.26
801M pgsql_tmp27571.27
1.1G pgsql_tmp27571.3
1.1G pgsql_tmp27571.4
1.1G pgsql_tmp27571.5
1.1G pgsql_tmp27571.6
1.1G pgsql_tmp27571.7
1.1G pgsql_tmp27571.8
1.1G pgsql_tmp27571.9
>


The transaction does not generate any updates/deletes.

Working queries (subsecond) have plan like this, condition pushed down
will use the index this time.

surveys=> explain SELECT t0.SURVEYID, t0.SRCID, t1.SURVEY_PK,
t1.SOURCE_PK, t1.TSTYPE, t1.VALS,
t1.CCDIDS, t1.FLAGS, t1.OBSTIME, t1.LEN, t1.VALUEERRORS
FROM sources t0 ,TS t1 where
(t0.SURVEYID = 16 AND t0.SRCID >= 202970108014045 AND t0.SRCID <=
202970108014909 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID = t1.SOURCE_PK )
ORDER BY t0.SURVEYID ASC, t0.SRCID ASC
;

  QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=0.00..68958.49 rows=17242 width=192)
  Join Filter: (t0.srcid = t1.source_pk)
  ->  Index Scan using sources_pkey on sources t0  (cost=0.00..35.48
rows=1 width=12)
        Index Cond: ((surveyid = 16) AND (srcid >=
202970108014045::bigint) AND (srcid <= 202970108014909::bigint))
  ->  Append  (cost=0.00..68707.45 rows=17245 width=180)
        ->  Index Scan using ts_pkey on ts t1  (cost=0.00..8.27 rows=1
width=1518)
              Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
        ->  Index Scan using ts_part_bs3000l00000_ts_pkey on
ts_part_bs3000l00000 t1  (cost=0.00..8.27 rows=1 width=180)
              Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
        ->  Bitmap Heap Scan on ts_part_bs3000l00001_cg0346l00000 t1
(cost=40.29..9208.75 rows=2312 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_bs3000l00001_cg0346l00000_ts_pkey  (cost=0.00..39.71 rows=2312
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_cg0346l00001_cg0816k00000 t1
(cost=41.60..9729.55 rows=2443 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_cg0346l00001_cg0816k00000_ts_pkey  (cost=0.00..40.99 rows=2443
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_cg0816k00001_cg1180k00000 t1
(cost=39.25..8906.31 rows=2236 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_cg0816k00001_cg1180k00000_ts_pkey  (cost=0.00..38.69 rows=2236
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_cg1180k00001_cg6204k00000 t1
(cost=37.50..8266.11 rows=2075 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_cg1180k00001_cg6204k00000_ts_pkey  (cost=0.00..36.98 rows=2075
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_cg6204k00001_lm0022n00000 t1
(cost=38.44..8528.77 rows=2141 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_cg6204k00001_lm0022n00000_ts_pkey  (cost=0.00..37.91 rows=2141
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_lm0022n00001_lm0276m00000 t1
(cost=36.99..8071.29 rows=2026 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_lm0022n00001_lm0276m00000_ts_pkey  (cost=0.00..36.49 rows=2026
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_lm0276m00001_lm0584k00000 t1
(cost=40.80..9482.89 rows=2381 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_lm0276m00001_lm0584k00000_ts_pkey  (cost=0.00..40.21 rows=2381
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Bitmap Heap Scan on ts_part_lm0584k00001_sm0073k00000 t1
(cost=32.95..6488.95 rows=1628 width=180)
              Recheck Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
              ->  Bitmap Index Scan on
ts_part_lm0584k00001_sm0073k00000_ts_pkey  (cost=0.00..32.54 rows=1628
width=0)
                    Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk
= t0.srcid))
        ->  Index Scan using ts_part_sm0073k00001_ts_pkey on
ts_part_sm0073k00001 t1  (cost=0.00..8.27 rows=1 width=180)
              Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
(43 rows)




And to check if partition exclusion is used for the working query:
surveys=> explain select t0.SURVEY_PK, t0.SOURCE_PK, t0.TSTYPE,
t0.VALS from ts t0 where t0.SURVEY_pk = 16 AND t0.SOURCE_PK >=
202970108014045 AND t0.Source_pk <= 202970108014909 ;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Result  (cost=0.00..9324.88 rows=2313 width=48)
 ->  Append  (cost=0.00..9324.88 rows=2313 width=48)
       ->  Index Scan using ts_pkey on ts t0  (cost=0.00..8.27
rows=1 width=853)
             Index Cond: ((survey_pk = 16) AND (source_pk >=
202970108014045::bigint) AND (source_pk <= 202970108014909::bigint))
       ->  Index Scan using
ts_part_bs3000l00001_cg0346l00000_ts_pkey on
ts_part_bs3000l00001_cg0346l00000 t0  (cost=0.00..9316.61 rows=2312
width=48)
             Index Cond: ((survey_pk = 16) AND (source_pk >=
202970108014045::bigint) AND (source_pk <= 202970108014909::bigint))
(6 rows)




Could you please advise how to cope with this? Should I file the bug?
Does any workaround exist?

Best Regards,
Krzysztof

Krzysztof Nienartowicz <krzysztof.nienartowicz.cern@gmail.com> writes:
> surveys-> SELECT  t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE,  t1.VALS
> surveys->   FROM sources t0 ,TS t1 where
> surveys->   (t0.SURVEYID = 16 AND t0.SRCID >= 203510110032281 AND
> t0.SRCID <= 203520107001677 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID =
> t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC

We don't make any attempt to infer derived inequality conditions,
so no, those constraints on t0.srcid won't be propagated over to
t1.source_pk.  Sorry.  It's been suggested before, but it would be
a lot of new mechanism and expense in the planner, and for most
queries it'd just slow things down to try to do that.

> I have around 30 clients running the same query with different
> parameters, but the query always returns 1000 rows (boundary values
> are pre-calculated,so it's like traversal of the equiwidth histogram
> if it comes to srcid/source_pk) and the rows from parallel queries
> cannot be overlapping. Usually query returns within around a second.
> I noticed however there are some queries that hang for many hours and
> what is most curious some of them created several GB of temp files.

Can you show us the query plan for the slow cases?

            regards, tom lane

Re: Query causing explosion of temp space with join involving partitioning

From
Krzysztof Nienartowicz
Date:
Hello,
Thank you for the clarifications. The plan as run from the psql looks ok, also did not notice any specific locks for this particular query.

Logs of the system running queries are not utterly clear, so chasing the parameters for the explosive query is not that simple (shared logs between multiple threads), but from what I see there is no difference between them and the plan looks like (without removal of irrelevant parameters this time, most of them are float8, but also bytea)

explain SELECT t0.surveyid, t0.srcid, t1.survey_pk, t1.source_pk, t1.tstype, t1.homoscedasticitytest, t1.ljungboxrandomnesstest, t1.maxvalue, t1.meanobstime, 
t1.meanvalue, t1.median, t1.minvalue, t1.range, t1.robustweightedstddev, t1.symmetrytest, t1.trimmedweightedmean, t1.trimmedweightedrange, 
t1.variabilityflag, t1.weightedkurtosis, t1.weightedmean, t1.weightedmeanconfidenceinterval, t1.weightedmeanobstime, t1.weightednormalizedp2pscatter, 
t1.weightedskewness, t1.weightedstddevdf, 
t1.weightedstddevwdf, t1.vals, t1.ccdids, t1.flags, t1.obstime, t1.len, t1.valueerrors FROM  sources t0 INNER JOIN ts t1 ON 
t0.surveyid = t1.survey_pk AND t0.srcid = t1.source_pk WHERE (t0.surveyid = 16 AND t0.srcid >= 200210107009116  AND t0.srcid <= 200210107009991)
 ORDER BY t0.surveyid ASC, t0.srcid ASC ;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2363.21 rows=835683 width=1527)
   Join Filter: (t0.srcid = t1.source_pk)
   ->  Index Scan using sources_pkey on sources t0  (cost=0.00..17.88 rows=1 width=12)
         Index Cond: ((surveyid = 16) AND (srcid >= 200210107009116::bigint) AND (srcid <= 200210107009991::bigint))
   ->  Append  (cost=0.00..2325.93 rows=1552 width=1053)
         ->  Index Scan using ts_pkey on ts t1  (cost=0.00..4.27 rows=1 width=1665)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00000_ts_pkey on ts_part_bs3000l00000 t1  (cost=0.00..6.30 rows=2 width=327)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_sm0073k00001_ts_pkey on ts_part_sm0073k00001 t1  (cost=0.00..1232.63 rows=608 width=327)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t1  (cost=0.00..145.41 rows=127 width=1556)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0346l00001_cg0816k00000_ts_pkey on ts_part_cg0346l00001_cg0816k00000 t1  (cost=0.00..147.64 rows=127 width=1669)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0816k00001_cg1180k00000_ts_pkey on ts_part_cg0816k00001_cg1180k00000 t1  (cost=0.00..138.09 rows=119 width=1615)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg1180k00001_cg6204k00000_ts_pkey on ts_part_cg1180k00001_cg6204k00000 t1  (cost=0.00..125.69 rows=109 width=1552)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg6204k00001_lm0022n00000_ts_pkey on ts_part_cg6204k00001_lm0022n00000 t1  (cost=0.00..133.23 rows=116 width=1509)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0022n00001_lm0276m00000_ts_pkey on ts_part_lm0022n00001_lm0276m00000 t1  (cost=0.00..131.08 rows=115 width=1500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0276m00001_lm0584k00000_ts_pkey on ts_part_lm0276m00001_lm0584k00000 t1  (cost=0.00..158.11 rows=135 width=1471)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0584k00001_sm0073k00000_ts_pkey on ts_part_lm0584k00001_sm0073k00000 t1  (cost=0.00..103.47 rows=93 width=1242)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))


I could increase debug level on the server, but not sure if the plan printed there is of any help. Could this be caused by some race where there is too much activity? - DB box is at around 10% CPU load, small io wait, before the query starts to overload the machine.

For sake of clarity this is the plan for the non-joined parameters to show which partition would be used (i.e. a single one)

explain select * from ts t0 where t0.survey_pk = 16 AND t0.source_pk >= 200210107009116  AND t0.source_pk <= 200210107009991;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..10.72 rows=2 width=1614)
   ->  Append  (cost=0.00..10.72 rows=2 width=1614)
         ->  Index Scan using ts_pkey on ts t0  (cost=0.00..4.27 rows=1 width=1669)
               Index Cond: ((survey_pk = 16) AND (source_pk >= 200210107009116::bigint) AND (source_pk <= 200210107009991::bigint))
         ->  Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t0  (cost=0.00..6.45 rows=1 width=1560)
               Index Cond: ((survey_pk = 16) AND (source_pk >= 200210107009116::bigint) AND (source_pk <= 200210107009991::bigint))
(6 rows)

Time: 1.559 ms

and to check the bin size:
>  select count(*) from ts t0 where t0.survey_pk = 16 AND t0.source_pk >= 200210107009116  AND t0.source_pk <= 200210107009991;
 count 
-------
  1000
(1 row)


and analyzed plan:
> explain analyze SELECT t0.surveyid, t0.srcid, t1.survey_pk, t1.source_pk, t1.tstype, t1.homoscedasticitytest, t1.ljungboxrandomnesstest, t1.maxvalue, t1.meanobstime, 
t1.meanvalue, t1.median, t1.minvalue, t1.range, t1.robustweightedstddev, t1.symmetrytest, t1.trimmedweightedmean, t1.trimmedweightedrange, 
t1.variabilityflag, t1.weightedkurtosis, t1.weightedmean, t1.weightedmeanconfidenceinterval, t1.weightedmeanobstime, t1.weightednormalizedp2pscatter, 
t1.weightedskewness, t1.weightedstddevdf, 
t1.weightedstddevwdf, t1.vals, t1.ccdids, t1.flags, t1.obstime, t1.len, t1.valueerrors FROM oglehip.sources t0 INNER JOIN oglehip.ts t1 ON 
t0.surveyid = t1.survey_pk AND t0.srcid = t1.source_pk WHERE (t0.surveyid = 16 AND t0.srcid >= 200210107009116  AND t0.srcid <= 200210107009991)
 ORDER BY t0.surveyid ASC, t0.srcid ASC ;
                                                                                             QUERY PLAN                                                                                              
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2363.21 rows=835692 width=1527) (actual time=73.629..585.003 rows=1000 loops=1)
   Join Filter: (t0.srcid = t1.source_pk)
   ->  Index Scan using sources_pkey on sources t0  (cost=0.00..17.88 rows=1 width=12) (actual time=73.507..560.589 rows=500 loops=1)
         Index Cond: ((surveyid = 16) AND (srcid >= 200210107009116::bigint) AND (srcid <= 200210107009991::bigint))
   ->  Append  (cost=0.00..2325.93 rows=1552 width=1053) (actual time=0.014..0.045 rows=2 loops=500)
         ->  Index Scan using ts_pkey on ts t1  (cost=0.00..4.27 rows=1 width=1665) (actual time=0.001..0.001 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00000_ts_pkey on ts_part_bs3000l00000 t1  (cost=0.00..6.30 rows=2 width=327) (actual time=0.002..0.002 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_sm0073k00001_ts_pkey on ts_part_sm0073k00001 t1  (cost=0.00..1232.63 rows=608 width=327) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t1  (cost=0.00..145.41 rows=127 width=1556) (actual time=0.006..0.007 rows=2 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0346l00001_cg0816k00000_ts_pkey on ts_part_cg0346l00001_cg0816k00000 t1  (cost=0.00..147.64 rows=127 width=1669) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0816k00001_cg1180k00000_ts_pkey on ts_part_cg0816k00001_cg1180k00000 t1  (cost=0.00..138.09 rows=119 width=1615) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg1180k00001_cg6204k00000_ts_pkey on ts_part_cg1180k00001_cg6204k00000 t1  (cost=0.00..125.69 rows=109 width=1552) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg6204k00001_lm0022n00000_ts_pkey on ts_part_cg6204k00001_lm0022n00000 t1  (cost=0.00..133.23 rows=116 width=1509) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0022n00001_lm0276m00000_ts_pkey on ts_part_lm0022n00001_lm0276m00000 t1  (cost=0.00..131.08 rows=115 width=1500) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0276m00001_lm0584k00000_ts_pkey on ts_part_lm0276m00001_lm0584k00000 t1  (cost=0.00..158.11 rows=135 width=1471) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0584k00001_sm0073k00000_ts_pkey on ts_part_lm0584k00001_sm0073k00000 t1  (cost=0.00..103.47 rows=93 width=1242) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
 Total runtime: 585.566 ms
(28 rows)

Time: 588.102 ms


Would be grateful for any pointers as the server restart is the only option now once such a query starts trashing the disk.

Best Regards,
Krzysztof


Krzysztof Nienartowicz <krzysztof.nienartowicz.cern@gmail.com> writes:
> surveys-> SELECT  t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE,  t1.VALS
> surveys->   FROM sources t0 ,TS t1 where
> surveys->   (t0.SURVEYID = 16 AND t0.SRCID >= 203510110032281 AND
> t0.SRCID <= 203520107001677 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID =
> t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC

We don't make any attempt to infer derived inequality conditions,
so no, those constraints on t0.srcid won't be propagated over to
t1.source_pk.  Sorry.  It's been suggested before, but it would be
a lot of new mechanism and expense in the planner, and for most
queries it'd just slow things down to try to do that.

> I have around 30 clients running the same query with different
> parameters, but the query always returns 1000 rows (boundary values
> are pre-calculated,so it's like traversal of the equiwidth histogram
> if it comes to srcid/source_pk) and the rows from parallel queries
> cannot be overlapping. Usually query returns within around a second.
> I noticed however there are some queries that hang for many hours and
> what is most curious some of them created several GB of temp files.

Can you show us the query plan for the slow cases?

regards, tom lane
Krzysztof Nienartowicz <Krzysztof.Nienartowicz@unige.ch> writes:
> Logs of the system running queries are not utterly clear, so chasing the
> parameters for the explosive query is not that simple (shared logs between
> multiple threads), but from what I see there is no difference between them
> and the plan looks like (without removal of irrelevant parameters this time,
> most of them are float8, but also bytea)
> [ nestloop with inner index scans over the inherited table ]

Well, that type of plan isn't going to consume much memory or disk
space.  What I suspect is happening is that sometimes, depending on the
specific parameter values called out in the query, the planner is
switching to another plan type that does consume lots of space (probably
via sort or hash temp files).  The most obvious guess is that that will
happen when the range limits on srcid get far enough apart to make a
nestloop not look cheap.  You could try experimenting with EXPLAIN and
different constant values to see what you get.

            regards, tom lane

Re: Query causing explosion of temp space with join involving partitioning

From
Krzysztof Nienartowicz
Date:
I made a brute force check and indeed, for one of the parameters the query was switching to sequential scans (or
bitmapsscans with condition on survey_pk=16 only if sequential scans were off). After closer look at the plan
cardinalitiesI thought it would be worthy to increase histogram size and I set statistics on sources(srcid) to 1000
fromdefault 10.  It fixed the plan! Sources table was around 100M so skewness in this range must have been looking odd
forthe planner.. 
Thank you for the hints!
Best Regards,
Krzysztof
On May 27, 2010, at 6:41 PM, Tom Lane wrote:

> Krzysztof Nienartowicz <Krzysztof.Nienartowicz@unige.ch> writes:
>> Logs of the system running queries are not utterly clear, so chasing the
>> parameters for the explosive query is not that simple (shared logs between
>> multiple threads), but from what I see there is no difference between them
>> and the plan looks like (without removal of irrelevant parameters this time,
>> most of them are float8, but also bytea)
>> [ nestloop with inner index scans over the inherited table ]
>
> Well, that type of plan isn't going to consume much memory or disk
> space.  What I suspect is happening is that sometimes, depending on the
> specific parameter values called out in the query, the planner is
> switching to another plan type that does consume lots of space (probably
> via sort or hash temp files).  The most obvious guess is that that will
> happen when the range limits on srcid get far enough apart to make a
> nestloop not look cheap.  You could try experimenting with EXPLAIN and
> different constant values to see what you get.
>
>             regards, tom lane