Re: Query causing explosion of temp space with join involving partitioning - Mailing list pgsql-bugs

From Krzysztof Nienartowicz
Subject Re: Query causing explosion of temp space with join involving partitioning
Date
Msg-id AANLkTiljwIjxXQU-TFrt1qYky3dkxDLUAAWUss35dnFy@mail.gmail.com
Whole thread Raw
In response to Query causing explosion of temp space with join involving partitioning  (Krzysztof Nienartowicz <krzysztof.nienartowicz.cern@gmail.com>)
Responses Re: Query causing explosion of temp space with join involving partitioning
List pgsql-bugs
Hello,
Thank you for the clarifications. The plan as run from the psql looks ok, also did not notice any specific locks for this particular query.

Logs of the system running queries are not utterly clear, so chasing the parameters for the explosive query is not that simple (shared logs between multiple threads), but from what I see there is no difference between them and the plan looks like (without removal of irrelevant parameters this time, most of them are float8, but also bytea)

explain SELECT t0.surveyid, t0.srcid, t1.survey_pk, t1.source_pk, t1.tstype, t1.homoscedasticitytest, t1.ljungboxrandomnesstest, t1.maxvalue, t1.meanobstime, 
t1.meanvalue, t1.median, t1.minvalue, t1.range, t1.robustweightedstddev, t1.symmetrytest, t1.trimmedweightedmean, t1.trimmedweightedrange, 
t1.variabilityflag, t1.weightedkurtosis, t1.weightedmean, t1.weightedmeanconfidenceinterval, t1.weightedmeanobstime, t1.weightednormalizedp2pscatter, 
t1.weightedskewness, t1.weightedstddevdf, 
t1.weightedstddevwdf, t1.vals, t1.ccdids, t1.flags, t1.obstime, t1.len, t1.valueerrors FROM  sources t0 INNER JOIN ts t1 ON 
t0.surveyid = t1.survey_pk AND t0.srcid = t1.source_pk WHERE (t0.surveyid = 16 AND t0.srcid >= 200210107009116  AND t0.srcid <= 200210107009991)
 ORDER BY t0.surveyid ASC, t0.srcid ASC ;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2363.21 rows=835683 width=1527)
   Join Filter: (t0.srcid = t1.source_pk)
   ->  Index Scan using sources_pkey on sources t0  (cost=0.00..17.88 rows=1 width=12)
         Index Cond: ((surveyid = 16) AND (srcid >= 200210107009116::bigint) AND (srcid <= 200210107009991::bigint))
   ->  Append  (cost=0.00..2325.93 rows=1552 width=1053)
         ->  Index Scan using ts_pkey on ts t1  (cost=0.00..4.27 rows=1 width=1665)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00000_ts_pkey on ts_part_bs3000l00000 t1  (cost=0.00..6.30 rows=2 width=327)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_sm0073k00001_ts_pkey on ts_part_sm0073k00001 t1  (cost=0.00..1232.63 rows=608 width=327)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t1  (cost=0.00..145.41 rows=127 width=1556)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0346l00001_cg0816k00000_ts_pkey on ts_part_cg0346l00001_cg0816k00000 t1  (cost=0.00..147.64 rows=127 width=1669)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0816k00001_cg1180k00000_ts_pkey on ts_part_cg0816k00001_cg1180k00000 t1  (cost=0.00..138.09 rows=119 width=1615)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg1180k00001_cg6204k00000_ts_pkey on ts_part_cg1180k00001_cg6204k00000 t1  (cost=0.00..125.69 rows=109 width=1552)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg6204k00001_lm0022n00000_ts_pkey on ts_part_cg6204k00001_lm0022n00000 t1  (cost=0.00..133.23 rows=116 width=1509)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0022n00001_lm0276m00000_ts_pkey on ts_part_lm0022n00001_lm0276m00000 t1  (cost=0.00..131.08 rows=115 width=1500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0276m00001_lm0584k00000_ts_pkey on ts_part_lm0276m00001_lm0584k00000 t1  (cost=0.00..158.11 rows=135 width=1471)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0584k00001_sm0073k00000_ts_pkey on ts_part_lm0584k00001_sm0073k00000 t1  (cost=0.00..103.47 rows=93 width=1242)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))


I could increase debug level on the server, but not sure if the plan printed there is of any help. Could this be caused by some race where there is too much activity? - DB box is at around 10% CPU load, small io wait, before the query starts to overload the machine.

For sake of clarity this is the plan for the non-joined parameters to show which partition would be used (i.e. a single one)

explain select * from ts t0 where t0.survey_pk = 16 AND t0.source_pk >= 200210107009116  AND t0.source_pk <= 200210107009991;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..10.72 rows=2 width=1614)
   ->  Append  (cost=0.00..10.72 rows=2 width=1614)
         ->  Index Scan using ts_pkey on ts t0  (cost=0.00..4.27 rows=1 width=1669)
               Index Cond: ((survey_pk = 16) AND (source_pk >= 200210107009116::bigint) AND (source_pk <= 200210107009991::bigint))
         ->  Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t0  (cost=0.00..6.45 rows=1 width=1560)
               Index Cond: ((survey_pk = 16) AND (source_pk >= 200210107009116::bigint) AND (source_pk <= 200210107009991::bigint))
(6 rows)

Time: 1.559 ms

and to check the bin size:
>  select count(*) from ts t0 where t0.survey_pk = 16 AND t0.source_pk >= 200210107009116  AND t0.source_pk <= 200210107009991;
 count 
-------
  1000
(1 row)


and analyzed plan:
> explain analyze SELECT t0.surveyid, t0.srcid, t1.survey_pk, t1.source_pk, t1.tstype, t1.homoscedasticitytest, t1.ljungboxrandomnesstest, t1.maxvalue, t1.meanobstime, 
t1.meanvalue, t1.median, t1.minvalue, t1.range, t1.robustweightedstddev, t1.symmetrytest, t1.trimmedweightedmean, t1.trimmedweightedrange, 
t1.variabilityflag, t1.weightedkurtosis, t1.weightedmean, t1.weightedmeanconfidenceinterval, t1.weightedmeanobstime, t1.weightednormalizedp2pscatter, 
t1.weightedskewness, t1.weightedstddevdf, 
t1.weightedstddevwdf, t1.vals, t1.ccdids, t1.flags, t1.obstime, t1.len, t1.valueerrors FROM oglehip.sources t0 INNER JOIN oglehip.ts t1 ON 
t0.surveyid = t1.survey_pk AND t0.srcid = t1.source_pk WHERE (t0.surveyid = 16 AND t0.srcid >= 200210107009116  AND t0.srcid <= 200210107009991)
 ORDER BY t0.surveyid ASC, t0.srcid ASC ;
                                                                                             QUERY PLAN                                                                                              
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2363.21 rows=835692 width=1527) (actual time=73.629..585.003 rows=1000 loops=1)
   Join Filter: (t0.srcid = t1.source_pk)
   ->  Index Scan using sources_pkey on sources t0  (cost=0.00..17.88 rows=1 width=12) (actual time=73.507..560.589 rows=500 loops=1)
         Index Cond: ((surveyid = 16) AND (srcid >= 200210107009116::bigint) AND (srcid <= 200210107009991::bigint))
   ->  Append  (cost=0.00..2325.93 rows=1552 width=1053) (actual time=0.014..0.045 rows=2 loops=500)
         ->  Index Scan using ts_pkey on ts t1  (cost=0.00..4.27 rows=1 width=1665) (actual time=0.001..0.001 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00000_ts_pkey on ts_part_bs3000l00000 t1  (cost=0.00..6.30 rows=2 width=327) (actual time=0.002..0.002 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_sm0073k00001_ts_pkey on ts_part_sm0073k00001 t1  (cost=0.00..1232.63 rows=608 width=327) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on ts_part_bs3000l00001_cg0346l00000 t1  (cost=0.00..145.41 rows=127 width=1556) (actual time=0.006..0.007 rows=2 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0346l00001_cg0816k00000_ts_pkey on ts_part_cg0346l00001_cg0816k00000 t1  (cost=0.00..147.64 rows=127 width=1669) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg0816k00001_cg1180k00000_ts_pkey on ts_part_cg0816k00001_cg1180k00000 t1  (cost=0.00..138.09 rows=119 width=1615) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg1180k00001_cg6204k00000_ts_pkey on ts_part_cg1180k00001_cg6204k00000 t1  (cost=0.00..125.69 rows=109 width=1552) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_cg6204k00001_lm0022n00000_ts_pkey on ts_part_cg6204k00001_lm0022n00000 t1  (cost=0.00..133.23 rows=116 width=1509) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0022n00001_lm0276m00000_ts_pkey on ts_part_lm0022n00001_lm0276m00000 t1  (cost=0.00..131.08 rows=115 width=1500) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0276m00001_lm0584k00000_ts_pkey on ts_part_lm0276m00001_lm0584k00000 t1  (cost=0.00..158.11 rows=135 width=1471) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
         ->  Index Scan using ts_part_lm0584k00001_sm0073k00000_ts_pkey on ts_part_lm0584k00001_sm0073k00000 t1  (cost=0.00..103.47 rows=93 width=1242) (actual time=0.004..0.004 rows=0 loops=500)
               Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk = t0.srcid))
 Total runtime: 585.566 ms
(28 rows)

Time: 588.102 ms


Would be grateful for any pointers as the server restart is the only option now once such a query starts trashing the disk.

Best Regards,
Krzysztof


Krzysztof Nienartowicz <krzysztof.nienartowicz.cern@gmail.com> writes:
> surveys-> SELECT  t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE,  t1.VALS
> surveys->   FROM sources t0 ,TS t1 where
> surveys->   (t0.SURVEYID = 16 AND t0.SRCID >= 203510110032281 AND
> t0.SRCID <= 203520107001677 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID =
> t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC

We don't make any attempt to infer derived inequality conditions,
so no, those constraints on t0.srcid won't be propagated over to
t1.source_pk.  Sorry.  It's been suggested before, but it would be
a lot of new mechanism and expense in the planner, and for most
queries it'd just slow things down to try to do that.

> I have around 30 clients running the same query with different
> parameters, but the query always returns 1000 rows (boundary values
> are pre-calculated,so it's like traversal of the equiwidth histogram
> if it comes to srcid/source_pk) and the rows from parallel queries
> cannot be overlapping. Usually query returns within around a second.
> I noticed however there are some queries that hang for many hours and
> what is most curious some of them created several GB of temp files.

Can you show us the query plan for the slow cases?

regards, tom lane

pgsql-bugs by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: BUG #5474: Installation
Next
From: Robert Haas
Date:
Subject: Re: BUG #5469: regexp_matches() has poor behaviour and more poor documentation