Re: Scanning all partition when more than 100 items in "where id in()" clause - Mailing list pgsql-bugs

From Amit Langote
Subject Re: Scanning all partition when more than 100 items in "where id in()" clause
Date
Msg-id 5c46f111-81a8-3156-016e-2b70d6985b4d@lab.ntt.co.jp
Whole thread Raw
In response to Re: Scanning all partition when more than 100 items in "where id in()" clause  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Scanning all partition when more than 100 items in "where id in()" clause
List pgsql-bugs
On 2018/07/27 11:23, Stephen Frost wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> /*
>>  * Proof attempts involving large arrays in ScalarArrayOpExpr nodes are
>>  * likely to require O(N^2) time, and more often than not fail anyway.
>>  * So we set an arbitrary limit on the number of array elements that
>>  * we will allow to be treated as an AND or OR clause.
>>  * XXX is it worth exposing this as a GUC knob?
>>  */
>> #define MAX_SAOP_ARRAY_SIZE        100
> 
> Which certainly makes me think that comment in there might be worth
> something- perhaps we should make this a GUC and let users see just what
> would end up happening with a different choice.  There could certainly
> be cases where it'd be better to work it out.
> 
>> Not a bug, but a tradeoff.  You'd be unhappy if the planner spent longer
>> devising the plan than it saved to eliminate the extra partitions.
> 
> While I agree in concept, I'm awful curious how the "simpler" approach
> used when we hit the limit resulted in a 5x increase in planning time.
> Looks a bit like the extra time required to perform that elimination for
> at least a few more items would be saving us cycles somewhere else that
> are apparently pretty expensive.

The "simpler" approach in this case is predtest.c not pruning partitions
at all, which results in the planner creating a scan plan for every
partition, instead of just one in the case where the IN(..) list was
within the limit for the pruning to occur.

Fwiw, on the tiny machine I work on, attempting pruning with 100-item list
takes way longer than it takes the planner to just forget about pruning
and create a plan for all partitions.  But that's just about the planning
time.

\d+ lt
Partition key: LIST (b)
Partitions: lt_1 FOR VALUES IN (1),
            lt_10 FOR VALUES IN (10),
            lt_2 FOR VALUES IN (2),
            lt_3 FOR VALUES IN (3),
            lt_4 FOR VALUES IN (4),
            lt_5 FOR VALUES IN (5),
            lt_6 FOR VALUES IN (6),
            lt_7 FOR VALUES IN (7),
            lt_8 FOR VALUES IN (8),
            lt_9 FOR VALUES IN (9)

-- 100-item list allowing pruning to occur
explain analyze select * from lt where b in

(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);

                                            QUERY PLAN




─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
───────────────────────────────────────────────────────────────────────────────────
 Append  (cost=0.29..437.60 rows=5085 width=8) (actual time=0.096..138.388
rows=10000 loops=1)
   ->  Index Scan using lt_1_b_idx on lt_1  (cost=0.29..437.60 rows=5085
width=8) (actual time=0.087..56.177 rows=10000 loops=1)
         Index Cond: (b = ANY

('{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}'::integer[]))
 Planning time: 24.349 ms
 Execution time: 179.944 ms
(5 rows)

-- 101-item list disabling pruning
explain analyze select * from lt where b in

(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);

                                             QUERY PLAN




─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─────────────────────────────────────────────────────────────────────────────────────
 Append  (cost=0.29..4463.78 rows=51360 width=8) (actual
time=0.172..141.214 rows=10000 loops=1)
   ->  Index Scan using lt_1_b_idx on lt_1  (cost=0.29..438.78 rows=5136
width=8) (actual time=0.153..55.516 rows=10000 loops=1)
         Index Cond: (b = ANY

('{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}'::integer[]))
   ->  Index Scan using lt_2_b_idx on lt_2  (cost=0.29..442.78 rows=5136
width=8) (actual time=0.091..0.091 rows=0 loops=1)
         Index Cond: (b = ANY

('{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}'::integer[]))
   ->  Index Scan using lt_3_b_idx on lt_3  (cost=0.29..446.78 rows=5136
width=8) (actual time=0.090..0.090 rows=0 loops=1)
         Index Cond: (b = ANY

('{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}'::integer[]))

[ Index Scan nodes of remaining patitions ]

 Planning time: 7.146 ms
 Execution time: 184.115 ms
(23 rows)

Oddly, that seems exactly the opposite of what OP is seeing.

Thanks,
Amit



pgsql-bugs by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: pg_restore: All GRANTs on table fail when any one role is missing
Next
From: Amit Langote
Date:
Subject: Re: Scanning all partition when more than 100 items in "where id in()" clause