On 7/19/23 16:44, Jacob Champion wrote:
> This patch pushes down any
> forced-null and not-null Vars as ScanKeys. It doesn't remove the
> redundant quals after turning them into ScanKeys, so it's needlessly
> inefficient, but there's still a decent speedup for some of the basic
> benchmarks in 0003.
>
> Plans look something like this:
>
> # EXPLAIN SELECT * FROM t WHERE i IS NULL;
> QUERY PLAN
> ------------------------------------------------------------
> Seq Scan on t (cost=0.00..1393.00 rows=49530 width=4)
> Scan Cond: (i IS NULL)
> Filter: (i IS NULL)
> (3 rows)
Redundant clauses are now filtered out in v3. So the new plans look more
like what you'd expect:
=# EXPLAIN SELECT * FROM table1 WHERE a IS NOT NULL AND b = 2;
QUERY PLAN
---------------------------------------------------------
Seq Scan on table1 (cost=0.00..3344.00 rows=1 width=4)
Scan Cond: (a IS NOT NULL)
Filter: (b = 2)
(3 rows)
> The non-nullable case worries me a bit because so many things imply IS
> NOT NULL. I think I need to do some sort of cost analysis using the
> null_frac statistics -- it probably only makes sense to push an
> implicit SK_SEARCHNOTNULL down to the AM layer if some fraction of
> rows would actually be filtered out -- but I'm not really sure how to
> choose a threshold.
v3 also uses the nullfrac and the expected cost of the filter clauses to
decide whether to promote a derived IS NOT NULL condition to a ScanKey.
(Explicit IS [NOT] NULL clauses are always promoted.) I'm still not sure
how to fine-tune the expected cost of the ScanKey, but the number I've
chosen for now (`0.1 * cpu_operator_cost`) doesn't seem to regress my
benchmarks, for whatever that's worth.
I recorded several of the changes to the regression EXPLAIN output, but
I've left a few broken because I'm not sure if they're useful or if I
should just disable the optimization. For example:
explain (analyze, costs off, summary off, timing off)
select * from list_part where a = list_part_fn(1) + a;
QUERY PLAN
------------------------------------------------------------------
Append (actual rows=0 loops=1)
-> Seq Scan on list_part1 list_part_1 (actual rows=0 loops=1)
+ Scan Cond: (a IS NOT NULL)
Filter: (a = (list_part_fn(1) + a))
Rows Removed by Filter: 1
-> Seq Scan on list_part2 list_part_2 (actual rows=0 loops=1)
+ Scan Cond: (a IS NOT NULL)
Filter: (a = (list_part_fn(1) + a))
Rows Removed by Filter: 1
-> Seq Scan on list_part3 list_part_3 (actual rows=0 loops=1)
+ Scan Cond: (a IS NOT NULL)
Filter: (a = (list_part_fn(1) + a))
Rows Removed by Filter: 1
-> Seq Scan on list_part4 list_part_4 (actual rows=0 loops=1)
+ Scan Cond: (a IS NOT NULL)
Filter: (a = (list_part_fn(1) + a))
Rows Removed by Filter: 1
These new conditions are due to a lack of statistics for the tiny
partitions; the filters are considered expensive enough that the savings
against a DEFAULT_UNK_SEL proportion of NULLs would hypothetically be
worth it. Best I can tell, this is a non-issue, since autovacuum will
fix the problem by the time the table grows to the point where the
pointless ScanKey would cause a slowdown. But it sure _looks_ like a
mistake, especially since these particular partitions can't even contain
NULL. Do I need to do something about this short-lived case?
There's still other work to do -- for instance, I think my modifications
to the filter clauses have probably broken some isolation levels until I
fix up SeqRecheck(), and better benchmarks would be good -- but I think
this is ready for early CF feedback.
Thanks,
--Jacob