Re: Partition pruning with array-contains check and current_setting function - Mailing list pgsql-performance

From Marcelo Zabani
Subject Re: Partition pruning with array-contains check and current_setting function
Date
Msg-id CACgY3QY6bO46zhPb9fa_c=qi3_OLQ7t4eyZcAd8Mkd9yyPpxXA@mail.gmail.com
Whole thread Raw
In response to Partition pruning with array-contains check and current_setting function  (Marcelo Zabani <mzabani@gmail.com>)
List pgsql-performance
I managed to get a plan I was hoping for, but it still doesn't prune partitions. I created a new operator #|<(integer[], integer) that is defined in SQL and is basically equivalent to value=ANY(array), and a non-stable tenants() function defined that returns an array from the setting, and with that I could use a scalar subquery without running into type-checking errors. This gives me an InitPlan node:

=> SET my.tenant_id='{1}';EXPLAIN (COSTS OFF) SELECT COUNT(*) FROM tbl WHERE tenant_id #|< (select tenants());
SET
                        QUERY PLAN
----------------------------------------------------------
 Finalize Aggregate
   InitPlan 1 (returns $0)
     ->  Result
   ->  Gather
         Workers Planned: 2
         Params Evaluated: $0
         ->  Partial Aggregate
               ->  Parallel Append
                     ->  Parallel Seq Scan on tbl2 tbl_2
                           Filter: (tenant_id = ANY ($0))
                     ->  Parallel Seq Scan on tbl1 tbl_1
                           Filter: (tenant_id = ANY ($0))




It still doesn't prune even if I EXPLAIN ANALYZE it. I thought maybe I did something wrong with the operator definition, so I tried making tenants() immutable and removing the scalar subquery, and then it does prune:
=> SET my.tenant_id='{1}';EXPLAIN (COSTS OFF) SELECT COUNT(*) FROM tbl WHERE tenant_id #|< tenants();
SET
                      QUERY PLAN
------------------------------------------------------
 Aggregate
   ->  Seq Scan on tbl1 tbl
         Filter: (tenant_id = ANY ('{1}'::integer[]))



Sadly I can't make tenants() immutable because it's a runtime setting, and making tenants() STABLE does not lead to partition pruning with or without the scalar subquery around it.

I'm a bit lost. It seems like postgres is fully capable of pruning partitions for =ANY checks, and some strange detail is confusing it in this case. I'm not sure what else to try.

On Wed, Aug 7, 2024 at 6:10 PM Marcelo Zabani <mzabani@gmail.com> wrote:
Hello all. I am trying to make postgres 16 prune partition for queries with `WHERE tenant_id=ANY(current_setting('my.tenant_id')::integer[])`, but I haven't been able to make it work, and naturally it impacts performance so I thought this list would be appropriate.

Here's the SQL I tried (but feel free to skip to the end as I'm sure all this stuff is obvious to you!):

CREATE TABLE tbl (id SERIAL NOT NULL, tenant_id INT NOT NULL, some_col INT, PRIMARY KEY (tenant_id, id))
PARTITION BY HASH (tenant_id);

CREATE TABLE tbl1 PARTITION OF tbl FOR VALUES WITH (MODULUS 2, REMAINDER 0);
CREATE TABLE tbl2 PARTITION OF tbl FOR VALUES WITH (MODULUS 2, REMAINDER 1);

INSERT INTO tbl (tenant_id, some_col) SELECT 1, * FROM generate_series(1,10000);
INSERT INTO tbl (tenant_id, some_col) SELECT 3, * FROM generate_series(1,10000);


Partition pruning works as expected for this query (still not an array-contains check):
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=1;

When reading from a setting it also prunes partitions correctly:
SET my.tenant_id=1;
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=current_setting('my.tenant_id')::integer;


It still does partition pruning if we use a scalar subquery. I can see the (never executed) scans in the plan.
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=(SELECT current_setting('my.tenant_id')::integer);

But how about an array-contains check? Still prunes, which is nice.
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=ANY('{1}'::integer[]);

However, it doesn't prune if the array is in a setting:
SET my.tenant_id='{1}';
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=ANY(current_setting('my.tenant_id')::integer[]);


I actually expected that when in a setting, none of the previous queries would've done partition pruning because I thought `current_setting` is not a stable function. But some of them did, which surprised me.

So I thought maybe if I put it in a scalar query it will give me an InitPlan node, but it looks like method resolution for =ANY won't let me try this:
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=ANY((SELECT current_setting('my.tenant_id')::integer[]));
ERROR:  operator does not exist: integer = integer[]

I tried using UNNEST, but that adds a Hash Semi Join to the plan which also doesn't do partition pruning.
EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=ANY((SELECT UNNEST(current_setting('my.tenant_id')::integer[])));

My question is if there's a way to do partition pruning based on array-contains operator if the array is in a setting. The use-case is to make Row Level Security policies do partition pruning "automatically" in a setting where users can be in more than one tenant.
It feels like this would work if there were a non-overloaded operator that takes in an array and a single element and tests for array-contains, because then I could use that operator with a scalar subquery and get an InitPlan node. But I'm new to all of this, so apologies if I'm getting it all wrong!

Thanks in advance,
Marcelo.

pgsql-performance by date:

Previous
From: David Mullineux
Date:
Subject: Re: Has gen_random_uuid() gotten much slower in v17?
Next
From: Andrey Stikheev
Date:
Subject: Performance degradation in Index searches with special characters