BUG #18941: PostgreSQL planner does not select a multicolumn btree_gin index under RLS - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18941: PostgreSQL planner does not select a multicolumn btree_gin index under RLS
Date
Msg-id 18941-90236c38c6101bf0@postgresql.org
Whole thread Raw
Responses Re: BUG #18941: PostgreSQL planner does not select a multicolumn btree_gin index under RLS
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18941
Logged by:          Pepe Fagoaga
Email address:      pepe@prowler.com
PostgreSQL version: 16.3
Operating system:   Alpine Linux
Description:

We have the following `findings` table:
prowler_db=> \d findings
Partitioned table "public.findings"
     Column      |           Type           | Collation | Nullable |
Default

-----------------+--------------------------+-----------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 id              | uuid                     |           | not null |
 inserted_at     | timestamp with time zone |           | not null |
 updated_at      | timestamp with time zone |           | not null |
 uid             | character varying(300)   |           | not null |
 delta           | finding_delta            |           |          |
 status          | status                   |           | not null |
 status_extended | text                     |           |          |
 severity        | severity                 |           | not null |
 impact          | severity                 |           | not null |
 impact_extended | text                     |           |          |
 raw_result      | jsonb                    |           | not null |
 check_id        | character varying(100)   |           | not null |
 check_metadata  | jsonb                    |           | not null |
 tags            | jsonb                    |           |          |
 scan_id         | uuid                     |           | not null |
 tenant_id       | uuid                     |           | not null |
 text_search     | tsvector                 |           |          |
generated always as (((setweight(to_tsvector('english'::regconfig,
COALESCE(impact_extended, ''::text)), 'A'::"char") ||
setweight(to_tsvector('english'::regconfig, COALESCE(status_extended,
''::text)), 'B'::"char")) ||
setweight(jsonb_to_tsvector('simple'::regconfig, check_metadata, '["string",
"numeric"]'::jsonb), 'D'::"char")) ||
setweight(jsonb_to_tsvector('simple'::regconfig, tags, '["string",
"numeric"]'::jsonb), 'D'::"char")) stored
 first_seen_at   | timestamp with time zone |           |          |
 muted           | boolean                  |           | not null |
 compliance      | jsonb                    |           |          |
Partition key: RANGE (id)
Indexes:
    "findings_pkey" PRIMARY KEY, btree (id)
    "find_delta_new_idx" btree (tenant_id, id) WHERE delta =
'new'::finding_delta
    "find_tenant_scan_id_idx" btree (tenant_id, scan_id, id)
    "findings_filter_idx" btree (scan_id, impact, severity, status,
check_id, delta)
    "findings_scan_id_4df6a7a0" btree (scan_id)
    "findings_search_tenant" gin (text_search, tenant_id)
    "findings_tenant_and_id_idx" btree (tenant_id, id)
    "findings_tenant_id_924c8b16" btree (tenant_id)
    "findings_tenant_search" gin (tenant_id, text_search)
    "findings_uid_idx" btree (uid)
    "uid_tenant_inserted_at" btree (uid, tenant_id, inserted_at DESC)
Foreign-key constraints:
    "findings_scan_id_4df6a7a0_fk_scans_id" FOREIGN KEY (scan_id) REFERENCES
scans(id) DEFERRABLE INITIALLY DEFERRED
    "findings_tenant_id_924c8b16_fk_tenants_id" FOREIGN KEY (tenant_id)
REFERENCES tenants(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "resource_finding_mappings" CONSTRAINT
"resource_finding_mappings_finding_id_d5a5c48e_fk_findings_id" FOREIGN KEY
(finding_id) REFERENCES findings(id) DEFERRABLE INITIALLY DEFERRED
Policies (forced row security enabled):
    POLICY "prowler_findings_default_select" FOR SELECT
      TO prowler
      USING ((current_tenant_id() = tenant_id))
    POLICY "prowler_findings_delete" FOR DELETE
      TO prowler
      USING (
CASE
    WHEN (current_setting('api.tenant_id'::text, true) IS NULL) THEN false
    ELSE (tenant_id = (current_setting('api.tenant_id'::text))::uuid)
END)
    POLICY "prowler_findings_insert" FOR INSERT
      TO prowler
      WITH CHECK (
CASE
    WHEN (current_setting('api.tenant_id'::text, true) IS NULL) THEN false
    ELSE (tenant_id = (current_setting('api.tenant_id'::text))::uuid)
END)
    POLICY "prowler_findings_select" FOR SELECT
      TO prowler
      USING ((current_tenant_id() = tenant_id))
    POLICY "prowler_findings_update" FOR UPDATE
      TO prowler
      USING (
CASE
    WHEN (current_setting('api.tenant_id'::text, true) IS NULL) THEN false
    ELSE (tenant_id = (current_setting('api.tenant_id'::text))::uuid)
END)
Number of partitions: 8 (Use \d+ to list them.)
When a non superuser tries to run the following query "select id from
findings_default where tenant_id = 'a3b89764-af41-4a9c-a735-38b03d3473ac'
and text_search @@ plainto_tsquery('analyzer');" the planner decides not to
pick the btree_gin index with tenant_id and text_search in favor of just
picking the tenant_id index. The following is the query's explain:
prowler_db=> explain (analyze, buffers) select id from findings_default
where tenant_id = 'a3b89764-af41-4a9c-a735-38b03d3473ac' and text_search @@
plainto_tsquery('analyzer');
                                                                      QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=246.87..47877.83 rows=10 width=16) (actual
time=36.039..36.041 rows=0 loops=1)
   One-Time Filter: (current_tenant_id() =
'a3b89764-af41-4a9c-a735-38b03d3473ac'::uuid)
   Buffers: shared hit=1185
   ->  Bitmap Heap Scan on findings_default  (cost=246.87..47877.83 rows=10
width=16) (actual time=35.858..35.859 rows=0 loops=1)
         Recheck Cond: (tenant_id =
'a3b89764-af41-4a9c-a735-38b03d3473ac'::uuid)
         Filter: (text_search @@ plainto_tsquery('analyzer'::text))
         Rows Removed by Filter: 20999
         Heap Blocks: exact=1166
         Buffers: shared hit=1185
         ->  Bitmap Index Scan on findings_default_tenant_id_idx
(cost=0.00..246.62 rows=21092 width=0) (actual time=0.854..0.855 rows=20999
loops=1)
               Index Cond: (tenant_id =
'a3b89764-af41-4a9c-a735-38b03d3473ac'::uuid)
               Buffers: shared hit=19
 Planning:
   Buffers: shared hit=2
 Planning Time: 9.587 ms
 Execution Time: 36.190 ms
(16 rows)
After a thorough investigation I get to the point that RLS is getting into
conflict with how the btree_gin index works because it is never picked
regardless of how the RLS policy is configured -- I tried using security
definer functions, wrapping the condition as an InitPlan, and some other
workarounds that did not work.
I'm convinced that the issue comes from how RLS affects during the plan
because using a role with BYPASSRLS, like the superuser, makes the planner
to pick the index I expect. See the following explain:
prowler_db=# explain (analyze, buffers) select id from findings_default
where tenant_id = 'a3b89764-af41-4a9c-a735-38b03d3473ac' and text_search @@
plainto_tsquery('analyzer');
                                                                      QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on findings_default  (cost=786.90..829.16 rows=10
width=16) (actual time=4.035..4.039 rows=0 loops=1)
   Recheck Cond: ((text_search @@ plainto_tsquery('analyzer'::text)) AND
(tenant_id = 'a3b89764-af41-4a9c-a735-38b03d3473ac'::uuid))
   Buffers: shared hit=185
   ->  Bitmap Index Scan on findings_default_text_search_tenant_id_idx
(cost=0.00..786.90 rows=10 width=0) (actual time=4.017..4.019 rows=0
loops=1)
         Index Cond: ((text_search @@ plainto_tsquery('analyzer'::text)) AND
(tenant_id = 'a3b89764-af41-4a9c-a735-38b03d3473ac'::uuid))
         Buffers: shared hit=185
 Planning:
   Buffers: shared hit=2
 Planning Time: 6.783 ms
 Execution Time: 4.251 ms
(10 rows)
Thanks in advance.
Best,
Pepe.


pgsql-bugs by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
Next
From: PG Bug reporting form
Date:
Subject: BUG #18942: walsender memory allocation failure adding snapshot and invalidations to logical replica w/PG 16.9