Re: [RFC] [PATCH] Flexible "partition pruning" hook - Mailing list pgsql-hackers

From Mike Palmiotto
Subject Re: [RFC] [PATCH] Flexible "partition pruning" hook
Date
Msg-id CAMN686Ec3RM3HQVGz8Sk=KW4m29+mGRu81svrtCPB20hsHRzPg@mail.gmail.com
Whole thread Raw
In response to RE: [RFC] [PATCH] Flexible "partition pruning" hook  ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Responses Re: [RFC] [PATCH] Flexible "partition pruning" hook
Re: [RFC] [PATCH] Flexible "partition pruning" hook
List pgsql-hackers
On Tue, Feb 26, 2019 at 1:55 AM Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
>
> From: Mike Palmiotto [mailto:mike.palmiotto@crunchydata.com]
> > Attached is a patch which attempts to solve a few problems:
> >
> > 1) Filtering out partitions flexibly based on the results of an external
> > function call (supplied by an extension).
> > 2) Filtering out partitions from pg_inherits based on the same function
> > call.
> > 3) Filtering out partitions from a partitioned table BEFORE the partition
> > is actually opened on-disk.
>
> What concrete problems would you expect this patch to solve?  What kind of extensions do you imagine?  I'd like to
hearabout the examples.  For example, "PostgreSQL 12 will not be able to filter out enough partitions when
planning/executingSELECT ... WHERE ... statement.  But an extension like this can extract just one partition early." 

My only application of the patch thus far has been to apply an RLS
policy driven by the extension's results. For example:

CREATE TABLE test.partpar
(
  a int,
  b text DEFAULT (extension_default_bfield('test.partpar'::regclass::oid))
)  PARTITION BY LIST (extension_translate_bfield(b));

CREATE POLICY filter_select on test.partpar for SELECT
USING (extension_filter_by_bfield(b));

CREATE POLICY filter_select on test.partpar for INSERT
WITH CHECK (extension_generate_insert_bfield('test.partpar'::regclass::oid)
= b);

CREATE POLICY filter_update on test.partpar for UPDATE
USING (extension_filter_by_bfield(b))
WITH CHECK (extension_filter_by_bfield(b));

CREATE POLICY filter_delete on test.partpar for DELETE
USING (extension_filter_by_bfield(b));

The function would filter based on some external criteria relating to
the username and the contents of the b column.

The desired effect would be to have `SELECT * from test.partpar;`
return check only the partitions where username can see any row in the
table based on column b. This is applicable, for instance, when a
partition of test.partpar (say test.partpar_b2) is given a label with
`SECURITY LABEL on TABLE test.partpar_b2 IS 'foo';` which is exactly
the same as the b column for every row in said partition. Using this
hook, we can simply check the table label and kick the entire
partition out early on. This should greatly improve performance for
the case where you can enforce that the partition SECURITY LABEL and
the b column are the same.

>
> Would this help the following issues with PostgreSQL 12?
>
> * UPDATE/DELETE planning takes time in proportion to the number of partitions, even when the actually accessed
partitionduring query execution is only one. 
>
> * Making a generic plan takes prohibitably long time (IIRC, about 12 seconds when the number of partitoons is 1,000
or8,000.) 

In theory, we'd be checking fewer items (the labels of the partitions,
instead of the b column for every row), so it may indeed help with
performance in these cases.

Admittedly, I haven't looked at either of these very closely. Do you
have any specific test cases I can try out on my end to verify?
--
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgbench MAX_ARGS
Next
From: Tom Lane
Date:
Subject: A note about recent ecpg buildfarm failures