Thread: [RFC] [PATCH] Flexible "partition pruning" hook
Greetings, Attached is a patch which attempts to solve a few problems: 1) Filtering out partitions flexibly based on the results of an external function call (supplied by an extension). 2) Filtering out partitions from pg_inherits based on the same function call. 3) Filtering out partitions from a partitioned table BEFORE the partition is actually opened on-disk. The name "partitionChildAccess_hook" comes from the fact that the backend may not have access to a particular partition within the partitioned table. The idea would be to silently filter out the partition from queries to the parent table, which means also adjusting the returned contents of find_inheritance_children based on the external function. I am curious how the community feels about these patches and if there is an alternative approach to solve the above issues (perhaps another existing hook). Thanks for your time. -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com
Attachment
From: Mike Palmiotto [mailto:mike.palmiotto@crunchydata.com] > Attached is a patch which attempts to solve a few problems: > > 1) Filtering out partitions flexibly based on the results of an external > function call (supplied by an extension). > 2) Filtering out partitions from pg_inherits based on the same function > call. > 3) Filtering out partitions from a partitioned table BEFORE the partition > is actually opened on-disk. What concrete problems would you expect this patch to solve? What kind of extensions do you imagine? I'd like to hear aboutthe examples. For example, "PostgreSQL 12 will not be able to filter out enough partitions when planning/executingSELECT ... WHERE ... statement. But an extension like this can extract just one partition early." Would this help the following issues with PostgreSQL 12? * UPDATE/DELETE planning takes time in proportion to the number of partitions, even when the actually accessed partitionduring query execution is only one. * Making a generic plan takes prohibitably long time (IIRC, about 12 seconds when the number of partitoons is 1,000 or 8,000.) Regards Takayuki Tsunakawa
On Tue, Feb 26, 2019 at 06:55:30AM +0000, Tsunakawa, Takayuki wrote: > What concrete problems would you expect this patch to solve? What > kind of extensions do you imagine? I'd like to hear about the > examples. For example, "PostgreSQL 12 will not be able to filter > out enough partitions when planning/executing SELECT ... WHERE > ... statement. But an extension like this can extract just one > partition early." Indeed. Hooks should be defined so as their use is as generic and possible depending on their context, particularly since there is a planner hook.. It is also important to consider first if the existing core code can be made better depending on the requirements, removing the need for a hook at the end. -- Michael
Attachment
On Tue, Feb 26, 2019 at 1:55 AM Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com> wrote: > > From: Mike Palmiotto [mailto:mike.palmiotto@crunchydata.com] > > Attached is a patch which attempts to solve a few problems: > > > > 1) Filtering out partitions flexibly based on the results of an external > > function call (supplied by an extension). > > 2) Filtering out partitions from pg_inherits based on the same function > > call. > > 3) Filtering out partitions from a partitioned table BEFORE the partition > > is actually opened on-disk. > > What concrete problems would you expect this patch to solve? What kind of extensions do you imagine? I'd like to hearabout the examples. For example, "PostgreSQL 12 will not be able to filter out enough partitions when planning/executingSELECT ... WHERE ... statement. But an extension like this can extract just one partition early." My only application of the patch thus far has been to apply an RLS policy driven by the extension's results. For example: CREATE TABLE test.partpar ( a int, b text DEFAULT (extension_default_bfield('test.partpar'::regclass::oid)) ) PARTITION BY LIST (extension_translate_bfield(b)); CREATE POLICY filter_select on test.partpar for SELECT USING (extension_filter_by_bfield(b)); CREATE POLICY filter_select on test.partpar for INSERT WITH CHECK (extension_generate_insert_bfield('test.partpar'::regclass::oid) = b); CREATE POLICY filter_update on test.partpar for UPDATE USING (extension_filter_by_bfield(b)) WITH CHECK (extension_filter_by_bfield(b)); CREATE POLICY filter_delete on test.partpar for DELETE USING (extension_filter_by_bfield(b)); The function would filter based on some external criteria relating to the username and the contents of the b column. The desired effect would be to have `SELECT * from test.partpar;` return check only the partitions where username can see any row in the table based on column b. This is applicable, for instance, when a partition of test.partpar (say test.partpar_b2) is given a label with `SECURITY LABEL on TABLE test.partpar_b2 IS 'foo';` which is exactly the same as the b column for every row in said partition. Using this hook, we can simply check the table label and kick the entire partition out early on. This should greatly improve performance for the case where you can enforce that the partition SECURITY LABEL and the b column are the same. > > Would this help the following issues with PostgreSQL 12? > > * UPDATE/DELETE planning takes time in proportion to the number of partitions, even when the actually accessed partitionduring query execution is only one. > > * Making a generic plan takes prohibitably long time (IIRC, about 12 seconds when the number of partitoons is 1,000 or8,000.) In theory, we'd be checking fewer items (the labels of the partitions, instead of the b column for every row), so it may indeed help with performance in these cases. Admittedly, I haven't looked at either of these very closely. Do you have any specific test cases I can try out on my end to verify? -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com
On Tue, Feb 26, 2019 at 1:06 PM Mike Palmiotto <mike.palmiotto@crunchydata.com> wrote: > > On Tue, Feb 26, 2019 at 1:55 AM Tsunakawa, Takayuki > <tsunakawa.takay@jp.fujitsu.com> wrote: > > > > From: Mike Palmiotto [mailto:mike.palmiotto@crunchydata.com] > > > Attached is a patch which attempts to solve a few problems: Updated patch attached. > > > <snip> > > What concrete problems would you expect this patch to solve? What kind of extensions do you imagine? I'd like to hearabout the examples. For example, "PostgreSQL 12 will not be able to filter out enough partitions when planning/executingSELECT ... WHERE ... statement. But an extension like this can extract just one partition early." > > My only application of the patch thus far has been to apply an RLS > policy driven by the extension's results. For example: > > CREATE TABLE test.partpar > ( > a int, > b text DEFAULT (extension_default_bfield('test.partpar'::regclass::oid)) > ) PARTITION BY LIST (extension_translate_bfield(b)); > > CREATE POLICY filter_select on test.partpar for SELECT > USING (extension_filter_by_bfield(b)); > > CREATE POLICY filter_select on test.partpar for INSERT > WITH CHECK (extension_generate_insert_bfield('test.partpar'::regclass::oid) > = b); > > CREATE POLICY filter_update on test.partpar for UPDATE > USING (extension_filter_by_bfield(b)) > WITH CHECK (extension_filter_by_bfield(b)); > > CREATE POLICY filter_delete on test.partpar for DELETE > USING (extension_filter_by_bfield(b)); > > The function would filter based on some external criteria relating to > the username and the contents of the b column. > > The desired effect would be to have `SELECT * from test.partpar;` > return check only the partitions where username can see any row in the > table based on column b. This is applicable, for instance, when a > partition of test.partpar (say test.partpar_b2) is given a label with > `SECURITY LABEL on TABLE test.partpar_b2 IS 'foo';` which is exactly > the same as the b column for every row in said partition. Using this > hook, we can simply check the table label and kick the entire > partition out early on. This should greatly improve performance for > the case where you can enforce that the partition SECURITY LABEL and > the b column are the same. Is this explanation suitable, or is more information required? Thanks, -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com
Attachment
On 2019-02-26 19:06, Mike Palmiotto wrote: > The desired effect would be to have `SELECT * from test.partpar;` > return check only the partitions where username can see any row in the > table based on column b. This is applicable, for instance, when a > partition of test.partpar (say test.partpar_b2) is given a label with > `SECURITY LABEL on TABLE test.partpar_b2 IS 'foo';` which is exactly > the same as the b column for every row in said partition. Using this > hook, we can simply check the table label and kick the entire > partition out early on. This should greatly improve performance for > the case where you can enforce that the partition SECURITY LABEL and > the b column are the same. To rephrase this: You have a partitioned table, and you have a RLS policy that hides certain rows, and you know based on your business logic that under certain circumstances entire partitions will be hidden, so they don't need to be scanned. So you want a planner hook that would allow you to prune those partitions manually. That sounds pretty hackish to me. We should give the planner and executor the appropriate information to make these decisions, like an additional partition constraint. If this information is hidden in user-defined functions in a way that cannot be reasoned about, who is enforcing these constraints and how do we know they are actually correct? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Feb 27, 2019 at 12:36 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > <snip> > To rephrase this: You have a partitioned table, and you have a RLS > policy that hides certain rows, and you know based on your business > logic that under certain circumstances entire partitions will be hidden, > so they don't need to be scanned. So you want a planner hook that would > allow you to prune those partitions manually. > > That sounds pretty hackish to me. We should give the planner and > executor the appropriate information to make these decisions, like an > additional partition constraint. Are you thinking of a partition pruning step for FuncExpr's or something else? I was considering an implementation where FuncExpr's were marked for execution-time pruning, but wanted to see if this patch got any traction first. > If this information is hidden in > user-defined functions in a way that cannot be reasoned about, who is > enforcing these constraints and how do we know they are actually correct? The author of the extension which utilizes the hook would have to be sure they use the hook correctly. This is not a new or different concept to any other existing hook. This hook in particular would be used by security extensions that have some understanding of the underlying security model being implemented by RLS. Thanks, -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com
Hi Peter, On 2/28/19 10:36 PM, Mike Palmiotto wrote: > On Wed, Feb 27, 2019 at 12:36 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> <snip> >> To rephrase this: You have a partitioned table, and you have a RLS >> policy that hides certain rows, and you know based on your business >> logic that under certain circumstances entire partitions will be hidden, >> so they don't need to be scanned. So you want a planner hook that would >> allow you to prune those partitions manually. >> >> That sounds pretty hackish to me. We should give the planner and >> executor the appropriate information to make these decisions, like an >> additional partition constraint. > > Are you thinking of a partition pruning step for FuncExpr's or > something else? I was considering an implementation where FuncExpr's > were marked for execution-time pruning, but wanted to see if this > patch got any traction first. > >> If this information is hidden in >> user-defined functions in a way that cannot be reasoned about, who is >> enforcing these constraints and how do we know they are actually correct? > > The author of the extension which utilizes the hook would have to be > sure they use the hook correctly. This is not a new or different > concept to any other existing hook. This hook in particular would be > used by security extensions that have some understanding of the > underlying security model being implemented by RLS. Thoughts on Mike's response? Regards, -- -David david@pgmasters.net
Hi, On 2019-02-28 13:36:32 -0500, Mike Palmiotto wrote: > On Wed, Feb 27, 2019 at 12:36 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: > > <snip> > > To rephrase this: You have a partitioned table, and you have a RLS > > policy that hides certain rows, and you know based on your business > > logic that under certain circumstances entire partitions will be hidden, > > so they don't need to be scanned. So you want a planner hook that would > > allow you to prune those partitions manually. > > > > That sounds pretty hackish to me. We should give the planner and > > executor the appropriate information to make these decisions, like an > > additional partition constraint. > > Are you thinking of a partition pruning step for FuncExpr's or > something else? I was considering an implementation where FuncExpr's > were marked for execution-time pruning, but wanted to see if this > patch got any traction first. > > > If this information is hidden in > > user-defined functions in a way that cannot be reasoned about, who is > > enforcing these constraints and how do we know they are actually correct? > > The author of the extension which utilizes the hook would have to be > sure they use the hook correctly. This is not a new or different > concept to any other existing hook. This hook in particular would be > used by security extensions that have some understanding of the > underlying security model being implemented by RLS. I noticed that the CF entry for this patch is marked as targeting v12. Even if we had agreement on the design - which we pretty clearly don't - I don't see that being realistic for a patch submitted a few days before the final v12 CF. That really only should contain simple new patches, or, obviously, patches that have been longer in development. I've moved this to the next CF, and marked it as targeting v13. Greetings, Andres Freund
On Sat, Apr 6, 2019 at 3:06 PM Andres Freund <andres@anarazel.de> wrote: > I've moved this to the next CF, and marked it as targeting v13. Hi Mike, Commifest 1 for PostgreSQL 13 is here. I was just going through the automated build results for the 'fest and noticed that your patch causes a segfault in the regression tests (possibly due to other changes that have been made in master since February). You can see the complete backtrace on the second link below, but it looks like this is happening every time, so hopefully not hard to track down locally. https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46412 https://travis-ci.org/postgresql-cfbot/postgresql/builds/555380617 Thanks, -- Thomas Munro https://enterprisedb.com
On Sun, Jul 7, 2019 at 9:46 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Sat, Apr 6, 2019 at 3:06 PM Andres Freund <andres@anarazel.de> wrote:
> I've moved this to the next CF, and marked it as targeting v13.
Hi Mike,
Commifest 1 for PostgreSQL 13 is here. I was just going through the
automated build results for the 'fest and noticed that your patch
causes a segfault in the regression tests (possibly due to other
changes that have been made in master since February). You can see
the complete backtrace on the second link below, but it looks like
this is happening every time, so hopefully not hard to track down
locally.
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46412
https://travis-ci.org/postgresql-cfbot/postgresql/builds/555380617
Thanks, Thomas, I'll look into this today.
--
On Mon, Jul 8, 2019 at 10:59 AM Mike Palmiotto <mike.palmiotto@crunchydata.com> wrote: > > On Sun, Jul 7, 2019 at 9:46 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> >> On Sat, Apr 6, 2019 at 3:06 PM Andres Freund <andres@anarazel.de> wrote: >> > I've moved this to the next CF, and marked it as targeting v13. >> >> Hi Mike, >> >> Commifest 1 for PostgreSQL 13 is here. I was just going through the >> automated build results for the 'fest and noticed that your patch >> causes a segfault in the regression tests (possibly due to other >> changes that have been made in master since February). You can see >> the complete backtrace on the second link below, but it looks like >> this is happening every time, so hopefully not hard to track down >> locally. >> >> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46412 >> https://travis-ci.org/postgresql-cfbot/postgresql/builds/555380617 Attached you will find a patch (rebased on master) that passes all tests on my local CentOS 7 box. Thanks again for the catch! -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com
Attachment
On Tue, Jul 9, 2019 at 3:31 PM Mike Palmiotto <mike.palmiotto@crunchydata.com> wrote: > Attached you will find a patch (rebased on master) that passes all > tests on my local CentOS 7 box. Thanks again for the catch! Hi Mike, Here are some comments on superficial aspects of the patch: +/* Custom partition child access hook. Provides further partition pruning given + * child OID. + */ Should be like: /* * Multi-line comment... */ Why "child"? Don't you really mean "Partition pruning hook. Provides custom pruning given partition OID." or something? +typedef bool (*partitionChildAccess_hook_type) (Oid childOID); +PGDLLIMPORT partitionChildAccess_hook_type partitionChildAccess_hook; Hmm, I wonder if this could better evoke the job that it's doing... partition_filter_hook? partition_access_filter_hook? partition_prune_hook? +/* Macro to use partitionChildAccess_hook. Handles NULL-checking. */ It's not a macro, it's a function. +static inline bool InvokePartitionChildAccessHook (Oid childOID) +{ + if (partitionChildAccess_hook && enable_partition_pruning && childOID) + { Normally we write OidIsValid(childOID) rather than comparing with 0. I wonder if you should call the variable relId? Single line if branches don't usually get curly braces. + return (*partitionChildAccess_hook) (childOID); The syntax we usually use for calling function pointers is just partitionChildAccess_hook(childOID). -- Thomas Munro https://enterprisedb.com
On Thu, Jul 11, 2019 at 8:49 AM Thomas Munro <thomas.munro@gmail.com> wrote: > <snip> > > Here are some comments on superficial aspects of the patch: > > +/* Custom partition child access hook. Provides further partition pruning given > + * child OID. > + */ > > Should be like: > > /* > * Multi-line comment... > */ Fixed in attached patch. > > Why "child"? Don't you really mean "Partition pruning hook. Provides > custom pruning given partition OID." or something? > > +typedef bool (*partitionChildAccess_hook_type) (Oid childOID); > +PGDLLIMPORT partitionChildAccess_hook_type partitionChildAccess_hook; > > Hmm, I wonder if this could better evoke the job that it's doing... > partition_filter_hook? > partition_access_filter_hook? partition_prune_hook? Ended up going with partition_prune_hook. Good call. > > +/* Macro to use partitionChildAccess_hook. Handles NULL-checking. */ > > It's not a macro, it's a function. Copy-pasta. Fixed. > > +static inline bool InvokePartitionChildAccessHook (Oid childOID) > +{ > + if (partitionChildAccess_hook && enable_partition_pruning && childOID) > + { > > Normally we write OidIsValid(childOID) rather than comparing with 0. > I wonder if you should call the variable relId? Single line if > branches don't usually get curly braces. Fixed. > > + return (*partitionChildAccess_hook) (childOID); > > The syntax we usually use for calling function pointers is just > partitionChildAccess_hook(childOID). Fixed. -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com
Attachment
Hello. I'm on Peter's side. Unlike other similar hooks, this hook is provided just to introduce arbitrary inconsistency in partition pruning. # By the way, InvokePartitionPruningHook seems useless if the # reason for it is to avoid duplicate if()'s . Adding check constraint on children works as far as the RLSish function is immutable. Do you have any concrete example or picture of what you want to achieve? By the way, while considering this, I noticed the following table definition passes. > create table t (id serial, a text check (a = '' or a = CURRENT_USER::text)); I don't think it is the right behavior. > grant all on t to public; > grant all on t_id_seq to public; > \c postgres u1 > insert into t(a) values ('u1'); > \c postgres u2 > insert into t(a) values ('u2'); > \c postgres horiguti > insert into t(a) values ('horiguti'); > select * from t; > id | a > ----+---------- > 1 | u1 > 2 | u2 > 3 | horiguti Broken... The attached patch make parser reject that but I'm not sure how much it affects existing users. > =# create table t (id serial, a text check (a = '' or a = CURRENT_USER::text)); > ERROR: mutable functions are not allowed in constraint expression regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 97f535a2f0..2fd233bf18 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -2376,6 +2376,15 @@ transformMinMaxExpr(ParseState *pstate, MinMaxExpr *m) static Node * transformSQLValueFunction(ParseState *pstate, SQLValueFunction *svf) { + /* + * All SQL value functions are stable so we reject them in check + * constraint expressions. + */ + if (pstate->p_expr_kind == EXPR_KIND_CHECK_CONSTRAINT) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("mutable functions are not allowed in check constraints"))); + /* * All we need to do is insert the correct result type and (where needed) * validate the typmod, so we just modify the node in-place. diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index 2a44b434a5..6ea2f0326d 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -271,6 +271,13 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, &nvargs, &vatype, &declared_arg_types, &argdefaults); + /* mutable functions are not allowed in constraint expressions */ + if (pstate->p_expr_kind == EXPR_KIND_CHECK_CONSTRAINT && + func_volatile(funcid) != PROVOLATILE_IMMUTABLE) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("mutable functions are not allowed in constraint expression"))); + cancel_parser_errposition_callback(&pcbstate); /*
Sorry for jumping into this thread pretty late. I have read the messages on this thread a couple of times and... On Fri, Jul 12, 2019 at 3:05 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > I'm on Peter's side. Unlike other similar hooks, this hook is > provided just to introduce arbitrary inconsistency in partition > pruning. ...I have questions about the patch as proposed, such as what Peter and Horiguchi-san might. Why does this hook need to be specific to partitioning, that is, why is this a "partition pruning" hook? IOW, why does the functionality of this hook apply only to partitions as opposed to *any* table? I may be missing something, but the two things -- a table's partition constraint and security properties (or any properties that the proposed hook's suppliers will provide) -- seem unrelated, so making this specific to partitioning seems odd. If this was proposed as applying to any table, then it might be easier to choose the point from which hook will be called and there will be fewer such points to consider. Needless to say, since the planner sees partitions as tables, the partitioning case will be covered too. Looking at the patch, it seems that the hook will be called *after* opening the partition (provided it survived partition pruning); I'm looking at this hunk: @@ -1038,6 +1038,13 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, continue; } + if (!InvokePartitionPruneHook(childRTE->relid)) + { + /* Implement custom partition pruning filter*/ + set_dummy_rel_pathlist(childrel); + continue; + } set_append_rel_size() is called *after* partitions are opened and their RelOptInfos formed. So it's not following the design you intended whereby the hook will avoid uselessly opening partition files. Also, I suspect you don't want to call the hook from here: @@ -92,6 +93,10 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode) while ((inheritsTuple = systable_getnext(scan)) != NULL) { inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid; + + if (!InvokePartitionPruneHook(inhrelid)) + continue; + ...as you might see unintended consequences, because find_inheritance_children() is called from many places that wouldn't want arbitrary extension code to drop some children from the list. For example, it is called when adding a column to a parent table to find the children that will need to get the same column added. You wouldn't want some children getting the column added while others not. Thanks, Amit
On Fri, Jul 12, 2019 at 4:25 AM Amit Langote <amitlangote09@gmail.com> wrote: > > Sorry for jumping into this thread pretty late. I have read the > messages on this thread a couple of times and... > > On Fri, Jul 12, 2019 at 3:05 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > I'm on Peter's side. Unlike other similar hooks, this hook is > > provided just to introduce arbitrary inconsistency in partition > > pruning. > > ...I have questions about the patch as proposed, such as what Peter > and Horiguchi-san might. Thank you for taking a look at this. I need a bit of time to research and affirm my previous assumptions, but will address all of these points as soon as I can. -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com