Thread: MAINTAIN privilege -- what do we need to un-revert it?
The MAINTAIN privilege was reverted during the 16 cycle because of the potential for someone to play tricks with search_path. For instance, if user foo does: CREATE FUNCTION mod7(INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS $$ BEGIN RETURN mod($1, 7); END; $$; CREATE TABLE x(i INT); CREATE UNIQUE INDEX x_mod7_idx ON x (mod7(i)); GRANT MAINTAIN ON x TO bar; Then user bar can create their own function named "bar.mod(int, int)", and "SET search_path = bar, pg_catalog", and then issue a "REINDEX x" and cause problems. There are several factors required for that to be a problem: 1. foo hasn't used a "SET search_path" clause on their function 2. bar must have the privileges to create a function somewhere 3. bar must have privileges on table x There's an argument that we should blame factor #1. Robert stated[1] that users should use SET search_path clauses on their functions, even SECURITY INVOKER functions. And I've added a search_path cache which improves the performance enough to make that more reasonable to do generally. There's also an argument that #2 is to blame. Given the realities of our system, best practice is that users shouldn't have the privileges to create objects, even in their own schema, unless required. (Joe made this suggestion in an offline discussion.) There's also an arugment that #3 is not specific to the MAINTAIN privilege. Clearly similar risks exist for other privileges, like TRIGGER. And even the INSERT privilege, in the above example, would allow bar to violate the unique constraint and corrupt the index[2]. If those arguments are still unconvincing, then the next idea is to fix the search_path for all maintenance commands[3]. I tried this during the 16 cycle, but due to timing issues it was also reverted. I can proceed with this approach again, but I'd like a clear endorsement, in case there were other reasons to doubt the approach. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/CA+TgmoYEP40iBW-A9nPfDp8AhGoekPp3aPDFzTgBUrqmfCwZzQ@mail.gmail.com [2] https://www.postgresql.org/message-id/fff566293c9165c69bb4c555da1ac02c63660664.camel@j-davis.com [3] https://www.postgresql.org/message-id/e44327179e5c9015c8dda67351c04da552066017.camel@j-davis.com
On Wed, Feb 14, 2024 at 10:20:28AM -0800, Jeff Davis wrote: > If those arguments are still unconvincing, then the next idea is to fix > the search_path for all maintenance commands[3]. I tried this during > the 16 cycle, but due to timing issues it was also reverted. I can > proceed with this approach again, but I'd like a clear endorsement, in > case there were other reasons to doubt the approach. This seemed like the approach folks were most in favor of at the developer meeting a couple weeks ago [0]. At least, that was my interpretation of the discussion. BTW I have been testing reverting commit 151c22d (i.e., un-reverting MAINTAIN) every month or two, and last I checked, it still applies pretty cleanly. The only changes I've needed to make are to the catversion and to a hard-coded version in a test (16 -> 17). [0] https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2024_Developer_Meeting#The_Path_to_un-reverting_the_MAINTAIN_privilege -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Feb 14, 2024 at 01:02:26PM -0600, Nathan Bossart wrote: > BTW I have been testing reverting commit 151c22d (i.e., un-reverting > MAINTAIN) every month or two, and last I checked, it still applies pretty > cleanly. The only changes I've needed to make are to the catversion and to > a hard-coded version in a test (16 -> 17). Posting to get some cfbot coverage. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, 2024-02-14 at 13:02 -0600, Nathan Bossart wrote: > This seemed like the approach folks were most in favor of at the > developer > meeting a couple weeks ago [0]. At least, that was my interpretation > of > the discussion. Attached rebased version. Note the changes in amcheck. It's creating functions and calling those functions from the comparators, and so the comparators need to set the search_path. I don't think that's terribly common, but does represent a behavior change and could break something. Regards, Jef Davis
Attachment
(Apologies in advance for anything I'm bringing up that we've already covered somewhere else.) On Fri, Feb 16, 2024 at 04:03:55PM -0800, Jeff Davis wrote: > Note the changes in amcheck. It's creating functions and calling those > functions from the comparators, and so the comparators need to set the > search_path. I don't think that's terribly common, but does represent a > behavior change and could break something. Why is this change needed? Is the idea to make amcheck follow the same rules as maintenance commands to encourage folks to set up index functions correctly? Or is amcheck similarly affected by search_path tricks? > void > InitializeSearchPath(void) > { > + /* Make the context we'll keep search path cache hashtable in */ > + SearchPathCacheContext = AllocSetContextCreate(TopMemoryContext, > + "search_path processing cache", > + ALLOCSET_DEFAULT_SIZES); > + > if (IsBootstrapProcessingMode()) > { > /* > @@ -4739,11 +4744,6 @@ InitializeSearchPath(void) > } > else > { > - /* Make the context we'll keep search path cache hashtable in */ > - SearchPathCacheContext = AllocSetContextCreate(TopMemoryContext, > - "search_path processing cache", > - ALLOCSET_DEFAULT_SIZES); > - What is the purpose of this change? > + SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH, PGC_USERSET, > + PGC_S_SESSION); I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new value for these. > +/* > + * Safe search path when executing code as the table owner, such as during > + * maintenance operations. > + */ > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp" Is including pg_temp actually safe? I worry that a user could use their temporary schema to inject objects that would take the place of non-schema-qualified stuff in functions. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
New version attached. Do we need a documentation update here? If so, where would be a good place? On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote: > Why is this change needed? Is the idea to make amcheck follow the > same > rules as maintenance commands to encourage folks to set up index > functions > correctly? amcheck is calling functions it defined, so in order to find those functions it needs to set the right search path. > > What is the purpose of this [bootstrap-related] change? DefineIndex() is called during bootstrap, and it's also a maintenance command. I tried to handle the bootstrapping case, but I think it's best to just guard it with a conditional. Done. I also added Assert(!IsBootstrapProcessingMode()) in assign_search_path(). > > + SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH, > > PGC_USERSET, > > + PGC_S_SESSION); > > I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new > value > for these. > > Did you have a particular concern about PGC_S_SESSION? If it's less than PGC_S_SESSION, it won't work, because the caller's SET command will override it, and the same manipulation is possible. And I don't think we want it higher than PGC_S_SESSION, otherwise the function can't set its own search_path, if needed. > > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp" > > Is including pg_temp actually safe? I worry that a user could use > their > temporary schema to inject objects that would take the place of > non-schema-qualified stuff in functions. pg_temp cannot (currently) be excluded. If it is omitted from the string, it will be placed *first* in the search_path, which is more dangerous. pg_temp does not take part in function or operator resolution, which makes it safer than it first appears. There are potentially some risks around tables, but it's not typical to access a table in a function called as part of an index expression. If we determine that pg_temp is actually unsafe to include, we need to do something like what I proposed here: https://www.postgresql.org/message-id/a6865db287596c9c6ea12bdd9de87216cb5e7902.camel@j-davis.com before this change. Regards, Jeff Davis
Attachment
On Tue, Feb 27, 2024 at 04:22:34PM -0800, Jeff Davis wrote: > Do we need a documentation update here? If so, where would be a good > place? I'm afraid I don't have a better idea than adding a short note in each affected commands's page. > On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote: >> I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new >> value >> for these. > > Did you have a particular concern about PGC_S_SESSION? My only concern is that it could obscure the source of the search_path change, which in turn might cause confusion when things fail. > If it's less than PGC_S_SESSION, it won't work, because the caller's > SET command will override it, and the same manipulation is possible. > > And I don't think we want it higher than PGC_S_SESSION, otherwise the > function can't set its own search_path, if needed. Yeah, we would have to make it equivalent in priority to PGC_S_SESSION, which would likely require a bunch of special logic. I don't know if this is worth it, and this seems like something that could pretty easily be added in the future if it became necessary. >> > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp" >> >> Is including pg_temp actually safe? I worry that a user could use >> their >> temporary schema to inject objects that would take the place of >> non-schema-qualified stuff in functions. > > pg_temp cannot (currently) be excluded. If it is omitted from the > string, it will be placed *first* in the search_path, which is more > dangerous. > > pg_temp does not take part in function or operator resolution, which > makes it safer than it first appears. There are potentially some risks > around tables, but it's not typical to access a table in a function > called as part of an index expression. > > If we determine that pg_temp is actually unsafe to include, we need to > do something like what I proposed here: > > https://www.postgresql.org/message-id/a6865db287596c9c6ea12bdd9de87216cb5e7902.camel@j-davis.com I don't doubt anything you've said, but I can't help but think that we might as well handle the pg_temp risk, too. Furthermore, I see that we use "" as a safe search_path for autovacuum and fe_utils/connect.h. Is there any way to unite these? IIUC it might be possible to combine the autovacuum and maintenance command cases (i.e., "!pg_temp"), but we might need to keep pg_temp for the frontend case. I think it's worth trying to add comments about why this setting is safe for some cases but not others, too. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, 2024-02-28 at 10:55 -0600, Nathan Bossart wrote: > I'm afraid I don't have a better idea than adding a short note in > each > affected commands's page. OK, that works for now. Later we should also document that the functions are run as the table owner. > > On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote: > > > I wonder if it's worth using PGC_S_INTERACTIVE or introducing a > > > new > > > value > > > for these. > > > > Did you have a particular concern about PGC_S_SESSION? > > My only concern is that it could obscure the source of the > search_path > change, which in turn might cause confusion when things fail. That's a good point. AutoVacWorkerMain uses PGC_S_OVERRIDE, but it doesn't have to worry about SET, because there's no real session. The function SET clause uses PGC_S_SESSION. It's arguable whether that's really the same source as a SET command, but it's definitely closer. > > Yeah, we would have to make it equivalent in priority to > PGC_S_SESSION, > which would likely require a bunch of special logic. I'm not clear on what problem that would solve. > I don't doubt anything you've said, but I can't help but think that > we > might as well handle the pg_temp risk, too. That sounds good to me, but I didn't get many replies in that last thread. And although it solves the problem, it is a bit awkward. Can we get some closure on whether that !pg_temp patch is the right approach? That was just my first idea, and it would be good to hear what others think. > Furthermore, I see that we use "" as a safe search_path for > autovacuum and > fe_utils/connect.h. Is there any way to unite these? We could have a single function like RestrictSearchPath(), which I believe I had in some previous iteration. That would use the safest search path (either excluding pg_temp or putting it at the end) and PGC_S_SESSION, and then use it everywhere. Regards, Jeff Davis
On Wed, 2024-02-28 at 09:29 -0800, Jeff Davis wrote: > On Wed, 2024-02-28 at 10:55 -0600, Nathan Bossart wrote: > > I'm afraid I don't have a better idea than adding a short note in > > each > > affected commands's page. > > OK, that works for now. Committed. The only changes are documentation and test updates. This is a behavior change, so it still carries some risk, though we've had a lot of discussion and generally it seems to be worth it. If it turns out worse than expected during beta, of course we can re-revert it. I will restate the risks here, which come basically from two places: (1) Functions called from index expressions which rely on search_path (and don't have a SET clause). Such a function would have already been fairly broken before my commit, because anyone accessing the table without the right search_path would have seen an error or wrong results. And there is no means to set the "right" search_path for autoanalyze or logical replication, so those would not have worked with such a broken function before my commit, no matter what. That being said, surely some users did have such broken functions, and with this commit, they will have to remedy them with a SET clause. Fortunately, the performance impact of doing so has been greatly reduced. (2) Matierialized views which call functions that rely on search_path (and don't have a SET clause). This is arguably a worse kind of breakage because materialized views are often refreshed only by the table owner, and it's easier to control search_path when running REFRESH. Additionally, functions called from materialized views are more likely to be "interesting" than functions called from an index expression. However, the remedy is straightforward: use a SET clause. Regards, Jeff Davis
On Mon, Mar 04, 2024 at 07:52:05PM -0800, Jeff Davis wrote: > Committed. Commit 2af07e2 wrote: > --- a/src/backend/access/brin/brin.c > +++ b/src/backend/access/brin/brin.c > @@ -1412,6 +1412,8 @@ brin_summarize_range(PG_FUNCTION_ARGS) > SetUserIdAndSecContext(heapRel->rd_rel->relowner, > save_sec_context | SECURITY_RESTRICTED_OPERATION); > save_nestlevel = NewGUCNestLevel(); > + SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH, PGC_USERSET, > + PGC_S_SESSION); I've audited NewGUCNestLevel() calls that didn't get this addition. Among those, these need the addition: - Each in ComputeIndexAttrs() -- they arise when the caller is DefineIndex() - In DefineIndex(), after comment "changed a behavior-affecting GUC" While "not necessary for security", ExecCreateTableAs() should do it for the same reason it calls NewGUCNestLevel().
On Sun, Jun 30, 2024 at 03:23:44PM -0700, Noah Misch wrote: > I've audited NewGUCNestLevel() calls that didn't get this addition. Among > those, these need the addition: > > - Each in ComputeIndexAttrs() -- they arise when the caller is DefineIndex() > - In DefineIndex(), after comment "changed a behavior-affecting GUC" Hmm. Is RestrictSearchPath() something that we should advertise more strongly, thinking here about extensions that call NewGUCNestLevel()? That would be really easy to miss, and it could have bad consequences. I know that this is not something that's published in the release notes, but it looks like something sensible to have, though. > While "not necessary for security", ExecCreateTableAs() should do it for the > same reason it calls NewGUCNestLevel(). +1. -- Michael
Attachment
On Tue, 2024-07-09 at 15:20 +0900, Michael Paquier wrote: > On Sun, Jun 30, 2024 at 03:23:44PM -0700, Noah Misch wrote: > > I've audited NewGUCNestLevel() calls that didn't get this > > addition. Among > > those, these need the addition: > > > > - Each in ComputeIndexAttrs() -- they arise when the caller is > > DefineIndex() > > - In DefineIndex(), after comment "changed a behavior-affecting > > GUC" Thank you for the report. Patch attached to address these missing call sites. > Hmm. Is RestrictSearchPath() something that we should advertise more > strongly, thinking here about extensions that call NewGUCNestLevel()? > That would be really easy to miss, and it could have bad > consequences. > I know that this is not something that's published in the release > notes, but it looks like something sensible to have, though. The pattern also involves SetUserIdAndSecContext(). Perhaps we could come up with a wrapper function to better encapsulate the general pattern? > > While "not necessary for security", ExecCreateTableAs() should do > > it for the > > same reason it calls NewGUCNestLevel(). > > +1. Do you have a suggestion about how that should be done? It's not trivial, because the both creates the table and populates it in ExecutorRun. For table creation, we need to use the original search_path, but we need to use the restricted search_path when populating it. I could try to refactor it into two statements and execute them separately, or I could try to rewrite the statement to use a fully- qualified destination table before execution. Thoughts? Regards, Jeff Davis
Attachment
On Tue, Jul 09, 2024 at 05:47:36PM -0700, Jeff Davis wrote: > On Tue, 2024-07-09 at 15:20 +0900, Michael Paquier wrote: > > On Sun, Jun 30, 2024 at 03:23:44PM -0700, Noah Misch wrote: > > Hmm. Is RestrictSearchPath() something that we should advertise more > > strongly, thinking here about extensions that call NewGUCNestLevel()? > > That would be really easy to miss, and it could have bad > > consequences. > > I know that this is not something that's published in the release > > notes, but it looks like something sensible to have, though. > > The pattern also involves SetUserIdAndSecContext(). Perhaps we could > come up with a wrapper function to better encapsulate the general > pattern? Worth a look. usercontext.c has an existing wrapper for a superuser process switching to an untrusted user. It could become the home for another wrapper targeting MAINTAIN-relevant callers. > > > While "not necessary for security", ExecCreateTableAs() should do > > > it for the > > > same reason it calls NewGUCNestLevel(). > > > > +1. > > Do you have a suggestion about how that should be done? > > It's not trivial, because the both creates the table and populates it > in ExecutorRun. For table creation, we need to use the original > search_path, but we need to use the restricted search_path when > populating it. > > I could try to refactor it into two statements and execute them > separately, or I could try to rewrite the statement to use a fully- > qualified destination table before execution. Thoughts? Those sound fine. Also fine: just adding a comment on why creation namespace considerations led to not doing it there.
On Thu, 2024-07-11 at 05:52 -0700, Noah Misch wrote: > > I could try to refactor it into two statements and execute them > > separately, or I could try to rewrite the statement to use a fully- > > qualified destination table before execution. Thoughts? > > Those sound fine. Also fine: just adding a comment on why creation > namespace > considerations led to not doing it there. Attached. 0002 separates the CREATE MATERIALIZED VIEW ... WITH DATA into (effectively): CREATE MATERIALIZED VIEW ... WITH NO DATA; REFRESH MATERIALIZED VIEW ...; Using refresh also achieves the stated goal more directly: to (mostly) ensure that a subsequent REFRESH will succeed. Note: the creation itself no longer executes in a security-restricted context, but I don't think that's a problem. The only reason it's using the security restricted context is so the following REFRESH will succeed, right? Regards, Jeff Davis
Attachment
On Fri, Jul 12, 2024 at 02:50:52PM -0700, Jeff Davis wrote: > On Thu, 2024-07-11 at 05:52 -0700, Noah Misch wrote: > > > I could try to refactor it into two statements and execute them > > > separately, or I could try to rewrite the statement to use a fully- > > > qualified destination table before execution. Thoughts? > > > > Those sound fine. Also fine: just adding a comment on why creation > > namespace > > considerations led to not doing it there. > > Attached. 0002 separates the CREATE MATERIALIZED VIEW ... WITH DATA > into (effectively): > > CREATE MATERIALIZED VIEW ... WITH NO DATA; > REFRESH MATERIALIZED VIEW ...; > > Using refresh also achieves the stated goal more directly: to (mostly) > ensure that a subsequent REFRESH will succeed. > > Note: the creation itself no longer executes in a security-restricted > context, but I don't think that's a problem. The only reason it's using > the security restricted context is so the following REFRESH will > succeed, right? Right, that's the only reason. > @@ -346,13 +339,21 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt, > PopActiveSnapshot(); > } > > - if (is_matview) > + /* > + * For materialized views, use REFRESH, which locks down > + * security-restricted operations and restricts the search_path. > + * Otherwise, one could create a materialized view not possible to > + * refresh. > + */ > + if (do_refresh) > { > - /* Roll back any GUC changes */ > - AtEOXact_GUC(false, save_nestlevel); > + RefreshMatViewStmt *refresh = makeNode(RefreshMatViewStmt); > > - /* Restore userid and security context */ > - SetUserIdAndSecContext(save_userid, save_sec_context); > + refresh->relation = into->rel; > + ExecRefreshMatView(refresh, pstate->p_sourcetext, NULL, qc); > + > + if (qc) > + qc->commandTag = CMDTAG_SELECT; > } Since refresh->relation is a RangeVar, this departs from the standard against repeated name lookups, from CVE-2014-0062 (commit 5f17304).
On Fri, 2024-07-12 at 16:11 -0700, Noah Misch wrote: > Since refresh->relation is a RangeVar, this departs from the standard > against > repeated name lookups, from CVE-2014-0062 (commit 5f17304). Interesting, thank you. I did a rough refactor and attached v3. Aside from cleanup issues, is this what you had in mind? Regards, Jeff Davis
Attachment
On Fri, Jul 12, 2024 at 04:50:17PM -0700, Jeff Davis wrote: > On Fri, 2024-07-12 at 16:11 -0700, Noah Misch wrote: > > Since refresh->relation is a RangeVar, this departs from the standard > > against > > repeated name lookups, from CVE-2014-0062 (commit 5f17304). > > Interesting, thank you. > > I did a rough refactor and attached v3. Aside from cleanup issues, is > this what you had in mind? > +extern ObjectAddress RefreshMatViewByOid(Oid matviewOid, bool skipData, bool concurrent, > + const char *queryString, ParamListInfo params, > + QueryCompletion *qc); > Yes, that's an API design that avoids repeated name lookups.
Hi, On Sat, 13 Jul 2024 14:47:48 -0700 Noah Misch <noah@leadboat.com> wrote: > On Fri, Jul 12, 2024 at 04:50:17PM -0700, Jeff Davis wrote: > > On Fri, 2024-07-12 at 16:11 -0700, Noah Misch wrote: > > > Since refresh->relation is a RangeVar, this departs from the standard > > > against > > > repeated name lookups, from CVE-2014-0062 (commit 5f17304). > > > > Interesting, thank you. > > > > I did a rough refactor and attached v3. Aside from cleanup issues, is > > this what you had in mind? > > > +extern ObjectAddress RefreshMatViewByOid(Oid matviewOid, bool skipData, bool concurrent, > > + const char *queryString, ParamListInfo params, > > + QueryCompletion *qc); > > > > Yes, that's an API design that avoids repeated name lookups. Since this commit, matviews are no longer handled in ExecCreateTableAs, so the following error message has not to consider materialized view cases, and can be made simple. /* SELECT should never rewrite to more or less than one SELECT query */ if (list_length(rewritten) != 1) elog(ERROR, "unexpected rewrite result for %s", is_matview ? "CREATE MATERIALIZED VIEW" : "CREATE TABLE AS SELECT"); RefreshMatViewByOid has REFRESH specific error messages in spite of its use in CREATE MATERIALIZED VIEW, but these errors seem not to occur in CREATE MATERIALIZED VIEW case, so I don't think it would be a problem. Another my question is why RefreshMatViewByOid has a ParamListInfo parameter. I don't understand why ExecRefreshMatView has one, either, because currently materialized views may not be defined using bound parameters, which is checked in transformCreateTableAsStmt, and the param argument is not used at all. It might be unsafe to change the interface of ExecRefreshMatView since this is public for a long time, but I don't think the new interface RefreshMatViewByOid has to have this unused argument. I attaehd patches for fixing them respectedly. What do you think about it? Regards, Yugo Nagata -- Yugo Nagata <nagata@sraoss.co.jp>
Attachment
Hello, Thank you for looking. On Fri, 2024-07-26 at 12:26 +0900, Yugo Nagata wrote: > Since this commit, matviews are no longer handled in > ExecCreateTableAs, so the > following error message has not to consider materialized view cases, > and can be made simple. > > /* SELECT should never rewrite to more or less than one > SELECT query */ > if (list_length(rewritten) != 1) > elog(ERROR, "unexpected rewrite result for %s", > is_matview ? "CREATE MATERIALIZED VIEW" : > "CREATE TABLE AS SELECT"); There's a similar error in refresh_matview_datafill(), and I suppose that should account for the CREATE MATERIALIZED VIEW case. We could pass an additional flag to RefreshMatViewByOid to indicate whether it's a CREATE or REFRESH, but it's an internal error, so perhaps it's not important. > Another my question is why RefreshMatViewByOid has a ParamListInfo > parameter. I just passed the params through, but you're right, they aren't referenced at all. I looked at the history, and it appears to go all the way back to the function's introduction in commit 3bf3ab8c56. > I don't understand why ExecRefreshMatView has one, either, because > currently > materialized views may not be defined using bound parameters, which > is checked > in transformCreateTableAsStmt, and the param argument is not used at > all. It might > be unsafe to change the interface of ExecRefreshMatView since this is > public for a > long time, but I don't think the new interface RefreshMatViewByOid > has to have this > unused argument. Extensions should be prepared for reasonable changes in these kinds of functions between releases. Even if the signatures remain the same, the parse structures may change, which creates similar incompatibilities. So let's just get rid of the 'params' argument from both functions. Regards, Jeff Davis
Hi, On Fri, 26 Jul 2024 16:47:23 -0700 Jeff Davis <pgsql@j-davis.com> wrote: > Hello, > > Thank you for looking. > > On Fri, 2024-07-26 at 12:26 +0900, Yugo Nagata wrote: > > Since this commit, matviews are no longer handled in > > ExecCreateTableAs, so the > > following error message has not to consider materialized view cases, > > and can be made simple. > > > > /* SELECT should never rewrite to more or less than one > > SELECT query */ > > if (list_length(rewritten) != 1) > > elog(ERROR, "unexpected rewrite result for %s", > > is_matview ? "CREATE MATERIALIZED VIEW" : > > "CREATE TABLE AS SELECT"); > > There's a similar error in refresh_matview_datafill(), and I suppose > that should account for the CREATE MATERIALIZED VIEW case. We could > pass an additional flag to RefreshMatViewByOid to indicate whether it's > a CREATE or REFRESH, but it's an internal error, so perhaps it's not > important. Thank you for looking into the pach. I agree that it might not be important, but I think adding the flag would be also helpful for improving code-readability because it clarify the function is used in the two cases. I attached patch for this fix (patch 0003). > > Another my question is why RefreshMatViewByOid has a ParamListInfo > > parameter. > > I just passed the params through, but you're right, they aren't > referenced at all. > > I looked at the history, and it appears to go all the way back to the > function's introduction in commit 3bf3ab8c56. > > > I don't understand why ExecRefreshMatView has one, either, because > > currently > > materialized views may not be defined using bound parameters, which > > is checked > > in transformCreateTableAsStmt, and the param argument is not used at > > all. It might > > be unsafe to change the interface of ExecRefreshMatView since this is > > public for a > > long time, but I don't think the new interface RefreshMatViewByOid > > has to have this > > unused argument. > > Extensions should be prepared for reasonable changes in these kinds of > functions between releases. Even if the signatures remain the same, the > parse structures may change, which creates similar incompatibilities. > So let's just get rid of the 'params' argument from both functions. Sure. I fixed the patch to remove 'param' from both functions. (patch 0002) I also add the small refactoring around ExecCreateTableAs(). (patch 0001) - Remove matview-related codes from intorel_startup. Materialized views are no longer handled in this function. - RefreshMatViewByOid is moved to just after create_ctas_nodata call to improve code readability. Regards, Yugo Nagata -- Yugo NAGATA <nagata@sraoss.co.jp>
Attachment
On Wed, 2024-07-31 at 18:20 +0900, Yugo NAGATA wrote: > I agree that it might not be important, but I think adding the flag > would be > also helpful for improving code-readability because it clarify the > function > is used in the two cases. I attached patch for this fix (patch 0003). Committed with one minor modification: I moved the boolean flag to be near the other booleans rather than at the end. Thank you. > Sure. I fixed the patch to remove 'param' from both functions. (patch > 0002) Committed, thank you. > I also add the small refactoring around ExecCreateTableAs(). (patch > 0001) > > - Remove matview-related codes from intorel_startup. > Materialized views are no longer handled in this function. > > - RefreshMatViewByOid is moved to just after create_ctas_nodata > call to improve code readability. > I'm not sure the changes in intorel_startup() are correct. I tried adding an Assert(into->viewQuery == NULL), and it fails because there's another path I did not consider: "EXPLAIN ANALYZE CREATE MATERIALIZED VIEW ...", which does not go through ExecCreateTableAs() but does go through CreateIntoRelDestReceiver(). See: https://postgr.es/m/20444c382e6cb5e21e93c94d679d0198b0dba4dd.camel@j-davis.com Should we refactor a bit and try to make EXPLAIN use the same code paths? Regards, Jeff Davis
On Thu, 01 Aug 2024 11:31:53 -0700 Jeff Davis <pgsql@j-davis.com> wrote: > On Wed, 2024-07-31 at 18:20 +0900, Yugo NAGATA wrote: > > I agree that it might not be important, but I think adding the flag > > would be > > also helpful for improving code-readability because it clarify the > > function > > is used in the two cases. I attached patch for this fix (patch 0003). > > Committed with one minor modification: I moved the boolean flag to be > near the other booleans rather than at the end. Thank you. > > > Sure. I fixed the patch to remove 'param' from both functions. (patch > > 0002) > > Committed, thank you. Thank you for committing them. Should not they be backported to REL_17_STABLE? > > > I also add the small refactoring around ExecCreateTableAs(). (patch > > 0001) > > > > - Remove matview-related codes from intorel_startup. > > Materialized views are no longer handled in this function. > > > > - RefreshMatViewByOid is moved to just after create_ctas_nodata > > call to improve code readability. > > > > I'm not sure the changes in intorel_startup() are correct. I tried > adding an Assert(into->viewQuery == NULL), and it fails because there's > another path I did not consider: "EXPLAIN ANALYZE CREATE MATERIALIZED > VIEW ...", which does not go through ExecCreateTableAs() but does go > through CreateIntoRelDestReceiver(). > > See: > > https://postgr.es/m/20444c382e6cb5e21e93c94d679d0198b0dba4dd.camel@j-davis.com > > Should we refactor a bit and try to make EXPLAIN use the same code > paths? I overlooked that CreateIntoRelDestReceiver() is used from EXPLAIN. I saw the thread above and I agree that we should refactor it to make EXPLAIN consistent CREATE MATERIALIZED VIEW, but I suppose this should be discussed the other thread. I attached a updated patch removed the intorel_startup() part from. Regards, Yugo Nagata -- Yugo NAGATA <nagata@sraoss.co.jp>
Attachment
On Fri, 2 Aug 2024 16:13:01 +0900 Yugo NAGATA <nagata@sraoss.co.jp> wrote: > On Thu, 01 Aug 2024 11:31:53 -0700 > Jeff Davis <pgsql@j-davis.com> wrote: > > > On Wed, 2024-07-31 at 18:20 +0900, Yugo NAGATA wrote: > > > I agree that it might not be important, but I think adding the flag > > > would be > > > also helpful for improving code-readability because it clarify the > > > function > > > is used in the two cases. I attached patch for this fix (patch 0003). > > > > Committed with one minor modification: I moved the boolean flag to be > > near the other booleans rather than at the end. Thank you. > > > > > Sure. I fixed the patch to remove 'param' from both functions. (patch > > > 0002) > > > > Committed, thank you. > > Thank you for committing them. > Should not they be backported to REL_17_STABLE? > > > > > > I also add the small refactoring around ExecCreateTableAs(). (patch > > > 0001) > > > > > > - Remove matview-related codes from intorel_startup. > > > Materialized views are no longer handled in this function. > > > > > > - RefreshMatViewByOid is moved to just after create_ctas_nodata > > > call to improve code readability. > > > > > > > I'm not sure the changes in intorel_startup() are correct. I tried > > adding an Assert(into->viewQuery == NULL), and it fails because there's > > another path I did not consider: "EXPLAIN ANALYZE CREATE MATERIALIZED > > VIEW ...", which does not go through ExecCreateTableAs() but does go > > through CreateIntoRelDestReceiver(). > > > > See: > > > > https://postgr.es/m/20444c382e6cb5e21e93c94d679d0198b0dba4dd.camel@j-davis.com > > > > Should we refactor a bit and try to make EXPLAIN use the same code > > paths? > > I overlooked that CreateIntoRelDestReceiver() is used from EXPLAIN. I saw the > thread above and I agree that we should refactor it to make EXPLAIN consistent > CREATE MATERIALIZED VIEW, but I suppose this should be discussed the other thread. > > I attached a updated patch removed the intorel_startup() part from. I confirmed that this has been committed to the master branch. Thank you! I also noticed that the documentation of CREATE MATERIALIZED VIEW doesn't mention search_path while it also changes search_path since it uses the REFRESH logic. I attached a trivial patch to fix this. Regards, Yugo Nagata -- Yugo Nagata <nagata@sraoss.co.jp>
Attachment
On Mon, Aug 05, 2024 at 04:05:02PM +0900, Yugo Nagata wrote: > + <para> > + While <command>CREATE MATERIALIZED VIEW</command> is running, the <xref > + linkend="guc-search-path"/> is temporarily changed to <literal>pg_catalog, > + pg_temp</literal>. > + </para> I think we should mention that this is not true when WITH NO DATA is used. Maybe something like: Unless WITH NO DATA is used, the search_path is temporarily changed to pg_catalog, pg_temp while CREATE MATERIALIZED VIEW is running. -- nathan
On Fri, Sep 27, 2024 at 12:42:34PM +0900, Yugo NAGATA wrote: > I agree with you. I overlooked WITH NO DATA. > I attached a updated patch. Thanks. Unless someone objects, I plan to commit this shortly. -- nathan
On Fri, 2024-09-27 at 10:34 -0500, Nathan Bossart wrote: > On Fri, Sep 27, 2024 at 12:42:34PM +0900, Yugo NAGATA wrote: > > I agree with you. I overlooked WITH NO DATA. > > I attached a updated patch. > > Thanks. Unless someone objects, I plan to commit this shortly. The command is run effectively in two parts: the CREATE part and the REFRESH part. The former just uses the session search path, while the latter uses the safe search path. I suggest that we add the wording to the <replaceable>query</replaceable> portion of the doc, near "security- restricted operation". Regards, Jeff Davis
On Fri, Sep 27, 2024 at 09:22:48AM -0700, Jeff Davis wrote: > I suggest that we add the wording to the > <replaceable>query</replaceable> portion of the doc, near "security- > restricted operation". How does this look? diff --git a/doc/src/sgml/ref/create_materialized_view.sgml b/doc/src/sgml/ref/create_materialized_view.sgml index 0d2fea2b97..62d897931c 100644 --- a/doc/src/sgml/ref/create_materialized_view.sgml +++ b/doc/src/sgml/ref/create_materialized_view.sgml @@ -143,7 +143,9 @@ CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] <replaceable>table_name</replaceable> A <link linkend="sql-select"><command>SELECT</command></link>, <link linkend="sql-table"><command>TABLE</command></link>, or <link linkend="sql-values"><command>VALUES</command></link> command. This query will run within a security-restricted operation; in particular, calls to functions that - themselves create temporary tables will fail. + themselves create temporary tables will fail. Also, while the query is + running, the <xref linkend="guc-search-path"/> is temporarily changed to + <literal>pg_catalog, pg_temp</literal>. </para> </listitem> </varlistentry> -- nathan
On Fri, 2024-09-27 at 15:04 -0500, Nathan Bossart wrote: > On Fri, Sep 27, 2024 at 09:22:48AM -0700, Jeff Davis wrote: > > I suggest that we add the wording to the > > <replaceable>query</replaceable> portion of the doc, near > > "security- > > restricted operation". > > How does this look? Looks good to me. Regards, Jeff Davis
On Fri, Sep 27, 2024 at 01:27:38PM -0700, Jeff Davis wrote: > Looks good to me. Thanks, committed. -- nathan