Thread: Alias collision in `refresh materialized view concurrently`

Alias collision in `refresh materialized view concurrently`

From
Mathis Rudolf
Date:
Hello,

we had a Customer-Report in which `refresh materialized view
CONCURRENTLY` failed with: `ERROR: column reference "mv" is ambiguous`

They're using `mv` as an alias for one column and this is causing a
collision with an internal alias. They also made it reproducible like this:
```
create materialized view testmv as select 'asdas' mv; --ok
create unique index on testmv (mv); --ok
refresh materialized view testmv; --ok
refresh materialized view CONCURRENTLY testmv; ---BAM!
```

```
ERROR: column reference "mv" is ambiguous
LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...
                                                               ^
QUERY:  CREATE TEMP TABLE pg_temp_4.pg_temp_218322_2 AS SELECT mv.ctid
AS tid, newdata FROM public.testmv mv FULL JOIN pg_temp_4.pg_temp_218322
newdata ON (newdata.mv OPERATOR(pg_catalog.=) mv.mv AND newdata
OPERATOR(pg_catalog.*=) mv) WHERE newdata IS NULL OR mv IS NULL ORDER BY tid
```

The corresponding Code is in `matview.c` in function
`refresh_by_match_merge`. With adding a prefix like `_pg_internal_` we
could make collisions pretty unlikely, without intrusive changes.

The appended patch does this change for the aliases `mv`, `newdata` and
`newdata2`.

Kind regards,
Mathis


Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Wed, May 19, 2021 at 5:33 PM Mathis Rudolf <mathis.rudolf@credativ.de> wrote:
>
> Hello,
>
> we had a Customer-Report in which `refresh materialized view
> CONCURRENTLY` failed with: `ERROR: column reference "mv" is ambiguous`
>
> They're using `mv` as an alias for one column and this is causing a
> collision with an internal alias. They also made it reproducible like this:
> ```
> create materialized view testmv as select 'asdas' mv; --ok
> create unique index on testmv (mv); --ok
> refresh materialized view testmv; --ok
> refresh materialized view CONCURRENTLY testmv; ---BAM!
> ```
>
> ```
> ERROR: column reference "mv" is ambiguous
> LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...
>                                                                ^
> QUERY:  CREATE TEMP TABLE pg_temp_4.pg_temp_218322_2 AS SELECT mv.ctid
> AS tid, newdata FROM public.testmv mv FULL JOIN pg_temp_4.pg_temp_218322
> newdata ON (newdata.mv OPERATOR(pg_catalog.=) mv.mv AND newdata
> OPERATOR(pg_catalog.*=) mv) WHERE newdata IS NULL OR mv IS NULL ORDER BY tid
> ```
>
> The corresponding Code is in `matview.c` in function
> `refresh_by_match_merge`. With adding a prefix like `_pg_internal_` we
> could make collisions pretty unlikely, without intrusive changes.
>
> The appended patch does this change for the aliases `mv`, `newdata` and
> `newdata2`.

I think it's better to have some random name, see below. We could
either use "OIDNewHeap" or "MyBackendId" to make those column names
unique and almost unguessable. So, something like "pg_temp1_XXXX",
"pg_temp2_XXXX" or "pg_temp3_XXXX" and so on would be better IMO.

    snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
    snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d", MyBackendId);

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Alias collision in `refresh materialized view concurrently`

From
Bernd Helmle
Date:
Am Mittwoch, dem 19.05.2021 um 18:06 +0530 schrieb Bharath Rupireddy:
> > The corresponding Code is in `matview.c` in function
> > `refresh_by_match_merge`. With adding a prefix like `_pg_internal_`
> > we
> > could make collisions pretty unlikely, without intrusive changes.
> > 
> > The appended patch does this change for the aliases `mv`, `newdata`
> > and
> > `newdata2`.
> 
> I think it's better to have some random name, see below. We could
> either use "OIDNewHeap" or "MyBackendId" to make those column names
> unique and almost unguessable. So, something like "pg_temp1_XXXX",
> "pg_temp2_XXXX" or "pg_temp3_XXXX" and so on would be better IMO.
> 
>     snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u",
> OIDOldHeap);
>     snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d",
> MyBackendId);

Hmm, it's an idea, but this can also lead to pretty random failures if
you have an unlucky user who had the same idea in its generating query
tool than the backend, no? Not sure if that's really better.

With the current implementation of REFRESH MATERIALIZED VIEW
CONCURRENTLY we always have the problem of possible collisions here,
you'd never get out of this area without analyzing the whole query for
such collisions. 

"mv" looks like a very common alias (i use it all over the time when
testing or playing around with materialized views, so i'm wondering why
i didn't see this issue already myself). So the risk here for such a
collision looks very high. We can try to lower this risk by choosing an
alias name, which is not so common. With a static alias however you get
a static error condition, not something that fails here and then.


-- 
Thanks,
    Bernd





Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Thu, May 20, 2021 at 7:52 PM Bernd Helmle <mailings@oopsware.de> wrote:
>
> Am Mittwoch, dem 19.05.2021 um 18:06 +0530 schrieb Bharath Rupireddy:
> > > The corresponding Code is in `matview.c` in function
> > > `refresh_by_match_merge`. With adding a prefix like `_pg_internal_`
> > > we
> > > could make collisions pretty unlikely, without intrusive changes.
> > >
> > > The appended patch does this change for the aliases `mv`, `newdata`
> > > and
> > > `newdata2`.
> >
> > I think it's better to have some random name, see below. We could
> > either use "OIDNewHeap" or "MyBackendId" to make those column names
> > unique and almost unguessable. So, something like "pg_temp1_XXXX",
> > "pg_temp2_XXXX" or "pg_temp3_XXXX" and so on would be better IMO.
> >
> >     snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u",
> > OIDOldHeap);
> >     snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d",
> > MyBackendId);
>
> Hmm, it's an idea, but this can also lead to pretty random failures if
> you have an unlucky user who had the same idea in its generating query
> tool than the backend, no? Not sure if that's really better.
>
> With the current implementation of REFRESH MATERIALIZED VIEW
> CONCURRENTLY we always have the problem of possible collisions here,
> you'd never get out of this area without analyzing the whole query for
> such collisions.
>
> "mv" looks like a very common alias (i use it all over the time when
> testing or playing around with materialized views, so i'm wondering why
> i didn't see this issue already myself). So the risk here for such a
> collision looks very high. We can try to lower this risk by choosing an
> alias name, which is not so common. With a static alias however you get
> a static error condition, not something that fails here and then.

Another idea is to use random() function to generate required number
of uint32 random values(refresh_by_match_merge might need 3 values to
replace newdata, newdata2 and mv) and use the names like
pg_temp_rmv_<<rand_no1>>,  pg_temp_rmv_<<rand_no2>> and so on. This
would make the name unguessable. Note that we use this in
choose_dsm_implementation, dsm_impl_posix.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Alias collision in `refresh materialized view concurrently`

From
Michael Paquier
Date:
On Thu, May 20, 2021 at 09:14:45PM +0530, Bharath Rupireddy wrote:
> On Thu, May 20, 2021 at 7:52 PM Bernd Helmle <mailings@oopsware.de> wrote:
>> "mv" looks like a very common alias (i use it all over the time when
>> testing or playing around with materialized views, so i'm wondering why
>>  i didn't see this issue already myself). So the risk here for such a
>>  collision looks very high. We can try to lower this risk by choosing an
>>  alias name, which is not so common. With a static alias however you get
>>  a static error condition, not something that fails here and then.
>
> Another idea is to use random() function to generate required number
> of uint32 random values(refresh_by_match_merge might need 3 values to
> replace newdata, newdata2 and mv) and use the names like
> pg_temp_rmv_<<rand_no1>>,  pg_temp_rmv_<<rand_no2>> and so on. This
> would make the name unguessable. Note that we use this in
> choose_dsm_implementation, dsm_impl_posix.

I am not sure that I see the point of using a random() number here
while the backend ID, or just the PID, would easily provide enough
entropy for this internal alias.  I agree that "mv" is a bad choice
for this alias name.  One thing that comes in mind here is to use an
alias similar to what we do for dropped attributes, say
........pg.matview.%d........ where %d is the PID.  This will very
unlikely cause conflicts.
--
Michael

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Fri, May 21, 2021 at 6:08 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, May 20, 2021 at 09:14:45PM +0530, Bharath Rupireddy wrote:
> > On Thu, May 20, 2021 at 7:52 PM Bernd Helmle <mailings@oopsware.de> wrote:
> >> "mv" looks like a very common alias (i use it all over the time when
> >> testing or playing around with materialized views, so i'm wondering why
> >>  i didn't see this issue already myself). So the risk here for such a
> >>  collision looks very high. We can try to lower this risk by choosing an
> >>  alias name, which is not so common. With a static alias however you get
> >>  a static error condition, not something that fails here and then.
> >
> > Another idea is to use random() function to generate required number
> > of uint32 random values(refresh_by_match_merge might need 3 values to
> > replace newdata, newdata2 and mv) and use the names like
> > pg_temp_rmv_<<rand_no1>>,  pg_temp_rmv_<<rand_no2>> and so on. This
> > would make the name unguessable. Note that we use this in
> > choose_dsm_implementation, dsm_impl_posix.
>
> I am not sure that I see the point of using a random() number here
> while the backend ID, or just the PID, would easily provide enough
> entropy for this internal alias.  I agree that "mv" is a bad choice
> for this alias name.  One thing that comes in mind here is to use an
> alias similar to what we do for dropped attributes, say
> ........pg.matview.%d........ where %d is the PID.  This will very
> unlikely cause conflicts.

I agree that backend ID and/or PID is enough. I'm not fully convinced
with using random(). To make it more concrete, how about something
like pg.matview.%d.%d (MyBackendId, MyProcPid)? If the user still sees
some collisions, then IMHO, it's better to ensure that this kind of
table/alias names are not generated outside of the server.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Alias collision in `refresh materialized view concurrently`

From
Michael Paquier
Date:
On Fri, May 21, 2021 at 03:56:31PM +0530, Bharath Rupireddy wrote:
> On Fri, May 21, 2021 at 6:08 AM Michael Paquier <michael@paquier.xyz> wrote:
>> I am not sure that I see the point of using a random() number here
>> while the backend ID, or just the PID, would easily provide enough
>> entropy for this internal alias.  I agree that "mv" is a bad choice
>> for this alias name.  One thing that comes in mind here is to use an
>> alias similar to what we do for dropped attributes, say
>> ........pg.matview.%d........ where %d is the PID.  This will very
>> unlikely cause conflicts.
>
> I agree that backend ID and/or PID is enough. I'm not fully convinced
> with using random(). To make it more concrete, how about something
> like pg.matview.%d.%d (MyBackendId, MyProcPid)? If the user still sees
> some collisions, then IMHO, it's better to ensure that this kind of
> table/alias names are not generated outside of the server.

There is no need to have the PID if MyBackendId is enough, so after
considering it I would just choose something like what I quoted above.
Don't we need also to worry about the queries using newdata, newdata2
and diff as aliases?  Would you like to implement a patch doing
something like that?
--
Michael

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Tue, Jun 1, 2021 at 7:11 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, May 21, 2021 at 03:56:31PM +0530, Bharath Rupireddy wrote:
> > On Fri, May 21, 2021 at 6:08 AM Michael Paquier <michael@paquier.xyz> wrote:
> >> I am not sure that I see the point of using a random() number here
> >> while the backend ID, or just the PID, would easily provide enough
> >> entropy for this internal alias.  I agree that "mv" is a bad choice
> >> for this alias name.  One thing that comes in mind here is to use an
> >> alias similar to what we do for dropped attributes, say
> >> ........pg.matview.%d........ where %d is the PID.  This will very
> >> unlikely cause conflicts.
> >
> > I agree that backend ID and/or PID is enough. I'm not fully convinced
> > with using random(). To make it more concrete, how about something
> > like pg.matview.%d.%d (MyBackendId, MyProcPid)? If the user still sees
> > some collisions, then IMHO, it's better to ensure that this kind of
> > table/alias names are not generated outside of the server.
>
> There is no need to have the PID if MyBackendId is enough, so after
> considering it I would just choose something like what I quoted above.
> Don't we need also to worry about the queries using newdata, newdata2
> and diff as aliases?  Would you like to implement a patch doing
> something like that?

Sure. PSA v2 patch. We can't have "." as separator in the alias names,
so I used "_" instead.

I used MyProcPid which seems more random than MyBackendId (which is
just a number like 1,2,3...). Even with this, someone could argue that
they can look at the backend PID, use it in the materialized view
names just to trick the server. I'm not sure if anyone would want to
do this.

I used the existing function make_temptable_name_n to prepare the
alias names. The advantage of this is that the code looks cleaner, but
it leaks memory, 1KB string for each call of the function. This is
also true with the existing usage of the function. Now, we will have 5
make_temptable_name_n function calls leaking 5KB memory. And we also
have quote_qualified_identifier leaking memory, 2 function calls, 2KB.
So, in total, these two functions will leak 7KB of memory (with the
patch).

Shall I pfree the memory for all the strings returned by the functions
make_temptable_name_n and quote_qualified_identifier? The problem is
that pfree isn't cheaper.
Or shall we leave it as is so that the memory will be freed up by the context?

Note I have not added tests for this, as the code is covered by the
existing tests.

With Regards,
Bharath Rupireddy.

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Bernd Helmle
Date:
Am Dienstag, dem 01.06.2021 um 13:13 +0530 schrieb Bharath Rupireddy:
> I used MyProcPid which seems more random than MyBackendId (which is
> just a number like 1,2,3...). Even with this, someone could argue
> that
> they can look at the backend PID, use it in the materialized view
> names just to trick the server. I'm not sure if anyone would want to
> do this.
> 
> 

A generated query likely uses just an incremented value derived from
somewhere and in my opinion 1,2,3 makes it more likely that you get a
chance for collisions if you managed to get the same alias prefix
somehow. So +1 with the MyProcPid...

> I used the existing function make_temptable_name_n to prepare the
> alias names. The advantage of this is that the code looks cleaner,
> but
> it leaks memory, 1KB string for each call of the function. This is
> also true with the existing usage of the function. Now, we will have
> 5
> make_temptable_name_n function calls leaking 5KB memory. And we also
> have quote_qualified_identifier leaking memory, 2 function calls,
> 2KB.
> So, in total, these two functions will leak 7KB of memory (with the
> patch).
> 
> Shall I pfree the memory for all the strings returned by the
> functions
> make_temptable_name_n and quote_qualified_identifier? The problem is
> that pfree isn't cheaper.
> Or shall we leave it as is so that the memory will be freed up by the
> context?
> 

afaics the memory context is deleted after execution immediately, so
i'd assume it's okay.



-- 
Thanks,
    Bernd





Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Tue, Jun 1, 2021 at 5:24 PM Bernd Helmle <mailings@oopsware.de> wrote:
>
> Am Dienstag, dem 01.06.2021 um 13:13 +0530 schrieb Bharath Rupireddy:
> > I used MyProcPid which seems more random than MyBackendId (which is
> > just a number like 1,2,3...). Even with this, someone could argue
> > that
> > they can look at the backend PID, use it in the materialized view
> > names just to trick the server. I'm not sure if anyone would want to
> > do this.
> >
>
> A generated query likely uses just an incremented value derived from
> somewhere and in my opinion 1,2,3 makes it more likely that you get a
> chance for collisions if you managed to get the same alias prefix
> somehow. So +1 with the MyProcPid...

Thanks.

> > I used the existing function make_temptable_name_n to prepare the
> > alias names. The advantage of this is that the code looks cleaner,
> > but
> > it leaks memory, 1KB string for each call of the function. This is
> > also true with the existing usage of the function. Now, we will have
> > 5
> > make_temptable_name_n function calls leaking 5KB memory. And we also
> > have quote_qualified_identifier leaking memory, 2 function calls,
> > 2KB.
> > So, in total, these two functions will leak 7KB of memory (with the
> > patch).
> >
> > Shall I pfree the memory for all the strings returned by the
> > functions
> > make_temptable_name_n and quote_qualified_identifier? The problem is
> > that pfree isn't cheaper.
> > Or shall we leave it as is so that the memory will be freed up by the
> > context?
> >
>
> afaics the memory context is deleted after execution immediately, so
> i'd assume it's okay.

Yes, the refresh operation happens in the "PortalContext", which gets
destroyed at the end of the query in PortalDrop.

PSA v3 patch. I added a commit message and made some cosmetic adjustments.

With Regards,
Bharath Rupireddy.

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Thomas Munro
Date:
On Wed, Jun 2, 2021 at 2:02 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> PSA v3 patch. I added a commit message and made some cosmetic adjustments.

Reminds me of this fun topic in Lisp:

https://en.wikipedia.org/wiki/Hygienic_macro#Strategies_used_in_languages_that_lack_hygienic_macros

I wondered if we could find a way to make identifiers that regular
queries are prohibited from using, at least by documentation.  You
could take advantage of the various constraints on unquoted
identifiers in the standard (for example, something involving $), but
it does seem a shame to remove the ability for users to put absolutely
anything except NUL in quoted identifiers.  I do wonder if at least
using something like _$mv would be slightly more principled than
pg_mv_1234, since nothing says pg_XXX is reserved (except in some very
specific places like schema names), and the number on the end seems a
bit cargo-cultish.



Re: Alias collision in `refresh materialized view concurrently`

From
Michael Paquier
Date:
On Wed, Jun 02, 2021 at 12:30:55PM +1200, Thomas Munro wrote:
> I wondered if we could find a way to make identifiers that regular
> queries are prohibited from using, at least by documentation.  You
> could take advantage of the various constraints on unquoted
> identifiers in the standard (for example, something involving $), but
> it does seem a shame to remove the ability for users to put absolutely
> anything except NUL in quoted identifiers.  I do wonder if at least
> using something like _$mv would be slightly more principled than
> pg_mv_1234, since nothing says pg_XXX is reserved (except in some very
> specific places like schema names), and the number on the end seems a
> bit cargo-cultish.

Yeah, using an underscore at the beginning of the name would have the
advantage to mark the relation as an internal thing.

+                    "(SELECT %s.tid FROM %s %s "
+                    "WHERE %s.tid IS NOT NULL "
+                    "AND %s.%s IS NULL)",
Anyway, I have another problem with the patch: readability.  It
becomes really hard for one to guess to which object or alias portions
of the internal queries refer to, especially with a total of five
temporary names lying around.  I think that you should drop the
business with make_temptable_name_n(), and just append those extra
underscores and uses of MyProcPid directly in the query string.  The
surroundings of quote_qualified_identifier() require two extra printf
calls, but that does not sound bad to me compared to the long-term
maintenance of those queries.
--
Michael

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Wed, Jun 2, 2021 at 6:33 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jun 02, 2021 at 12:30:55PM +1200, Thomas Munro wrote:
> > I wondered if we could find a way to make identifiers that regular
> > queries are prohibited from using, at least by documentation.  You
> > could take advantage of the various constraints on unquoted
> > identifiers in the standard (for example, something involving $), but
> > it does seem a shame to remove the ability for users to put absolutely
> > anything except NUL in quoted identifiers.  I do wonder if at least
> > using something like _$mv would be slightly more principled than
> > pg_mv_1234, since nothing says pg_XXX is reserved (except in some very
> > specific places like schema names), and the number on the end seems a
> > bit cargo-cultish.
>
> Yeah, using an underscore at the beginning of the name would have the
> advantage to mark the relation as an internal thing.
>
> +                    "(SELECT %s.tid FROM %s %s "
> +                    "WHERE %s.tid IS NOT NULL "
> +                    "AND %s.%s IS NULL)",
> Anyway, I have another problem with the patch: readability.  It
> becomes really hard for one to guess to which object or alias portions
> of the internal queries refer to, especially with a total of five
> temporary names lying around.  I think that you should drop the
> business with make_temptable_name_n(), and just append those extra
> underscores and uses of MyProcPid directly in the query string.  The
> surroundings of quote_qualified_identifier() require two extra printf
> calls, but that does not sound bad to me compared to the long-term
> maintenance of those queries.

Thanks. PSA v4.

With Regards,
Bharath Rupireddy.

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Michael Paquier
Date:
On Wed, Jun 02, 2021 at 10:53:22AM +0530, Bharath Rupireddy wrote:
> Thanks. PSA v4.

Thanks for the new version.

+                    MyProcPid, tempname, MyProcPid, MyProcPid,
+                    tempname, MyProcPid, MyProcPid, MyProcPid,
+                    MyProcPid, MyProcPid, MyProcPid);
This style is still a bit heavy-ish.  Perhaps we should just come back
to Thomas's suggestion and just use a prefix with _$ for all that.
--
Michael

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Bharath Rupireddy
Date:
On Wed, Jun 2, 2021 at 1:27 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jun 02, 2021 at 10:53:22AM +0530, Bharath Rupireddy wrote:
> > Thanks. PSA v4.
>
> Thanks for the new version.
>
> +                    MyProcPid, tempname, MyProcPid, MyProcPid,
> +                    tempname, MyProcPid, MyProcPid, MyProcPid,
> +                    MyProcPid, MyProcPid, MyProcPid);
> This style is still a bit heavy-ish.  Perhaps we should just come back
> to Thomas's suggestion and just use a prefix with _$ for all that.

Thanks.The changes with that approach are very minimal. PSA v5 and let
me know if any more changes are needed.

With Regards,
Bharath Rupireddy.

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Michael Paquier
Date:
On Wed, Jun 02, 2021 at 03:44:45PM +0530, Bharath Rupireddy wrote:
> Thanks.The changes with that approach are very minimal. PSA v5 and let
> me know if any more changes are needed.

Simple enough, so applied and back-patched.  It took 8 years for
somebody to complain about the current aliases, so that should be
enough to get us close to zero conflicts now.  I have looked a bit to
see if anybody would use this naming convention, but could not find a
trace, FWIW.
--
Michael

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Wed, Jun 02, 2021 at 03:44:45PM +0530, Bharath Rupireddy wrote:
>> Thanks.The changes with that approach are very minimal. PSA v5 and let
>> me know if any more changes are needed.

> Simple enough, so applied and back-patched.

I just came across this issue while preparing the release notes.
ISTM that people have expended a great deal of effort on a fundamentally
unreliable solution, when a reliable one is easily available.
The originally complained-of issue was that a user-chosen column name
could collide with the query-chosen table name:

ERROR: column reference "mv" is ambiguous
LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...

This is true, but it's self-inflicted damage, because all you have
to do is write the query so that mv is clearly a table name:

... mv.mv AND newdata.* OPERATOR(pg_catalog.*=) mv.*) WHERE ...

AFAICT that works and generates the identical parse tree to the original
coding.  The only place touched by the patch where it's a bit difficult to
make the syntax unambiguous this way is

                    "CREATE TEMP TABLE %s AS "
                    "SELECT _$mv.ctid AS tid, _$newdata "
                    "FROM %s _$mv FULL JOIN %s _$newdata ON (",

because newdata.* would ordinarily get expanded to multiple columns
if it's at the top level of a SELECT list, and that's not what we want.
However, that's easily fixed using the same hack as in ruleutils.c's
get_variable: add a no-op cast to the table's rowtype.  So this
would look like

    appendStringInfo(&querybuf,
                     "CREATE TEMP TABLE %s AS "
                     "SELECT mv.ctid AS tid, newdata.*::%s "
                     "FROM %s mv FULL JOIN %s newdata ON (",
                     diffname, matviewname, matviewname, tempname);

Given that it took this long to notice the problem at all, maybe
this is not a fix to cram in on the weekend before the release wrap.
But I don't see why we need to settle for "mostly works" when
"always works" is barely any harder.

            regards, tom lane



Re: Alias collision in `refresh materialized view concurrently`

From
Tom Lane
Date:
I wrote:
> I just came across this issue while preparing the release notes.
> ISTM that people have expended a great deal of effort on a fundamentally
> unreliable solution, when a reliable one is easily available.

Concretely, I propose the attached.

            regards, tom lane

diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9493b227b4..512b00bc65 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -537,9 +537,12 @@ transientrel_destroy(DestReceiver *self)
 /*
  * Given a qualified temporary table name, append an underscore followed by
  * the given integer, to make a new table name based on the old one.
+ * The result is a palloc'd string.
  *
- * This leaks memory through palloc(), which won't be cleaned up until the
- * current memory context is freed.
+ * As coded, this would fail to make a valid SQL name if the given name were,
+ * say, "FOO"."BAR".  Currently, the table name portion of the input will
+ * never be double-quoted because it's of the form "pg_temp_NNN", cf
+ * make_new_heap().  But we might have to work harder someday.
  */
 static char *
 make_temptable_name_n(char *tempname, int n)
@@ -627,16 +630,20 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
      * that in a way that allows showing the first duplicated row found.  Even
      * after we pass this test, a unique index on the materialized view may
      * find a duplicate key problem.
+     *
+     * Note: here and below, we use "tablename.*::tablerowtype" as a hack to
+     * keep ".*" from being expanded into multiple columns in a SELECT list.
+     * Compare ruleutils.c's get_variable().
      */
     resetStringInfo(&querybuf);
     appendStringInfo(&querybuf,
-                     "SELECT _$newdata FROM %s _$newdata "
-                     "WHERE _$newdata IS NOT NULL AND EXISTS "
-                     "(SELECT 1 FROM %s _$newdata2 WHERE _$newdata2 IS NOT NULL "
-                     "AND _$newdata2 OPERATOR(pg_catalog.*=) _$newdata "
-                     "AND _$newdata2.ctid OPERATOR(pg_catalog.<>) "
-                     "_$newdata.ctid)",
-                     tempname, tempname);
+                     "SELECT newdata.*::%s FROM %s newdata "
+                     "WHERE newdata.* IS NOT NULL AND EXISTS "
+                     "(SELECT 1 FROM %s newdata2 WHERE newdata2.* IS NOT NULL "
+                     "AND newdata2.* OPERATOR(pg_catalog.*=) newdata.* "
+                     "AND newdata2.ctid OPERATOR(pg_catalog.<>) "
+                     "newdata.ctid)",
+                     tempname, tempname, tempname);
     if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
         elog(ERROR, "SPI_exec failed: %s", querybuf.data);
     if (SPI_processed > 0)
@@ -663,9 +670,9 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
     resetStringInfo(&querybuf);
     appendStringInfo(&querybuf,
                      "CREATE TEMP TABLE %s AS "
-                     "SELECT _$mv.ctid AS tid, _$newdata "
-                     "FROM %s _$mv FULL JOIN %s _$newdata ON (",
-                     diffname, matviewname, tempname);
+                     "SELECT mv.ctid AS tid, newdata.*::%s AS newdata "
+                     "FROM %s mv FULL JOIN %s newdata ON (",
+                     diffname, tempname, matviewname, tempname);

     /*
      * Get the list of index OIDs for the table from the relcache, and look up
@@ -757,9 +764,9 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
                 if (foundUniqueIndex)
                     appendStringInfoString(&querybuf, " AND ");

-                leftop = quote_qualified_identifier("_$newdata",
+                leftop = quote_qualified_identifier("newdata",
                                                     NameStr(attr->attname));
-                rightop = quote_qualified_identifier("_$mv",
+                rightop = quote_qualified_identifier("mv",
                                                      NameStr(attr->attname));

                 generate_operator_clause(&querybuf,
@@ -787,8 +794,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
     Assert(foundUniqueIndex);

     appendStringInfoString(&querybuf,
-                           " AND _$newdata OPERATOR(pg_catalog.*=) _$mv) "
-                           "WHERE _$newdata IS NULL OR _$mv IS NULL "
+                           " AND newdata.* OPERATOR(pg_catalog.*=) mv.*) "
+                           "WHERE newdata.* IS NULL OR mv.* IS NULL "
                            "ORDER BY tid");

     /* Create the temporary "diff" table. */
@@ -814,10 +821,10 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
     /* Deletes must come before inserts; do them first. */
     resetStringInfo(&querybuf);
     appendStringInfo(&querybuf,
-                     "DELETE FROM %s _$mv WHERE ctid OPERATOR(pg_catalog.=) ANY "
-                     "(SELECT _$diff.tid FROM %s _$diff "
-                     "WHERE _$diff.tid IS NOT NULL "
-                     "AND _$diff._$newdata IS NULL)",
+                     "DELETE FROM %s mv WHERE ctid OPERATOR(pg_catalog.=) ANY "
+                     "(SELECT diff.tid FROM %s diff "
+                     "WHERE diff.tid IS NOT NULL "
+                     "AND diff.newdata IS NULL)",
                      matviewname, diffname);
     if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
         elog(ERROR, "SPI_exec failed: %s", querybuf.data);
@@ -825,8 +832,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
     /* Inserts go last. */
     resetStringInfo(&querybuf);
     appendStringInfo(&querybuf,
-                     "INSERT INTO %s SELECT (_$diff._$newdata).* "
-                     "FROM %s _$diff WHERE tid IS NULL",
+                     "INSERT INTO %s SELECT (diff.newdata).* "
+                     "FROM %s diff WHERE tid IS NULL",
                      matviewname, diffname);
     if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
         elog(ERROR, "SPI_exec failed: %s", querybuf.data);
diff --git a/src/test/regress/expected/matview.out b/src/test/regress/expected/matview.out
index 4b3a2e0cb7..313c72a268 100644
--- a/src/test/regress/expected/matview.out
+++ b/src/test/regress/expected/matview.out
@@ -551,7 +551,15 @@ NOTICE:  drop cascades to materialized view mvtest_mv_v
 -- make sure running as superuser works when MV owned by another role (bug #11208)
 CREATE ROLE regress_user_mvtest;
 SET ROLE regress_user_mvtest;
-CREATE TABLE mvtest_foo_data AS SELECT i, md5(random()::text)
+-- this test case also checks for ambiguity in the queries issued by
+-- refresh_by_match_merge(), by choosing column names that intentionally
+-- duplicate all the aliases used in those queries
+CREATE TABLE mvtest_foo_data AS SELECT i,
+  i+1 AS tid,
+  md5(random()::text) AS mv,
+  md5(random()::text) AS newdata,
+  md5(random()::text) AS newdata2,
+  md5(random()::text) AS diff
   FROM generate_series(1, 10) i;
 CREATE MATERIALIZED VIEW mvtest_mv_foo AS SELECT * FROM mvtest_foo_data;
 CREATE MATERIALIZED VIEW mvtest_mv_foo AS SELECT * FROM mvtest_foo_data;
diff --git a/src/test/regress/sql/matview.sql b/src/test/regress/sql/matview.sql
index 4a4bd0d6b6..68b9ccfd45 100644
--- a/src/test/regress/sql/matview.sql
+++ b/src/test/regress/sql/matview.sql
@@ -211,7 +211,15 @@ DROP TABLE mvtest_v CASCADE;
 -- make sure running as superuser works when MV owned by another role (bug #11208)
 CREATE ROLE regress_user_mvtest;
 SET ROLE regress_user_mvtest;
-CREATE TABLE mvtest_foo_data AS SELECT i, md5(random()::text)
+-- this test case also checks for ambiguity in the queries issued by
+-- refresh_by_match_merge(), by choosing column names that intentionally
+-- duplicate all the aliases used in those queries
+CREATE TABLE mvtest_foo_data AS SELECT i,
+  i+1 AS tid,
+  md5(random()::text) AS mv,
+  md5(random()::text) AS newdata,
+  md5(random()::text) AS newdata2,
+  md5(random()::text) AS diff
   FROM generate_series(1, 10) i;
 CREATE MATERIALIZED VIEW mvtest_mv_foo AS SELECT * FROM mvtest_foo_data;
 CREATE MATERIALIZED VIEW mvtest_mv_foo AS SELECT * FROM mvtest_foo_data;

Re: Alias collision in `refresh materialized view concurrently`

From
Michael Paquier
Date:
On Fri, Aug 06, 2021 at 10:48:40AM -0400, Tom Lane wrote:
> AFAICT that works and generates the identical parse tree to the original
> coding.  The only place touched by the patch where it's a bit difficult to
> make the syntax unambiguous this way is
>
>                     "CREATE TEMP TABLE %s AS "
>                     "SELECT _$mv.ctid AS tid, _$newdata "
>                     "FROM %s _$mv FULL JOIN %s _$newdata ON (",
>
> because newdata.* would ordinarily get expanded to multiple columns
> if it's at the top level of a SELECT list, and that's not what we want.
> However, that's easily fixed using the same hack as in ruleutils.c's
> get_variable: add a no-op cast to the table's rowtype.  So this
> would look like
>
>     appendStringInfo(&querybuf,
>                      "CREATE TEMP TABLE %s AS "
>                      "SELECT mv.ctid AS tid, newdata.*::%s "
>                      "FROM %s mv FULL JOIN %s newdata ON (",
>                      diffname, matviewname, matviewname, tempname);

Smart piece.  I haven't thought of that.

> Given that it took this long to notice the problem at all, maybe
> this is not a fix to cram in on the weekend before the release wrap.
> But I don't see why we need to settle for "mostly works" when
> "always works" is barely any harder.

Yes, I would vote to delay that for a couple of days.  That's not
worth taking a risk for.
--
Michael

Attachment

Re: Alias collision in `refresh materialized view concurrently`

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Aug 06, 2021 at 10:48:40AM -0400, Tom Lane wrote:
>> Given that it took this long to notice the problem at all, maybe
>> this is not a fix to cram in on the weekend before the release wrap.
>> But I don't see why we need to settle for "mostly works" when
>> "always works" is barely any harder.

> Yes, I would vote to delay that for a couple of days.  That's not
> worth taking a risk for.

I went ahead and created the patch, including test case, and it
seems fine.  So I'm leaning towards pushing that tomorrow.  Mainly
because I don't want to have to document "we partially fixed this"
in this release set and then "we really fixed it" three months from
now.

            regards, tom lane