Re: Alias collision in `refresh materialized view concurrently` - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Alias collision in `refresh materialized view concurrently`
Date
Msg-id 2488325.1628261320@sss.pgh.pa.us
Whole thread Raw
In response to Re: Alias collision in `refresh materialized view concurrently`  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Alias collision in `refresh materialized view concurrently`  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Alias collision in `refresh materialized view concurrently`  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Michael Paquier <michael@paquier.xyz> writes:
> On Wed, Jun 02, 2021 at 03:44:45PM +0530, Bharath Rupireddy wrote:
>> Thanks.The changes with that approach are very minimal. PSA v5 and let
>> me know if any more changes are needed.

> Simple enough, so applied and back-patched.

I just came across this issue while preparing the release notes.
ISTM that people have expended a great deal of effort on a fundamentally
unreliable solution, when a reliable one is easily available.
The originally complained-of issue was that a user-chosen column name
could collide with the query-chosen table name:

ERROR: column reference "mv" is ambiguous
LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...

This is true, but it's self-inflicted damage, because all you have
to do is write the query so that mv is clearly a table name:

... mv.mv AND newdata.* OPERATOR(pg_catalog.*=) mv.*) WHERE ...

AFAICT that works and generates the identical parse tree to the original
coding.  The only place touched by the patch where it's a bit difficult to
make the syntax unambiguous this way is

                    "CREATE TEMP TABLE %s AS "
                    "SELECT _$mv.ctid AS tid, _$newdata "
                    "FROM %s _$mv FULL JOIN %s _$newdata ON (",

because newdata.* would ordinarily get expanded to multiple columns
if it's at the top level of a SELECT list, and that's not what we want.
However, that's easily fixed using the same hack as in ruleutils.c's
get_variable: add a no-op cast to the table's rowtype.  So this
would look like

    appendStringInfo(&querybuf,
                     "CREATE TEMP TABLE %s AS "
                     "SELECT mv.ctid AS tid, newdata.*::%s "
                     "FROM %s mv FULL JOIN %s newdata ON (",
                     diffname, matviewname, matviewname, tempname);

Given that it took this long to notice the problem at all, maybe
this is not a fix to cram in on the weekend before the release wrap.
But I don't see why we need to settle for "mostly works" when
"always works" is barely any harder.

            regards, tom lane



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: RFC: Improve CPU cache locality of syscache searches
Next
From: Himanshu Upadhyaya
Date:
Subject: Re: Support reset of Shared objects statistics in "pg_stat_reset" function