Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE
Date
Msg-id CAEZATCVaZNONSW1yEZsAe6U_-O0kqyPTv_N4_60ydd=ot7y79Q@mail.gmail.com
Whole thread Raw
In response to Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, 6 Sept 2020 at 22:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I think you'd be better off to make transformInsertStmt(), specifically
> its multi-VALUES-rows code path, check for all-DEFAULT columns and adjust
> the tlist itself.  Doing it there might be a good bit less inefficient
> for very long VALUES lists, too, which is a case that we do worry about.
> Since that's already iterating through the sub-lists, you could track
> whether all rows seen so far contain SetToDefault in each column position,
> and avoid extra scans of the sublists.  (A BitmapSet might be a convenient
> representation of that, though you could also use a bool array I suppose.)
>
> I do not care for what you did in rewriteValuesRTE() either: just removing
> a sanity check isn't OK, unless you've done something to make the sanity
> check unnecessary which you surely have not.  Perhaps you could extend
> the initial scan of the tlist (lines 1294-1310) to notice SetToDefault
> nodes as well as Var nodes and keep track of which columns have those.
> Then you could cross-check that one or the other case applies whenever
> you see a SetToDefault in the VALUES lists.

That's not quite right because by the time rewriteValuesRTE() sees the
tlist, it will contain already-rewritten generated column expressions,
not SetToDefault nodes. If we're going to keep that sanity check (and
I think that we should), I think that the way to do it is to have
rewriteTargetListIU() record which columns it has expanded defaults
for, and pass that information to rewriteValuesRTE(). Those columns of
the VALUES RTE are no longer used in the query, so the sanity check
can be amended to ignore them while continuing to check the other
columns.

Incidentally, there is another way of causing that sanity check to
fail -- an INSERT ... OVERRIDING USER VALUE query with some DEFAULTS
in the VALUES RTE (but not necessarily all DEFAULTs) will trigger it.
So even if the parser completely removed any all-DEFAULT columns from
the VALUES RTE, some work in the rewriter would still be necessary.


> BTW, another thing that needs checking is whether a rule containing
> an INSERT like this will reverse-list sanely.  The whole idea of
> replacing some of the Vars might not work so far as ruleutils.c is
> concerned.  In that case I think we might have to implement this
> by having transformInsertStmt restructure the VALUES lists to
> physically remove the all-DEFAULT column, and adjust the target column
> list accordingly --- that is, make a parse-time transformation of
>         INSERT INTO gtest0 VALUES (1, DEFAULT), (2, DEFAULT);
> into
>         INSERT INTO gtest0(a) VALUES (1), (2);
> That'd have the advantage that you'd not have to hack up the
> rewriter at all.

I think it's actually easier to just do it all in the rewriter -- at
the point where we see that we're about to insert potentially illegal
values from a VALUES RTE into a generated column, scan it to see if
all the values in that column are DEFAULTs, and if so trigger the
existing code to replace the Var in the tlist with the generated
column expression. That way we only do extra work in the case for
which we're currently throwing an error, rather than for every query.
Also, I think that scanning the VALUES lists in this way is likely to
be cheaper than rebuilding them to remove a column.

Attached is a patch doing it that way, along with additional
regression tests that trigger both the original error and the
sanity-check error triggered by INSERT ... OVERRIDING USER VALUES. I
also added a few additional comments where I found the existing code a
little non-obvious.

I haven't touched the existing error messages. I think that's a
subject for a separate patch.

Regards,
Dean

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path
Next
From: Laurenz Albe
Date:
Subject: Re: Add session statistics to pg_stat_database