Re: [PATCH] Implement INSERT SET syntax - Mailing list pgsql-hackers

From Gareth Palmer
Subject Re: [PATCH] Implement INSERT SET syntax
Date
Msg-id 281BF7E9-7EF0-49C7-82FA-58E6BF736EFC@internetnz.net.nz
Whole thread Raw
In response to Re: [PATCH] Implement INSERT SET syntax  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

> On 15/11/2019, at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Gareth Palmer <gareth@internetnz.net.nz> writes:
>>> On 19/08/2019, at 3:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Perhaps the way to resolve Peter's objection is to make the syntax
>>> more fully like UPDATE:
>>>    INSERT INTO target SET c1 = x, c2 = y+z, ... FROM tables-providing-x-y-z
>>> (with the patch as-submitted corresponding to the case with an empty
>>> FROM clause, hence no variables in the expressions-to-be-assigned).
>
>> Thanks for the feedback. Attached is version 3 of the patch that makes
>> the syntax work more like an UPDATE statement when a FROM clause is used.
>
> Since nobody has objected to this, I'm supposing that there's general
> consensus that that design sketch is OK, and we can move on to critiquing
> implementation details.  I took a look, and didn't like much of what I saw.
>
> * In the grammar, there's no real need to have separate productions
> for the cases with FROM and without.  The way you have it is awkward,
> and it arbitrarily rejects combinations that work fine in plain
> SELECT, such as WHERE without FROM.  You should just do
>
> insert_set_clause:
>         SET set_clause_list from_clause where_clause
>           group_clause having_clause window_clause opt_sort_clause
>           opt_select_limit
>
> relying on the ability of all those symbols (except set_clause_list) to
> reduce to empty.

There are two separate productions to match the two different types
of inserts: INSERT with VALUES and INSERT with SELECT.

The former has to store the the values in valuesLists so that DEFAULT
can still be used.

Allowing a WHERE without a FROM also mean that while this would
work:

INSERT INTO t SET c = DEFAULT;

But this would fail with 'DEFAULT is not allowed in this context':

INSERT INTO t SET c = DEFAULT WHERE true;

I should have put a comment explaining why there are two rules.

It could be combined into one production but there would have to be
a check that $4 .. $9 are NULL to determine what type of INSERT to
use.

transformInsertStmt() also has an optimisation for the case of a
single valueLists entry.

> * This is randomly inconsistent with select_no_parens, and not in a
> good way, because you've omitted the option that's actually most likely
> to be useful, namely for_locking_clause.  I wonder whether it's practical
> to refactor select_no_parens so that the stuff involving optional trailing
> clauses can be separated out into a production that insert_set_clause
> could also use.  Might not be worth the trouble, but I'm concerned
> about select_no_parens growing additional clauses that we then forget
> to also add to insert_set_clause.
>
> * I'm not sure if it's worth also refactoring simple_select so that
> the "into_clause ... window_clause" business could be shared.  But
> it'd likely be a good idea to at least have a comment there noting
> that any changes in that production might need to be applied to
> insert_set_clause as well.

I can add opt_for_locking_clause and a comment to simple_select to
start with while the format of insert_set_clause is still being
worked out.

> * In kind of the same vein, it feels like the syntax documentation
> is awkwardly failing to share commonality that it ought to be
> able to share with the SELECT man page.

I could collapse the from clause to just '[ FROM from_clause ]'
and have it refer to the from clause and everything after it in
SELECT.

> * I dislike the random hacking you did in transformMultiAssignRef.
> That weakens a useful check for error cases, and it's far from clear
> why the new assertion is OK.  It also raises the question of whether
> this is really the only place you need to touch in parse analysis.
> Perhaps it'd be better to consider inventing new EXPR_KIND_ values
> for this situation; you'd then have to run around and look at all the
> existing EXPR_KIND uses, but that seems like a useful cross-check
> activity anyway.  Or maybe we need to take two steps back and
> understand why that change is needed at all.  I'd imagined that this
> patch would be only syntactic sugar for something you can do already,
> so it's not quite clear to me why we need additional changes.
>
> (If it's *not* just syntactic sugar, then the scope of potential
> problems becomes far greater, eg does ruleutils.c need to know
> how to reconstruct a valid SQL command from a querytree like this.
> If we're not touching ruleutils.c, we need to be sure that every
> command that can be written this way can be written old-style.)

It was intended to just be syntatic sugar, however because
set_clause_list is being re-used the ability to do multi-assignment
in an INSERT's targetList 'came along for the ride' which has
no equivalent in the current INSERT syntax.

That would be why those EXPR_KIND's are now appearing in
transformMultiAssignRef().

There are 3 things that could be done here:

1. Update ruletutils.c to emit INSERT SET in get_insert_query_def()
   if query->hasSubLinks is true.

2. Add a new production similar to set_clause_list which doesn't
   allow multi-assignment.

3. Re-use set_clause_list but reject targetLists that contain
   multi-assignment.

Keeping that feature is probably desirable at least for consistency
with other SET clauses.

I will work on getting get_insert_query_def() to correctly
reconstruct the new syntax.

> * Other documentation gripes: the lone example seems insufficient,
> and there needs to be an entry under COMPATIBILITY pointing out
> that this is not per SQL spec.

I will add something to in the compatibility section.

> * Some of the test cases seem to be expensively repeating
> construction/destruction of tables that they could have shared with
> existing test cases.  I do not consider it a virtue for new tests
> added to an existing test script to be resolutely independent of
> what's already in that script.

Those test cases will be changed to share the those tables.

> I'm setting this back to Waiting on Author.
>
>             regards, tom lane




pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: progress report for ANALYZE
Next
From: Thomas Munro
Date:
Subject: Re: Invisible PROMPT2