Thread: Patch to add insert of multiple tuples per INSERT statement

Patch to add insert of multiple tuples per INSERT statement

From

Liam Stewart

Date:

31 July 2001, 14:14:13

The attached patch adds the ability to explicitly insert multiple tuples
per insert statement. ie:

INSERT INTO tab [(col1, col2, ...)] VALUES (x1, y1, ...), (x2, y2, ...), ... ;

as per the todo list entry.

Liam

--
Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com

Attachment

insert.diff

Re: Patch to add insert of multiple tuples per INSERT statement

From

Tom Lane

Date:

31 July 2001, 15:36:48

Liam Stewart <liams@redhat.com> writes:
> The attached patch adds the ability to explicitly insert multiple tuples
> per insert statement. ie:
> INSERT INTO tab [(col1, col2, ...)] VALUES (x1, y1, ...), (x2, y2, ...), ... ;
> as per the todo list entry.

There are several things that need to be cleaned up here.

1. The static TupleCount variable is pretty ugly, and unnecessary because
there are already mechanisms in the executor to keep track of the number
of tuples processed --- here's an example:

regression=# insert into foo select * from int8_tbl;
INSERT 0 5
regression=# insert into foo select * from int8_tbl limit 1;
INSERT 146302 1

Please do this the way the rest of the code does it.

2. You missed the equal-func for InsertStmt.  In general, when changing
fields of a Node structure, there are copy, equality, input, and output
functions to be looked at.  The input and output functions may be
omitted for node types that never appear in stored rules, however, and
InsertStmt doesn't.  But copy and equality are always there.

3. Please also fix the other copy of the grammar in
src/interfaces/ecpg/preproc/preproc.y.  (If anyone has an idea how to
not duplicate most of the grammar for ecpg, I'd love to hear it.)

4. I don't like the approach of converting an InsertStmt into an
extras_after list --- the extras_after list is a hack, and shouldn't
be depended on unnecessarily.  Seems like a cleaner approach would be to
do all the looping in transformInsertStmt.  In any case, the use of
both targetList and tupleList in the InsertStmt node is peculiar and
very inadequately documented --- it took me some time to see that what
you had done was not outright broken.  Please improve the comments in
parsenodes.h, if nothing else.

A more general point, though this may be more than you wanted to tackle
now, is that in the long run complexifying INSERT processing is exactly
*not* the way to go about this.  If you read the SQL92 grammar you'll
see that "VALUES (row), (row), ..." is actually a kind of <query
expression> and should be accepted anywhere that a sub-SELECT would be.
So eventually we want to allow that and simplify INSERT to have only one
case covering both INSERT ... VALUES and INSERT ... SELECT.  This, as
well as allowing the SQL-mandated capability of writing DEFAULT for any
one column of an INSERT, was what I had taken the TODO item to mean.

            regards, tom lane

Re: Patch to add insert of multiple tuples per INSERT statement

From

Liam Stewart

Date:

01 August 2001, 16:17:12

On Tue, Jul 31, 2001 at 12:16:58PM -0400, Tom Lane wrote:
> 2. You missed the equal-func for InsertStmt.  In general, when changing
> fields of a Node structure, there are copy, equality, input, and output
> functions to be looked at.  The input and output functions may be
> omitted for node types that never appear in stored rules, however, and
> InsertStmt doesn't.  But copy and equality are always there.
>
> 3. Please also fix the other copy of the grammar in
> src/interfaces/ecpg/preproc/preproc.y.  (If anyone has an idea how to
> not duplicate most of the grammar for ecpg, I'd love to hear it.)

Ok; these are good to know.

> 4. I don't like the approach of converting an InsertStmt into an
> extras_after list --- the extras_after list is a hack, and shouldn't
> be depended on unnecessarily.  Seems like a cleaner approach would be to
> do all the looping in transformInsertStmt.

Could you elaborate on your last statement (all looping in
transformInsertStmt)? That's currently where I'm breaking large inserts
into multiple single inserts. Yes, the conversion approach is not
terribly nice, but it works. Are there any plans for replacing the
extras_(before|after) lists method with something something else?

> A more general point, though this may be more than you wanted to tackle
> now, is that in the long run complexifying INSERT processing is exactly
> *not* the way to go about this.  If you read the SQL92 grammar you'll
> see that "VALUES (row), (row), ..." is actually a kind of <query
> expression> and should be accepted anywhere that a sub-SELECT would be.
> So eventually we want to allow that and simplify INSERT to have only one
> case covering both INSERT ... VALUES and INSERT ... SELECT.  This, as
> well as allowing the SQL-mandated capability of writing DEFAULT for any
> one column of an INSERT, was what I had taken the TODO item to mean.

Yup, I agree. I'm willing to take a look at that. How about this: create
a new node type for <table value constructor> (VALUES (...), (...), ...)
and work things so that in the executor, tuples are retrieved from a
list rather than going down through all the heap code. So in effect,
VALUES ... will work like a SELECT when viewed from the outside. I still
need to examine the details, but it looks do-able. There obviously needs
to be a distinction between VALUES ... and SELECT ... and that would be
reflected in gram.y (as well as in the other backend code), perhaps
underneath a rule for <query expression>. I think this is what you were
getting at.

Liam

--
Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com