Thread: Patch to add insert of multiple tuples per INSERT statement
The attached patch adds the ability to explicitly insert multiple tuples per insert statement. ie: INSERT INTO tab [(col1, col2, ...)] VALUES (x1, y1, ...), (x2, y2, ...), ... ; as per the todo list entry. Liam -- Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com
Attachment
Liam Stewart <liams@redhat.com> writes: > The attached patch adds the ability to explicitly insert multiple tuples > per insert statement. ie: > INSERT INTO tab [(col1, col2, ...)] VALUES (x1, y1, ...), (x2, y2, ...), ... ; > as per the todo list entry. There are several things that need to be cleaned up here. 1. The static TupleCount variable is pretty ugly, and unnecessary because there are already mechanisms in the executor to keep track of the number of tuples processed --- here's an example: regression=# insert into foo select * from int8_tbl; INSERT 0 5 regression=# insert into foo select * from int8_tbl limit 1; INSERT 146302 1 Please do this the way the rest of the code does it. 2. You missed the equal-func for InsertStmt. In general, when changing fields of a Node structure, there are copy, equality, input, and output functions to be looked at. The input and output functions may be omitted for node types that never appear in stored rules, however, and InsertStmt doesn't. But copy and equality are always there. 3. Please also fix the other copy of the grammar in src/interfaces/ecpg/preproc/preproc.y. (If anyone has an idea how to not duplicate most of the grammar for ecpg, I'd love to hear it.) 4. I don't like the approach of converting an InsertStmt into an extras_after list --- the extras_after list is a hack, and shouldn't be depended on unnecessarily. Seems like a cleaner approach would be to do all the looping in transformInsertStmt. In any case, the use of both targetList and tupleList in the InsertStmt node is peculiar and very inadequately documented --- it took me some time to see that what you had done was not outright broken. Please improve the comments in parsenodes.h, if nothing else. A more general point, though this may be more than you wanted to tackle now, is that in the long run complexifying INSERT processing is exactly *not* the way to go about this. If you read the SQL92 grammar you'll see that "VALUES (row), (row), ..." is actually a kind of <query expression> and should be accepted anywhere that a sub-SELECT would be. So eventually we want to allow that and simplify INSERT to have only one case covering both INSERT ... VALUES and INSERT ... SELECT. This, as well as allowing the SQL-mandated capability of writing DEFAULT for any one column of an INSERT, was what I had taken the TODO item to mean. regards, tom lane
On Tue, Jul 31, 2001 at 12:16:58PM -0400, Tom Lane wrote: > 2. You missed the equal-func for InsertStmt. In general, when changing > fields of a Node structure, there are copy, equality, input, and output > functions to be looked at. The input and output functions may be > omitted for node types that never appear in stored rules, however, and > InsertStmt doesn't. But copy and equality are always there. > > 3. Please also fix the other copy of the grammar in > src/interfaces/ecpg/preproc/preproc.y. (If anyone has an idea how to > not duplicate most of the grammar for ecpg, I'd love to hear it.) Ok; these are good to know. > 4. I don't like the approach of converting an InsertStmt into an > extras_after list --- the extras_after list is a hack, and shouldn't > be depended on unnecessarily. Seems like a cleaner approach would be to > do all the looping in transformInsertStmt. Could you elaborate on your last statement (all looping in transformInsertStmt)? That's currently where I'm breaking large inserts into multiple single inserts. Yes, the conversion approach is not terribly nice, but it works. Are there any plans for replacing the extras_(before|after) lists method with something something else? > A more general point, though this may be more than you wanted to tackle > now, is that in the long run complexifying INSERT processing is exactly > *not* the way to go about this. If you read the SQL92 grammar you'll > see that "VALUES (row), (row), ..." is actually a kind of <query > expression> and should be accepted anywhere that a sub-SELECT would be. > So eventually we want to allow that and simplify INSERT to have only one > case covering both INSERT ... VALUES and INSERT ... SELECT. This, as > well as allowing the SQL-mandated capability of writing DEFAULT for any > one column of an INSERT, was what I had taken the TODO item to mean. Yup, I agree. I'm willing to take a look at that. How about this: create a new node type for <table value constructor> (VALUES (...), (...), ...) and work things so that in the executor, tuples are retrieved from a list rather than going down through all the heap code. So in effect, VALUES ... will work like a SELECT when viewed from the outside. I still need to examine the details, but it looks do-able. There obviously needs to be a distinction between VALUES ... and SELECT ... and that would be reflected in gram.y (as well as in the other backend code), perhaps underneath a rule for <query expression>. I think this is what you were getting at. Liam -- Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com