Thread: multiple inserts

multiple inserts

From
Liam Stewart
Date:
Here is another version of the multiple insert patch that contains
modifications as per Tom's post on the 24th.

I had though about the integration of targetList and valuesList into
something like targetLists, but I didn't do it because I knew there
would need to be modifications in a lot of places and because the
difference between targetList and valuesList seemed clear. I gave Tom's
suggestion a shot, though, and this is the first result. There are quite
a number of functions that take a single targetList as a parameter. I
didn't change those because I think it's more trouble that it's worth
right now. If those modifications are desired, they should wait until
the redesign of the the querytree structure, IMHO. But targetList ->
targetLists is done.

There are a couple of places where I made an assertion that the length
of a Query node's targetLists member must be one that I'm not sure about
(there are comments by those assertions). Suggestions on those would be
appreciated.

There are also a couple of places where I need to do a check to see if a
query's targetLists is NIL and do assignments based on the result of
that check. Would it be better to for all Query nodes that currently
have a NIL targetLists coming out of transfromXYZ to have a targetLists
member with one NIL targetList. I don't really think so (it's could be
confusing), but I'm just wondering what the thoughts of others are.

Comments?

Liam

--
Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com

Attachment

Re: multiple inserts

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

>
> Here is another version of the multiple insert patch that contains
> modifications as per Tom's post on the 24th.
>
> I had though about the integration of targetList and valuesList into
> something like targetLists, but I didn't do it because I knew there
> would need to be modifications in a lot of places and because the
> difference between targetList and valuesList seemed clear. I gave Tom's
> suggestion a shot, though, and this is the first result. There are quite
> a number of functions that take a single targetList as a parameter. I
> didn't change those because I think it's more trouble that it's worth
> right now. If those modifications are desired, they should wait until
> the redesign of the the querytree structure, IMHO. But targetList ->
> targetLists is done.
>
> There are a couple of places where I made an assertion that the length
> of a Query node's targetLists member must be one that I'm not sure about
> (there are comments by those assertions). Suggestions on those would be
> appreciated.
>
> There are also a couple of places where I need to do a check to see if a
> query's targetLists is NIL and do assignments based on the result of
> that check. Would it be better to for all Query nodes that currently
> have a NIL targetLists coming out of transfromXYZ to have a targetLists
> member with one NIL targetList. I don't really think so (it's could be
> confusing), but I'm just wondering what the thoughts of others are.
>
> Comments?
>
> Liam
>
> --
> Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com

[ Attachment, skipping... ]

[ Attachment, skipping... ]

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: multiple inserts

From
Tom Lane
Date:
Liam Stewart <liams@redhat.com> writes:
> Here is another version of the multiple insert patch that contains
> modifications as per Tom's post on the 24th.

I poked at this a little bit and find that it still has problems,
for example:

regression=# create table foo (f1 int, f2 int);
CREATE

regression=# insert into foo values (1,2), (3,4);
INSERT 0 2

regression=# insert into foo values (0,4), (3,1+(select max(f2) from foo));
ERROR:  fmgr_info: function 0: cache lookup failed

regression=# insert into foo values (0,4),
regression-# (3,case when 1 = any (select max(f2) from foo) then 1 else 0 end);
server closed the connection unexpectedly
(postmaster log shows backend died with SIGSEGV)

While these particular problems could doubtless be fixed (they are due
to failure to consider Result's additional targetlists in setrefs.c),
I'm beginning to have serious doubts about the approach of adding
multiple targetlists to Query and Result nodes.  The patch is much
larger and uglier than I'd hoped it would turn out, and yet there are
still many places that arbitrarily assume they are dealing with
single-targetlist cases.  This does not look like an approach that will
scale up easily to the place we'd hoped to get to (allowing VALUES
constructs as sub-selects).

I am very sorry for leading you down the garden path, but at this point
I wonder whether you shouldn't scrap the multi-targetlist approach and
try again.  The desired feature could be added without any impact beyond
the parser, if analyze.c were to turn a multi-VALUES-list INSERT into
something that looked like it had been produced by

    INSERT INTO targettable
        SELECT first-values-list
        UNION ALL
        SELECT second-values-list
        UNION ALL
        ...

You could do this quite easily in transformInsertStmt: if there's more
than one sub-list in stmt->values, transform each sublist as a
targetlist and then build a Query node just for that targetlist, finally
assembling the Query nodes into a SetOp tree.  Note you'd want to apply
prepareInsertTargetList to each of the sub-targetlists separately (to
coerce the result datatypes to the target column types) rather than
running the type resolution code that UNION normally uses to resolve
disparate datatypes in the arms of the UNION.  Otherwise it seems a
fairly straightforward, localized exercise.

I think we had talked about this idea earlier and decided to shoot for
propagating the notion of multiple targetlists into Query and Result
nodes instead.  But now it's looking like that was the wrong decision to
make.  I think we'd better add the general capability to our list of
things-to-think-about-when-we-redesign-querytrees, and just go for
fixing INSERT ... VALUES for the moment.

The good news is your time was not wasted, since you now know a lot more
about pieces of the system beyond the parser than you would've without
making this venture ;-).

            regards, tom lane

Re: multiple inserts

From
Liam Stewart
Date:
I'm in agreement about postponing the targetList changes; I don't like
the third attempt and knew before I started that the patch would be ugly
and would leave many things dangling.

But following up on Tom's latest suggestion, I've implemented it in the
yacc grammer rather than in the parser. I prefer this method as there is
less to change in analyze.c and it seems safer - having the parser
construct something more or less automatically and then re-using
existing functionality rather than having to build up Queries and SetOp
trees by hand (can't really use existing methods). The patch is short
actually simplifies things a bit as the targetlist field of InsertStmt
is no longer needed.

The only problem that I've encountered is that the rules regression test
is failing on the last select statement - the rule definitions are
displayed using SELECTs rather than VALUES. They are equivalent, but is
that re-writing alright? Looking at get_insert_query_def() in
ruleutils.c, it seems that this change is unavoidable if INSERT ...
VALUES are rewritten (any way) as INSERT ... SELECT, in which case
either the implementation on the patch needs to be reconsidered or the
regression test expected output needs to be modified.

Liam

--
Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com

Attachment

Re: multiple inserts

From
Tom Lane
Date:
Liam Stewart <liams@redhat.com> writes:
> But following up on Tom's latest suggestion, I've implemented it in the
> yacc grammer rather than in the parser.

Oh, that's a cool idea ... I think.  Did you take care that coercion of
values entries still happens correctly?  Given the way that UNION type
resolution is done, UNIONing first and coercing afterwards will not
produce the behavior we want.  An example: suppose f1 is numeric.

INSERT INTO foo(f1) VALUES ('11'), ('22.3');

This should coerce both unknown-type literals to numeric without complaint.
However,

INSERT INTO foo(f1) SELECT '11' UNION ALL SELECT '22.3';

will fail, since the UNION outputs will be resolved as type "text" for
lack of any context to decide otherwise, and then it's too late to go
back and make them numeric instead.  The knowledge of the intended
destination column types needs to get in there before the UNION type
resolution code is run.

Currently, there's a special-case hack in transformInsertStmt that fixes
this problem for unknown-type literals in the simple INSERT...SELECT
case.  That won't work as-is for a UNION, but maybe you could resolve
both issues if you reached in and transformed the targetlist column
types before allowing normal transformation of the SELECT part to
proceed.

This is all pretty ugly in any case :-( ... we're really starting to
push the limits of what we can accomplish without redoing the Querytree
datastructure.


> The only problem that I've encountered is that the rules regression test
> is failing on the last select statement - the rule definitions are
> displayed using SELECTs rather than VALUES. They are equivalent, but is
> that re-writing alright?

I think this is fine; changing the expected output of that regression
test is perfectly okay (and happens regularly).  I'm more concerned
about the datatype coercion issue.

            regards, tom lane

Re: multiple inserts

From
Liam Stewart
Date:
On Fri, Sep 07, 2001 at 11:28:21AM -0400, Tom Lane wrote:
> Liam Stewart <liams@redhat.com> writes:
> > But following up on Tom's latest suggestion, I've implemented it in the
> > yacc grammer rather than in the parser.
>
> Oh, that's a cool idea ... I think.  Did you take care that coercion of
> values entries still happens correctly?  Given the way that UNION type
> resolution is done, UNIONing first and coercing afterwards will not
> produce the behavior we want.  An example: suppose f1 is numeric.

Erk.. hmmm..yes, it's busted :( I'll take a look at that.

> This is all pretty ugly in any case :-( ... we're really starting to
> push the limits of what we can accomplish without redoing the Querytree
> datastructure.

Agreed.

Liam

--
Liam Stewart :: Red Hat Canada, Ltd. :: liams@redhat.com