Re: [HACKERS] Some progress on INSERT/SELECT/GROUP BY bugs - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] Some progress on INSERT/SELECT/GROUP BY bugs |
Date | |
Msg-id | 199909182136.RAA19883@candle.pha.pa.us Whole thread Raw |
In response to | Some progress on INSERT/SELECT/GROUP BY bugs (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Some progress on INSERT/SELECT/GROUP BY bugs
|
List | pgsql-hackers |
Tom, is this fixed? > I believe I've identified the main cause of the peculiar behavior we > are seeing with INSERT ... SELECT ... GROUP/ORDER BY: it's a subtle > parser bug. > > Here is the test case I'm looking at: > > CREATE TABLE si_tmpverifyaccountbalances ( > type int4 NOT NULL, > memberid int4 NOT NULL, > categoriesid int4 NOT NULL, > amount numeric); > > CREATE TABLE invoicelinedetails ( > invoiceid int4, > memberid int4, > totshippinghandling numeric, > invoicelinesid int4); > > INSERT INTO si_tmpverifyaccountbalances SELECT invoiceid+3, > memberid, 1, totshippinghandling FROM invoicelinedetails > GROUP BY invoiceid+3, memberid, totshippinghandling; > > ERROR: INSERT has more expressions than target columns > > The reason this is coming out is that the matching of GROUP BY (also > ORDER BY) items to targetlist entries is fundamentally broken in this > context. The GROUP BY items "memberid" and "totshippinghandling" are > simply unvarnished Ident nodes when they arrive at findTargetlistEntry() > in parse_clause.c; what findTargetlistEntry() does with them is to try > to match them against the resdom names of the existing targetlist items. > I think that's correct behavior in the plain SELECT case (but note it > means "SELECT a AS b, b AS c GROUP BY b" will really group by a not b > --- is that per spec??). But it fails miserably in the INSERT/SELECT > case, because by the time control gets here, the targetlist items have > been given resdom names *corresponding to the column names of the target > table*. > > So, in the example at hand, "memberid" is matched to the correct column > by pure luck (because it has the same name in the destination table), > and then "totshippinghandling" is not recognized as one of the existing > TLEs because it does not match any destination column name. > > Now, call me silly, but it seems to me that SELECT ... GROUP BY ought > to mean the same thing no matter whether there is an INSERT in front of > it or not, and thus that letting target column names affect the meaning > of GROUP BY items is dead wrong. (Don't have a spec to check this with, > however.) > > I believe the most reasonable fix for this is to postpone relabeling > of the targetlist entries with destination column names until after > analysis of the SELECT's subsidiary clauses is complete. In particular, > it should *not* be done instantly when each TLE is made, which is what > MakeTargetEntryIdent currently does. The TLEs should have the same > resnames as in the SELECT case until after subsidiary clause processing > is done. > > (MakeTargetEntryIdent is broken anyway because it tries to associate > a destination column with every TLE, even the resjunk ones. The reason > we see the quoted error message in this situation is that after > findTargetlistEntry fails to detect that totshippinghandling is already > a TLE, it calls MakeTargetEntryIdent to make a junk TLE for > totshippinghandling, and then MakeTargetEntryIdent tries to find a > target column to go with the junk TLE. So the revised code should only > assign dest column names to non-junk TLEs.) > > I'm not really familiar enough with the parser to want to tackle this > size of change by myself --- Thomas, do you want to do it? I think it's > largely a matter of moving code around, but I'm not sure where is the > right place for it... > > regards, tom lane > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: