Thread: GROUP BY fixes committed
I just committed a rewrite of union_planner, make_groupPlan, and related routines that corrects several of the bugs we've been seeing lately. In particular, cases involving nontrivial GROUP BY expressions work again. The core of the problem was that the EXCEPT/HAVING patch broke some extremely delicate (and quite undocumented) interactions between these routines. I decided rewrite was better than (another layer of) patch, especially since I could document along the way. There are closely associated bugs in the rewriter and parser that I have not gone after. Jan's example still fails: CREATE TABLE t1 (a int4, b int4); CREATE VIEW v1 AS SELECT b, count(b) FROM t1 GROUP BY b; SELECT count FROM v1; because the rewriter is mislabeling both the target column 'count' and the group-by column 'b' with resno 1. More interestingly, given the same view SELECT b FROM v1; also fails, even though there is no resno conflict. The problem in this case is that the query is marked hasAggs, even though all the aggregates have been optimized out. By the time the planner realizes that there are not in fact any aggregates, it's too late to recover easily, so for now I just made it report an error. Jan, how hard would it be to make the rewriter tell the truth in this case? Also, the problem Michael Davis reported on 4/29 seems to be in the parser: insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1, TotShippingHandling from InvoiceLineDetails where TotShippingHandling <> 0 and InvoiceLinesID <= 100 group by invoiceid+3, memberid, TotShippingHandling; ERROR: INSERT has more expressions than target columns since that error message appears only in the parser. Thomas, did you change anything recently in parsing of INSERT ... SELECT? regards, tom lane
> Also, the problem Michael Davis reported on 4/29 seems to be in the > parser: > insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1, > TotShippingHandling from InvoiceLineDetails where TotShippingHandling <> 0 > and InvoiceLinesID <= 100 group by invoiceid+3, memberid, > TotShippingHandling; > ERROR: INSERT has more expressions than target columns > since that error message appears only in the parser. Thomas, did you > change anything recently in parsing of INSERT ... SELECT? Not that I know of. And I'm not sure what you mean by "recently". I've looked at the CVS logs but those don't help much because, especially for patches submitted by others, there is a generic description entered into the log which can't possibly describe the fixes in the particular file. *sigh* Anyway, I'd lost the thread. Did Michael say that this is a recently-introduced problem? - Tom -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Nice summary. Where are we on the last item. > I just committed a rewrite of union_planner, make_groupPlan, and > related routines that corrects several of the bugs we've been > seeing lately. In particular, cases involving nontrivial GROUP BY > expressions work again. The core of the problem was that the > EXCEPT/HAVING patch broke some extremely delicate (and quite > undocumented) interactions between these routines. I decided > rewrite was better than (another layer of) patch, especially > since I could document along the way. > > There are closely associated bugs in the rewriter and parser that > I have not gone after. Jan's example still fails: > > CREATE TABLE t1 (a int4, b int4); > CREATE VIEW v1 AS SELECT b, count(b) FROM t1 GROUP BY b; > > SELECT count FROM v1; > > because the rewriter is mislabeling both the target column 'count' > and the group-by column 'b' with resno 1. More interestingly, > given the same view > > SELECT b FROM v1; > > also fails, even though there is no resno conflict. The problem in > this case is that the query is marked hasAggs, even though all the > aggregates have been optimized out. By the time the planner realizes > that there are not in fact any aggregates, it's too late to recover > easily, so for now I just made it report an error. Jan, how hard would > it be to make the rewriter tell the truth in this case? > > Also, the problem Michael Davis reported on 4/29 seems to be in the > parser: > > insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1, > TotShippingHandling from InvoiceLineDetails where TotShippingHandling <> 0 > and InvoiceLinesID <= 100 group by invoiceid+3, memberid, > TotShippingHandling; > ERROR: INSERT has more expressions than target columns > > since that error message appears only in the parser. Thomas, did you > change anything recently in parsing of INSERT ... SELECT? > > regards, tom lane > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <maillist@candle.pha.pa.us> writes: > Nice summary. Where are we on the last item. It was still broken as of a day or two ago. I poked at it some more, and concluded that INSERT ... SELECT is pretty broken when the SELECT includes GROUP BY. I didn't try to delve into the code though, just experimented with different commands. Here are my notes: TEST CONDITION: CREATE TABLE "si_tmpverifyaccountbalances" ( "type" int4 NOT NULL, "memberid" int4 NOT NULL, "categoriesid"int4 NOT NULL, "amount" numeric); CREATE TABLE "invoicelinedetails" ( "invoiceid" int4, "memberid" int4, "totshippinghandling" numeric, "invoicelinesid" int4); ACCEPTED: insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1, TotShippingHandling from InvoiceLineDetails group by invoiceid+3, memberid; NOT ACCEPTED: insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1, TotShippingHandling from InvoiceLineDetails group by invoiceid+3, memberid, TotShippingHandling; Probably error check is including GROUP BY targets in its count of things-to-be-inserted :-(. The behavior is quite inconsistent though. Also, why doesn't the first example get rejected, since TotShippingHandling is neither GROUP BY nor an aggregate?? regards, tom lane
> ACCEPTED: > insert into si_tmpVerifyAccountBalances select invoiceid+3, > memberid, 1, TotShippingHandling from InvoiceLineDetails > group by invoiceid+3, memberid; > > NOT ACCEPTED: > insert into si_tmpVerifyAccountBalances select invoiceid+3, > memberid, 1, TotShippingHandling from InvoiceLineDetails > group by invoiceid+3, memberid, TotShippingHandling; > > Probably error check is including GROUP BY targets in its count of > things-to-be-inserted :-(. The behavior is quite inconsistent though. > Also, why doesn't the first example get rejected, since > TotShippingHandling is neither GROUP BY nor an aggregate?? Yikes. We check to make sure all non-agg columns are referenced in GROUP BY, but not that all GROUP BY's are in target list, perhaps? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026