Thread: GROUP BY fixes committed

GROUP BY fixes committed

From
Tom Lane
Date:
I just committed a rewrite of union_planner, make_groupPlan, and
related routines that corrects several of the bugs we've been
seeing lately.  In particular, cases involving nontrivial GROUP BY
expressions work again.  The core of the problem was that the
EXCEPT/HAVING patch broke some extremely delicate (and quite
undocumented) interactions between these routines.  I decided
rewrite was better than (another layer of) patch, especially
since I could document along the way.

There are closely associated bugs in the rewriter and parser that
I have not gone after.  Jan's example still fails:

CREATE TABLE t1 (a int4, b int4);
CREATE VIEW v1 AS SELECT b, count(b) FROM t1 GROUP BY b;

SELECT count FROM v1;

because the rewriter is mislabeling both the target column 'count'
and the group-by column 'b' with resno 1.  More interestingly,
given the same view

SELECT b FROM v1;

also fails, even though there is no resno conflict.  The problem in
this case is that the query is marked hasAggs, even though all the
aggregates have been optimized out.  By the time the planner realizes
that there are not in fact any aggregates, it's too late to recover
easily, so for now I just made it report an error.  Jan, how hard would
it be to make the rewriter tell the truth in this case?

Also, the problem Michael Davis reported on 4/29 seems to be in the
parser:

insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1,
TotShippingHandling from InvoiceLineDetails where TotShippingHandling <> 0
and InvoiceLinesID <= 100 group by invoiceid+3, memberid,
TotShippingHandling;
ERROR:  INSERT has more expressions than target columns

since that error message appears only in the parser.  Thomas, did you
change anything recently in parsing of INSERT ... SELECT?
        regards, tom lane


Re: [HACKERS] GROUP BY fixes committed

From
Thomas Lockhart
Date:
> Also, the problem Michael Davis reported on 4/29 seems to be in the
> parser:
> insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1,
> TotShippingHandling from InvoiceLineDetails where TotShippingHandling <> 0
> and InvoiceLinesID <= 100 group by invoiceid+3, memberid,
> TotShippingHandling;
> ERROR:  INSERT has more expressions than target columns
> since that error message appears only in the parser.  Thomas, did you
> change anything recently in parsing of INSERT ... SELECT?

Not that I know of. And I'm not sure what you mean by "recently". I've
looked at the CVS logs but those don't help much because, especially
for patches submitted by others, there is a generic description
entered into the log which can't possibly describe the fixes in the
particular file. *sigh*

Anyway, I'd lost the thread. Did Michael say that this is a
recently-introduced problem?
                    - Tom

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] GROUP BY fixes committed

From
Bruce Momjian
Date:

Nice summary.  Where are we on the last item.

> I just committed a rewrite of union_planner, make_groupPlan, and
> related routines that corrects several of the bugs we've been
> seeing lately.  In particular, cases involving nontrivial GROUP BY
> expressions work again.  The core of the problem was that the
> EXCEPT/HAVING patch broke some extremely delicate (and quite
> undocumented) interactions between these routines.  I decided
> rewrite was better than (another layer of) patch, especially
> since I could document along the way.
> 
> There are closely associated bugs in the rewriter and parser that
> I have not gone after.  Jan's example still fails:
> 
> CREATE TABLE t1 (a int4, b int4);
> CREATE VIEW v1 AS SELECT b, count(b) FROM t1 GROUP BY b;
> 
> SELECT count FROM v1;
> 
> because the rewriter is mislabeling both the target column 'count'
> and the group-by column 'b' with resno 1.  More interestingly,
> given the same view
> 
> SELECT b FROM v1;
> 
> also fails, even though there is no resno conflict.  The problem in
> this case is that the query is marked hasAggs, even though all the
> aggregates have been optimized out.  By the time the planner realizes
> that there are not in fact any aggregates, it's too late to recover
> easily, so for now I just made it report an error.  Jan, how hard would
> it be to make the rewriter tell the truth in this case?
> 
> Also, the problem Michael Davis reported on 4/29 seems to be in the
> parser:
> 
> insert into si_tmpVerifyAccountBalances select invoiceid+3, memberid, 1,
> TotShippingHandling from InvoiceLineDetails where TotShippingHandling <> 0
> and InvoiceLinesID <= 100 group by invoiceid+3, memberid,
> TotShippingHandling;
> ERROR:  INSERT has more expressions than target columns
> 
> since that error message appears only in the parser.  Thomas, did you
> change anything recently in parsing of INSERT ... SELECT?
> 
>             regards, tom lane
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] GROUP BY fixes committed

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> Nice summary.  Where are we on the last item.

It was still broken as of a day or two ago.

I poked at it some more, and concluded that INSERT ... SELECT is pretty
broken when the SELECT includes GROUP BY.  I didn't try to delve into
the code though, just experimented with different commands.  Here are my
notes:

TEST CONDITION:
CREATE TABLE "si_tmpverifyaccountbalances" (       "type" int4 NOT NULL,       "memberid" int4 NOT NULL,
"categoriesid"int4 NOT NULL,       "amount" numeric);
 
CREATE TABLE "invoicelinedetails" (       "invoiceid" int4,       "memberid" int4,       "totshippinghandling" numeric,
     "invoicelinesid" int4);
 

ACCEPTED:
insert into si_tmpVerifyAccountBalances select invoiceid+3,
memberid, 1, TotShippingHandling from InvoiceLineDetails
group by invoiceid+3, memberid;

NOT ACCEPTED:
insert into si_tmpVerifyAccountBalances select invoiceid+3,
memberid, 1, TotShippingHandling from InvoiceLineDetails
group by invoiceid+3, memberid, TotShippingHandling;

Probably error check is including GROUP BY targets in its count of
things-to-be-inserted :-(.  The behavior is quite inconsistent though.
Also, why doesn't the first example get rejected, since
TotShippingHandling is neither GROUP BY nor an aggregate??
        regards, tom lane


Re: [HACKERS] GROUP BY fixes committed

From
Bruce Momjian
Date:
> ACCEPTED:
> insert into si_tmpVerifyAccountBalances select invoiceid+3,
> memberid, 1, TotShippingHandling from InvoiceLineDetails
> group by invoiceid+3, memberid;
> 
> NOT ACCEPTED:
> insert into si_tmpVerifyAccountBalances select invoiceid+3,
> memberid, 1, TotShippingHandling from InvoiceLineDetails
> group by invoiceid+3, memberid, TotShippingHandling;
> 
> Probably error check is including GROUP BY targets in its count of
> things-to-be-inserted :-(.  The behavior is quite inconsistent though.
> Also, why doesn't the first example get rejected, since
> TotShippingHandling is neither GROUP BY nor an aggregate??

Yikes.  We check to make sure all non-agg columns are referenced in
GROUP BY, but not that all GROUP BY's are in target list, perhaps?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026