Thread: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
"Sheng Y. Cheng"
Date:
The following bug has been logged online:

Bug reference:      5025
Logged by:          Sheng Y. Cheng
Email address:      scheng@adconion.com
PostgreSQL version: 8.4.0 / 8.3.1
Operating system:   Red Hat 4.1.1-52
Description:        Aggregate function with subquery in 8.3 and 8.4.
Details:

Here are some facts and questions about the aggregate function with
subquery
in 8.3 and 8.4.

================= Question 1. ==================
I though the following query would give me the same results in 8.4.0 and
8.3.1.

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

BEGIN;
SELECT version();
CREATE TEMPORARY TABLE t1 (f1 text ) on commit drop ;
CREATE TEMPORARY TABLE t2 (f1 text ) on commit drop ;
INSERT INTO t1 (f1) VALUES ('aaa');
INSERT INTO t1 (f1) VALUES ('bbb');
INSERT INTO t1 (f1) VALUES ('ccc');
INSERT INTO t2 (f1) VALUES ('bbb');

SELECT t1.f1, COUNT(ts.*) FROM
  t1
LEFT JOIN
  (SELECT
    CASE WHEN f1 = '111'
      THEN '111'
    ELSE
      f1
    END
   FROM t2) AS ts
ON
 t1.f1 = ts.f1
GROUP BY
 t1.f1;

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

However, In 8.3.1 I got the following.


vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

BEGIN
                                                  version


----------------------------------------------------------------------------

-------------------------------
 PostgreSQL 8.3.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
 f1  | count
-----+-------
 aaa |     0
 bbb |     1
 ccc |     0
(3 rows)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Whereas, in 8.4.0 I got the following.

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

BEGIN
                                                      version


----------------------------------------------------------------------------

---------------------------------------
 PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52), 64-bit
(1 row)

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
 f1  | count
-----+-------
 aaa |     1
 bbb |     1
 ccc |     1
(3 rows)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Session 4.2.7. Aggregate Expressions in 8.3 document at
http://www.postgresql.org/docs/8.3/static/sql-expressions.html states "The
last form invokes the aggregate once for each input row regardless of null
or non-null values." I am wondering if the result I saw from 8.4.0 is a bug
fix for 8.3.1?


================= Question 2. ==================
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

BEGIN;
SELECT version();
CREATE TEMPORARY TABLE t1 (f1 text ) on commit drop ;
CREATE TEMPORARY TABLE t2 (f1 text ) on commit drop ;
INSERT INTO t1 (f1) VALUES ('aaa');
INSERT INTO t1 (f1) VALUES ('bbb');
INSERT INTO t1 (f1) VALUES ('ccc');
INSERT INTO t2 (f1) VALUES ('bbb');

SELECT t1.f1, COUNT(ts.*) FROM
  t1
LEFT JOIN
  (SELECT
     f1
   FROM t2) AS ts
ON
  t1.f1 = ts.f1
GROUP BY
  t1.f1;

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I though the result of the above query would be the following.

 f1  | count
-----+-------
 aaa |     0
 bbb |     1
 ccc |     0

however, I got the following in both 8.4.0 and 8.3.1.

Result from 8.3.1.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

BEGIN
                                                  version


----------------------------------------------------------------------------

-------------------------------
 PostgreSQL 8.3.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
 f1  | count
-----+-------
 aaa |     1
 bbb |     1
 ccc |     1
(3 rows)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Result from 8.4.0.

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

BEGIN
                                                      version


----------------------------------------------------------------------------

---------------------------------------
 PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52), 64-bit
(1 row)

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
 f1  | count
-----+-------
 aaa |     1
 bbb |     1
 ccc |     1
(3 rows)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Is this how Postgres works for aggregate function?


Thank you,

Sheng

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Heikki Linnakangas
Date:
Sheng Y. Cheng wrote:
> The Session 4.2.7. Aggregate Expressions in 8.3 document at
> http://www.postgresql.org/docs/8.3/static/sql-expressions.html states "The
> last form invokes the aggregate once for each input row regardless of null
> or non-null values." I am wondering if the result I saw from 8.4.0 is a bug
> fix for 8.3.1?

Well, a COUNT(ts.*) is in fact not of the last form, but the first.
"ts.*" is a whole-row reference to t2, like just "ts" would be (as in
"COUNT(t2)").

But there indeed seems to be something wrong. The query can be reduced into:

SELECT t1.f1, COUNT(ts) FROM t1
LEFT JOIN
  (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts
ON t1.f1 = ts.f1
GROUP BY t1.f1;

With this you can reproduce the discrepancy in CVS HEAD alone - the
query produces a different result if you remove the "OFFSET 0":

postgres=# SELECT t1.f1, COUNT(ts) FROM t1
postgres-# LEFT JOIN
postgres-#   (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts
postgres-# ON t1.f1 = ts.f1
postgres-# GROUP BY t1.f1;
 f1  | count
-----+-------
 aaa |     0
 bbb |     1
 ccc |     0
(3 rows)

postgres=# SELECT t1.f1, COUNT(ts) FROM t1
LEFT JOIN
  (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts
ON t1.f1 = ts.f1
GROUP BY t1.f1;
 f1  | count
-----+-------
 aaa |     1
 bbb |     1
 ccc |     1
(3 rows)

Without the OFFSET, the subquery is "pulled up" into the top query, and
that optimization makes the difference. PostgreSQL 8.4 is more
aggressive at that optimization, which is why you saw different results
on 8.3 and 8.4.

The "pullup" code transforms the "ts" reference into a ROW constructor:

postgres=# explain verbose SELECT t1.f1, COUNT(ts) FROM t1
LEFT JOIN
  (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts
ON t1.f1 = ts.f1
GROUP BY t1.f1;
                                   QUERY PLAN

--------------------------------------------------------------------------------
 GroupAggregate  (cost=181.86..362.51 rows=200 width=64)
   Output: t1.f1, count(ROW(t2.f1))
...

That transformation isn't 100% accurate. A ROW expression with all NULL
columns is not the same thing as a NULL whole-row expression when it
comes to STRICT functions - a strict function is invoked with the
former, but not the latter, even though both return true for an IS NULL
test.

That let's us write the test case as:

CREATE FUNCTION stricttest (a anyelement) RETURNS boolean AS $$ SELECT
true; $$ LANGUAGE SQL STRICT;

postgres=#  SELECT t1.f1, stricttest(ts) FROM t1
LEFT JOIN
  (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts
ON t1.f1 = ts.f1;
 f1  | stricttest
-----+------------
 aaa |
 bbb | t
 ccc |
(3 rows)

postgres=#  SELECT t1.f1, stricttest(ts) FROM t1
LEFT JOIN
  (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts
ON t1.f1 = ts.f1;
 f1  | stricttest
-----+------------
 aaa | t
 bbb | t
 ccc | t
(3 rows)

I can see two possible interpretations for this:

1. The subquery pull-up code is broken, the transformation of a
whole-row reference to ROW(...) is not valid.

2. The semantics of STRICT with row arguments is broken. It should be
made consistent with IS NULL. Strict function should not be called if
the argument is a row value with all NULL columns.

I'm not sure which interpretation is correct. Thoughts? The SQL spec
probably has something to say about this.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Hitoshi Harada
Date:
2009/9/1 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
> I can see two possible interpretations for this:
>
> 1. The subquery pull-up code is broken, the transformation of a
> whole-row reference to ROW(...) is not valid.
>
> 2. The semantics of STRICT with row arguments is broken. It should be
> made consistent with IS NULL. Strict function should not be called if
> the argument is a row value with all NULL columns.
>
> I'm not sure which interpretation is correct. Thoughts? The SQL spec
> probably has something to say about this.

I suppose ts.* that wasn't joined is NULL. Not "a row value with all
NULL columns" but "a NULL row value".

contrib_regression=# SELECT t1.f1, ts.* IS NULL, ts.* FROM
 t1
LEFT JOIN
 (SELECT
   f1
 FROM t2
--  offset 0
 ) AS ts
ON
 t1.f1 = ts.f1
;
 f1  | ?column? | f1
-----+----------+-----
 aaa | t        |
 bbb | f        | bbb
 ccc | t        |
(3 rows)

So the 1. ROW(...) construction seems not valid.

>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Regards,

--
Hitoshi Harada

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> I can see two possible interpretations for this:

> 1. The subquery pull-up code is broken, the transformation of a
> whole-row reference to ROW(...) is not valid.

I think the problem is probably that we need a PlaceHolderVar wrapper
around the ROW() constructor.

> 2. The semantics of STRICT with row arguments is broken. It should be
> made consistent with IS NULL.

Well, that's a different argument.  The immediate problem is that this
case doesn't behave consistently with pre-8.4 behavior, which was not
an intended change --- so I think we'd better make it work like before.

            regards, tom lane

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Merlin Moncure
Date:
On Tue, Sep 1, 2009 at 4:11 AM, Heikki
Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:
> 2. The semantics of STRICT with row arguments is broken. It should be
> made consistent with IS NULL. Strict function should not be called if
> the argument is a row value with all NULL columns.

not just STRICT, but coalesce(), libpq 'is null' bit, plpgsql row
variables, type input/output, joins, etc.  see recent threads on
-hackers and -bugs note that fixing this would break code in the field
(like mine for example) :-).

merlin

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Tom Lane
Date:
I wrote:
> I think the problem is probably that we need a PlaceHolderVar wrapper
> around the ROW() constructor.

I looked into this a bit more.  The issue is that flattening of
subqueries that are inside an outer join first puts PlaceHolderVars
around non-nullable output expressions of the subquery, and then applies
ResolveNew() to substitute those expressions for upper references to the
subquery outputs.  However, a whole-row Var referencing the subquery is
expanded by ResolveNew into a ROW() expression over the subquery
outputs.  To preserve compatibility with pre-8.4 behavior, what we
really need here is to have a PlaceHolderVar around the ROW(), not on
the individual expressions inside it.

While it's not that hard to put in the PlaceHolderVar, the easy way to
do it would require passing the PlannerInfo "root" struct to ResolveNew,
which goes well beyond my threshold of pain from a modularity
standpoint --- ResolveNew isn't part of the planner and has no business
using that struct.  ResolveNew's API is a study in ugliness already,
so I'm thinking it's time to bite the bullet and refactor it.  The
idea that comes to mind is to provide a callback function and "void *"
context argument, which ResolveNew would call upon finding a Var that
needs substitution.  The existing guts of ResolveNew would move into a
callback that is specific to rewriteHandler.c's uses, and we'd write a
different one for the planner.  The PlannerInfo link would be hidden
within the "void *" argument so we'd avoid exposing it to rewriter code.

Comments?

I believe BTW that there are related issues in other places where we
expand composites into RowExprs.  But the other places have been doing
that for awhile.  I think that for 8.4 our goals should be limited to
not changing the behavior compared to prior releases.  If any consensus
is reached on the general issue of how we want "row() is null" to
behave, then it'll be time to reconsider the other stuff.

            regards, tom lane

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Tom Lane
Date:
I blithely opined:
> I believe BTW that there are related issues in other places where we
> expand composites into RowExprs.  But the other places have been doing
> that for awhile.  I think that for 8.4 our goals should be limited to
> not changing the behavior compared to prior releases.

So while I was testing my fix for this, I found out that that's more
complicated than I thought.  Consider these examples in the regression
database:

select t1.q2, count(t2.*)
from int8_tbl t1 left join int8_tbl t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl offset 0) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

select t1.q2, count(t2.*)
from int8_tbl t1 left join
  (select q1, case when q2=1 then 1 else q2 end as q2 from int8_tbl) t2
  on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

If you believe that "t2.*" should go to NULL in a join-extended row,
then the correct answer for all four of these is

        q2         | count
-------------------+-------
 -4567890123456789 |     0
               123 |     2
               456 |     0
  4567890123456789 |     6
(4 rows)

However, the actual behavior of every release since 8.0 has been that
the second case gives

        q2         | count
-------------------+-------
 -4567890123456789 |     1
               123 |     2
               456 |     1
  4567890123456789 |     6
(4 rows)

ie, t2.* fails to go to NULL because it's expanded as ROW(t2.q1,t2.q2).
The OFFSET 0 in the third case restores expected behavior by preventing
flattening of the subquery, and up till 8.4 the CASE expression in the
fourth case did too.

With the fix I was just about to apply, all four cases give the first
set of results.  This clearly satisfies the principle of least
astonishment, at least more nearly than what we have; but it equally
clearly is *not* going to restore 8.4 to work just like 8.3.

I'm inclined to apply the patch to 8.4 anyway, because it seems like a
bug fix.  I would consider patching further back except there's no
chance of making it work in older branches, at least not without
destabilizing them quite a bit (the PlaceHolderVar mechanism would have
to be back-ported).

It might be possible to fix the older branches by not flattening
subqueries that have whole-row references; but even that would take
nontrivial work, and it'd be sacrificing performance to fix a corner
case no one has previously complained about.  So I'm leaning to patching
8.4 and leaving the older branches alone.

Thoughts?

            regards, tom lane

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> With the fix I was just about to apply, all four cases give the first
> set of results.  This clearly satisfies the principle of least
> astonishment, at least more nearly than what we have; but it equally
> clearly is *not* going to restore 8.4 to work just like 8.3.

Right, 8.3 had the same underlying problem, 8.4 just makes it more
visible as it's better at flattening subqueries.

> I'm inclined to apply the patch to 8.4 anyway, because it seems like a
> bug fix.  I would consider patching further back except there's no
> chance of making it work in older branches, at least not without
> destabilizing them quite a bit (the PlaceHolderVar mechanism would have
> to be back-ported).
>
> It might be possible to fix the older branches by not flattening
> subqueries that have whole-row references; but even that would take
> nontrivial work, and it'd be sacrificing performance to fix a corner
> case no one has previously complained about.  So I'm leaning to patching
> 8.4 and leaving the older branches alone.
>
> Thoughts?

Seems reasonable.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Tom Lane wrote:
>> With the fix I was just about to apply, all four cases give the first
>> set of results.  This clearly satisfies the principle of least
>> astonishment, at least more nearly than what we have; but it equally
>> clearly is *not* going to restore 8.4 to work just like 8.3.

> Right, 8.3 had the same underlying problem, 8.4 just makes it more
> visible as it's better at flattening subqueries.

What is interesting is that the CASE in the OP's original submission
is apparently only there to dodge the visible-since-8.0 version of
the problem; at least I can't see that it does anything else useful.
The complaint apparently is not so much that 8.3 was right, as that
the workaround for its bug stopped working ...

>> ...  So I'm leaning to patching
>> 8.4 and leaving the older branches alone.
>>
>> Thoughts?

> Seems reasonable.

Will apply fix shortly --- I thought of one minor improvement to
make (the code as it stands is generating redundant PlaceHolderVars).

            regards, tom lane

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> > Tom Lane wrote:
> >> With the fix I was just about to apply, all four cases give the first
> >> set of results.  This clearly satisfies the principle of least
> >> astonishment, at least more nearly than what we have; but it equally
> >> clearly is *not* going to restore 8.4 to work just like 8.3.
>
> > Right, 8.3 had the same underlying problem, 8.4 just makes it more
> > visible as it's better at flattening subqueries.
>
> What is interesting is that the CASE in the OP's original submission
> is apparently only there to dodge the visible-since-8.0 version of
> the problem; at least I can't see that it does anything else useful.
> The complaint apparently is not so much that 8.3 was right, as that
> the workaround for its bug stopped working ...

In that light, it probably doesn't make much sense to backport the fix
further back, given that the people (person?) bitten by the bug surely
must already be working around it.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> What is interesting is that the CASE in the OP's original submission
>> is apparently only there to dodge the visible-since-8.0 version of
>> the problem; at least I can't see that it does anything else useful.
>> The complaint apparently is not so much that 8.3 was right, as that
>> the workaround for its bug stopped working ...

> In that light, it probably doesn't make much sense to backport the fix
> further back, given that the people (person?) bitten by the bug surely
> must already be working around it.

Yeah, and there might even be people relying on the incorrect behavior.
I'm not too worried about changing it in 8.4 since that's such a young
release, but it's a bit harder to make a case for that in older
branches.

            regards, tom lane