Re: BUG #13863: Select from views gives wrong results - Mailing list pgsql-bugs

From wrb
Subject Re: BUG #13863: Select from views gives wrong results
Date
Msg-id 82C006C0-07A7-401B-9BDA-0BABD9025099@autistici.org
Whole thread Raw
In response to Re: BUG #13863: Select from views gives wrong results  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: BUG #13863: Select from views gives wrong results
List pgsql-bugs
> On 14. 1. 2016, at 0:50, Haribabu Kommi <kommi.haribabu@gmail.com> =
wrote:
>=20
> View's don't have storage, so any query on the view is rewritten to
> the target relation.
> In the above scenario, the same happened and the where clause is =
pushed into
> the view select query. There is no row that is presented in the emp
> table satisfies
> the where clause that is provided. Because of this reason no data is =
selected.
>=20
> But where as with materialized view, it has the storage. The =
materialized view
> is populated with the query result during creation. So adding an where =
clause
> on materialized view satisfies with it and thus it returned the
> results. The number
> of records in the materialized view are 27 compared to the number of =
records in
> table are 7.
>=20
> The same with the CTE also, first the inner query is executed and on =
top of that
> result the outer query is executed. Because of this reason, the where =
clause is
> satisfied and the results are returned.

I understand why it happens - with normal group by, the predicate *can* =
be pushed down
unless it is on generated column. But with cube, every column is a =
geenrated column because
of the way cube works.

It works correctly unless cube is involved:

create table test (a int, b int);
create view v_test as select a, count(*) from test group by a;
create view v_test2 as select a, count(*) from test group by cube (a);
insert into test values (1, 1), (2, 2);

this doesn't get pushed down either as CTE or from view:

explain with t as (select a, count(*) from test group by a) select * =
from t where count =3D 1;
                              QUERY PLAN                             =20
----------------------------------------------------------------------
 CTE Scan on t  (cost=3D45.90..50.40 rows=3D1 width=3D12)
   Filter: (count =3D 1)
   CTE t
     ->  HashAggregate  (cost=3D43.90..45.90 rows=3D200 width=3D4)
           Group Key: test.a
           ->  Seq Scan on test  (cost=3D0.00..32.60 rows=3D2260 =
width=3D4)
(6 rows)

explain select * from v_test where count =3D 1;
                          QUERY PLAN                         =20
--------------------------------------------------------------
 HashAggregate  (cost=3D49.55..52.05 rows=3D200 width=3D4)
   Group Key: test.a
   Filter: (count(*) =3D 1)
   ->  Seq Scan on test  (cost=3D0.00..32.60 rows=3D2260 width=3D4)
(4 rows)


This does get pushed down correctly:

explain select * from v_test where a =3D 1;
                         QUERY PLAN                        =20
------------------------------------------------------------
 GroupAggregate  (cost=3D0.00..38.32 rows=3D1 width=3D4)
   Group Key: test.a
   ->  Seq Scan on test  (cost=3D0.00..38.25 rows=3D11 width=3D4)
         Filter: (a =3D 1)
(4 rows)


This doesn't get pushed down, but it's a known issue with CTEs being =
optimization boundary in postgres:

explain with t as (select a, count(*) from test group by a) select * =
from t where a =3D 1;
                              QUERY PLAN                             =20
----------------------------------------------------------------------
 CTE Scan on t  (cost=3D45.90..50.40 rows=3D1 width=3D12)
   Filter: (a =3D 1)
   CTE t
     ->  HashAggregate  (cost=3D43.90..45.90 rows=3D200 width=3D4)
           Group Key: test.a
           ->  Seq Scan on test  (cost=3D0.00..32.60 rows=3D2260 =
width=3D4)
(6 rows)

But when cube gets involved:

with t as (select a, count(*) from test group by cube (a)) select * from =
t where a is null;
 a | count=20
---+-------
   |     2

CTE still works, because it doesn't get pushed down:

explain with t as (select a, count(*) from test group by cube (a)) =
select * from t where a is null;
                                 QUERY PLAN                              =
  =20
=
--------------------------------------------------------------------------=
--
 CTE Scan on t  (cost=3D177.47..181.49 rows=3D1 width=3D12)
   Filter: (a IS NULL)
   CTE t
     ->  GroupAggregate  (cost=3D158.51..177.47 rows=3D201 width=3D4)
           Group Key: test.a
           Group Key: ()
           ->  Sort  (cost=3D158.51..164.16 rows=3D2260 width=3D4)
                 Sort Key: test.a
                 ->  Seq Scan on test  (cost=3D0.00..32.60 rows=3D2260 =
width=3D4)
(9 rows)

but with a view, it gets pushed down to the table, even though it =
doesn't make any sense for cube,=20
and gives incorrect results:

select * from v_test2 where a is null;
 a | count=20
---+-------
   |     0

explain select * from v_test2 where a is null;
                            QUERY PLAN                           =20
------------------------------------------------------------------
 GroupAggregate  (cost=3D32.79..32.89 rows=3D2 width=3D4)
   Group Key: test.a
   Group Key: ()
   Filter: (test.a IS NULL)
   ->  Sort  (cost=3D32.79..32.82 rows=3D11 width=3D4)
         Sort Key: test.a
         ->  Seq Scan on test  (cost=3D0.00..32.60 rows=3D11 width=3D4)
               Filter: (a IS NULL)



No, I'm pretty sure this is incorrect result, because views are supposed =
to provide logical data independence and running the same query through =
nested select, CTE and view should give the same result, right? If not, =
I'm pretty sure this is the first case where it doesn't work like this.

W

>=20
> This is just an usage problem.
>=20
> Regards,
> Hari Babu
> Fujitsu Australia

pgsql-bugs by date:

Previous
From: y.chaitanya@tcs.com
Date:
Subject: Re: BUG #13862: Duplicated rows for a table with primary key
Next
From: y.chaitanya@tcs.com
Date:
Subject: Re: BUG #13862: Duplicated rows for a table with primary key