Re: [PATCH] Erase the distinctClause if the result is unique bydefinition - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: [PATCH] Erase the distinctClause if the result is unique bydefinition
Date
Msg-id 20200211124311.GA14732@nol
Whole thread Raw
In response to Re: [PATCH] Erase the distinctClause if the result is unique by definition  (Andy Fan <zhihui.fan1213@gmail.com>)
List pgsql-hackers
On Tue, Feb 11, 2020 at 08:14:14PM +0800, Andy Fan wrote:
> On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> 
> > >
> > > and if we prepare sql outside a transaction, and execute it in the
> > > transaction, the other session can't drop the constraint until the
> > > transaction is ended.
> >
> > And what if you create a view on top of a query containing a distinct
> > clause
> > rather than using prepared statements?  FWIW your patch doesn't handle such
> > case at all, without even needing to drop constraints:
> >
> > CREATE TABLE t (a int primary key, b int not null,  c int);
> > INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
> > CREATE UNIQUE INDEX t_idx1 on t(b);
> > CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
> > EXPLAIN SELECT * FROM v1;
> > server closed the connection unexpectedly
> >         This probably means the server terminated abnormally
> >         before or while processing the request.
> >
> >
> This error can be fixed with
> 
> -       num_of_rtables = bms_num_members(non_semi_anti_relids);
> +       num_of_rtables = list_length(query->rtable);
> 
> This test case also be added into the patch.
> 
> 
> > I also think this is not the right way to handle this optimization.
> >
> 
> do you have any other concerns?

Yes, it seems to be broken as soon as you alter the view's underlying table:

=# CREATE TABLE t (a int primary key, b int not null,  c int);
CREATE TABLE

=# INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
INSERT 0 2

=# CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE INDEX

=# CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
CREATE VIEW

=# EXPLAIN SELECT * FROM v1;
                   QUERY PLAN
-------------------------------------------------
 Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
                   QUERY PLAN
-------------------------------------------------
 Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# ALTER TABLE t ALTER COLUMN b DROP NOT NULL;
ALTER TABLE

=# EXPLAIN SELECT * FROM v1;
                   QUERY PLAN
-------------------------------------------------
 Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
                         QUERY PLAN
-------------------------------------------------------------
 Unique  (cost=1.03..1.04 rows=2 width=4)
   ->  Sort  (cost=1.03..1.03 rows=2 width=4)
         Sort Key: b
         ->  Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(4 rows)




pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Next
From: Victor Yegorov
Date:
Subject: Re: open-source equivalent of golden-gate