Re: [PATCH] Erase the distinctClause if the result is unique by definition - Mailing list pgsql-hackers

From Andy Fan
Subject Re: [PATCH] Erase the distinctClause if the result is unique by definition
Date
Msg-id CAKU4AWqrsDv_o+eUpt4ywJnGtjR4Ned3_T9Q0KBA3B7x080gAA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Erase the distinctClause if the result is unique by definition  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: [PATCH] Erase the distinctClause if the result is unique by definition
List pgsql-hackers


On Wed, Feb 12, 2020 at 12:36 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:


On Tue, Feb 11, 2020 at 8:27 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:


On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:



[PATCH] Erase the distinctClause if the result is unique by
 definition

I forgot to mention this in the last round of comments. Your patch was actually removing distictClause from the Query structure. Please avoid doing that. If you remove it, you are also removing the evidence that this Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it.  what is the purpose of keeping the evidence?

Julien's example provides an explanation for this. The Query structure is serialised into a view definition. Removing distinctClause from there means that the view will never try to produce unique results.
 
 
Actually it is not true.  If a view is used in the query,  the definition will be *copied*
into the query tree. so if we modify the query tree,  the definition of the view never
touched.  The issue of Julien reported is because of a typo error. 

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
                         QUERY PLAN
-------------------------------------------------------------
 Unique  (cost=1.03..1.04 rows=1 width=4)
   ->  Sort  (cost=1.03..1.04 rows=1 width=4)
         Sort Key: b
         ->  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
               Filter: (c = $1)
(5 rows)

Since this prepared statement is parameterised PostgreSQL is replanning it every time it gets executed. It's not using a stored prepared plan. Try without parameters. Also make sure that a prepared plan is used for execution and not a new plan.

Even for  parameterised prepared statement, it is still possible to generate an generic
plan. so it will not replanning every time.  But no matter generic plan or not,  after a DDL like
changing the NOT NULL constraints.   pg will generated a plan based on the stored query
tree.   However, the query tree will be *copied* again to generate a new plan. so even I 
modified the query tree,  everything will be ok as well. 

At last,  I am agreed with that modifying the query tree is not a good idea. 
so my updated patch doesn't use it any more.  

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Error on failed COMMIT
Next
From: Hubert Zhang
Date:
Subject: Re: Yet another vectorized engine