Re: [PATCH] Keeps tracking the uniqueness with UniqueKey - Mailing list pgsql-hackers

From Andy Fan
Subject Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date
Msg-id CAKU4AWpFmE8DLiFxGrE4SD=5+Semj7o_-pWuHxVq8PuxRdMWxw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
List pgsql-hackers
Hi Ashutosh:

All your suggestions are followed except the UNION ALL one. I replied the reason
below.  For the suggestions about partitioned table,  looks lot of cases to handle, so
I summarize/enrich your idea in README and email thread,  we can continue
to talk about that.  



+
+    foreach(lc,  unionrel->reltarget->exprs)
+    {
+        exprs = lappend(exprs, lfirst(lc));
+        colnos = lappend_int(colnos, i);
+        i++;
+    }

This should be only possible when it's not UNION ALL. We should add some assert
or protection for that.

OK, actually I called this function in generate_union_paths. which handle
UNION case only.  I will add the Assert anyway. 
 
 
Finally I found I have to add one more parameter to populate_unionrel_uniquekeys, and
the only usage of that parameter is used to Assert, so I didn't do that at last. 
 

+
+    /* Fast path */
+    if (innerrel->uniquekeys == NIL || outerrel->uniquekeys == NIL)
+        return;
+
+    outer_is_onerow = relation_is_onerow(outerrel);
+    inner_is_onerow = relation_is_onerow(innerrel);
+
+    outerrel_ukey_ctx = initililze_uniquecontext_for_joinrel(joinrel,
outerrel);
+    innerrel_ukey_ctx = initililze_uniquecontext_for_joinrel(joinrel,
innerrel);
+
+    clause_list = gather_mergeable_joinclauses(joinrel, outerrel, innerrel,
+                                               restrictlist, jointype);

Something similar happens in select_mergejoin_clauses().

I didn't realized this before.  I will refactor this code accordingly if necessary
after reading that. 
 

 I reused select_mergejoin_clauses and removed the duplicated code in uniquekeys.c
in v8. 

At least from the
first reading of this patch, I get an impression that all these functions which
are going through clause lists and index lists should be merged into other
functions which go through these lists hunting for some information useful to
the optimizer. 
+
+
+    if (innerrel_keeps_unique(root, outerrel, innerrel, clause_list, false))
+    {
+        foreach(lc, innerrel_ukey_ctx)
+        {
+            UniqueKeyContext ctx = (UniqueKeyContext)lfirst(lc);
+            if (!list_is_subset(ctx->uniquekey->exprs,
joinrel->reltarget->exprs))
+            {
+                /* The UniqueKey on baserel is not useful on the joinrel */

A joining relation need not be a base rel always, it could be a join rel as
well.

good catch. 

Fixed. 
 
 

+                ctx->useful = false;
+                continue;
+            }
+            if ((jointype == JOIN_LEFT || jointype == JOIN_FULL) &&
!ctx->uniquekey->multi_nullvals)
+            {
+                /* Change the multi_nullvals to true at this case */

Need a comment explaining this. Generally, I feel, this and other functions in
this file need good comments explaining the logic esp. "why" instead of "what".
 
Exactly. 

Done in v8. 


Will continue reviewing your new set of patches as time permits.

Thank you!  Actually there is no big difference between v6 and v7 regarding the
 UniqueKey part except 2 bug fix.  However v7 has some more documents, 
comments improvement and code refactor/split, which may be helpful
for review. You may try v7 next time if v8 has not come yet:) 


v8 has come :) 

Best Regards
Andy Fan
 

pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Next
From: Alvaro Herrera
Date:
Subject: Re: Add "-Wimplicit-fallthrough" to default flags