Re: Getting different number of results when using hashjoin on/off - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Getting different number of results when using hashjoin on/off
Date
Msg-id 12475.1133197733@sss.pgh.pa.us
Whole thread Raw
In response to Re: Getting different number of results when using hashjoin on/off  ("Mario Weilguni" <mario.weilguni@icomedias.com>)
List pgsql-hackers
"Mario Weilguni" <mario.weilguni@icomedias.com> writes:
> No, I'm using 8.1.0, and tried it on different machines, always the same results.

I see it, I think: the recent changes to avoid work when one or the
other side of the hash join is empty would exit the hash join leaving
a state that confused ExecReScanHashJoin() into thinking it didn't
have to do anything.  Try the attached patch.
        regards, tom lane


Index: src/backend/executor/nodeHashjoin.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v
retrieving revision 1.75.2.1
diff -c -r1.75.2.1 nodeHashjoin.c
*** src/backend/executor/nodeHashjoin.c    22 Nov 2005 18:23:09 -0000    1.75.2.1
--- src/backend/executor/nodeHashjoin.c    28 Nov 2005 17:04:43 -0000
***************
*** 152,163 ****          * outer join, we can quit without scanning the outer relation.          */         if
(hashtable->totalTuples== 0 && node->js.jointype != JOIN_LEFT)
 
-         {
-             ExecHashTableDestroy(hashtable);
-             node->hj_HashTable = NULL;
-             node->hj_FirstOuterTupleSlot = NULL;             return NULL;
-         }          /*          * need to remember whether nbatch has increased since we began
--- 152,158 ----
***************
*** 487,493 ****     {         ExecHashTableDestroy(node->hj_HashTable);         node->hj_HashTable = NULL;
-         node->hj_FirstOuterTupleSlot = NULL;     }      /*
--- 482,487 ----
***************
*** 805,841 **** ExecReScanHashJoin(HashJoinState *node, ExprContext *exprCtxt) {     /*
-      * If we haven't yet built the hash table then we can just return; nothing
-      * done yet, so nothing to undo.
-      */
-     if (node->hj_HashTable == NULL)
-         return;
- 
-     /*      * In a multi-batch join, we currently have to do rescans the hard way,      * primarily because batch
tempfiles may have already been released. But      * if it's a single-batch join, and there is no parameter change for
the     * inner subnode, then we can just re-use the existing hash table without      * rebuilding it.      */
 
!     if (node->hj_HashTable->nbatch == 1 &&
!         ((PlanState *) node)->righttree->chgParam == NULL)
!     {
!         /* okay to reuse the hash table; needn't rescan inner, either */
!     }
!     else     {
!         /* must destroy and rebuild hash table */
!         ExecHashTableDestroy(node->hj_HashTable);
!         node->hj_HashTable = NULL;
!         node->hj_FirstOuterTupleSlot = NULL; 
!         /*
!          * if chgParam of subnode is not null then plan will be re-scanned by
!          * first ExecProcNode.
!          */
!         if (((PlanState *) node)->righttree->chgParam == NULL)
!             ExecReScan(((PlanState *) node)->righttree, exprCtxt);     }      /* Always reset intra-tuple state */
--- 799,830 ---- ExecReScanHashJoin(HashJoinState *node, ExprContext *exprCtxt) {     /*      * In a multi-batch join,
wecurrently have to do rescans the hard way,      * primarily because batch temp files may have already been released.
But     * if it's a single-batch join, and there is no parameter change for the      * inner subnode, then we can just
re-usethe existing hash table without      * rebuilding it.      */
 
!     if (node->hj_HashTable != NULL)     {
!         if (node->hj_HashTable->nbatch == 1 &&
!             ((PlanState *) node)->righttree->chgParam == NULL)
!         {
!             /* okay to reuse the hash table; needn't rescan inner, either */
!         }
!         else
!         {
!             /* must destroy and rebuild hash table */
!             ExecHashTableDestroy(node->hj_HashTable);
!             node->hj_HashTable = NULL; 
!             /*
!              * if chgParam of subnode is not null then plan will be re-scanned
!              * by first ExecProcNode.
!              */
!             if (((PlanState *) node)->righttree->chgParam == NULL)
!                 ExecReScan(((PlanState *) node)->righttree, exprCtxt);
!         }     }      /* Always reset intra-tuple state */
***************
*** 847,852 ****
--- 836,842 ----     node->js.ps.ps_TupFromTlist = false;     node->hj_NeedNewOuter = true;     node->hj_MatchedOuter =
false;
+     node->hj_FirstOuterTupleSlot = NULL;      /*      * if chgParam of subnode is not null then plan will be
re-scannedby
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Help: 8.0.3 Vacuum of an empty table never completes ...
Next
From: Tom Lane
Date:
Subject: Re: Anonymous CVS working?