Re: Getting different number of results when using hashjoin on/off - Mailing list pgsql-hackers
| From | Tom Lane |
|---|---|
| Subject | Re: Getting different number of results when using hashjoin on/off |
| Date | |
| Msg-id | 12475.1133197733@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: Getting different number of results when using hashjoin on/off ("Mario Weilguni" <mario.weilguni@icomedias.com>) |
| List | pgsql-hackers |
"Mario Weilguni" <mario.weilguni@icomedias.com> writes:
> No, I'm using 8.1.0, and tried it on different machines, always the same results.
I see it, I think: the recent changes to avoid work when one or the
other side of the hash join is empty would exit the hash join leaving
a state that confused ExecReScanHashJoin() into thinking it didn't
have to do anything. Try the attached patch.
regards, tom lane
Index: src/backend/executor/nodeHashjoin.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v
retrieving revision 1.75.2.1
diff -c -r1.75.2.1 nodeHashjoin.c
*** src/backend/executor/nodeHashjoin.c 22 Nov 2005 18:23:09 -0000 1.75.2.1
--- src/backend/executor/nodeHashjoin.c 28 Nov 2005 17:04:43 -0000
***************
*** 152,163 **** * outer join, we can quit without scanning the outer relation. */ if
(hashtable->totalTuples== 0 && node->js.jointype != JOIN_LEFT)
- {
- ExecHashTableDestroy(hashtable);
- node->hj_HashTable = NULL;
- node->hj_FirstOuterTupleSlot = NULL; return NULL;
- } /* * need to remember whether nbatch has increased since we began
--- 152,158 ----
***************
*** 487,493 **** { ExecHashTableDestroy(node->hj_HashTable); node->hj_HashTable = NULL;
- node->hj_FirstOuterTupleSlot = NULL; } /*
--- 482,487 ----
***************
*** 805,841 **** ExecReScanHashJoin(HashJoinState *node, ExprContext *exprCtxt) { /*
- * If we haven't yet built the hash table then we can just return; nothing
- * done yet, so nothing to undo.
- */
- if (node->hj_HashTable == NULL)
- return;
-
- /* * In a multi-batch join, we currently have to do rescans the hard way, * primarily because batch
tempfiles may have already been released. But * if it's a single-batch join, and there is no parameter change for
the * inner subnode, then we can just re-use the existing hash table without * rebuilding it. */
! if (node->hj_HashTable->nbatch == 1 &&
! ((PlanState *) node)->righttree->chgParam == NULL)
! {
! /* okay to reuse the hash table; needn't rescan inner, either */
! }
! else {
! /* must destroy and rebuild hash table */
! ExecHashTableDestroy(node->hj_HashTable);
! node->hj_HashTable = NULL;
! node->hj_FirstOuterTupleSlot = NULL;
! /*
! * if chgParam of subnode is not null then plan will be re-scanned by
! * first ExecProcNode.
! */
! if (((PlanState *) node)->righttree->chgParam == NULL)
! ExecReScan(((PlanState *) node)->righttree, exprCtxt); } /* Always reset intra-tuple state */
--- 799,830 ---- ExecReScanHashJoin(HashJoinState *node, ExprContext *exprCtxt) { /* * In a multi-batch join,
wecurrently have to do rescans the hard way, * primarily because batch temp files may have already been released.
But * if it's a single-batch join, and there is no parameter change for the * inner subnode, then we can just
re-usethe existing hash table without * rebuilding it. */
! if (node->hj_HashTable != NULL) {
! if (node->hj_HashTable->nbatch == 1 &&
! ((PlanState *) node)->righttree->chgParam == NULL)
! {
! /* okay to reuse the hash table; needn't rescan inner, either */
! }
! else
! {
! /* must destroy and rebuild hash table */
! ExecHashTableDestroy(node->hj_HashTable);
! node->hj_HashTable = NULL;
! /*
! * if chgParam of subnode is not null then plan will be re-scanned
! * by first ExecProcNode.
! */
! if (((PlanState *) node)->righttree->chgParam == NULL)
! ExecReScan(((PlanState *) node)->righttree, exprCtxt);
! } } /* Always reset intra-tuple state */
***************
*** 847,852 ****
--- 836,842 ---- node->js.ps.ps_TupFromTlist = false; node->hj_NeedNewOuter = true; node->hj_MatchedOuter =
false;
+ node->hj_FirstOuterTupleSlot = NULL; /* * if chgParam of subnode is not null then plan will be
re-scannedby
pgsql-hackers by date: