Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Daniel Bausch
Subject Re: Parallel Seq Scan
Date
Msg-id 8761bfdtjq.fsf@gelnhausen.dvs.informatik.tu-darmstadt.de
Whole thread Raw
In response to Re: Parallel Seq Scan  (David Fetter <david@fetter.org>)
List pgsql-hackers
Hi David and others!

David Fetter <david@fetter.org> writes:

> On Tue, Jan 27, 2015 at 08:02:37AM +0100, Daniel Bausch wrote:
>>
>> Tom Lane <tgl@sss.pgh.pa.us> writes:
>>
>> >> Wait for first IO, issue second IO request
>> >> Compute
>> >> Already have second IO request, issue third
>> >> ...
>> >
>> >> We'd be a lot less sensitive to IO latency.
>> >
>> > It would take about five minutes of coding to prove or disprove this:
>> > stick a PrefetchBuffer call into heapgetpage() to launch a request for the
>> > next page as soon as we've read the current one, and then see if that
>> > makes any obvious performance difference.  I'm not convinced that it will,
>> > but if it did then we could think about how to make it work for real.
>>
>> Sorry for dropping in so late...
>>
>> I have done all this two years ago.  For TPC-H Q8, Q9, Q17, Q20, and Q21
>> I see a speedup of ~100% when using IndexScan prefetching + Nested-Loops
>> Look-Ahead (the outer loop!).
>> (On SSD with 32 Pages Prefetch/Look-Ahead + Cold Page Cache / Small RAM)
>
> Would you be so kind as to pass along any patches (ideally applicable
> to git master), tests, and specific measurements you made?

Attached find my patches based on the old revision
36f4c7843cf3d201279855ed9a6ebc1deb3c9463
(Adjust cube.out expected output for new test queries.)

I did not test applicability against HEAD by now.

Disclaimer: This was just a proof-of-concept and so is poor
implementation quality.  Nevertheless, performance looked promising
while it still needs a lot of extra rules for special cases, like
detecting accidential sequential scans.  General assumption is: no
concurrency - a single query owning the machine.

Here is a comparison using dbt3.  Q8, Q9, Q17, Q20, and Q21 are
significantly improved.

|     |   baseline |  indexscan | indexscan+nestloop |
|     |            | patch 1+2  | patch 3            |
|-----+------------+------------+--------------------|
| Q1  |  76.124261 |  73.165161 |          76.323119 |
| Q2  |   9.676956 |  11.211073 |          10.480668 |
| Q3  |  36.836417 |  36.268022 |          36.837226 |
| Q4  |  48.707501 |    64.2255 |          30.872218 |
| Q5  |  59.371467 |  59.205048 |          58.646096 |
| Q6  |  70.514214 |  73.021006 |           72.64643 |
| Q7  |  63.667594 |  63.258499 |          62.758288 |
| Q8  |  70.640973 |  33.144454 |          32.530732 |
| Q9  | 446.630473 | 379.063773 |         219.926094 |
| Q10 |  49.616125 |  49.244744 |          48.411664 |
| Q11 |   6.122317 |   6.158616 |           6.160189 |
| Q12 |  74.294292 |  87.780442 |          87.533936 |
| Q13 |   32.37932 |  32.771938 |          33.483444 |
| Q14 |  47.836053 |  48.093996 |           47.72221 |
| Q15 | 139.350038 | 138.880208 |         138.681336 |
| Q16 |  12.092429 |  12.120661 |          11.668971 |
| Q17 |   9.346636 |   4.106042 |           4.018951 |
| Q18 |  66.106875 | 123.754111 |         122.623193 |
| Q19 |  22.750504 |  23.191532 |           22.34084 |
| Q20 |  80.481986 |  29.906274 |           28.58106 |
| Q21 | 396.897269 |  355.45988 |          214.44184 |
| Q22 |   6.834841 |   6.600922 |           6.524032 |

Regards,
Daniel
--
MSc. Daniel Bausch
Research Assistant (Computer Science)
Technische Universität Darmstadt
http://www.dvs.tu-darmstadt.de/staff/dbausch
>From 569398929d899100b769abfd919bc3383626ac9f Mon Sep 17 00:00:00 2001
From: Daniel Bausch <bausch@dvs.tu-darmstadt.de>
Date: Tue, 22 Oct 2013 15:22:25 +0200
Subject: [PATCH 1/4] Quick proof-of-concept for indexscan prefetching

This implements a prefetching queue of tuples whose tid is read ahead.
Their block number is quickly checked for random properties (not current
block and not the block prefetched last).  Random reads are prefetched.
Up to 32 tuples are considered by default.  The tids are queued in a
fixed ring buffer.

The prefetching is implemented in the generic part of the index scan, so
it applies to all access methods.
---
 src/backend/access/index/indexam.c | 96 ++++++++++++++++++++++++++++++++++++++
 src/include/access/relscan.h       | 12 +++++
 2 files changed, 108 insertions(+)

diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index b878155..1c54ef5 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -251,6 +251,12 @@ index_beginscan(Relation heapRelation,
     scan->heapRelation = heapRelation;
     scan->xs_snapshot = snapshot;

+#ifdef USE_PREFETCH
+    scan->xs_prefetch_head = scan->xs_prefetch_tail = -1;
+    scan->xs_last_prefetch = -1;
+    scan->xs_done = false;
+#endif
+
     return scan;
 }

@@ -432,6 +438,55 @@ index_restrpos(IndexScanDesc scan)
     FunctionCall1(procedure, PointerGetDatum(scan));
 }

+static int
+index_prefetch_queue_space(IndexScanDesc scan)
+{
+    if (scan->xs_prefetch_tail < 0)
+        return INDEXSCAN_PREFETCH_COUNT;
+
+    Assert(scan->xs_prefetch_head >= 0);
+
+    return (INDEXSCAN_PREFETCH_COUNT
+            - (scan->xs_prefetch_tail - scan->xs_prefetch_head + 1))
+        % INDEXSCAN_PREFETCH_COUNT;
+}
+
+/* makes copy of ItemPointerData */
+static bool
+index_prefetch_queue_push(IndexScanDesc scan, ItemPointer tid)
+{
+    Assert(index_prefetch_queue_space(scan) > 0);
+
+    if (scan->xs_prefetch_tail == -1)
+        scan->xs_prefetch_head = scan->xs_prefetch_tail = 0;
+    else
+        scan->xs_prefetch_tail =
+            (scan->xs_prefetch_tail + 1) % INDEXSCAN_PREFETCH_COUNT;
+
+    scan->xs_prefetch_queue[scan->xs_prefetch_tail] = *tid;
+
+    return true;
+}
+
+static ItemPointer
+index_prefetch_queue_pop(IndexScanDesc scan)
+{
+    ItemPointer res;
+
+    if (scan->xs_prefetch_head < 0)
+        return NULL;
+
+    res = &scan->xs_prefetch_queue[scan->xs_prefetch_head];
+
+    if (scan->xs_prefetch_head == scan->xs_prefetch_tail)
+        scan->xs_prefetch_head = scan->xs_prefetch_tail = -1;
+    else
+        scan->xs_prefetch_head =
+            (scan->xs_prefetch_head + 1) % INDEXSCAN_PREFETCH_COUNT;
+
+    return res;
+}
+
 /* ----------------
  * index_getnext_tid - get the next TID from a scan
  *
@@ -444,12 +499,52 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
 {
     FmgrInfo   *procedure;
     bool        found;
+    ItemPointer    from_queue;
+    BlockNumber    pf_block;

     SCAN_CHECKS;
     GET_SCAN_PROCEDURE(amgettuple);

     Assert(TransactionIdIsValid(RecentGlobalXmin));

+#ifdef USE_PREFETCH
+    while (!scan->xs_done && index_prefetch_queue_space(scan) > 0) {
+        /*
+         * The AM's amgettuple proc finds the next index entry matching the
+         * scan keys, and puts the TID into scan->xs_ctup.t_self.  It should
+         * also set scan->xs_recheck and possibly scan->xs_itup, though we pay
+         * no attention to those fields here.
+         */
+        found = DatumGetBool(FunctionCall2(procedure,
+                                           PointerGetDatum(scan),
+                                           Int32GetDatum(direction)));
+        if (found)
+        {
+            index_prefetch_queue_push(scan, &scan->xs_ctup.t_self);
+            pf_block = ItemPointerGetBlockNumber(&scan->xs_ctup.t_self);
+            /* prefetch only if not the current buffer and not exactly the
+             * previously prefetched buffer (heuristic random detection)
+             * because sequential read-ahead would be redundant */
+            if ((!BufferIsValid(scan->xs_cbuf) ||
+                 pf_block != BufferGetBlockNumber(scan->xs_cbuf)) &&
+                pf_block != scan->xs_last_prefetch)
+            {
+                PrefetchBuffer(scan->heapRelation, MAIN_FORKNUM, pf_block);
+                scan->xs_last_prefetch = pf_block;
+            }
+        }
+        else
+            scan->xs_done = true;
+    }
+    from_queue = index_prefetch_queue_pop(scan);
+    if (from_queue)
+    {
+        scan->xs_ctup.t_self = *from_queue;
+        found = true;
+    }
+    else
+        found = false;
+#else
     /*
      * The AM's amgettuple proc finds the next index entry matching the scan
      * keys, and puts the TID into scan->xs_ctup.t_self.  It should also set
@@ -459,6 +554,7 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
     found = DatumGetBool(FunctionCall2(procedure,
                                        PointerGetDatum(scan),
                                        Int32GetDatum(direction)));
+#endif

     /* Reset kill flag immediately for safety */
     scan->kill_prior_tuple = false;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index 3a86ca4..bccc1a4 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -93,6 +93,18 @@ typedef struct IndexScanDescData

     /* state data for traversing HOT chains in index_getnext */
     bool        xs_continue_hot;    /* T if must keep walking HOT chain */
+
+#ifdef USE_PREFETCH
+# ifndef INDEXSCAN_PREFETCH_COUNT
+#  define INDEXSCAN_PREFETCH_COUNT 32
+# endif
+    /* prefetch queue - ringbuffer */
+    ItemPointerData xs_prefetch_queue[INDEXSCAN_PREFETCH_COUNT];
+    int            xs_prefetch_head;
+    int            xs_prefetch_tail;
+    BlockNumber    xs_last_prefetch;
+    bool        xs_done;
+#endif
 }    IndexScanDescData;

 /* Struct for heap-or-index scans of system tables */
--
2.0.5

>From 7cb5839dd7751bcdcae6e4cbf69cfd24af10a694 Mon Sep 17 00:00:00 2001
From: Daniel Bausch <bausch@dvs.tu-darmstadt.de>
Date: Wed, 23 Oct 2013 09:45:11 +0200
Subject: [PATCH 2/4] Fix index-only scan and rescan

Prefetching heap data for index-only scans does not make any sense and
it uses a different field (itup), nevertheless.  Deactivate the prefetch
logic for index-only scans.

Reset xs_done and the queue on rescan, so we find tuples again.
Remember last prefetch to detect correlation.
---
 src/backend/access/index/indexam.c | 85 +++++++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 38 deletions(-)

diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 1c54ef5..d8a4622 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -353,6 +353,12 @@ index_rescan(IndexScanDesc scan,

     scan->kill_prior_tuple = false;        /* for safety */

+#ifdef USE_PREFETCH
+    /* I think, it does not hurt to remember xs_last_prefetch */
+    scan->xs_prefetch_head = scan->xs_prefetch_tail = -1;
+    scan->xs_done = false;
+#endif
+
     FunctionCall5(procedure,
                   PointerGetDatum(scan),
                   PointerGetDatum(keys),
@@ -508,7 +514,47 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
     Assert(TransactionIdIsValid(RecentGlobalXmin));

 #ifdef USE_PREFETCH
-    while (!scan->xs_done && index_prefetch_queue_space(scan) > 0) {
+    if (!scan->xs_want_itup)
+    {
+        while (!scan->xs_done && index_prefetch_queue_space(scan) > 0) {
+            /*
+             * The AM's amgettuple proc finds the next index entry matching
+             * the scan keys, and puts the TID into scan->xs_ctup.t_self.  It
+             * should also set scan->xs_recheck and possibly scan->xs_itup,
+             * though we pay no attention to those fields here.
+             */
+            found = DatumGetBool(FunctionCall2(procedure,
+                                               PointerGetDatum(scan),
+                                               Int32GetDatum(direction)));
+            if (found)
+            {
+                index_prefetch_queue_push(scan, &scan->xs_ctup.t_self);
+                pf_block = ItemPointerGetBlockNumber(&scan->xs_ctup.t_self);
+                /* prefetch only if not the current buffer and not exactly the
+                 * previously prefetched buffer (heuristic random detection)
+                 * because sequential read-ahead would be redundant */
+                if ((!BufferIsValid(scan->xs_cbuf) ||
+                     pf_block != BufferGetBlockNumber(scan->xs_cbuf)) &&
+                    pf_block != scan->xs_last_prefetch)
+                {
+                    PrefetchBuffer(scan->heapRelation, MAIN_FORKNUM, pf_block);
+                    scan->xs_last_prefetch = pf_block;
+                }
+            }
+            else
+                scan->xs_done = true;
+        }
+        from_queue = index_prefetch_queue_pop(scan);
+        if (from_queue)
+        {
+            scan->xs_ctup.t_self = *from_queue;
+            found = true;
+        }
+        else
+            found = false;
+    }
+    else
+#endif
         /*
          * The AM's amgettuple proc finds the next index entry matching the
          * scan keys, and puts the TID into scan->xs_ctup.t_self.  It should
@@ -518,43 +564,6 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
         found = DatumGetBool(FunctionCall2(procedure,
                                            PointerGetDatum(scan),
                                            Int32GetDatum(direction)));
-        if (found)
-        {
-            index_prefetch_queue_push(scan, &scan->xs_ctup.t_self);
-            pf_block = ItemPointerGetBlockNumber(&scan->xs_ctup.t_self);
-            /* prefetch only if not the current buffer and not exactly the
-             * previously prefetched buffer (heuristic random detection)
-             * because sequential read-ahead would be redundant */
-            if ((!BufferIsValid(scan->xs_cbuf) ||
-                 pf_block != BufferGetBlockNumber(scan->xs_cbuf)) &&
-                pf_block != scan->xs_last_prefetch)
-            {
-                PrefetchBuffer(scan->heapRelation, MAIN_FORKNUM, pf_block);
-                scan->xs_last_prefetch = pf_block;
-            }
-        }
-        else
-            scan->xs_done = true;
-    }
-    from_queue = index_prefetch_queue_pop(scan);
-    if (from_queue)
-    {
-        scan->xs_ctup.t_self = *from_queue;
-        found = true;
-    }
-    else
-        found = false;
-#else
-    /*
-     * The AM's amgettuple proc finds the next index entry matching the scan
-     * keys, and puts the TID into scan->xs_ctup.t_self.  It should also set
-     * scan->xs_recheck and possibly scan->xs_itup, though we pay no attention
-     * to those fields here.
-     */
-    found = DatumGetBool(FunctionCall2(procedure,
-                                       PointerGetDatum(scan),
-                                       Int32GetDatum(direction)));
-#endif

     /* Reset kill flag immediately for safety */
     scan->kill_prior_tuple = false;
--
2.0.5

>From d8b1533955e3471fb2eb6a030619dcbc258955a8 Mon Sep 17 00:00:00 2001
From: Daniel Bausch <bausch@dvs.tu-darmstadt.de>
Date: Mon, 28 Oct 2013 10:43:16 +0100
Subject: [PATCH 3/4] First try on tuple look-ahead in nestloop

Similarly to the prefetching logic just added to the index scan, look
ahead tuples in the outer loop of a nested loop scan.  For every tuple
looked ahead issue an explicit request for prefetching to the inner
plan.  Modify the index scan to react on this request.
---
 src/backend/access/index/indexam.c   |  81 +++++++++-----
 src/backend/executor/execProcnode.c  |  36 +++++++
 src/backend/executor/nodeIndexscan.c |  16 +++
 src/backend/executor/nodeNestloop.c  | 200 ++++++++++++++++++++++++++++++++++-
 src/include/access/genam.h           |   4 +
 src/include/executor/executor.h      |   3 +
 src/include/executor/nodeIndexscan.h |   1 +
 src/include/nodes/execnodes.h        |  12 +++
 8 files changed, 323 insertions(+), 30 deletions(-)

diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index d8a4622..5f44dec 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -493,6 +493,57 @@ index_prefetch_queue_pop(IndexScanDesc scan)
     return res;
 }

+#ifdef USE_PREFETCH
+int
+index_prefetch(IndexScanDesc scan, int maxPrefetch, ScanDirection direction)
+{
+    FmgrInfo   *procedure;
+    int            numPrefetched = 0;
+    bool        found;
+    BlockNumber    pf_block;
+    FILE       *logfile;
+
+    GET_SCAN_PROCEDURE(amgettuple);
+
+    while (numPrefetched < maxPrefetch && !scan->xs_done &&
+           index_prefetch_queue_space(scan) > 0)
+    {
+        /*
+         * The AM's amgettuple proc finds the next index entry matching the
+         * scan keys, and puts the TID into scan->xs_ctup.t_self.  It should
+         * also set scan->xs_recheck and possibly scan->xs_itup, though we pay
+         * no attention to those fields here.
+         */
+        found = DatumGetBool(FunctionCall2(procedure,
+                                           PointerGetDatum(scan),
+                                           Int32GetDatum(direction)));
+        if (found)
+        {
+            index_prefetch_queue_push(scan, &scan->xs_ctup.t_self);
+            pf_block = ItemPointerGetBlockNumber(&scan->xs_ctup.t_self);
+
+            /*
+             * Prefetch only if not the current buffer and not exactly the
+             * previously prefetched buffer (heuristic random detection)
+             * because sequential read-ahead would be redundant
+             */
+            if ((!BufferIsValid(scan->xs_cbuf) ||
+                 pf_block != BufferGetBlockNumber(scan->xs_cbuf)) &&
+                pf_block != scan->xs_last_prefetch)
+            {
+                PrefetchBuffer(scan->heapRelation, MAIN_FORKNUM, pf_block);
+                scan->xs_last_prefetch = pf_block;
+                numPrefetched++;
+            }
+        }
+        else
+            scan->xs_done = true;
+    }
+
+    return numPrefetched;
+}
+#endif
+
 /* ----------------
  * index_getnext_tid - get the next TID from a scan
  *
@@ -506,7 +557,6 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
     FmgrInfo   *procedure;
     bool        found;
     ItemPointer    from_queue;
-    BlockNumber    pf_block;

     SCAN_CHECKS;
     GET_SCAN_PROCEDURE(amgettuple);
@@ -516,34 +566,7 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
 #ifdef USE_PREFETCH
     if (!scan->xs_want_itup)
     {
-        while (!scan->xs_done && index_prefetch_queue_space(scan) > 0) {
-            /*
-             * The AM's amgettuple proc finds the next index entry matching
-             * the scan keys, and puts the TID into scan->xs_ctup.t_self.  It
-             * should also set scan->xs_recheck and possibly scan->xs_itup,
-             * though we pay no attention to those fields here.
-             */
-            found = DatumGetBool(FunctionCall2(procedure,
-                                               PointerGetDatum(scan),
-                                               Int32GetDatum(direction)));
-            if (found)
-            {
-                index_prefetch_queue_push(scan, &scan->xs_ctup.t_self);
-                pf_block = ItemPointerGetBlockNumber(&scan->xs_ctup.t_self);
-                /* prefetch only if not the current buffer and not exactly the
-                 * previously prefetched buffer (heuristic random detection)
-                 * because sequential read-ahead would be redundant */
-                if ((!BufferIsValid(scan->xs_cbuf) ||
-                     pf_block != BufferGetBlockNumber(scan->xs_cbuf)) &&
-                    pf_block != scan->xs_last_prefetch)
-                {
-                    PrefetchBuffer(scan->heapRelation, MAIN_FORKNUM, pf_block);
-                    scan->xs_last_prefetch = pf_block;
-                }
-            }
-            else
-                scan->xs_done = true;
-        }
+        index_prefetch(scan, INDEXSCAN_PREFETCH_COUNT, direction);
         from_queue = index_prefetch_queue_pop(scan);
         if (from_queue)
         {
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 76dd62f..a8f2c90 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -741,3 +741,39 @@ ExecEndNode(PlanState *node)
             break;
     }
 }
+
+
+#ifdef USE_PREFETCH
+/* ----------------------------------------------------------------
+ *        ExecPrefetchNode
+ *
+ *        Request explicit prefetching from a subtree/node without
+ *        actually forming a tuple.
+ *
+ *        The node shall request at most 'maxPrefetch' pages being
+ *        prefetched.
+ *
+ *        The function returns how many pages have been requested.
+ *
+ *        Calling this function for a type that does not support
+ *        prefetching is not an error.  It just returns 0 as if no
+ *        prefetching was possible.
+ * ----------------------------------------------------------------
+ */
+int
+ExecPrefetchNode(PlanState *node, int maxPrefetch)
+{
+    if (node == NULL)
+        return 0;
+
+    switch (nodeTag(node))
+    {
+        case T_IndexScanState:
+            return ExecPrefetchIndexScan((IndexScanState *) node,
+                                         maxPrefetch);
+
+        default:
+            return 0;
+    }
+}
+#endif
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index f1062f1..bab0e7a 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -192,6 +192,22 @@ ExecReScanIndexScan(IndexScanState *node)
     ExecScanReScan(&node->ss);
 }

+#ifdef USE_PREFETCH
+/* ----------------------------------------------------------------
+ *        ExecPrefetchIndexScan(node, maxPrefetch)
+ *
+ *        Trigger prefetching of index scan without actually fetching
+ *        a tuple.
+ * ----------------------------------------------------------------
+ */
+int
+ExecPrefetchIndexScan(IndexScanState *node, int maxPrefetch)
+{
+    return index_prefetch(node->iss_ScanDesc, maxPrefetch,
+                          node->ss.ps.state->es_direction);
+}
+#endif
+

 /*
  * ExecIndexEvalRuntimeKeys
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index c7a08ed..21ad5f8 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -25,6 +25,90 @@
 #include "executor/nodeNestloop.h"
 #include "utils/memutils.h"

+#ifdef USE_PREFETCH
+static int
+NestLoopLookAheadQueueSpace(NestLoopState *node)
+{
+    if (node->nl_lookAheadQueueTail < 0)
+        return NESTLOOP_PREFETCH_COUNT;
+
+    Assert(node->nl_lookAheadQueueHead >= 0);
+
+    return (NESTLOOP_PREFETCH_COUNT
+            - (node->nl_lookAheadQueueTail - node->nl_lookAheadQueueHead + 1))
+        % NESTLOOP_PREFETCH_COUNT;
+}
+
+/* makes materialized copy of tuple table slot */
+static bool
+NestLoopLookAheadQueuePush(NestLoopState *node, TupleTableSlot *tuple)
+{
+    TupleTableSlot **queueEntry;
+
+    Assert(NestLoopLookAheadQueueSpace(node) > 0);
+
+    if (node->nl_lookAheadQueueTail == -1)
+        node->nl_lookAheadQueueHead = node->nl_lookAheadQueueTail = 0;
+    else
+        node->nl_lookAheadQueueTail =
+            (node->nl_lookAheadQueueTail +1) % NESTLOOP_PREFETCH_COUNT;
+
+    queueEntry = &node->nl_lookAheadQueue[node->nl_lookAheadQueueTail];
+
+    if (!(*queueEntry))
+    {
+        *queueEntry = ExecInitExtraTupleSlot(node->js.ps.state);
+        ExecSetSlotDescriptor(*queueEntry,
+                              ExecGetResultType(outerPlanState(node)));
+    }
+
+    ExecCopySlot(*queueEntry, tuple);
+
+    return true;
+}
+
+static TupleTableSlot *
+NestLoopLookAheadQueuePop(NestLoopState *node)
+{
+    TupleTableSlot *res;
+
+    if (node->nl_lookAheadQueueHead < 0)
+        return NULL;
+
+    res = node->nl_lookAheadQueue[node->nl_lookAheadQueueHead];
+
+    if (node->nl_lookAheadQueueHead == node->nl_lookAheadQueueTail)
+        node->nl_lookAheadQueueHead = node->nl_lookAheadQueueTail = -1;
+    else
+        node->nl_lookAheadQueueHead =
+            (node->nl_lookAheadQueueHead + 1) % NESTLOOP_PREFETCH_COUNT;
+
+    return res;
+}
+
+static void
+NestLoopLookAheadQueueClear(NestLoopState *node)
+{
+    TupleTableSlot *lookAheadTuple;
+    int        i;
+
+    /*
+     * As we do not clear the tuple table slots on pop, we need to scan the
+     * whole array, regardless of the current queue fill.
+     *
+     * We cannot really free the slot, as there is no well defined interface
+     * for that, but the emptied slots will be freed when the query ends.
+     */
+    for (i = 0; i < NESTLOOP_PREFETCH_COUNT; i++)
+    {
+        lookAheadTuple = node->nl_lookAheadQueue[i];
+        /* look only on pointer - all non NULL fields are non-empty */
+        if (lookAheadTuple)
+            ExecClearTuple(lookAheadTuple);
+    }
+
+}
+#endif /* USE_PREFETCH */

 /* ----------------------------------------------------------------
  *        ExecNestLoop(node)
@@ -120,7 +204,87 @@ ExecNestLoop(NestLoopState *node)
         if (node->nl_NeedNewOuter)
         {
             ENL1_printf("getting new outer tuple");
-            outerTupleSlot = ExecProcNode(outerPlan);
+
+#ifdef USE_PREFETCH
+            /*
+             * While we have outer tuples and were not able to request enought
+             * prefetching from the inner plan to properly load the system,
+             * request more outer tuples and inner prefetching for them.
+             *
+             * Unfortunately we can do outer look-ahead directed prefetching
+             * only when we are rescanning the inner plan anyway; otherwise we
+             * would break the inner scan.  Only an independent copy of the
+             * inner plan state would allow us to prefetch accross inner loops
+             * regardless of inner scan position.
+             */
+            while (!node->nl_lookAheadDone &&
+                   node->nl_numInnerPrefetched < NESTLOOP_PREFETCH_COUNT &&
+                   NestLoopLookAheadQueueSpace(node) > 0)
+            {
+                TupleTableSlot *lookAheadTupleSlot = ExecProcNode(outerPlan);
+
+                if (!TupIsNull(lookAheadTupleSlot))
+                {
+                    NestLoopLookAheadQueuePush(node, lookAheadTupleSlot);
+
+                    /*
+                     * Set inner params according to look-ahead tuple.
+                     *
+                     * Fetch the values of any outer Vars that must be passed
+                     * to the inner scan, and store them in the appropriate
+                     * PARAM_EXEC slots.
+                     */
+                    foreach(lc, nl->nestParams)
+                    {
+                        NestLoopParam *nlp = (NestLoopParam *) lfirst(lc);
+                        int            paramno = nlp->paramno;
+                        ParamExecData *prm;
+
+                        prm = &(econtext->ecxt_param_exec_vals[paramno]);
+                        /* Param value should be an OUTER_VAR var */
+                        Assert(IsA(nlp->paramval, Var));
+                        Assert(nlp->paramval->varno == OUTER_VAR);
+                        Assert(nlp->paramval->varattno > 0);
+                        prm->value = slot_getattr(lookAheadTupleSlot,
+                                                  nlp->paramval->varattno,
+                                                  &(prm->isnull));
+                        /* Flag parameter value as changed */
+                        innerPlan->chgParam =
+                            bms_add_member(innerPlan->chgParam, paramno);
+                    }
+
+                    /*
+                     * Rescan inner plan with changed parameters and request
+                     * explicit prefetch.  Limit the inner prefetch amount
+                     * according to our own bookkeeping.
+                     *
+                     * When the so processed outer tuple gets finally active
+                     * in the inner loop, the inner plan will autonomously
+                     * prefetch the same tuples again.  This is redundant but
+                     * avoiding that seems too complicated for now.  It should
+                     * not hurt too much and may even help in case the
+                     * prefetched blocks have been evicted again in the
+                     * meantime.
+                     */
+                    ExecReScan(innerPlan);
+                    node->nl_numInnerPrefetched +=
+                        ExecPrefetchNode(innerPlan,
+                                         NESTLOOP_PREFETCH_COUNT -
+                                         node->nl_numInnerPrefetched);
+                }
+                else
+                    node->nl_lookAheadDone = true; /* outer plan exhausted */
+            }
+
+            /*
+             * If there is already the next outerPlan in our look-ahead queue,
+             * get the next outer tuple from there, otherwise execute the
+             * outer plan.
+             */
+            outerTupleSlot = NestLoopLookAheadQueuePop(node);
+            if (TupIsNull(outerTupleSlot) && !node->nl_lookAheadDone)
+#endif /* USE_PREFETCH */
+                outerTupleSlot = ExecProcNode(outerPlan);

             /*
              * if there are no more outer tuples, then the join is complete..
@@ -174,6 +338,18 @@ ExecNestLoop(NestLoopState *node)
         innerTupleSlot = ExecProcNode(innerPlan);
         econtext->ecxt_innertuple = innerTupleSlot;

+#ifdef USE_PREFETCH
+        /*
+         * Decrement prefetch counter as we cosume inner tuples.  We need to
+         * check for >0 because prefetching might not have happened for the
+         * consumed tuple, maybe because explicit prefetching is not supported
+         * by the inner plan or because the explicit prefetching requested by
+         * us is exhausted and the inner plan is doing it on its own now.
+         */
+        if (node->nl_numInnerPrefetched > 0)
+            node->nl_numInnerPrefetched--;
+#endif
+
         if (TupIsNull(innerTupleSlot))
         {
             ENL1_printf("no inner tuple, need new outer tuple");
@@ -296,6 +472,9 @@ NestLoopState *
 ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
 {
     NestLoopState *nlstate;
+#ifdef USE_PREFETCH
+    int i;
+#endif

     /* check for unsupported flags */
     Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
@@ -381,6 +560,15 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
     nlstate->nl_NeedNewOuter = true;
     nlstate->nl_MatchedOuter = false;

+#ifdef USE_PREFETCH
+    nlstate->nl_lookAheadQueueHead = nlstate->nl_lookAheadQueueTail = -1;
+    nlstate->nl_lookAheadDone = false;
+    nlstate->nl_numInnerPrefetched = 0;
+
+    for (i = 0; i < NESTLOOP_PREFETCH_COUNT; i++)
+        nlstate->nl_lookAheadQueue[i] = NULL;
+#endif
+
     NL1_printf("ExecInitNestLoop: %s\n",
                "node initialized");

@@ -409,6 +597,10 @@ ExecEndNestLoop(NestLoopState *node)
      */
     ExecClearTuple(node->js.ps.ps_ResultTupleSlot);

+#ifdef USE_PREFETCH
+    NestLoopLookAheadQueueClear(node);
+#endif
+
     /*
      * close down subplans
      */
@@ -444,4 +636,10 @@ ExecReScanNestLoop(NestLoopState *node)
     node->js.ps.ps_TupFromTlist = false;
     node->nl_NeedNewOuter = true;
     node->nl_MatchedOuter = false;
+
+#ifdef USE_PREFETCH
+    NestLoopLookAheadQueueClear(node);
+    node->nl_lookAheadDone = false;
+    node->nl_numInnerPrefetched = 0;
+#endif
 }
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index a800041..7733b3c 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -146,6 +146,10 @@ extern void index_markpos(IndexScanDesc scan);
 extern void index_restrpos(IndexScanDesc scan);
 extern ItemPointer index_getnext_tid(IndexScanDesc scan,
                   ScanDirection direction);
+#ifdef USE_PREFETCH
+extern int index_prefetch(IndexScanDesc scan, int maxPrefetch,
+                          ScanDirection direction);
+#endif
 extern HeapTuple index_fetch_heap(IndexScanDesc scan);
 extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
 extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 75841c8..88d0522 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -221,6 +221,9 @@ extern PlanState *ExecInitNode(Plan *node, EState *estate, int eflags);
 extern TupleTableSlot *ExecProcNode(PlanState *node);
 extern Node *MultiExecProcNode(PlanState *node);
 extern void ExecEndNode(PlanState *node);
+#ifdef USE_PREFETCH
+extern int ExecPrefetchNode(PlanState *node, int maxPrefetch);
+#endif

 /*
  * prototypes from functions in execQual.c
diff --git a/src/include/executor/nodeIndexscan.h b/src/include/executor/nodeIndexscan.h
index 71dbd9c..f93632c 100644
--- a/src/include/executor/nodeIndexscan.h
+++ b/src/include/executor/nodeIndexscan.h
@@ -18,6 +18,7 @@

 extern IndexScanState *ExecInitIndexScan(IndexScan *node, EState *estate, int eflags);
 extern TupleTableSlot *ExecIndexScan(IndexScanState *node);
+extern int ExecPrefetchIndexScan(IndexScanState *node, int maxPrefetch);
 extern void ExecEndIndexScan(IndexScanState *node);
 extern void ExecIndexMarkPos(IndexScanState *node);
 extern void ExecIndexRestrPos(IndexScanState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 3b430e0..27fe65d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1526,6 +1526,18 @@ typedef struct NestLoopState
     bool        nl_NeedNewOuter;
     bool        nl_MatchedOuter;
     TupleTableSlot *nl_NullInnerTupleSlot;
+
+#ifdef USE_PREFETCH
+# ifndef NESTLOOP_PREFETCH_COUNT
+#  define NESTLOOP_PREFETCH_COUNT 32
+# endif
+    /* look-ahead queue (for prefetching) - ringbuffer */
+    TupleTableSlot *nl_lookAheadQueue[NESTLOOP_PREFETCH_COUNT];
+    int            nl_lookAheadQueueHead;
+    int            nl_lookAheadQueueTail;
+    bool        nl_lookAheadDone;
+    int            nl_numInnerPrefetched;
+#endif
 } NestLoopState;

 /* ----------------
--
2.0.5

>From a1fcab2d9d001505a5fc25accdca71e88148e4ff Mon Sep 17 00:00:00 2001
From: Daniel Bausch <bausch@dvs.tu-darmstadt.de>
Date: Tue, 29 Oct 2013 16:41:09 +0100
Subject: [PATCH 4/4] Limit recursive prefetching for merge join

Add switch facility to limit the prefetching of a subtree recursively.
In a first try add support for some variants of merge join.  Distribute
the prefetch allowance evenly between outer and inner subplan.
---
 src/backend/access/index/indexam.c   |  5 +++-
 src/backend/executor/execProcnode.c  | 47 +++++++++++++++++++++++++++++++++++-
 src/backend/executor/nodeAgg.c       | 10 ++++++++
 src/backend/executor/nodeIndexscan.c | 18 ++++++++++++++
 src/backend/executor/nodeMaterial.c  | 14 +++++++++++
 src/backend/executor/nodeMergejoin.c | 22 +++++++++++++++++
 src/include/access/relscan.h         |  1 +
 src/include/executor/executor.h      |  1 +
 src/include/executor/nodeAgg.h       |  3 +++
 src/include/executor/nodeIndexscan.h |  3 +++
 src/include/executor/nodeMaterial.h  |  3 +++
 src/include/executor/nodeMergejoin.h |  3 +++
 src/include/nodes/execnodes.h        |  6 +++++
 13 files changed, 134 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 5f44dec..354bde6 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -255,6 +255,7 @@ index_beginscan(Relation heapRelation,
     scan->xs_prefetch_head = scan->xs_prefetch_tail = -1;
     scan->xs_last_prefetch = -1;
     scan->xs_done = false;
+    scan->xs_prefetch_limit = INDEXSCAN_PREFETCH_COUNT;
 #endif

     return scan;
@@ -506,7 +507,9 @@ index_prefetch(IndexScanDesc scan, int maxPrefetch, ScanDirection direction)
     GET_SCAN_PROCEDURE(amgettuple);

     while (numPrefetched < maxPrefetch && !scan->xs_done &&
-           index_prefetch_queue_space(scan) > 0)
+           index_prefetch_queue_space(scan) > 0 &&
+           index_prefetch_queue_space(scan) >
+           (INDEXSCAN_PREFETCH_COUNT - scan->xs_prefetch_limit))
     {
         /*
          * The AM's amgettuple proc finds the next index entry matching the
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index a8f2c90..a14a0d0 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -745,6 +745,51 @@ ExecEndNode(PlanState *node)

 #ifdef USE_PREFETCH
 /* ----------------------------------------------------------------
+ *        ExecLimitPrefetchNode
+ *
+ *        Limit the amount of prefetching that may be requested by
+ *        a subplan.
+ *
+ *        Most of the handlers just pass-through the received value
+ *        to their subplans.  That is the case, when they have just
+ *        one subplan that might prefetch.  If they have two subplans
+ *        intelligent heuristics need to be applied to distribute the
+ *        prefetch allowance in a way delivering overall advantage.
+ * ----------------------------------------------------------------
+ */
+void
+ExecLimitPrefetchNode(PlanState *node, int limit)
+{
+    if (node == NULL)
+        return;
+
+    switch (nodeTag(node))
+    {
+        case T_IndexScanState:
+            ExecLimitPrefetchIndexScan((IndexScanState *) node, limit);
+            break;
+
+        case T_MergeJoinState:
+            ExecLimitPrefetchMergeJoin((MergeJoinState *) node, limit);
+            break;
+
+        case T_MaterialState:
+            ExecLimitPrefetchMaterial((MaterialState *) node, limit);
+            break;
+
+        case T_AggState:
+            ExecLimitPrefetchAgg((AggState *) node, limit);
+            break;
+
+        default:
+            elog(INFO,
+                 "missing ExecLimitPrefetchNode handler for node type: %d",
+                 (int) nodeTag(node));
+            break;
+    }
+}
+
+/* ----------------------------------------------------------------
  *        ExecPrefetchNode
  *
  *        Request explicit prefetching from a subtree/node without
@@ -776,4 +821,4 @@ ExecPrefetchNode(PlanState *node, int maxPrefetch)
             return 0;
     }
 }
-#endif
+#endif /* USE_PREFETCH */
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index e02a6ff..94f6d77 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -1877,6 +1877,16 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
     return aggstate;
 }

+#ifdef USE_PREFETCH
+void
+ExecLimitPrefetchAgg(AggState *node, int limit)
+{
+    Assert(node != NULL);
+
+    ExecLimitPrefetchNode(outerPlanState(node), limit);
+}
+#endif
+
 static Datum
 GetAggInitVal(Datum textInitVal, Oid transtype)
 {
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index bab0e7a..6ea236e 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -640,6 +640,24 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
     return indexstate;
 }

+#ifdef USE_PREFETCH
+/* ----------------------------------------------------------------
+ *        ExecLimitPrefetchIndexScan
+ *
+ *        Sets/changes the number of tuples whose pages to request in
+ *        advance.
+ * ----------------------------------------------------------------
+ */
+void
+ExecLimitPrefetchIndexScan(IndexScanState *node, int limit)
+{
+    Assert(node != NULL);
+    Assert(node->iss_ScanDesc != NULL);
+
+    node->iss_ScanDesc->xs_prefetch_limit = limit;
+}
+#endif
+

 /*
  * ExecIndexBuildScanKeys
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 7a82f56..3370362 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -232,6 +232,20 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
     return matstate;
 }

+#ifdef USE_PREFETCH
+/* ----------------------------------------------------------------
+ *        ExecLimitPrefetchMaterial
+ * ----------------------------------------------------------------
+ */
+void
+ExecLimitPrefetchMaterial(MaterialState *node, int limit)
+{
+    Assert(node != NULL);
+
+    ExecLimitPrefetchNode(outerPlanState(node), limit);
+}
+#endif
+
 /* ----------------------------------------------------------------
  *        ExecEndMaterial
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index e69bc64..f25e074 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1627,6 +1627,10 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
     mergestate->mj_OuterTupleSlot = NULL;
     mergestate->mj_InnerTupleSlot = NULL;

+#ifdef USE_PREFETCH
+    ExecLimitPrefetchMergeJoin(mergestate, MERGEJOIN_PREFETCH_COUNT);
+#endif
+
     /*
      * initialization successful
      */
@@ -1636,6 +1640,24 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
     return mergestate;
 }

+#ifdef USE_PREFETCH
+/* ----------------------------------------------------------------
+ *        ExecLimitPrefetchMergeJoin
+ * ----------------------------------------------------------------
+ */
+void
+ExecLimitPrefetchMergeJoin(MergeJoinState *node, int limit)
+{
+    int outerLimit = limit/2;
+    int innerLimit = limit/2;
+
+    Assert(node != NULL);
+
+    ExecLimitPrefetchNode(outerPlanState(node), outerLimit);
+    ExecLimitPrefetchNode(innerPlanState(node), innerLimit);
+}
+#endif
+
 /* ----------------------------------------------------------------
  *        ExecEndMergeJoin
  *
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index bccc1a4..3297900 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -104,6 +104,7 @@ typedef struct IndexScanDescData
     int            xs_prefetch_tail;
     BlockNumber    xs_last_prefetch;
     bool        xs_done;
+    int            xs_prefetch_limit;
 #endif
 }    IndexScanDescData;

diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 88d0522..09b94e0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -222,6 +222,7 @@ extern TupleTableSlot *ExecProcNode(PlanState *node);
 extern Node *MultiExecProcNode(PlanState *node);
 extern void ExecEndNode(PlanState *node);
 #ifdef USE_PREFETCH
+extern void ExecLimitPrefetchNode(PlanState *node, int limit);
 extern int ExecPrefetchNode(PlanState *node, int maxPrefetch);
 #endif

diff --git a/src/include/executor/nodeAgg.h b/src/include/executor/nodeAgg.h
index 38823d6..f775ec8 100644
--- a/src/include/executor/nodeAgg.h
+++ b/src/include/executor/nodeAgg.h
@@ -17,6 +17,9 @@
 #include "nodes/execnodes.h"

 extern AggState *ExecInitAgg(Agg *node, EState *estate, int eflags);
+#ifdef USE_PREFETCH
+extern void ExecLimitPrefetchAgg(AggState *node, int limit);
+#endif
 extern TupleTableSlot *ExecAgg(AggState *node);
 extern void ExecEndAgg(AggState *node);
 extern void ExecReScanAgg(AggState *node);
diff --git a/src/include/executor/nodeIndexscan.h b/src/include/executor/nodeIndexscan.h
index f93632c..ccf3121 100644
--- a/src/include/executor/nodeIndexscan.h
+++ b/src/include/executor/nodeIndexscan.h
@@ -17,6 +17,9 @@
 #include "nodes/execnodes.h"

 extern IndexScanState *ExecInitIndexScan(IndexScan *node, EState *estate, int eflags);
+#ifdef USE_PREFETCH
+extern void ExecLimitPrefetchIndexScan(IndexScanState *node, int limit);
+#endif
 extern TupleTableSlot *ExecIndexScan(IndexScanState *node);
 extern int ExecPrefetchIndexScan(IndexScanState *node, int maxPrefetch);
 extern void ExecEndIndexScan(IndexScanState *node);
diff --git a/src/include/executor/nodeMaterial.h b/src/include/executor/nodeMaterial.h
index cfca0a5..5c81fe8 100644
--- a/src/include/executor/nodeMaterial.h
+++ b/src/include/executor/nodeMaterial.h
@@ -17,6 +17,9 @@
 #include "nodes/execnodes.h"

 extern MaterialState *ExecInitMaterial(Material *node, EState *estate, int eflags);
+#ifdef USE_PREFETCH
+extern void ExecLimitPrefetchMaterial(MaterialState *node, int limit);
+#endif
 extern TupleTableSlot *ExecMaterial(MaterialState *node);
 extern void ExecEndMaterial(MaterialState *node);
 extern void ExecMaterialMarkPos(MaterialState *node);
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index fa6b5e0..e402b42 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -17,6 +17,9 @@
 #include "nodes/execnodes.h"

 extern MergeJoinState *ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags);
+#ifdef USE_PREFETCH
+extern void ExecLimitPrefetchMergeJoin(MergeJoinState *node, int limit);
+#endif
 extern TupleTableSlot *ExecMergeJoin(MergeJoinState *node);
 extern void ExecEndMergeJoin(MergeJoinState *node);
 extern void ExecReScanMergeJoin(MergeJoinState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 27fe65d..64ed6fb 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1585,6 +1585,12 @@ typedef struct MergeJoinState
     ExprContext *mj_InnerEContext;
 } MergeJoinState;

+#ifdef USE_PREFETCH
+# ifndef MERGEJOIN_PREFETCH_COUNT
+#  define MERGEJOIN_PREFETCH_COUNT 32
+# endif
+#endif
+
 /* ----------------
  *     HashJoinState information
  *
--
2.0.5


pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: Possible problem with pgcrypto
Next
From: "Syed, Rahila"
Date:
Subject: Re: [REVIEW] Re: Compression of full-page-writes