Re: Batching in executor - Mailing list pgsql-hackers

From cca5507
Subject Re: Batching in executor
Date
Msg-id tencent_164C2A5EDF8189DD70287661406D688A8506@qq.com
Whole thread Raw
In response to Re: Batching in executor  (Junwang Zhao <zhjwpku@gmail.com>)
Responses Re: Batching in executor
List pgsql-hackers
Hi,

Some comments for v5:

0001
====

1) heap_begin_batch()

```
    /* Single allocation for HeapBatch header + tupdata array */
    alloc_size = sizeof(HeapBatch) + sizeof(HeapTupleData) * maxitems;
    hb = palloc(alloc_size);
    hb->tupdata = (HeapTupleData *) ((char *) hb + sizeof(HeapBatch));
```

Do we need a MAXALIGN() here to avoid unaligned access? Something like this:

```
    /* Single allocation for HeapBatch header + tupdata array */
    alloc_size = MAXALIGN(sizeof(HeapBatch)) + sizeof(HeapTupleData) * maxitems;
    hb = palloc(alloc_size);
    hb->tupdata = (HeapTupleData *) ((char *) hb + MAXALIGN(sizeof(HeapBatch)));
```

Or how about just using zero-length array:

```
typedef struct HeapBatch
{
    Buffer            buf;
    int                maxitems;
    int                nitems;
    HeapTupleData    tupdata[FLEXIBLE_ARRAY_MEMBER];
} HeapBatch;

// and
hb = palloc(offsetof(HeapBatch, tupdata) + sizeof(HeapTupleData) * maxitems);
```

2) pgstat_count_heap_getnext_batch()

```
#define pgstat_count_heap_getnext_batch(rel, n)                        \
    do {                                                            \
        if (pgstat_should_count_relation(rel))                        \
            (rel)->pgstat_info->counts.tuples_returned += n;        \
    } while (0)
```

"+= n" -> "+= (n)", just like pgstat_count_index_tuples().

0002
====

1) TupleBatchCreate()

```
    /* Single allocation for TupleBatch + inslots + outslots arrays */
    alloc_size = sizeof(TupleBatch) + 2 * sizeof(TupleTableSlot *) * capacity;
    b = palloc(alloc_size);
    inslots = (TupleTableSlot **) ((char *) b + sizeof(TupleBatch));
    outslots = (TupleTableSlot **) ((char *) b + sizeof(TupleBatch) +
        sizeof(TupleTableSlot *) * capacity);
```

Do we need a MAXALIGN() here to avoid unaligned access?

2) TupleBatchReset()

```
    for (int i = 0; i < b->maxslots; i++)
    {
        ExecClearTuple(b->inslots[i]);
        if (drop_slots)
            ExecDropSingleTupleTableSlot(b->inslots[i]);
    }
```

ExecDropSingleTupleTableSlot() will call ExecClearTuple(), so ExecClearTuple() will be
called twice if drop_slots is true, I think we can avoid this.

3) ScanCanUseBatching()

In heap_beginscan(), we may disable page-at-a-time mode:

```
    /*
     * Disable page-at-a-time mode if it's not a MVCC-safe snapshot.
     */
    if (!(snapshot && IsMVCCSnapshot(snapshot)))
        scan->rs_base.rs_flags &= ~SO_ALLOW_PAGEMODE;
```

It seems that ScanCanUseBatching() didn't consider this.

4) struct TupleBatch

```
    struct TupleTableSlot **inslots; /* slots for tuples read "into" batch */
    struct TupleTableSlot **outslots; /* slots for tuples going "out of"
                                       * batch */
    struct TupleTableSlot **activeslots;
```

I think we can remove the word "struct".

5) ExecScanExtendedBatchSlot()

```
        /* Get next input slot from current batch, or refill */
        if (!TupleBatchHasMore(b))
        {
            if (!accessBatchMtd(node))
                return NULL;
        }
```

I think we cannot just return NULL here, see comments in ExecScanExtended():

```
        /*
         * if the slot returned by the accessMtd contains NULL, then it means
         * there is nothing more to scan so we just return an empty slot,
         * being careful to use the projection result slot so it has correct
         * tupleDesc.
         */
        if (TupIsNull(slot))
        {
            if (projInfo)
                return ExecClearTuple(projInfo->pi_state.resultslot);
            else
                return slot;
        }
```

And why not just write this function like ExecScanExtended() and ExecScanFetch()?

--
Regards,
ChangAo Chen

pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: Fix pg_stat_get_backend_wait_event() for aux processes
Next
From: Heikki Linnakangas
Date:
Subject: Re: Refactor recovery conflict signaling a little