Re: Batch insert in CTAS/MatView code - Mailing list pgsql-hackers

From Paul Guo
Subject Re: Batch insert in CTAS/MatView code
Date
Msg-id CAEET0ZE7YGKfWvcCYv1f+MO7oM_CmmJqMrmDYod_9wn5+3P2Uw@mail.gmail.com
Whole thread Raw
In response to Re: Batch insert in CTAS/MatView code  (Andres Freund <andres@anarazel.de>)
Responses Re: Batch insert in CTAS/MatView code  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

> > However, I can also see that there is no better alternative.  We need to
> > compute the size of accumulated tuples so far, in order to decide whether
> > to stop accumulating tuples.  There is no convenient way to obtain the
> > length of the tuple, given a slot.  How about making that decision solely
> > based on number of tuples, so that we can avoid ExecFetchSlotHeapTuple call
> > altogether?
>
> ... maybe we should add a new operation to slots, that returns the
> (approximate?) size of a tuple?

Hm, I'm not convinced that it's worth adding that as a dedicated
operation. It's not that clear what it'd exactly mean anyway - what
would it measure? As referenced in the slot? As if it were stored on
disk? etc?

I wonder if the right answer wouldn't be to just measure the size of a
memory context containing the batch slots, or something like that.


Probably a better way is to move those logic (append slot to slots, judge
when to flush, flush, clean up slots) into table_multi_insert()? Generally
the final implementation of table_multi_insert() should be able to know
the sizes easily. One concern is that currently just COPY in the repo uses
multi insert, so not sure if other callers in the future want their own
logic (or set up a flag to allow customization but seems a bit over-designed?).

pgsql-hackers by date:

Previous
From: Paul Guo
Date:
Subject: Re: Batch insert in CTAS/MatView code
Next
From: Joe Nelson
Date:
Subject: Re: Change atoi to strtol in same place