Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Date
Msg-id CALj2ACWTrx1zxWvq8Uj2rEwCsDgQHeJ53WdvzZUw3kW+_VPG6A@mail.gmail.com
Whole thread Raw
In response to Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Thu, Apr 25, 2024 at 10:11 PM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2024-04-24 at 18:19 +0530, Bharath Rupireddy wrote:
> > I added a flush callback named TableModifyBufferFlushCallback; when
> > provided by callers invoked after tuples are flushed to disk from the
> > buffers but before the AM frees them up. Index insertions and AFTER
> > ROW INSERT triggers can be executed in this callback. See the v19-
> > 0001 patch for how AM invokes the flush callback, and see either v19-
> > 0003 or v19-0004 or v19-0005 for how a caller can supply the callback
> > and required context to execute index insertions and AR triggers.
>
> The flush callback takes a pointer to an array of slot pointers, and I
> don't think that's the right API. I think the callback should be called
> on each slot individually.
>
> We shouldn't assume that a table AM stores buffered inserts as an array
> of slot pointers. A TupleTableSlot has a fair amount of memory overhead
> (64 bytes), so most AMs wouldn't want to pay that overhead for every
> tuple. COPY does, but that's because the number of buffered tuples is
> fairly small.

I get your point. An AM can choose to implement the buffering strategy
by just storing the plain tuple rather than the tuple slots in which
case the flush callback with an array of tuple slots won't work.
Therefore, I now changed the flush callback to accept only a single
tuple slot.

> > > 11. Deprecate the multi_insert API.
> >
> > I did remove both table_multi_insert and table_finish_bulk_insert in
> > v19-0006.
>
> That's OK with me. Let's leave those functions out for now.

Okay. Dropped the 0006 patch for now.

Please see the attached v20 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: A failure in prepared_xacts test
Next
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby