Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Date
Msg-id CAEze2WgiW89oQkG5TEQ-qmvUdSUG2OQYaxrdn=6HJKehEO1puw@mail.gmail.com
Whole thread Raw
In response to Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
List pgsql-hackers
On Tue, 27 Aug 2024 at 07:42, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2024-08-26 at 23:59 +0200, Matthias van de Meent wrote:
> > Specifically, I'm having trouble seeing how this could be used to
> > implement ```INSERT INTO ... SELECT ... RETURNING ctid``` as I see no
> > returning output path for the newly inserted tuples' data, which is
> > usually required for our execution nodes' output path. Is support for
> > RETURN-clauses planned for this API? In a previous iteration, the
> > flush operation was capable of returning a TTS, but that seems to
> > have
> > been dropped, and I can't quite figure out why.
>
> I'm not sure where that was lost, but I suspect when we changed
> flushing to use a callback. I didn't get to v23-0003 yet, but I think
> you're right that the current flushing mechanism isn't right for
> returning tuples. Thank you.
>
> One solution: when the buffer is flushed, we can return an iterator
> over the buffered tuples to the caller. The caller can then use the
> iterator to insert into indexes, return a tuple to the executor, etc.,
> and then release the iterator when done (freeing the buffer).

I think that would work, but it'd need to be accomodated in the
table_modify_buffer_insert path too, not just the _flush path, as the
heap AM flushes the buffer when inserting tuples and its internal
buffer is full, so not only at the end of modifications.

> That control flow is less convenient for most callers, though, so
> perhaps that should be optional?

That would be OK with me.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Parallel CREATE INDEX for GIN indexes
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Enable data checksums by default