Re: New Table Access Methods for Multi and Single Inserts - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: New Table Access Methods for Multi and Single Inserts
Date
Msg-id CALj2ACWRJ-mVyQbKauKb7LJ3=rW0v9qWzrHoQW-iz73Tve3EUw@mail.gmail.com
Whole thread Raw
In response to New Table Access Methods for Multi and Single Inserts  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Tue, Dec 8, 2020 at 6:27 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> Hi,
>
> Currently, for any component (such as COPY, CTAS[1], CREATE/REFRESH
> Mat View[1], INSERT INTO SELECTs[2]) multi insert logic such as buffer
> slots allocation, maintenance, decision to flush and clean up, need to
> be implemented outside the table_multi_insert() API. The main problem
> is that it fails to take into consideration the underlying storage
> engine capabilities, for more details of this point refer to a
> discussion in multi inserts in CTAS thread[1]. This also creates a lot
> of duplicate code which is more error prone and not maintainable.
>
> More importantly, in another thread [3] @Andres Freund suggested to
> have table insert APIs in such a way that they look more like 'scan'
> APIs i.e. insert_begin, insert, insert_end. The main advantages doing
> this are(quoting from his statement in [3]) - "more importantly it'd
> allow an AM to optimize operations across multiple inserts, which is
> important for column stores."
>
> I propose to introduce new table access methods for both multi and
> single inserts based on the prototype suggested by Andres in [3]. Main
> design goal of these new APIs is to give flexibility to tableam
> developers in implementing multi insert logic dependent on the
> underlying storage engine.
>
> Below are the APIs. I suggest to have a look at
> v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch for details
> of the new data structure and the API functionality. Note that
> temporarily I used XX_v2, we can change it later.
>
> TableInsertState* table_insert_begin(initial_args);
> void table_insert_v2(TableInsertState *state, TupleTableSlot *slot);
> void table_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot);
> void table_multi_insert_flush(TableInsertState *state);
> void table_insert_end(TableInsertState *state);
>
> I'm attaching a few patches(just to show that these APIs work, avoids
> a lot of duplicate code and makes life easier). Better commenting can
> be added later. If these APIs and patches look okay, we can even
> consider replacing them in other places such as nodeModifyTable.c and
> so on.
>
> v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch --->
> introduces new table access methods for multi and single inserts. Also
> implements/rearranges the outside code for heap am into these new
> APIs.
> v1-0002-CTAS-and-REFRESH-Mat-View-With-New-Multi-Insert-Table-AM.patch
> ---> adds new multi insert table access methods to CREATE TABLE AS,
> CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW.
> v1-0003-ATRewriteTable-With-New-Single-Insert-Table-AM.patch ---> adds
> new single insert table access method to ALTER TABLE rewrite table
> code.
> v1-0004-COPY-With-New-Multi-and-Single-Insert-Table-AM.patch ---> adds
> new single and multi insert table access method to COPY code.
>
> Thoughts?
>
> [1] - https://www.postgresql.org/message-id/4eee0730-f6ec-e72d-3477-561643f4b327%40swarm64.com
> [2] - https://www.postgresql.org/message-id/20201124020020.GK24052%40telsasoft.com
> [3] - https://www.postgresql.org/message-id/20200924024128.kyk3r5g7dnu3fxxx%40alap3.anarazel.de

Added this to commitfest to get it reviewed further.

https://commitfest.postgresql.org/31/2871/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Feature Proposal: Add ssltermination parameter for SNI-based LoadBalancing
Next
From: Konstantin Knizhnik
Date:
Subject: Re: On login trigger: take three