On Mon, Nov 16, 2020 at 8:02 PM Paul Guo <guopa@vmware.com> wrote:
>
> > On Nov 13, 2020, at 7:21 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Nov 10, 2020 at 3:47 PM Paul Guo <guopa@vmware.com> wrote:
> >>
> >> Thanks for doing this. There might be another solution - use raw insert interfaces (i.e. raw_heap_insert()).
> >> Attached is the test (not formal) patch that verifies this idea. raw_heap_insert() writes the page into the
> >> table files directly and also write the FPI xlog when the tuples filled up the whole page. This seems be
> >> more efficient.
> >>
> >
> > Thanks. Will the new raw_heap_insert() APIs scale well (i.e. extend
> > the table parallelly) with parallelism? The existing
> > table_multi_insert() API scales well, see, for instance, the benefit
> > with parallel copy[1] and parallel multi inserts in CTAS[2].
>
> Yes definitely some work needs to be done to make raw heap insert interfaces fit the parallel work, but
> it seems that there is no hard blocking issues for this?
>
I may be wrong here. If we were to allow raw heap insert APIs to
handle parallelism, shouldn't we need some sort of shared memory to
allow coordination among workers? If we do so, at the end, aren't
these raw insert APIs equivalent to current table_multi_insert() API
which uses a separate shared ring buffer(bulk insert state) for
insertions?
And can we think of these raw insert APIs similar to the behaviour of
table_multi_insert() API for unlogged tables?
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com