Re: Table AM Interface Enhancements - Mailing list pgsql-hackers

From Mats Kindahl
Subject Re: Table AM Interface Enhancements
Date
Msg-id CA+14427Qk6_0_Jg8ou05teCpsu7L2EfpV+x7SyiJywH4U3aWUQ@mail.gmail.com
Whole thread Raw
In response to Table AM Interface Enhancements  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Thu, Nov 23, 2023 at 1:43 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
Hello PostgreSQL Hackers,

I am pleased to submit a series of patches related to the Table Access
Method (AM) interface, which I initially announced during my talk at
PGCon 2023 [1]. These patches are primarily designed to support the
OrioleDB engine, but I believe they could be beneficial for other
table AM implementations as well.

The focus of these patches is to introduce more flexibility and
capabilities into the Table AM interface. This is particularly
relevant for advanced use cases like index-organized tables,
alternative MVCC implementations, etc.

Hi Alexander and great to see some action around in the table access method interface.

Sorry for being late to the game, but wondering a few things about the patches, but I'll start with the first one that caught my eye.

0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patch

This allows table AM to return a native tuple slot, which is aware of
table AM-specific system attributes.

This patch seems straightforward enough, but from reading the surrounding code and trying to understand the context I am wondering a few things. Reading the thread, I am unsure if this will go in or not, but just wanted to point out a concern I had. My apologies if I am raising an issue that is already resolved.

AFAICT, the general contract for working with table tuple slots is creating them for a particular purpose, filling it in, and then passing around a pointer to it. Since the slot is created using a "source" implementation, the "source" is responsible for memory allocation and also other updates to the state. Please correct me if I have misunderstood how this is intended to work, but this seems like a good API since it avoids unnecessary allocation and, in particular, unbounded creation of new slots affecting memory usage while a query is executing. For a plan you want to execute, you just make sure that you have slots of the right kind in each plan node and there is no need to dynamically allocate more slots. If you want one for the table access method, just make sure to fetch the slot callbacks from the table access method use those correctly. As a result, the number of slots used during execution is bounded

Assuming that I've understood it correct, if a TTS is then created and passed to tuple_insert, and it needs to return a different slot, this raises two questions:
  • As Andres pointed out: who is responsible for taking care of and dealing with the cleanup of the returned slot here? Note that this is not just a matter of releasing memory, there are other stateful things that they might need to deal with that the TAM have created for in the slot. For this, some sort of callback is needed and the tuple_insert implementation needs to call that correctly.
  • The dual is the cleanup of the "original" slot passed in: a slot of a particular kind is passed in and you need to deal with this correctly to release the resources allocated by the original slot, using some sort of callback.
For both these cases, the question is what cleanup function to call.

In most cases, the slot comes from a subplan and is not dynamically allocated, i.e., it cannot just use release() since it is reused later. For example, for ExecScanFetch the slot ss_ScanTupleSlot is returned, which is then used with tuple_insert (unless I've misread the code), which is typically cleared, not released.

If clear() is used instead, and you clear this slot as part of inserting a tuple, you can instead clear a premature intermediate result (ss_ScanTupleSlot, in the example above), which can cause strange issues if this result is needed later. 

So, given that the dynamic allocation of new slots is unbounded within a query and it is complicated to make sure that slots are cleared/reset/released correctly depending on context, this seems to be hard to get to work correctly and not risk introducing bugs. IMHO, it would be preferable to have a very simple contract where you init, set, clear, and release the slot to avoid bugs creeping into the code, which is what the PostgreSQL code mostly has now.

So, the question here is why changing the slot implementation is needed. I do not know the details of OrioleDB, but this slot is immediately used with ExecInsertIndexTuples() after the call in nodeModifyTable. If the need is to pass information from the TAM to the IAM then it might be better to store this information in the execution state. Is there a case where the correct slot is not created, then fixing that location might be better. (I've noticed that the copyFrom code has a somewhat naïve assumption of what slot implementation should be used, but that is a separate discussion.)

Best wishes,
Mats Kindahl

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)
Next
From: Michael Paquier
Date:
Subject: Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)