Re: Table AM Interface Enhancements - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Table AM Interface Enhancements
Date
Msg-id CAEze2Wj2jnrjyK2PN_2TK0Yu79zruPNakY1r-aF4T-yC+xnfPg@mail.gmail.com
Whole thread Raw
In response to Table AM Interface Enhancements  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Table AM Interface Enhancements
List pgsql-hackers
Hi,

On Thu, 23 Nov 2023 at 13:43, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> Hello PostgreSQL Hackers,
>
> I am pleased to submit a series of patches related to the Table Access
> Method (AM) interface, which I initially announced during my talk at
> PGCon 2023 [1]. These patches are primarily designed to support the
> OrioleDB engine, but I believe they could be beneficial for other
> table AM implementations as well.
>
> The focus of these patches is to introduce more flexibility and
> capabilities into the Table AM interface. This is particularly
> relevant for advanced use cases like index-organized tables,
> alternative MVCC implementations, etc.
>
> Here's a brief overview of the patches included in this set:

Note: no significant review of the patches, just a first response on
the cover letters and oddities I noticed:

Overall, this patchset adds significant API area to TableAmRoutine,
without adding the relevant documentation on how it's expected to be
used. With the overall size of the patchset also being very
significant, I don't think this patch is reviewable as is; the goal
isn't clear enough, the APIs aren't well explained, and the
interactions with the index API are left up in the air.

> 0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patch
>
> Optimizes the process of locking concurrently updated tuples during
> update and delete operations. Helpful for table AMs where refinding
> existing tuples is expensive.

Is this essentially an optimized implementation of the "DELETE FROM
... RETURNING *" per-tuple primitive?

> 0003-Allow-table-AM-to-store-complex-data-structures-i-v1.patch
>
> Allows table AM to store complex data structure in rd_amcache rather
> than a single chunk of memory.

I don't think we should allow arbitrarily large and arbitrarily many
chunks of data in the relcache or table caches. Why isn't one chunk
enough?

> 0004-Add-table-AM-tuple_is_current-method-v1.patch
>
> This allows us to abstract how/whether table AM uses transaction identifiers.

I'm not a fan of the indirection here. Also, assuming that table slots
don't outlive transactions, wouldn't this be a more appropriate fit
with the table tuple slot API?

> 0005-Generalize-relation-analyze-in-table-AM-interface-v1.patch
>
> Provides a more flexible API for sampling tuples, beneficial for
> non-standard table types like index-organized tables.
>
> 0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch
>
> Provides a new table AM API method to encapsulate the whole INSERT ...
> ON CONFLICT ... algorithm rather than just implementation of
> speculative tokens.

Does this not still require speculative inserts, with speculative
tokens, for secondary indexes? Why make AMs implement that all over
again?

> 0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patch
>
> This allows table AM to return a native tuple slot, which is aware of
> table AM-specific system attributes.

This seems reasonable.

> 0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch
>
> Allows table AM to skip index insertions in the executor and handle
> those insertions itself.

Who handles index tuple removal then? I don't see a patch that
describes index AM changes for this...

> 0009-Custom-reloptions-for-table-AM-v1.patch
>
> Enables table AMs to define and override reloptions for tables and indexes.
>
> 0010-Notify-table-AM-about-index-creation-v1.patch
>
> Allows table AMs to prepare or update specific meta-information during
> index creation.

I don't think the described use case of this API is OK - a table AM
cannot know about the internals of index AMs, and is definitely not
allowed to overwrite the information of that index.
If I ask for an index that uses the "btree" index, then that needs to
be the AM actually used, or an error needs to be raised if it is
somehow incompatible with the table AM used. It can't be that we
silently update information and create an index that is explicitly not
what the user asked to create.

I also don't see updates in documentation, which I think is quite a
shame as I have trouble understanding some parts.

> 0012-Introduce-RowID-bytea-tuple-identifier-v1.patch
>
> `This patch introduces 'RowID', a new bytea tuple identifier, to
> overcome the limitations of the current 32-bit block number and 16-bit
> offset-based tuple identifier. This is particularly useful for
> index-organized tables and other advanced use cases.

We don't have any index methods that can handle anything but
block+offset TIDs, and I don't see any changes to the IndexAM APIs to
support these RowID tuples, so what's the plan here? I don't see any
of that in the commit message, nor in the rest of this patchset.

> Each commit message contains a detailed explanation of the changes and
> their rationale. I believe these enhancements will significantly
> improve the flexibility and capabilities of the PostgreSQL Table AM
> interface.

I've noticed there is not a lot of rationale for several of the
changes as to why PostgreSQL needs these changes implemented like
this, amongst which the index-related tableAM changes.

I understand that index-organized tables can be quite useful, but I
don't see design solutions to the more complex questions that would
still be required before we could host such table AMs like OreoleDB's
as a first-party citizen: For index-organized tables, you also need
index AM APIs that support TIDS with more than 48 bits of data
(assuming we actually want primary keys with >48 bits of usable
space), and for undo-based logging you would probably need index APIs
for retail index tuple deletion. Neither is supplied here, nor is
described why these APIs were omitted.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Improving the comments in pqsignal()
Next
From: jian he
Date:
Subject: Re: PATCH: Add REINDEX tag to event triggers