Re: Pluggable toaster - Mailing list pgsql-hackers

From Nikita Malakhov
Subject Re: Pluggable toaster
Date
Msg-id CAN-LCVNvOTHJQSzJZjKyzshQdFeSNh8TSHy6p9PEUPL393ic7Q@mail.gmail.com
Whole thread Raw
In response to Re: Pluggable toaster  (Aleksander Alekseev <aleksander@timescale.com>)
Responses Re: Pluggable toaster  (Aleksander Alekseev <aleksander@timescale.com>)
List pgsql-hackers
Hi!

>I don't argue with most of what you say. I am just pointing out the
>reason why the chosen approach "N TOASTers x M TableAMs" will not
>work:

We assume that TAM used in custom Toaster works as it is should work,
and leave TAM internals to this TAM developer - say, we do not want to
change internals of Heap AM.

We don't want to create some kind of silver bullet. There are already existing
and widely-known (from production environments) problems with TOAST
mechanics, and we suggest not too complex way to solve them.

As I mentioned before, Pluggable TOAST does not change Heap AM, it is
not minded to change TAMs. 

>This is what I meant above when talking about the framework for
>simplifying this task:

That's a kind of generalizing custom TOAST implementation. It is very
good intention, but keep in mind that different kinds of data require very
different approach to external storage - say, JSON TOAST works with
maps of keys and values, super binary object (experimental name) does
not work with internals of TOASTed data except searching. But, we thought
 about that too and reusable code resides in toast_internals.c source - any
custom Toaster working with Heap could use it's insert, update and fetch
methods, but deal with data on it's own.

Even with the general framework there must be a common interface which
would be the entry point for those custom methods developed with the
framework. That's what the TOAST API is - just an interface that all custom
TOAST implementations use to have a common entry point from any TAM,
with syntax support to plug in custom TOAST implementations from the SQL.
No less, but no more.

Moreover, our patches show that even Generic (default) TOAST implementation
could still be left as-is, without necessity to route it via our API, though it is logically
wrong because common API is meant to be common for all TOAST implementations
without exceptions.

Have I answered your question? Please don't hesitate to point to any unclear
parts, I'd be glad to explain that.

The main idea in TOAST API is very elegant and light, and it is designed alike
to Pluggable Storage (Table AM API).

On Mon, Oct 24, 2022 at 12:10 PM Aleksander Alekseev <aleksander@timescale.com> wrote:
Hi Nikita,

I don't argue with most of what you say. I am just pointing out the
reason why the chosen approach "N TOASTers x M TableAMs" will not
work:

> Don't you think that this is an arguable design decision? Basically
> all we know about the underlying TableAM is that it stores tuples
> _somehow_ and that tuples have TIDs [1]. That's it. We don't know if
> it even has any sort of pages, whether they are fixed in size or not,
> whether it uses shared buffers, etc. It may not even require TOAST.
> [...]

Also I completely agree with:

> Implementing another Table AM just to implement another TOAST strategy seems too
> much, the TAM API is very heavy and complex, and you would have to add it as a contrib.

This is what I meant above when talking about the framework for
simplifying this task:

> It looks like the idea should be actually turned inside out. I.e. what
> would be nice to have is some sort of _framework_ that helps TableAM
> authors to implement TOAST (alternatively, the rest of the TableAM
> except for TOAST) if the TableAM is similar to the default one.

From the user perspective it's much easier to think about one entity -
TableAM, and choosing from heapam_with_default_toast and
heapam_with_different_toast.

From the extension implementer point of view creating TableAMs is a
difficult task. This is what the framework should address. Ideally the
interface should be as simple as:

CreateParametrizedDefaultHeapAM(SomeTOASTSubstitutionObject, ...other
arguments, in the future...)

Where the extension author should be worried only about an alternative
TOAST implementation.

I think at some point such a framework may address at least one more
issue we have - an inability to change the page size on the table
level. As it was shown by Tomas Vondra [1] the default 8 KB page size
can be suboptimal depending on the load. So it would be nice if the
user could change it without rebuilding PostgreSQL. Naturally this is
out of scope of this particular patchset. I just wanted to point out
opportunities we have here.

[1]: https://www.postgresql.org/message-id/flat/b4861449-6c54-ccf8-e67c-c039228cdc6d%40enterprisedb.com

--
Best regards,
Aleksander Alekseev


--
Regards,
Nikita Malakhov
Postgres Professional 

pgsql-hackers by date:

Previous
From: Melih Mutlu
Date:
Subject: Re: Mingw task for Cirrus CI
Next
From: Alvaro Herrera
Date:
Subject: Re: Testing DDL Deparser