Re: Pluggable toaster - Mailing list pgsql-hackers

From Nikita Malakhov
Subject Re: Pluggable toaster
Date
Msg-id CAN-LCVMXN-igZdOH4kunF0p2jobsksNhFXovEXjr-zVF90Nq1A@mail.gmail.com
Whole thread Raw
In response to Re: Pluggable toaster  (Andres Freund <andres@anarazel.de>)
Responses Re: Pluggable toaster
List pgsql-hackers
Hi hackers!

I've made a rebase according to Andres and Aleksander comments. 
Rebased branch resides here:

I've decided to leave only the first 2 patches for review and send less significant
changes after the main patch will be straightened out.
So, here is 
v13-0001-toaster-interface.patch - main TOAST API patch, with reference TOAST
mechanics left as-is.
v13-0002-toaster-default.patch - reference TOAST re-implemented via TOAST API.


>I'm a bit confused by the patch numbering - why isn't there a patch 0001 and
>0005?
Sorry for the confusion, my fault. The first patch (CREATE TABLE .. SET STORAGE)
is already committed into v15, and several of the late patches weren't included.
I've rearranged patch numbers in this iteration.

>I think 0002 needs to be split further - the relevant part isn't that it
>introduces the "dummy toaster" module, it's a large change doing lots of
>things, the addition of the contrib module is irrelevant comparatively.

Done, contrib /dummy_toaster excluded from main patch and placed in branch
as a separate commit.

>As is the patches unfortunately are close to unreviewable. Lots of code gets
>moved around in one patch, then again in the next patch, then again in the
>next.

So I've decided to put here only the first one while I'm working on the latter to clean
this up - I agree, code in latter patches needs some refactoring. Working on it.

>Unfortunately, scanning through these patches, it seems this is a lot of
>complexity, with a (for me) comensurate benefit. There's a lot of more general
>improvements to toasting and the json type that we can do, that are a lot less
>complex than this.

We have very significant improvements for storing large JSON and a couple of
other TOAST improvements which make a lot of sense, but they are based on
this API. But in the first patch reference TOAST is left as-is, and does not use
TOAST API.

>> 2) New VARATT_CUSTOM data structure with fixed header and variable
>> tail to store custom toasted data, with according macros set;

>That's adding overhead to every toast interaction, independent of any new
>infrastructure being used.

We've performed some tests on this and haven't detected significant overhead,

>So we're increasing pg_attribute - often already the largest catalog table in
>a database.

A little bit, with an OID column storing Toaster OID. We do not see any other way
to keep track of Toaster used by the table's column, because it could be changed
any time by ALTER TABLE ... SET TOASTER.

>Am I just missing something, or is atttoaster not actually used in this patch?
>So most of the contrib module added is unreachable code?

It is necessary for Toasters implemented via TOAST API, the first patch does not 
use it directly because reference TOAST is left unchanged. The second one which
implements reference TOAST via TOAST API uses it.

>That seems not great.

About Toasters deletion - we forbid dropping Toasters because if Toaster is dropped 
the data TOASTed with it is lost,  and as was mentioned before, we think that there 
won't be a lot of custom Toasters, likely seems to be less then a dozen.

>That move the whole list around! On a cache hit. Tthis would likely already be
>slower than syscache.

Thank you for the remark, it is questionable approach. I've changed this in current iteration
(patch in attach) to keep Toaster list appended-only if Toaster was not found, and leave 
Toaster cache as a straight list - first element in is the head of the list.

Also, documentation on TOAST API is provided in README.toastapi in the first patch - 
I'd be grateful for comments on it.

Thanks for the feedback!

On Tue, Aug 2, 2022 at 6:37 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2022-08-02 09:15:12 +0300, Nikita Malakhov wrote:
> Attach includes:
> v11-0002-toaster-interface.patch - contains TOAST API with default Toaster
> left as-is (reference implementation) and Dummy toaster as an example
> (will be removed later as a part of refactoring?).
>
> v11-0003-toaster-default.patch - implements reference TOAST as Default
> Toaster
> via TOAST API, so Heap AM calls Toast only via API, and does not have direct
> calls to Toast functionality.
>
> v11-0004-toaster-snapshot.patch - supports row versioning for TOASTed values
> and some refactoring.

I'm a bit confused by the patch numbering - why isn't there a patch 0001 and
0005?

I think 0002 needs to be split further - the relevant part isn't that it
introduces the "dummy toaster" module, it's a large change doing lots of
things, the addition of the contrib module is irrelevant comparatively.

As is the patches unfortunately are close to unreviewable. Lots of code gets
moved around in one patch, then again in the next patch, then again in the
next.


Unfortunately, scanning through these patches, it seems this is a lot of
complexity, with a (for me) comensurate benefit. There's a lot of more general
improvements to toasting and the json type that we can do, that are a lot less
complex than this.


> From 6b35d6091248e120d2361cf0a806dbfb161421cf Mon Sep 17 00:00:00 2001
> From: Nikita Malakhov <n.malakhov@postgrespro.ru>
> Date: Tue, 12 Apr 2022 18:37:21 +0300
> Subject: [PATCH] Pluggable TOAST API interface with dummy_toaster contrib
>  module
>
> Pluggable TOAST API is introduced with implemented contrib example
> module.
> Pluggable TOAST API consists of 4 parts:
> 1) SQL syntax supports manipulations with toasters - CREATE TABLE ...
> (column type STORAGE storage_type TOASTER toaster), ALTER TABLE ALTER
> COLUMN column SET TOASTER toaster and Toaster definition.
> TOAST API requires earlier patch with CREATE TABLE SET STORAGE clause;
> New column atttoaster is added to pg_attribute.
> Toaster drop is not allowed for not to lose already toasted data;
> 2) New VARATT_CUSTOM data structure with fixed header and variable
> tail to store custom toasted data, with according macros set;

That's adding overhead to every toast interaction, independent of any new
infrastructure being used.



> 4) Dummy toaster implemented via new TOAST API to be used as sample.
> In this patch regular (default) TOAST function is left as-is and not
> yet implemented via new API.
> TOAST API syntax and code explanation provided in additional docs patch.

I'd make this a separate commit.



> @@ -445,6 +447,8 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
>                       return false;
>               if (attr1->attstorage != attr2->attstorage)
>                       return false;
> +             if (attr1->atttoaster != attr2->atttoaster)
> +                     return false;

So we're increasing pg_attribute - often already the largest catalog table in
a database.

Am I just missing something, or is atttoaster not actually used in this patch?
So most of the contrib module added is unreachable code?


> +/*
> + * Toasters is very often called so syscache lookup and TsrRoutine allocation are
> + * expensive and we need to cache them.

Ugh.

> + * We believe what there are only a few toasters and there is high chance that
> + * only one or only two of them are heavy used, so most used toasters should be
> + * found as easy as possible. So, let us use a simple list, in future it could
> + * be changed to other structure. For now it will be stored in TopCacheContext
> + * and never destroed in backend life cycle - toasters are never deleted.
> + */

That seems not great.


> +typedef struct ToasterCacheEntry
> +{
> +     Oid                     toasterOid;
> +     TsrRoutine *routine;
> +} ToasterCacheEntry;
> +
> +static List  *ToasterCache = NIL;
> +
> +/*
> + * SearchTsrCache - get cached toaster routine, emits an error if toaster
> + * doesn't exist
> + */
> +TsrRoutine*
> +SearchTsrCache(Oid   toasterOid)
> +{
> +     ListCell                   *lc;
> +     ToasterCacheEntry  *entry;
> +     MemoryContext           ctx;
> +
> +     if (list_length(ToasterCache) > 0)
> +     {
> +             /* fast path */
> +             entry = (ToasterCacheEntry*)linitial(ToasterCache);
> +             if (entry->toasterOid == toasterOid)
> +                     return entry->routine;
> +     }
> +
> +     /* didn't find in first position */
> +     ctx = MemoryContextSwitchTo(CacheMemoryContext);
> +
> +     for_each_from(lc, ToasterCache, 0)
> +     {
> +             entry = (ToasterCacheEntry*)lfirst(lc);
> +
> +             if (entry->toasterOid == toasterOid)
> +             {
> +                     /* remove entry from list, it will be added in a head of list below */
> +                     foreach_delete_current(ToasterCache, lc);

That needs to move later list elements!


> +                     goto out;
> +             }
> +     }
> +
> +     /* did not find entry, make a new one */
> +     entry = palloc(sizeof(*entry));
> +
> +     entry->toasterOid = toasterOid;
> +     entry->routine = GetTsrRoutineByOid(toasterOid, false);
> +
> +out:
> +     ToasterCache = lcons(entry, ToasterCache);

That move the whole list around! On a cache hit. Tthis would likely already be
slower than syscache.


> diff --git a/contrib/dummy_toaster/dummy_toaster.c b/contrib/dummy_toaster/dummy_toaster.c
> index 0d261f6042..02f49052b7 100644
> --- a/contrib/dummy_toaster/dummy_toaster.c
> +++ b/contrib/dummy_toaster/dummy_toaster.c

So this is just changing around the code added in the prior commit. Why was it
then included before?


> +++ b/src/include/access/generic_toaster.h

> +HeapTuple
> +heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
> +                                                     int options);

The generic toast API has heap_* in its name?




> From 4112cd70b05dda39020d576050a98ca3cdcf2860 Mon Sep 17 00:00:00 2001
> From: Nikita Malakhov <n.malakhov@postgrespro.ru>
> Date: Tue, 12 Apr 2022 22:57:21 +0300
> Subject: [PATCH] Versioned rows in TOASTed values for Default Toaster support
>
> Original TOAST mechanics does not support rows versioning
> for TOASTed values.
> Toaster snapshot - refactored generic toaster, implements
> rows versions check in toasted values to share common parts
> of toasted values between different versions of rows.

This misses explaining *WHY* this is changed.



> diff --git a/src/backend/access/common/detoast.c b/src/backend/access/common/detoast.c
> deleted file mode 100644
> index aff8042166..0000000000
> --- a/src/backend/access/common/detoast.c
> +++ /dev/null

These patches really move things around in a largely random way.


> -static bool toastrel_valueid_exists(Relation toastrel, Oid valueid);
> -static bool toastid_valueid_exists(Oid toastrelid, Oid valueid);
> -
> +static void
> +toast_extract_chunk_fields(Relation toastrel, TupleDesc toasttupDesc,
> +                                                Oid valueid, HeapTuple ttup, int32 *seqno,
> +                                                char **chunkdata, int *chunksize);
> +
> +static void
> +toast_write_slice(Relation toastrel, Relation *toastidxs,
> +                               int num_indexes, int validIndex,
> +                               Oid valueid, int32 value_size, int32 slice_offset,
> +                               int32 slice_length, char *slice_data,
> +                               int options,
> +                               void *chunk_header, int chunk_header_size,
> +                               ToastChunkVisibilityCheck visibility_check,
> +                               void *visibility_cxt);

What do all these changes have to do with "Versioned rows in TOASTed
values for Default Toaster support"?


> +static void *
> +toast_fetch_old_chunk(Relation toastrel, SysScanDesc toastscan, Oid valueid,
> +                                       int32 expected_chunk_seq, int32 last_old_chunk_seq,
> +                                       ToastChunkVisibilityCheck visibility_check,
> +                                       void *visibility_cxt,
> +                                       int32 *p_old_chunk_size, ItemPointer old_tid)
> +{
> +     for (;;)
> +     {
> +             HeapTuple       old_toasttup;
> +             char       *old_chunk_data;
> +             int32           old_chunk_seq;
> +             int32           old_chunk_data_size;
> +
> +             old_toasttup = systable_getnext_ordered(toastscan, ForwardScanDirection);
> +
> +             if (old_toasttup)
> +             {
> +                     /* Skip aborted chunks */
> +                     if (!HeapTupleHeaderXminCommitted(old_toasttup->t_data))
> +                     {
> +                             TransactionId xmin = HeapTupleHeaderGetXmin(old_toasttup->t_data);
> +
> +                             Assert(!HeapTupleHeaderXminInvalid(old_toasttup->t_data));
> +
> +                             if (TransactionIdDidAbort(xmin))
> +                                     continue;
> +                     }

Why is there visibility logic in quite random places? Also, it's not "legal"
to call TransactionIdDidAbort() without having checked
TransactionIdIsInProgress() first. And what does this this have to do with
snapshots - it's pretty clearly not implementing snapshot logic.


Greetings,

Andres Freund


--
Regards,
Nikita Malakhov
Postgres Professional 
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Next
From: Justin Pryzby
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints