Re: Feature: Add reloption support for table access method - Mailing list pgsql-hackers

From Jelte Fennema
Subject Re: Feature: Add reloption support for table access method
Date
Msg-id CAGECzQRHj820XxYiC224BULnQNO4+MRMaWpM644BGJXkP+va2g@mail.gmail.com
Whole thread Raw
In response to Re: Feature: Add reloption support for table access method  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
I'm definitely in favor of this general idea of supporting custom rel
options, we could use that for Citus its columnar TableAM. There have
been at least two other discussions on this topic:
1. https://www.postgresql.org/message-id/flat/CAFF0-CG4KZHdtYHMsonWiXNzj16gWZpduXAn8yF7pDDub%2BGQMg%40mail.gmail.com
2. https://www.postgresql.org/message-id/flat/429fb58fa3218221bb17c7bf9e70e1aa6cfc6b5d.camel%40j-davis.com

I haven't looked at the patch in detail, but it's probably good to
check what part of those previous discussions apply to it. And if
there's any ideas from those previous attempts that this patch could
borrow.

On Tue, 9 May 2023 at 22:59, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-05-05 16:44:39 +0800, 吴昊 wrote:
> > When I wrote an extension to implement a new storage by table access method. I found some issues
> > that the existing code has strong assumptions for heap tables now. Here are 3 issues that I currently have:
> >
> >
> > 1. Index access method has a callback to handle reloptions, but table access method hasn't. We can't
> > add storage-specific parameters by `WITH` clause when creating a table.
>
> Makes sense to add that.
>
>
> > 2. Existing code strongly assumes that the data file of a table structures by a serial physical files named
> > in a hard coded rule: <relfilenode>[.<segno>]. It may only fit for heap like tables. A new storage may have its
> > owner structure on how the data files are organized. The problem happens when dropping a table.
>
> I agree that it's not great, but I don't think it's particularly easy to fix
> (because things like transactional DDL require fairly tight integration). Most
> of the time it should be possible can also work around the limitations though.
>
>
> > 3. The existing code also assumes that the data file consists of a series of fix-sized block. It may not
> > be desired by other storage. Is there any suggestions on this situation?
>
> That's a requirement of using the buffer manager, but if you don't want to
> rely on that, you can use a different pattern. There's some limitations
> (format of TIDs, most prominently), but you should be able to deal with that.
>
> I don't think it would make sense to support other block sizes in the buffer
> manager.
>
>
> > The rest of this mail is to talk about the first issue. It looks reasonable to add a similar callback in
> > struct TableAmRoutine, and parse reloptions by the callback. This patch is in the attachment file.
>
> Why did you add relkind to the callbacks? The callbacks are specific to a
> certain relkind, so I don't think that makes sense.
>
> I don't think we really need GetTableAmRoutineByAmId() that raises nice
> errors etc - as the AM has already been converted to an oid, we shouldn't need
> to recheck?
>
>
>
> > +bytea *
> > +table_reloptions_am(Oid accessMethodId, Datum reloptions, char relkind, bool validate)
> >  {
> > ...
> >
> > +             /* built-in table access method put here to fetch TAM fast */
> > +             case HEAP_TABLE_AM_OID:
> > +                     tam = GetHeapamTableAmRoutine();
> > +                     break;
> >               default:
> > -                     /* other relkinds are not supported */
> > -                     return NULL;
> > +                     tam = GetTableAmRoutineByAmId(accessMethodId);
> > +                     break;
>
> Why do we need this fastpath? This shouldn't be something called at a
> meaningful frequency?
>
>
> > }
> > +     return table_reloptions(tam->amoptions, reloptions, relkind, validate);
> > }
>
> I'd just pass the tam, instead of an individual function.
>
> > @@ -866,6 +866,11 @@ typedef struct TableAmRoutine
> >                                                                                  struct SampleScanState *scanstate,
> >                                                                                  TupleTableSlot *slot);
> >
> > +     /*
> > +      * This callback is used to parse reloptions for relation/matview/toast.
> > +      */
> > +     bytea       *(*amoptions)(Datum reloptions, char relkind, bool validate);
> > +
> >  } TableAmRoutine;
>
> Did you mean table instead of relation in the comment?
>
>
>
> > Another thing about reloption is that the current API passes a parameter `validate` to tell the parse
> > functioin to check and whether to raise an error. It doesn't have enough context when these reloptioins
> > are used:
> > 1. CREATE TABLE ... WITH(...)
> > 2. ALTER TABLE ... SET ...
> > 3. ALTER TABLE ... RESET ...
> > The reason why the context matters is that some reloptions are disallowed to change after creating
> > the table, while some reloptions are allowed.
>
> What kind of reloption are you thinking of here?
>
> Greetings,
>
> Andres Freund
>
>



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: MERGE lacks ruleutils.c decompiling support!?
Next
From: Peter Eisentraut
Date:
Subject: smgrzeroextend clarification