Thread: Feature: Add reloption support for table access method
Hi pg-hackers,
When I wrote an extension to implement a new storage by table access method. I found some issues
that the existing code has strong assumptions for heap tables now. Here are 3 issues that I currently have:
1. Index access method has a callback to handle reloptions, but table access method hasn't. We can't
add storage-specific parameters by `WITH` clause when creating a table.
2. Existing code strongly assumes that the data file of a table structures by a serial physical files named
in a hard coded rule: <relfilenode>[.<segno>]. It may only fit for heap like tables. A new storage may have its
owner structure on how the data files are organized. The problem happens when dropping a table.
3. The existing code also assumes that the data file consists of a series of fix-sized block. It may not
be desired by other storage. Is there any suggestions on this situation?
The rest of this mail is to talk about the first issue. It looks reasonable to add a similar callback in
struct TableAmRoutine, and parse reloptions by the callback. This patch is in the attachment file.
Another thing about reloption is that the current API passes a parameter `validate` to tell the parse
functioin to check and whether to raise an error. It doesn't have enough context when these reloptioins
are used:
1. CREATE TABLE ... WITH(...)
2. ALTER TABLE ... SET ...
3. ALTER TABLE ... RESET ...
The reason why the context matters is that some reloptions are disallowed to change after creating
the table, while some reloptions are allowed.
I wonder if this change makes sense for you.
The attached patch only supports callback for TAM-specific implementation, not include the change
about usage context.
Regards,
Hao Wu
Attachment
Hi, On 2023-05-05 16:44:39 +0800, 吴昊 wrote: > When I wrote an extension to implement a new storage by table access method. I found some issues > that the existing code has strong assumptions for heap tables now. Here are 3 issues that I currently have: > > > 1. Index access method has a callback to handle reloptions, but table access method hasn't. We can't > add storage-specific parameters by `WITH` clause when creating a table. Makes sense to add that. > 2. Existing code strongly assumes that the data file of a table structures by a serial physical files named > in a hard coded rule: <relfilenode>[.<segno>]. It may only fit for heap like tables. A new storage may have its > owner structure on how the data files are organized. The problem happens when dropping a table. I agree that it's not great, but I don't think it's particularly easy to fix (because things like transactional DDL require fairly tight integration). Most of the time it should be possible can also work around the limitations though. > 3. The existing code also assumes that the data file consists of a series of fix-sized block. It may not > be desired by other storage. Is there any suggestions on this situation? That's a requirement of using the buffer manager, but if you don't want to rely on that, you can use a different pattern. There's some limitations (format of TIDs, most prominently), but you should be able to deal with that. I don't think it would make sense to support other block sizes in the buffer manager. > The rest of this mail is to talk about the first issue. It looks reasonable to add a similar callback in > struct TableAmRoutine, and parse reloptions by the callback. This patch is in the attachment file. Why did you add relkind to the callbacks? The callbacks are specific to a certain relkind, so I don't think that makes sense. I don't think we really need GetTableAmRoutineByAmId() that raises nice errors etc - as the AM has already been converted to an oid, we shouldn't need to recheck? > +bytea * > +table_reloptions_am(Oid accessMethodId, Datum reloptions, char relkind, bool validate) > { > ... > > + /* built-in table access method put here to fetch TAM fast */ > + case HEAP_TABLE_AM_OID: > + tam = GetHeapamTableAmRoutine(); > + break; > default: > - /* other relkinds are not supported */ > - return NULL; > + tam = GetTableAmRoutineByAmId(accessMethodId); > + break; Why do we need this fastpath? This shouldn't be something called at a meaningful frequency? > } > + return table_reloptions(tam->amoptions, reloptions, relkind, validate); > } I'd just pass the tam, instead of an individual function. > @@ -866,6 +866,11 @@ typedef struct TableAmRoutine > struct SampleScanState *scanstate, > TupleTableSlot *slot); > > + /* > + * This callback is used to parse reloptions for relation/matview/toast. > + */ > + bytea *(*amoptions)(Datum reloptions, char relkind, bool validate); > + > } TableAmRoutine; Did you mean table instead of relation in the comment? > Another thing about reloption is that the current API passes a parameter `validate` to tell the parse > functioin to check and whether to raise an error. It doesn't have enough context when these reloptioins > are used: > 1. CREATE TABLE ... WITH(...) > 2. ALTER TABLE ... SET ... > 3. ALTER TABLE ... RESET ... > The reason why the context matters is that some reloptions are disallowed to change after creating > the table, while some reloptions are allowed. What kind of reloption are you thinking of here? Greetings, Andres Freund
> > The rest of this mail is to talk about the first issue. It looks reasonable to add a similar callback in > > struct TableAmRoutine, and parse reloptions by the callback. This patch is in the attachment file. > > Why did you add relkind to the callbacks? The callbacks are specific to a > certain relkind, so I don't think that makes sense.
An implementation of table access method may be used for table/toast/matview, different relkinds
may define different set of reloptions. If they have the same reloption set, just ignore the relkind
parameter.
> I don't think we really need GetTableAmRoutineByAmId() that raises nice > errors etc - as the AM has already been converted to an oid, we shouldn't need > to recheck?
When defining a relation, the function knows only the access method name, not the AM routine struct.
The AmRoutine needs to be looked-up by its access method name or oid. The existing function
calculates AmRoutine by the handler oid, not by am oid. > > +bytea * > > +table_reloptions_am(Oid accessMethodId, Datum reloptions, char relkind, bool validate) > > { > > ... > > > > + /* built-in table access method put here to fetch TAM fast */ > > + case HEAP_TABLE_AM_OID: > > + tam = GetHeapamTableAmRoutine(); > > + break; > > default: > > - /* other relkinds are not supported */ > > - return NULL; > > + tam = GetTableAmRoutineByAmId(accessMethodId); > > + break; > Why do we need this fastpath? This shouldn't be something called at a > meaningful frequency?
OK, it make sense. > > } > > + return table_reloptions(tam->amoptions, reloptions, relkind, validate); > > } > > I'd just pass the tam, instead of an individual function.
It's aligned to index_reloptions, and the function extractRelOptions also uses
an individual function other a pointer to AmRoutine struct.
> Did you mean table instead of relation in the comment?
Yes, the comment doesn't update.
> > Another thing about reloption is that the current API passes a parameter `validate` to tell the parse > > functioin to check and whether to raise an error. It doesn't have enough context when these reloptioins > > are used: > > 1. CREATE TABLE ... WITH(...) > > 2. ALTER TABLE ... SET ... > > 3. ALTER TABLE ... RESET ... > > The reason why the context matters is that some reloptions are disallowed to change after creating > > the table, while some reloptions are allowed. > > What kind of reloption are you thinking of here?
DRAFT: The amoptions in TableAmRoutine may change to
```
bytea *(*amoptions)(Datum reloptions, char relkind, ReloptionContext context);
enum ReloptionContext {
RELOPTION_INIT, // CREATE TABLE ... WITH(...)
RELOPTION_SET, // ALTER TABLE ... SET ...
RELOPTION_RESET, // ALTER TABLE ... RESET ...
RELOPTION_EXTRACT, // build reloptions from pg_class.reloptions
}
```
The callback always validates the reloptions if the context is not RELOPTION_EXTRACT.
If the TAM disallows to update some reloptions, it may throw an error when the context is
one of (RELOPTION_SET, RELOPTION_RESET).
The similar callback `amoptions` in IndexRoutine also applies this change.
BTW, it's hard to find an appropriate header file to define the ReloptionContext, which is
used by index/table AM.
Regards,
Hao Wu
I'm definitely in favor of this general idea of supporting custom rel options, we could use that for Citus its columnar TableAM. There have been at least two other discussions on this topic: 1. https://www.postgresql.org/message-id/flat/CAFF0-CG4KZHdtYHMsonWiXNzj16gWZpduXAn8yF7pDDub%2BGQMg%40mail.gmail.com 2. https://www.postgresql.org/message-id/flat/429fb58fa3218221bb17c7bf9e70e1aa6cfc6b5d.camel%40j-davis.com I haven't looked at the patch in detail, but it's probably good to check what part of those previous discussions apply to it. And if there's any ideas from those previous attempts that this patch could borrow. On Tue, 9 May 2023 at 22:59, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-05-05 16:44:39 +0800, 吴昊 wrote: > > When I wrote an extension to implement a new storage by table access method. I found some issues > > that the existing code has strong assumptions for heap tables now. Here are 3 issues that I currently have: > > > > > > 1. Index access method has a callback to handle reloptions, but table access method hasn't. We can't > > add storage-specific parameters by `WITH` clause when creating a table. > > Makes sense to add that. > > > > 2. Existing code strongly assumes that the data file of a table structures by a serial physical files named > > in a hard coded rule: <relfilenode>[.<segno>]. It may only fit for heap like tables. A new storage may have its > > owner structure on how the data files are organized. The problem happens when dropping a table. > > I agree that it's not great, but I don't think it's particularly easy to fix > (because things like transactional DDL require fairly tight integration). Most > of the time it should be possible can also work around the limitations though. > > > > 3. The existing code also assumes that the data file consists of a series of fix-sized block. It may not > > be desired by other storage. Is there any suggestions on this situation? > > That's a requirement of using the buffer manager, but if you don't want to > rely on that, you can use a different pattern. There's some limitations > (format of TIDs, most prominently), but you should be able to deal with that. > > I don't think it would make sense to support other block sizes in the buffer > manager. > > > > The rest of this mail is to talk about the first issue. It looks reasonable to add a similar callback in > > struct TableAmRoutine, and parse reloptions by the callback. This patch is in the attachment file. > > Why did you add relkind to the callbacks? The callbacks are specific to a > certain relkind, so I don't think that makes sense. > > I don't think we really need GetTableAmRoutineByAmId() that raises nice > errors etc - as the AM has already been converted to an oid, we shouldn't need > to recheck? > > > > > +bytea * > > +table_reloptions_am(Oid accessMethodId, Datum reloptions, char relkind, bool validate) > > { > > ... > > > > + /* built-in table access method put here to fetch TAM fast */ > > + case HEAP_TABLE_AM_OID: > > + tam = GetHeapamTableAmRoutine(); > > + break; > > default: > > - /* other relkinds are not supported */ > > - return NULL; > > + tam = GetTableAmRoutineByAmId(accessMethodId); > > + break; > > Why do we need this fastpath? This shouldn't be something called at a > meaningful frequency? > > > > } > > + return table_reloptions(tam->amoptions, reloptions, relkind, validate); > > } > > I'd just pass the tam, instead of an individual function. > > > @@ -866,6 +866,11 @@ typedef struct TableAmRoutine > > struct SampleScanState *scanstate, > > TupleTableSlot *slot); > > > > + /* > > + * This callback is used to parse reloptions for relation/matview/toast. > > + */ > > + bytea *(*amoptions)(Datum reloptions, char relkind, bool validate); > > + > > } TableAmRoutine; > > Did you mean table instead of relation in the comment? > > > > > Another thing about reloption is that the current API passes a parameter `validate` to tell the parse > > functioin to check and whether to raise an error. It doesn't have enough context when these reloptioins > > are used: > > 1. CREATE TABLE ... WITH(...) > > 2. ALTER TABLE ... SET ... > > 3. ALTER TABLE ... RESET ... > > The reason why the context matters is that some reloptions are disallowed to change after creating > > the table, while some reloptions are allowed. > > What kind of reloption are you thinking of here? > > Greetings, > > Andres Freund > >