Re: Sequence Access Methods, round two - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Sequence Access Methods, round two |
Date | |
Msg-id | Zd06cdyl0sqPrrVC@paquier.xyz Whole thread Raw |
In response to | Re: Sequence Access Methods, round two (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Sequence Access Methods, round two
|
List | pgsql-hackers |
On Thu, Feb 22, 2024 at 05:36:00PM +0100, Tomas Vondra wrote: > I took a quick look at this patch series, mostly to understand how it > works and how it might interact with the logical decoding patches > discussed in a nearby thread. Thanks. Both discussions are linked. > 0001 > ------ > > I think this bit in pg_proc.dat is not quite right: > > proallargtypes => '{regclass,bool,int8}', proargmodes => '{i,o,o}', > proargnames => '{seqname,is_called,last_value}', > > the first argument should not be "seqname" but rather "seqid". Ah, right. There are not many system functions that use regclass as arguments, but the existing ones refer more to IDs, not names. > 0002, 0003 > ------------ > seems fine, cosmetic changes Applied these ones as 449e798c77ed and 6e951bf98e2e. > 0004 > ------ > > I don't understand this bit in AlterSequence: > > last_value = newdataform->last_value; > is_called = newdataform->is_called; > > UnlockReleaseBuffer(buf); > > /* Check and set new values */ > init_params(pstate, stmt->options, stmt->for_identity, false, > seqform, &last_value, &reset_state, &is_called, > &need_seq_rewrite, &owned_by); > > Why set the values only to pass them to init_params(), which will just > overwrite them anyway? Or do I get this wrong? The values of "last_value" and is_called may not get updated depending on the options given in the ALTER SEQUENCE query, and they need to use as initial state what's been returned from their last heap lookup. > Also, isn't "reset_state" just a different name for (original) log_cnt? Yep. That's quite the point. That's an implementation detail depending on the interface a sequence AM should use, but the main argument behind this change is that log_cnt is a counter to decide when to WAL-log the changes of a relation, but I have noticed that all the paths of init_params() don't care about log_cnt as being a counter at all: we just want to know if the state of a sequence should be reset. Form_pg_sequence_data is a piece that only the in-core "local" sequence AM cares about in this proposal. > 0005 > ------ > > I don't quite understand what "elt" stands for :-( > > stmt->tableElts = NIL; > > Do we need AT_AddColumnToSequence? It seems to work exactly like > AT_AddColumn. OTOH we do have AT_AddColumnToView too ... Yeah, that's just cleaner to use a separate one, to be able to detect the attributes in the DDL deparsing pieces when gathering these pieces with event triggers. At least that's my take once you extract the piece that a sequence AM may need a table AM to store its data with its own set of attributes (a sequence AM may as well not need a local table for its data). > Thinking about this code: > > case T_CreateSeqStmt: > EventTriggerAlterTableStart(parsetree); > address = DefineSequence(pstate, (CreateSeqStmt *) parsetree); > /* stashed internally */ > commandCollected = true; > EventTriggerAlterTableEnd(); > break; > > Does this actually make sense? I mean, are sequences really relations? > Or was that just a side effect of storing the state in a heap table > (which is more of an implementation detail)? This was becoming handy when creating custom attributes for the underlying table used by a sequence. Sequences are already relations (views are also relations), we store them in pg_class. Now sequences can also use tables internally to store their data, like the in-core "local" sequence AM defined in the patch. At least that's the split done in this patch set. > 0007 > ------ > I wonder why heap_create_with_catalog needs to do this (check that it's > a sequence): > > if ((RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) || > relkind == RELKIND_SEQUENCE) > > Presumably this is to handle sequences that use heap to store the state? > Maybe the comment should explain that. Also, will the other table AMs > need to do something similar, just in case some sequence happens to use > that table AM (which seems out of control of the table AM)? Okay, I can see why this part can be confusing with the state of things in v2. In DefineRelation(), heap_create_with_catalog() passes down the OID of the sequence access method when creating a sequence, not the OID of the table AM it may rely on. There's coverage for that in the regression tests if you remove the check, see the "Try to drop and fail on dependency" in create_am.sql. You have a good point here: there could be a dependency between a table AM and a sequence AM that may depend on it. The best way to tackle that would be to add a DEPENDENCY_NORMAL on the amhandler of the table AM when dealing with a sequence amtype in CreateAccessMethod() in this design. Does that make sense? (This may or may not make sense depending on how the design problem related to the relationship between a sequence AM and its optional table AM is tackled, of course, but at least it makes sense to me in the scope of the design of this patch set.) > I don't understand why DefineSequence need to copy the string: > > stmt->accessMethod = seq->accessMethod ? pstrdup(seq->accessMethod) > : NULL; That's required to pass down the correct sequence AM for DefineRelation() when creating the pg_class entry of a sequence. > RelationInitTableAccessMethod now does not need to handle sequences, or > rather should not be asked to handle sequences. Is there a risk we'd > pass a sequence to the function anyway? Maybe an assert / error would be > appropriate? Hmm. The risk sounds legit. This is something where an assertion based on RELKIND_HAS_TABLE_AM() would be useful. Same argument for RelationInitSequenceAccessMethod() with RELKIND_HAS_SEQUENCE_AM() suggested below. I've added these, for now. > This bit in RelationBuildLocalRelation looks a bit weird ... > > if (RELKIND_HAS_TABLE_AM(relkind)) > RelationInitTableAccessMethod(rel); > else if (relkind == RELKIND_SEQUENCE) > RelationInitSequenceAccessMethod(rel); > > It's not a fault of this patch, but shouldn't we now have something like > RELKIND_HAS_SEQUENCE_AM()? Perhaps, I was not sure. This would just be a check on RELKIND_SEQUENCE, but perhaps that's worth having at the end, and this makes the code more symmetric in the relcache, for one. The comment at the top of RELKIND_HAS_TABLE_AM is wrong with 0007 in place anyway. > logical decoding / replication > -------------------------------- > Now, regarding the logical decoding / replication, would introducing the > sequence AM interfere with that in some way? Either in general, or with > respect to the nearby patch. I think it does not. The semantics of the existing in-core "local" sequence AM are not changed. So what's here is just a large refactoring shaped around the current semantics of the existing computation method. Perhaps it should be smarter about some aspects, but that's not something we'll know about until folks start implementing their own custom methods. On my side, being able to plug in a custom callback into nextval_internal() is the main taker. > That is, what would it take to support logical replication of sequences > with some custom sequence AM? I believe that requires (a) synchronizing > the initial value, and (b) decoding the sequence WAL and (c) apply the > decoded changes. I don't think the sequence AM breaks any of this, as > long as it allows selecting "current value", decoding the values from > WAL, sending them to the subscriber, etc. Sure, that may make sense to support, particularly if one uses a sequence AM that uses a computation method that may not be unique across nodes, and where you may want to copy them. I don't think that this is problem for something like the proposal of this thread or what the other thread does, they can tackle separate areas (the logirep patch has a lot of value for rolling upgrades where one uses logical replication to create the new node and somebody does not want to bother with a custom computation). > I guess the decoding would be up to the RMGR, and this patch maintains > the 1:1 mapping of sequences to relfilenodes, right? (That is, CREATE > and ALTER SEQUENCE would still create a new relfilenode, which is rather > important to decide if a sequence change is transactional.) Yeah, one "local" sequence would have one relfilenode. A sequence AM may want something different, like not using shared buffers, or just not use a relfilenode at all. > It seems to me this does not change the non-transactional behavior of > sequences, right? This patch set does nothing about the non-transactional behavior of sequences. That seems out of scope to me from the start of what I have sent here. > alternative sequence AMs > -------------------------- > I understand one of the reasons for adding sequence AMs is to allow > stuff like global/distributed sequences, etc. But will people actually > use that? Good question. I have users who would be happy with that, hiding behind sequences custom computations rather than plug in a bunch of default expressions to various attributes. You can do that today, but this has this limitations depending on how much control one has over their applications (for example this cannot be easily achieved with generated columns in a schema). > For example, I believe Simon originally proposed this in 2016 because > the plan was to implement distributed sequences in BDR on top of it. But > I believe BDR ultimately went with a very different approach, not with > custom sequence AMs. So I'm a bit skeptical this being suitable for > other active-active systems ... Snowflake IDs are popular AFAIK, thanks to the unicity of the values across nodes. > Especially when the general consensus seems to be that for active-active > systems it's much better to use e.g. UUID, because that does not require > any coordination between the nodes, etc. That means being able to support something larger than 64b values as these are 128b. > I'm not claiming there are no use cases for sequence AMs, of course. For > example the PRNG-based sequences mentioned by Mattias seems interesting. > I don't know how widely useful that is, though, and if it's worth it > (considering they managed to implement it in a different way). Right. I bet that they just plugged a default expression to the attributes involved. When it comes to users at a large scale, a sequence AM makes the change more transparent, especially if DDL queries are replication across multiple logical nodes. > But I think it might be a good idea to implement a PoC of such sequence > AM, if only to verify it can be implemented using the proposed code. You mean the PRNG idea or something else? I have a half-baked implementation for snowflake, actually. Would that be enough? I still need to spend more hours on it to polish it. One part I found more difficult than necessary with the patch set of this thread is the APIs used in commands/sequence.c for the buffer manipulations, requiring more duplications. Not impossible to do outside core, but I've wanted more refactoring of the routines used by the "local" sequence AM of this patch. Plugging in a custom data type on top of the existing sequence objects is something entirely different, where we will need a callback separation anyway at the end, IMHO. This seems like a separate topic to me at the end, as custom computations with 64b to store them is enough based on what I've heard even for hundreds of nodes. I may be wrong and may not think big enough, of course. Attaching a v3 set, fixing one conflict, while on it. -- Michael
Attachment
- v3-0001-Switch-pg_sequence_last_value-to-report-a-tuple-a.patch
- v3-0002-Remove-FormData_pg_sequence_data-from-init_params.patch
- v3-0003-Integrate-addition-of-attributes-for-sequences-wi.patch
- v3-0004-Move-code-for-local-sequences-to-own-file.patch
- v3-0005-Sequence-access-methods-backend-support.patch
- v3-0006-Sequence-access-methods-core-documentation.patch
- v3-0007-Sequence-access-methods-dump-restore-support.patch
- v3-0008-dummy_sequence_am-Example-of-sequence-AM.patch
- signature.asc
pgsql-hackers by date: