Re: Sequence Access Methods, round two - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Sequence Access Methods, round two |
Date | |
Msg-id | ef7114ee-e629-4132-b92d-d9104e2a40d7@enterprisedb.com Whole thread Raw |
In response to | Re: Sequence Access Methods, round two (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Sequence Access Methods, round two
Re: Sequence Access Methods, round two |
List | pgsql-hackers |
Hi Michael, I took a quick look at this patch series, mostly to understand how it works and how it might interact with the logical decoding patches discussed in a nearby thread. First, some general review comments: 0001 ------ I think this bit in pg_proc.dat is not quite right: proallargtypes => '{regclass,bool,int8}', proargmodes => '{i,o,o}', proargnames => '{seqname,is_called,last_value}', the first argument should not be "seqname" but rather "seqid". 0002, 0003 ------------ seems fine, cosmetic changes 0004 ------ I don't understand this bit in AlterSequence: last_value = newdataform->last_value; is_called = newdataform->is_called; UnlockReleaseBuffer(buf); /* Check and set new values */ init_params(pstate, stmt->options, stmt->for_identity, false, seqform, &last_value, &reset_state, &is_called, &need_seq_rewrite, &owned_by); Why set the values only to pass them to init_params(), which will just overwrite them anyway? Or do I get this wrong? Also, isn't "reset_state" just a different name for (original) log_cnt? 0005 ------ I don't quite understand what "elt" stands for :-( stmt->tableElts = NIL; Do we need AT_AddColumnToSequence? It seems to work exactly like AT_AddColumn. OTOH we do have AT_AddColumnToView too ... Thinking about this code: case T_CreateSeqStmt: EventTriggerAlterTableStart(parsetree); address = DefineSequence(pstate, (CreateSeqStmt *) parsetree); /* stashed internally */ commandCollected = true; EventTriggerAlterTableEnd(); break; Does this actually make sense? I mean, are sequences really relations? Or was that just a side effect of storing the state in a heap table (which is more of an implementation detail)? 0006 ------ no comment, just moving code 0007 ------ I wonder why heap_create_with_catalog needs to do this (check that it's a sequence): if ((RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) || relkind == RELKIND_SEQUENCE) Presumably this is to handle sequences that use heap to store the state? Maybe the comment should explain that. Also, will the other table AMs need to do something similar, just in case some sequence happens to use that table AM (which seems out of control of the table AM)? I don't understand why DefineSequence need to copy the string: stmt->accessMethod = seq->accessMethod ? pstrdup(seq->accessMethod) : NULL; RelationInitTableAccessMethod now does not need to handle sequences, or rather should not be asked to handle sequences. Is there a risk we'd pass a sequence to the function anyway? Maybe an assert / error would be appropriate? This bit in RelationBuildLocalRelation looks a bit weird ... if (RELKIND_HAS_TABLE_AM(relkind)) RelationInitTableAccessMethod(rel); else if (relkind == RELKIND_SEQUENCE) RelationInitSequenceAccessMethod(rel); It's not a fault of this patch, but shouldn't we now have something like RELKIND_HAS_SEQUENCE_AM()? 0008-0010 ----------- no comment logical decoding / replication -------------------------------- Now, regarding the logical decoding / replication, would introducing the sequence AM interfere with that in some way? Either in general, or with respect to the nearby patch. That is, what would it take to support logical replication of sequences with some custom sequence AM? I believe that requires (a) synchronizing the initial value, and (b) decoding the sequence WAL and (c) apply the decoded changes. I don't think the sequence AM breaks any of this, as long as it allows selecting "current value", decoding the values from WAL, sending them to the subscriber, etc. I guess the decoding would be up to the RMGR, and this patch maintains the 1:1 mapping of sequences to relfilenodes, right? (That is, CREATE and ALTER SEQUENCE would still create a new relfilenode, which is rather important to decide if a sequence change is transactional.) It seems to me this does not change the non-transactional behavior of sequences, right? alternative sequence AMs -------------------------- I understand one of the reasons for adding sequence AMs is to allow stuff like global/distributed sequences, etc. But will people actually use that? For example, I believe Simon originally proposed this in 2016 because the plan was to implement distributed sequences in BDR on top of it. But I believe BDR ultimately went with a very different approach, not with custom sequence AMs. So I'm a bit skeptical this being suitable for other active-active systems ... Especially when the general consensus seems to be that for active-active systems it's much better to use e.g. UUID, because that does not require any coordination between the nodes, etc. I'm not claiming there are no use cases for sequence AMs, of course. For example the PRNG-based sequences mentioned by Mattias seems interesting. I don't know how widely useful that is, though, and if it's worth it (considering they managed to implement it in a different way). But I think it might be a good idea to implement a PoC of such sequence AM, if only to verify it can be implemented using the proposed code. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: