Re: logical decoding and replication of sequences - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: logical decoding and replication of sequences |
Date | |
Msg-id | CAA4eK1KG2hj-=5y6Ufc67q8O2P=OgRA584DahK7Sz6UuQvf0qA@mail.gmail.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: logical decoding and replication of sequences
Re: logical decoding and replication of sequences |
List | pgsql-hackers |
On Fri, Mar 25, 2022 at 3:56 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > > On 3/25/22 05:01, Amit Kapila wrote: > > On Fri, Mar 25, 2022 at 3:29 AM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> > >> Pushed. > >> > > > > Some of the comments given by me [1] don't seem to be addressed or > > responded to. Let me try to say again for the ease of discussion: > > > > D'oh! I got distracted by Petr's response to that message, and missed > this part ... > > > * Don't we need some syncing mechanism between apply worker and > > sequence sync worker so that apply worker skips the sequence changes > > till the sync worker is finished, otherwise, there is a risk of one > > overriding the values of the other? See how we take care of this for a > > table in should_apply_changes_for_rel() and its callers. If we don't > > do this for sequences for some reason then probably a comment > > somewhere is required. > > > > How would that happen? If we're effectively setting the sequence as a > side effect of inserting the data, then why should we even replicate the > sequence? > I was talking just about sequence values here, considering that some sequence is just replicating based on nextval. I think the problem is that apply worker might override what copy has done if an apply worker is behind the sequence sync worker as both can run in parallel. Let me try to take some theoretical example to explain this: Assume, at LSN 10000, the value of sequence s1 is 10. Then by LSN 12000, the value of s1 becomes 20. Now, say copy decides to copy the sequence value till LSN 12000 which means it will make the value as 20 on the subscriber, now, in parallel, apply worker can process LSN 10000 and make it again 10. Apply worker might end up redoing all sequence operations along with some transactional ones where we recreate the file. I am not sure what exact problem it can lead to but I think we don't need to redo the work. We'll have the problem later too, no? > > > * Don't we need explicit privilege checking before applying sequence > > data as we do in commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca for > > tables? > > > > So essentially something like TargetPrivilegesCheck in the worker? > Right. Few more comments: ================== 1. @@ -636,7 +704,7 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt) get_database_name(MyDatabaseId)); /* FOR ALL TABLES requires superuser */ - if (stmt->for_all_tables && !superuser()) + if (for_all_tables && !superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("must be superuser to create FOR ALL TABLES publication"))); Don't we need a similar check for 'for_all_schema' publications? 2. +<varlistentry> +<term> + Int8 +</term> +<listitem> +<para> + 1 if the sequence update is transactions, 0 otherwise. Shall we say transactional instead of transactions? 3. +/* + * Determine object type given the object type set for a schema. + */ +char +pub_get_object_type_for_relkind(char relkind) Shouldn't it be 'relation' instead of 'schema' at the end of the sentence? 4. @@ -1739,13 +1804,13 @@ get_rel_sync_entry(PGOutputData *data, Relation relation) { Oid schemaId = get_rel_namespace(relid); List *pubids = GetRelationPublications(relid); - + char objectType = pub_get_object_type_for_relkind(get_rel_relkind(relid)); A few lines after this we are again getting relkind which is not a big deal but OTOH there doesn't seem to be a need to fetch the same thing twice from the cache. 5. + + /* Check that user is allowed to manipulate the publication tables. */ + if (sequences && pubform->puballsequences) /tables/sequences 6. +apply_handle_sequence(StringInfo s) { ... + + relid = RangeVarGetRelid(makeRangeVar(seq.nspname, + seq.seqname, -1), + RowExclusiveLock, false); ... } As here, we are using missing_ok, if the sequence doesn't exist, it will give a message like: "ERROR: relation "public.s1" does not exist" whereas for tables we give a slightly more clear message like: "ERROR: logical replication target relation "public.t1" does not exist". This is handled via logicalrep_rel_open(). -- With Regards, Amit Kapila.
pgsql-hackers by date: