Re: logical decoding and replication of sequences - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: logical decoding and replication of sequences |
Date | |
Msg-id | c0d61b5f-c122-a057-8063-b7303fb8bdbd@enterprisedb.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: logical decoding and replication of sequences
|
List | pgsql-hackers |
On 3/25/22 12:21, Amit Kapila wrote: > On Fri, Mar 25, 2022 at 3:56 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> >> On 3/25/22 05:01, Amit Kapila wrote: >>> On Fri, Mar 25, 2022 at 3:29 AM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> >>>> Pushed. >>>> >>> >>> Some of the comments given by me [1] don't seem to be addressed or >>> responded to. Let me try to say again for the ease of discussion: >>> >> >> D'oh! I got distracted by Petr's response to that message, and missed >> this part ... >> >>> * Don't we need some syncing mechanism between apply worker and >>> sequence sync worker so that apply worker skips the sequence changes >>> till the sync worker is finished, otherwise, there is a risk of one >>> overriding the values of the other? See how we take care of this for a >>> table in should_apply_changes_for_rel() and its callers. If we don't >>> do this for sequences for some reason then probably a comment >>> somewhere is required. >>> >> >> How would that happen? If we're effectively setting the sequence as a >> side effect of inserting the data, then why should we even replicate the >> sequence? >> > > I was talking just about sequence values here, considering that some > sequence is just replicating based on nextval. I think the problem is > that apply worker might override what copy has done if an apply worker > is behind the sequence sync worker as both can run in parallel. Let me > try to take some theoretical example to explain this: > > Assume, at LSN 10000, the value of sequence s1 is 10. Then by LSN > 12000, the value of s1 becomes 20. Now, say copy decides to copy the > sequence value till LSN 12000 which means it will make the value as 20 > on the subscriber, now, in parallel, apply worker can process LSN > 10000 and make it again 10. Apply worker might end up redoing all > sequence operations along with some transactional ones where we > recreate the file. I am not sure what exact problem it can lead to but > I think we don't need to redo the work. > > We'll have the problem later too, no? > Ah, I was confused why this would be an issue for sequences and not for plain tables, but now I realize apply_handle_sequence() is not called in apply_handle_sequence. Yes, that's probably a thinko. >>> * Don't we need explicit privilege checking before applying sequence >>> data as we do in commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca for >>> tables? >>> >> >> So essentially something like TargetPrivilegesCheck in the worker? >> > > Right. > OK, will do. > Few more comments: > ================== > 1. > @@ -636,7 +704,7 @@ CreatePublication(ParseState *pstate, > CreatePublicationStmt *stmt) > get_database_name(MyDatabaseId)); > > /* FOR ALL TABLES requires superuser */ > - if (stmt->for_all_tables && !superuser()) > + if (for_all_tables && !superuser()) > ereport(ERROR, > (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > errmsg("must be superuser to create FOR ALL TABLES > publication"))); > > Don't we need a similar check for 'for_all_schema' publications? > I think you mean "for_all_sequences", right? > 2. > +<varlistentry> > +<term> > + Int8 > +</term> > +<listitem> > +<para> > + 1 if the sequence update is transactions, 0 otherwise. > > Shall we say transactional instead of transactions? > > 3. > +/* > + * Determine object type given the object type set for a schema. > + */ > +char > +pub_get_object_type_for_relkind(char relkind) > > Shouldn't it be 'relation' instead of 'schema' at the end of the sentence? > > 4. > @@ -1739,13 +1804,13 @@ get_rel_sync_entry(PGOutputData *data, > Relation relation) > { > Oid schemaId = get_rel_namespace(relid); > List *pubids = GetRelationPublications(relid); > - > + char objectType = > pub_get_object_type_for_relkind(get_rel_relkind(relid)); > > A few lines after this we are again getting relkind which is not a big > deal but OTOH there doesn't seem to be a need to fetch the same thing > twice from the cache. > > 5. > + > + /* Check that user is allowed to manipulate the publication tables. */ > + if (sequences && pubform->puballsequences) > > /tables/sequences > > 6. > +apply_handle_sequence(StringInfo s) > { > ... > + > + relid = RangeVarGetRelid(makeRangeVar(seq.nspname, > + seq.seqname, -1), > + RowExclusiveLock, false); > ... > } > > As here, we are using missing_ok, if the sequence doesn't exist, it > will give a message like: "ERROR: relation "public.s1" does not > exist" whereas for tables we give a slightly more clear message like: > "ERROR: logical replication target relation "public.t1" does not > exist". This is handled via logicalrep_rel_open(). > Thanks, I'll look at rewording these comments and messages. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: