Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences
Date
Msg-id c0d61b5f-c122-a057-8063-b7303fb8bdbd@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: logical decoding and replication of sequences
List pgsql-hackers
On 3/25/22 12:21, Amit Kapila wrote:
> On Fri, Mar 25, 2022 at 3:56 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>> On 3/25/22 05:01, Amit Kapila wrote:
>>> On Fri, Mar 25, 2022 at 3:29 AM Tomas Vondra
>>> <tomas.vondra@enterprisedb.com> wrote:
>>>>
>>>> Pushed.
>>>>
>>>
>>> Some of the comments given by me [1] don't seem to be addressed or
>>> responded to. Let me try to say again for the ease of discussion:
>>>
>>
>> D'oh! I got distracted by Petr's response to that message, and missed
>> this part ...
>>
>>> * Don't we need some syncing mechanism between apply worker and
>>> sequence sync worker so that apply worker skips the sequence changes
>>> till the sync worker is finished, otherwise, there is a risk of one
>>> overriding the values of the other? See how we take care of this for a
>>> table in should_apply_changes_for_rel() and its callers. If we don't
>>> do this for sequences for some reason then probably a comment
>>> somewhere is required.
>>>
>>
>> How would that happen? If we're effectively setting the sequence as a
>> side effect of inserting the data, then why should we even replicate the
>> sequence?
>>
> 
> I was talking just about sequence values here, considering that some
> sequence is just replicating based on nextval. I think the problem is
> that apply worker might override what copy has done if an apply worker
> is behind the sequence sync worker as both can run in parallel. Let me
> try to take some theoretical example to explain this:
> 
> Assume, at LSN 10000, the value of sequence s1 is 10. Then by LSN
> 12000, the value of s1 becomes 20. Now, say copy decides to copy the
> sequence value till LSN 12000 which means it will make the value as 20
> on the subscriber, now, in parallel, apply worker can process LSN
> 10000 and make it again 10. Apply worker might end up redoing all
> sequence operations along with some transactional ones where we
> recreate the file. I am not sure what exact problem it can lead to but
> I think we don't need to redo the work.
> 
>  We'll have the problem later too, no?
>

Ah, I was confused why this would be an issue for sequences and not for
plain tables, but now I realize apply_handle_sequence() is not called in
apply_handle_sequence. Yes, that's probably a thinko.


>>> * Don't we need explicit privilege checking before applying sequence
>>> data as we do in commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca for
>>> tables?
>>>
>>
>> So essentially something like TargetPrivilegesCheck in the worker?
>>
> 
> Right.
> 

OK, will do.

> Few more comments:
> ==================
> 1.
> @@ -636,7 +704,7 @@ CreatePublication(ParseState *pstate,
> CreatePublicationStmt *stmt)
>                        get_database_name(MyDatabaseId));
> 
>     /* FOR ALL TABLES requires superuser */
> -   if (stmt->for_all_tables && !superuser())
> +   if (for_all_tables && !superuser())
>         ereport(ERROR,
>                 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                  errmsg("must be superuser to create FOR ALL TABLES
> publication")));
> 
> Don't we need a similar check for 'for_all_schema' publications?
> 

I think you mean "for_all_sequences", right?

> 2.
> +<varlistentry>
> +<term>
> +        Int8
> +</term>
> +<listitem>
> +<para>
> +                1 if the sequence update is transactions, 0 otherwise.
> 
> Shall we say transactional instead of transactions?
> 
> 3.
> +/*
> + * Determine object type given the object type set for a schema.
> + */
> +char
> +pub_get_object_type_for_relkind(char relkind)
> 
> Shouldn't it be 'relation' instead of 'schema' at the end of the sentence?
> 
> 4.
> @@ -1739,13 +1804,13 @@ get_rel_sync_entry(PGOutputData *data,
> Relation relation)
>     {
>         Oid         schemaId = get_rel_namespace(relid);
>         List       *pubids = GetRelationPublications(relid);
> -
> +       char        objectType =
> pub_get_object_type_for_relkind(get_rel_relkind(relid));
> 
> A few lines after this we are again getting relkind which is not a big
> deal but OTOH there doesn't seem to be a need to fetch the same thing
> twice from the cache.
> 
> 5.
> +
> +   /* Check that user is allowed to manipulate the publication tables. */
> +   if (sequences && pubform->puballsequences)
> 
> /tables/sequences
> 
> 6.
> +apply_handle_sequence(StringInfo s)
> {
> ...
> +
> +   relid = RangeVarGetRelid(makeRangeVar(seq.nspname,
> +                                         seq.seqname, -1),
> +                            RowExclusiveLock, false);
> ...
> }
> 
> As here, we are using missing_ok, if the sequence doesn't exist, it
> will give a message like: "ERROR:  relation "public.s1" does not
> exist" whereas for tables we give a slightly more clear message like:
> "ERROR:  logical replication target relation "public.t1" does not
> exist". This is handled via logicalrep_rel_open().
> 

Thanks, I'll look at rewording these comments and messages.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: logical decoding and replication of sequences
Next
From: "Daniel Verite"
Date:
Subject: Re: Add header support to text format and matching feature