Home > mailing lists

Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: logical decoding and replication of sequences
Date	March 25, 2022 14:21:12
Msg-id	CAA4eK1KG2hj-=5y6Ufc67q8O2P=OgRA584DahK7Sz6UuQvf0qA@mail.gmail.com Whole thread Raw
In response to	Re: logical decoding and replication of sequences (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: logical decoding and replication of sequences Re: logical decoding and replication of sequences
List	pgsql-hackers

Tree view

On Fri, Mar 25, 2022 at 3:56 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
>
> On 3/25/22 05:01, Amit Kapila wrote:
> > On Fri, Mar 25, 2022 at 3:29 AM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> Pushed.
> >>
> >
> > Some of the comments given by me [1] don't seem to be addressed or
> > responded to. Let me try to say again for the ease of discussion:
> >
>
> D'oh! I got distracted by Petr's response to that message, and missed
> this part ...
>
> > * Don't we need some syncing mechanism between apply worker and
> > sequence sync worker so that apply worker skips the sequence changes
> > till the sync worker is finished, otherwise, there is a risk of one
> > overriding the values of the other? See how we take care of this for a
> > table in should_apply_changes_for_rel() and its callers. If we don't
> > do this for sequences for some reason then probably a comment
> > somewhere is required.
> >
>
> How would that happen? If we're effectively setting the sequence as a
> side effect of inserting the data, then why should we even replicate the
> sequence?
>

I was talking just about sequence values here, considering that some
sequence is just replicating based on nextval. I think the problem is
that apply worker might override what copy has done if an apply worker
is behind the sequence sync worker as both can run in parallel. Let me
try to take some theoretical example to explain this:

Assume, at LSN 10000, the value of sequence s1 is 10. Then by LSN
12000, the value of s1 becomes 20. Now, say copy decides to copy the
sequence value till LSN 12000 which means it will make the value as 20
on the subscriber, now, in parallel, apply worker can process LSN
10000 and make it again 10. Apply worker might end up redoing all
sequence operations along with some transactional ones where we
recreate the file. I am not sure what exact problem it can lead to but
I think we don't need to redo the work.

 We'll have the problem later too, no?
>
> > * Don't we need explicit privilege checking before applying sequence
> > data as we do in commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca for
> > tables?
> >
>
> So essentially something like TargetPrivilegesCheck in the worker?
>

Right.

Few more comments:
==================
1.
@@ -636,7 +704,7 @@ CreatePublication(ParseState *pstate,
CreatePublicationStmt *stmt)
                       get_database_name(MyDatabaseId));

    /* FOR ALL TABLES requires superuser */
-   if (stmt->for_all_tables && !superuser())
+   if (for_all_tables && !superuser())
        ereport(ERROR,
                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
                 errmsg("must be superuser to create FOR ALL TABLES
publication")));

Don't we need a similar check for 'for_all_schema' publications?

2.
+<varlistentry>
+<term>
+        Int8
+</term>
+<listitem>
+<para>
+                1 if the sequence update is transactions, 0 otherwise.

Shall we say transactional instead of transactions?

3.
+/*
+ * Determine object type given the object type set for a schema.
+ */
+char
+pub_get_object_type_for_relkind(char relkind)

Shouldn't it be 'relation' instead of 'schema' at the end of the sentence?

4.
@@ -1739,13 +1804,13 @@ get_rel_sync_entry(PGOutputData *data,
Relation relation)
    {
        Oid         schemaId = get_rel_namespace(relid);
        List       *pubids = GetRelationPublications(relid);
-
+       char        objectType =
pub_get_object_type_for_relkind(get_rel_relkind(relid));

A few lines after this we are again getting relkind which is not a big
deal but OTOH there doesn't seem to be a need to fetch the same thing
twice from the cache.

5.
+
+   /* Check that user is allowed to manipulate the publication tables. */
+   if (sequences && pubform->puballsequences)

/tables/sequences

6.
+apply_handle_sequence(StringInfo s)
{
...
+
+   relid = RangeVarGetRelid(makeRangeVar(seq.nspname,
+                                         seq.seqname, -1),
+                            RowExclusiveLock, false);
...
}

As here, we are using missing_ok, if the sequence doesn't exist, it
will give a message like: "ERROR:  relation "public.s1" does not
exist" whereas for tables we give a slightly more clear message like:
"ERROR:  logical replication target relation "public.t1" does not
exist". This is handled via logicalrep_rel_open().

-- 
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Michael Paquier
Date: 25 March 2022, 14:18:31
Subject: Re: Allow file inclusion in pg_hba and pg_ident files

From: Tomas Vondra
Date: 25 March 2022, 14:59:42
Subject: Re: logical decoding and replication of sequences

Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

Previous

Next