Re: logical replication worker accesses catalogs in error context callback - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: logical replication worker accesses catalogs in error context callback
Date
Msg-id CALj2ACWRUYJDCsAA-yRJuROYwEo_tRSsGpGcNcFA=eyiTvBjnQ@mail.gmail.com
Whole thread Raw
In response to Re: logical replication worker accesses catalogs in error context callback  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Feb 4, 2021 at 5:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Thanks Amit. I verified it with gdb. I attached gdb to the logical
> > replication worker. In slot_store_data's for loop, I intentionally set
> > CurrentTransactionState->state = TRANS_DEFAULT,
> >
>
> What happens if you don't change CurrentTransactionState->state?

Just a refresher -  the problem we are trying to solve with the 2
patches proposed in this thread [1] is that the error callbacks
shouldn't be accessing system catalogues because the txn might be
broken by then or another error can be raised in system catalogue
access paths (if we have a cache miss and access the catalogue table
from the disk). So the required information that's needed in the error
callback is to be filled in the function that register the callback.

Now, coming to testing the patches: I tried to use the error callback
when we are outside of the xact. For 0001patch i.e.
slot_store_error_callback, I called it after the
apply_handle_commit_internal in apply_handle_commit. The expectation
is that the patches proposed in this thread [1], should just be usable
even outside of the xact.

HEAD: The logical replication worker crashes at
Assert(IsTransactionState() in SearchCatCacheInternal. See [2] for the
testing patch that's used.

Patched: The testing DEBUG message gets printed in the subscriber
server log and no crash happens. See [3] for the testing patch that's
used.
2021-02-05 12:02:32.340 IST [2361724] DEBUG:  calling slot_store_error_callback
2021-02-05 12:02:32.340 IST [2361724] CONTEXT:  processing remote data
for replication target relation "public.t1" column "a1", remote type
remote_test_type, local type local_test_type

Similarly, the 0002 patch in [1] which is avoiding system catalogues
in conversion_error_callback postgres_fdw can also be tested.

Hope this testing is enough for the patches. Please let me know if
anything more is required.

[1] - https://www.postgresql.org/message-id/CALj2ACUkvABzq6yeKQZsgng5Rd_NF%2BikhTvL9k4NR0GSRKsSdg%40mail.gmail.com
[2] -
@@ -713,6 +717,26 @@ apply_handle_commit(StringInfo s)
      apply_handle_commit_internal(s, &commit_data);
 +    if (rel_for_err_cb_chk)
+    {
+        /* We are out of xact. */
+        Assert(!IsTransactionState());
+
+        /* Push callback + info on the error context stack */
+        errarg.rel = rel_for_err_cb_chk;
+        errarg.local_attnum = 0;
+        errarg.remote_attnum = 0;
+        errcallback.callback = slot_store_error_callback;
+        errcallback.arg = (void *) &errarg;
+        errcallback.previous = error_context_stack;
+        error_context_stack = &errcallback;
+
+        elog(DEBUG1, "calling slot_store_error_callback");
+
+        /* Pop the error context stack */
+        error_context_stack = errcallback.previous;
+        rel_for_err_cb_chk = NULL;
+    }
[3]
 @@ -719,6 +724,26 @@ apply_handle_commit(StringInfo s)
      apply_handle_commit_internal(s, &commit_data);
 +    if (rel_for_err_cb_chk)
+    {
+        /* We are out of xact. */
+        Assert(!IsTransactionState());
+        errarg.rel = rel_for_err_cb_chk;
+        errarg.remote_attnum = 0;
+        errarg.local_typename = "local_test_type";
+        errarg.remote_typename = "remote_test_type";
+        errcallback.callback = slot_store_error_callback;
+        errcallback.arg = (void *) &errarg;
+        errcallback.previous = error_context_stack;
+        error_context_stack = &errcallback;
+
+        elog(DEBUG1, "calling slot_store_error_callback");
+
+        /* Pop the error context stack */
+        error_context_stack = errcallback.previous;
+        rel_for_err_cb_chk = NULL;
+    }

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Support for NSS as a libpq TLS backend
Next
From: Amit Kapila
Date:
Subject: Re: pg_replication_origin_drop API potential race condition