Re: [BUG] Logical replica crash if there was an error in a function. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [BUG] Logical replica crash if there was an error in a function.
Date
Msg-id 1828454.1667412148@sss.pgh.pa.us
Whole thread Raw
In response to Re: [BUG] Logical replica crash if there was an error in a function.  (Michael Paquier <michael@paquier.xyz>)
Responses Re: [BUG] Logical replica crash if there was an error in a function.
List pgsql-hackers
Michael Paquier <michael@paquier.xyz> writes:
> Yeah, the query string is not available in this context, but it surely
> looks wrong to me to assume that something as low-level as
> function_parse_error_transpose() needs to be updated for the sake of a
> logical worker, while we have other areas that would expect a portal
> to be set up.

After thinking about this some more, I'm withdrawing my opposition to
fixing it by making function_parse_error_transpose() cope with not
having an active portal.  I have a few reasons:

* A Portal is intended to contain an executor state.  While worker.c
does fake up an EState, there's certainly no plan tree or planstate tree,
and I doubt it'd be sane to create dummy ones.  So even if we made a
Portal, it'd be lacking a lot of the stuff one would expect to find there.
I fear that moving the cause of this sort of problem from "there's no
ActivePortal" to "there's an ActivePortal but it lacks field X" wouldn't
be an improvement.

* There is actually just one other consumer of ActivePortal,
namely EnsurePortalSnapshotExists, and that doesn't offer a lot of
support for the idea that ActivePortal must always be set.  It says

     * Nothing to do if a snapshot is set.  (We take it on faith that the
     * outermost active snapshot belongs to some Portal; or if there is no
     * Portal, it's somebody else's responsibility to manage things.)

and "it's somebody else's responsibility" summarizes the situation
here pretty perfectly.  worker.c *does* set up an active snapshot.

* The comment in function_parse_error_transpose() freely admits that
looking at the ActivePortal is a hack.  It works, more or less, for
the intended case of reporting a function-body syntax error nicely
during CREATE FUNCTION.  But it's capable of getting false-positive
matches, so maybe someday we should replace it with something more
bulletproof.

* There isn't any strong reason why function_parse_error_transpose()
has to succeed at finding the original query text.  Its fallback
approach of treating the syntax error position as internal to the
function body text is fine, in fact it's just what we want here.


So I'm now good with the idea of just not failing.  I don't like
the patch as presented though.  First, the cfbot is quite rightly
complaining about the "uninitialized variable" warning it draws.
Second, I don't see a good reason to tie the change to logical
replication in any way.  Let's just change the Assert to an if(),
as attached.

            regards, tom lane

diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index a9fe45e347..e03b98bcd2 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -1017,7 +1017,6 @@ function_parse_error_transpose(const char *prosrc)
 {
     int            origerrposition;
     int            newerrposition;
-    const char *queryText;
 
     /*
      * Nothing to do unless we are dealing with a syntax error that has a
@@ -1035,11 +1034,22 @@ function_parse_error_transpose(const char *prosrc)
     }
 
     /* We can get the original query text from the active portal (hack...) */
-    Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE);
-    queryText = ActivePortal->sourceText;
+    if (ActivePortal && ActivePortal->status == PORTAL_ACTIVE)
+    {
+        const char *queryText = ActivePortal->sourceText;
 
-    /* Try to locate the prosrc in the original text */
-    newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition);
+        /* Try to locate the prosrc in the original text */
+        newerrposition = match_prosrc_to_query(prosrc, queryText,
+                                               origerrposition);
+    }
+    else
+    {
+        /*
+         * Quietly give up if no ActivePortal.  This is an unusual situation
+         * but it can happen in, e.g., logical replication workers.
+         */
+        newerrposition = -1;
+    }
 
     if (newerrposition > 0)
     {

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Glossary and initdb definition work for "superuser" and database/cluster
Next
From: Justin Pryzby
Date:
Subject: Re: pg15 inherited stats expressions: cache lookup failed for statistics object