Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands
Date
Msg-id 2350990.1658848101@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-bugs
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> I guess I am expecting exec_execute_message to have:

> if (completed && use_implicit_block)
> {
>     EndImplicitTransactionBlock();
>     finish_xact_command();
> } else if (completed) [existing code continues]

The problem with that is "where do we get use_implicit_block from"?
In simple query mode it's set if the simple-query message contains
more than one statement.  But the issue we face in extended mode is
precisely that we don't know if the client will try to send another
statement before Sync.

I spent some time thinking about alternative solutions for this.
AFAICS the only other feasible approach is to continue to not do
finish_xact_command() until Sync, but change state so that any
message that tries to do other work will be rejected.  But that's
not really at all attractive, for these reasons:

1. Rejecting other message types implies an error (unless we
get REALLY weird), which implies a rollback, which gets us into
the same inconsistent state as a user-issued rollback.

2. Once we've completed the CREATE DATABASE or whatever, we really
have got to commit or we end with inconsistent state.  So it does
not seem like a good plan to sit and wait for the client, even if
we were certain that it'd eventually issue Sync.  The longer we
sit, the more chance of something interfering --- database shutdown,
network connection drop, etc.

3. This approach winds up throwing errors for cases that used
to work, eg multiple CREATE DATABASE commands before Sync.
The immediate-silent-commit approach doesn't.  The only compatibility
break is that you can't ROLLBACK after CREATE DATABASE ... but that's
precisely the case that doesn't work anyway.

Ideally we'd dodge all of this mess by making all our DDL fully
transactional and getting rid of PreventInTransactionBlock.
I'm not sure that will ever happen; but I am sad that so many
new calls of it have been introduced by the logical replication
stuff.  (Doesn't look like anybody bothered to teach psql's
command_no_begin() about those, either.)  In any case, that's a
long-term direction to pursue, not something that could yield
a back-patchable fix.

Anyway, here's an updated patch, now with docs.  I was surprised
to realize that protocol.sgml has no explicit mention of pipelining,
even though extended query protocol was intentionally set up to make
that possible.  So I added a <sect2> about that, which provides a home
for the caveat about immediate-commit commands.

            regards, tom lane

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index c0b89a3c01..6c5d1dcb1b 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1050,6 +1050,66 @@ SELCT 1/0;<!-- this typo is intentional -->
    </note>
   </sect2>

+  <sect2 id="protocol-flow-pipelining">
+   <title>Pipelining</title>
+
+   <indexterm zone="protocol-flow-pipelining">
+    <primary>pipelining</primary>
+    <secondary>protocol specification</secondary>
+   </indexterm>
+
+   <para>
+    Use of the extended query protocol
+    allows <firstterm>pipelining</firstterm>, which means sending a series
+    of queries without waiting for earlier ones to complete.  This reduces
+    the number of network round trips needed to complete a given series of
+    operations.  However, the user must carefully consider the required
+    behavior if one of the steps fails, since later queries will already
+    be in flight to the server.
+   </para>
+
+   <para>
+    One way to deal with that is to wrap the whole query series in a
+    single transaction, and withhold sending the
+    final <command>COMMIT</command> until success of the series is known;
+    if there is any problem, send <command>ROLLBACK</command> instead.
+    This is a bit tedious though, and it still requires an extra network
+    round trip for the <command>COMMIT</command>
+    or <command>ROLLBACK</command>.  Also, one might wish for some of the
+    commands to commit independently of others.
+   </para>
+
+   <para>
+    The extended query protocol provides another way to manage this
+    concern, which is to omit sending Sync messages between steps that
+    are dependent.  Since, after an error, the backend will skip command
+    messages until it finds Sync, this allows later commands in a pipeline
+    to be skipped automatically when an earlier one fails, without the
+    client having to manage that explicitly with <command>COMMIT</command>
+    and <command>ROLLBACK</command>.  Independently-committable segments
+    of the pipeline can be separated by Sync messages.
+   </para>
+
+   <para>
+    If the client has not issued an explicit <command>BEGIN</command>,
+    then each Sync ordinarily causes an implicit <command>COMMIT</command>
+    if the preceding step(s) succeeded, or an
+    implicit <command>ROLLBACK</command> if they failed.  However, there
+    are a few DDL commands (such as <command>CREATE DATABASE</command>)
+    that force an immediate commit to preserve database consistency.
+    A Sync immediately following one of these has no effect except to
+    respond with ReadyForQuery.
+   </para>
+
+   <para>
+    When using this method, completion of the pipeline must be determined
+    by counting ReadyForQuery messages and waiting for that to reach the
+    number of Syncs sent.  Counting command completion responses is
+    unreliable, since some of the commands may not be executed and thus not
+    produce a completion message.
+   </para>
+  </sect2>
+
   <sect2>
    <title>Function Call</title>

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 116de1175b..ce1417b8f0 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -3453,6 +3453,9 @@ AbortCurrentTransaction(void)
  *    could issue more commands and possibly cause a failure after the statement
  *    completes).  Subtransactions are verboten too.
  *
+ *    We must also set XACT_FLAGS_NEEDIMMEDIATECOMMIT in MyXactFlags, to ensure
+ *    that postgres.c follows through by committing after the statement is done.
+ *
  *    isTopLevel: passed down from ProcessUtility to determine whether we are
  *    inside a function.  (We will always fail if this is false, but it's
  *    convenient to centralize the check here instead of making callers do it.)
@@ -3494,7 +3497,9 @@ PreventInTransactionBlock(bool isTopLevel, const char *stmtType)
     if (CurrentTransactionState->blockState != TBLOCK_DEFAULT &&
         CurrentTransactionState->blockState != TBLOCK_STARTED)
         elog(FATAL, "cannot prevent transaction chain");
-    /* all okay */
+
+    /* All okay.  Set the flag to make sure the right thing happens later. */
+    MyXactFlags |= XACT_FLAGS_NEEDIMMEDIATECOMMIT;
 }

 /*
@@ -3591,6 +3596,13 @@ IsInTransactionBlock(bool isTopLevel)
         CurrentTransactionState->blockState != TBLOCK_STARTED)
         return true;

+    /*
+     * If we tell the caller we're not in a transaction block, then inform
+     * postgres.c that it had better commit when the statement is done.
+     * Otherwise our report could be a lie.
+     */
+    MyXactFlags |= XACT_FLAGS_NEEDIMMEDIATECOMMIT;
+
     return false;
 }

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d0bbd30d2b..078fbdb5a0 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1277,6 +1277,13 @@ exec_simple_query(const char *query_string)
         }
         else
         {
+            /*
+             * We had better not see XACT_FLAGS_NEEDIMMEDIATECOMMIT set if
+             * we're not calling finish_xact_command().  (The implicit
+             * transaction block should have prevented it from getting set.)
+             */
+            Assert(!(MyXactFlags & XACT_FLAGS_NEEDIMMEDIATECOMMIT));
+
             /*
              * We need a CommandCounterIncrement after every query, except
              * those that start or end a transaction block.
@@ -2092,32 +2099,16 @@ exec_execute_message(const char *portal_name, long max_rows)

     /*
      * We must copy the sourceText and prepStmtName into MessageContext in
-     * case the portal is destroyed during finish_xact_command. Can avoid the
-     * copy if it's not an xact command, though.
+     * case the portal is destroyed during finish_xact_command.  We do not
+     * make a copy of the portalParams though, preferring to just not print
+     * them in that case.
      */
-    if (is_xact_command)
-    {
-        sourceText = pstrdup(portal->sourceText);
-        if (portal->prepStmtName)
-            prepStmtName = pstrdup(portal->prepStmtName);
-        else
-            prepStmtName = "<unnamed>";
-
-        /*
-         * An xact command shouldn't have any parameters, which is a good
-         * thing because they wouldn't be around after finish_xact_command.
-         */
-        portalParams = NULL;
-    }
+    sourceText = pstrdup(portal->sourceText);
+    if (portal->prepStmtName)
+        prepStmtName = pstrdup(portal->prepStmtName);
     else
-    {
-        sourceText = portal->sourceText;
-        if (portal->prepStmtName)
-            prepStmtName = portal->prepStmtName;
-        else
-            prepStmtName = "<unnamed>";
-        portalParams = portal->portalParams;
-    }
+        prepStmtName = "<unnamed>";
+    portalParams = portal->portalParams;

     /*
      * Report query to various monitoring facilities.
@@ -2216,13 +2207,24 @@ exec_execute_message(const char *portal_name, long max_rows)

     if (completed)
     {
-        if (is_xact_command)
+        if (is_xact_command || (MyXactFlags & XACT_FLAGS_NEEDIMMEDIATECOMMIT))
         {
             /*
              * If this was a transaction control statement, commit it.  We
              * will start a new xact command for the next command (if any).
+             * Likewise if the statement required immediate commit.  Without
+             * this provision, we wouldn't force commit until Sync is
+             * received, which creates a hazard if the client tries to
+             * pipeline immediate-commit statements.
              */
             finish_xact_command();
+
+            /*
+             * These commands typically don't have any parameters, and even if
+             * one did we couldn't print them now because the storage went
+             * away during finish_xact_command.  So pretend there were none.
+             */
+            portalParams = NULL;
         }
         else
         {
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 7d2b35213d..300baae120 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -107,6 +107,12 @@ extern PGDLLIMPORT int MyXactFlags;
  */
 #define XACT_FLAGS_ACQUIREDACCESSEXCLUSIVELOCK    (1U << 1)

+/*
+ * XACT_FLAGS_NEEDIMMEDIATECOMMIT - records whether the top level statement
+ * is one that requires immediate commit, such as CREATE DATABASE.
+ */
+#define XACT_FLAGS_NEEDIMMEDIATECOMMIT            (1U << 2)
+
 /*
  *    start- and end-of-transaction callbacks for dynamically loaded modules
  */

pgsql-bugs by date:

Previous
From: Marco Boeringa
Date:
Subject: Re: Fwd: "SELECT COUNT(*) FROM" still causing issues (deadlock) in PostgreSQL 14.3/4?
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands