Thread: Logging transaction IDs for DDL.

Logging transaction IDs for DDL.

From
Vik Fearing
Date:
I just noticed a problem with log_statement = 'ddl' and log_line_prefix
containing '%x'.  If the statement is the first in the transaction, it
will be logged before it is executed, and more importantly, before a
transaction ID is assigned.  That means that %x will be 0.

If the administrator has configured postgres like this in order to ease
PITR on bad DDL, it's actually of no use whatsoever and they'll have to
use pg_waldump to try to figure things out.

PFA a simple patch that I hope addresses the issue in a clean way.  It
also handles the same problem for 'mod'.

I'm undecided whether this is a bugfix or an improvement.  I'm leaning
towards bugfix.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Attachment

Re: Logging transaction IDs for DDL.

From
Andres Freund
Date:
Hi,

On 2018-06-14 00:34:54 +0200, Vik Fearing wrote:
> I just noticed a problem with log_statement = 'ddl' and log_line_prefix
> containing '%x'.  If the statement is the first in the transaction, it
> will be logged before it is executed, and more importantly, before a
> transaction ID is assigned.  That means that %x will be 0.

This isn't particularly related to DDL, no? It's a general weakness of
how %x is handled. What's even worse is that by
log_min_duration_statement time the xid is already cleared out, when
using a single-statement transaction.


> If the administrator has configured postgres like this in order to ease
> PITR on bad DDL, it's actually of no use whatsoever and they'll have to
> use pg_waldump to try to figure things out.

To me that part seems like it'd be better handled if we had an option
that allows to log at a time where the xid is already assigned. Making
log_statement assign xids imo has a number of problems, but preserving
the xid for logging at the *end* of a statement should be much less
problematic.


> I'm undecided whether this is a bugfix or an improvement.  I'm leaning
> towards bugfix.

It's been reported a couple times, without anybody working to fix
it. Your fix has a noticable regression potential, so I'm not
immediately in favor of backpatching.


> -- 
> Vik Fearing                                          +33 6 46 75 15 36
> http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index f4133953be..27027a0fa8 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -38,6 +38,7 @@
>  
>  #include "access/parallel.h"
>  #include "access/printtup.h"
> +#include "access/transam.h"
>  #include "access/xact.h"
>  #include "catalog/pg_type.h"
>  #include "commands/async.h"
> @@ -2098,16 +2099,31 @@ check_log_statement(List *stmt_list)
>  
>      if (log_statement == LOGSTMT_NONE)
>          return false;
> -    if (log_statement == LOGSTMT_ALL)
> +
> +    /*
> +     * If we're supposed to log mod or ddl statements, we need to make sure we
> +     * have a TransactionId so it appears in log_line_prefix's %x wildcard.
> +     * If the user is logging all statements, we can fast-track out if we
> +     * already have a TransactionId, otherwise we need to loop through the
> +     * statements.
> +     */
> +    if (log_statement == LOGSTMT_ALL && TransactionIdIsValid(GetTopTransactionIdIfAny()))
>          return true;
>
>      /* Else we have to inspect the statement(s) to see whether to log */
>      foreach(stmt_item, stmt_list)
>      {
>          Node       *stmt = (Node *) lfirst(stmt_item);
> +        LogStmtLevel level = GetCommandLogLevel(stmt);
> +
> +        if (level <= log_statement)
> +        {
> +            /* Make sure we have a TransactionId for mod and ddl statements. */
> +            if (level == LOGSTMT_DDL || level == LOGSTMT_MOD)
> +                (void) GetTopTransactionId();
>  
> -        if (GetCommandLogLevel(stmt) <= log_statement)
>              return true;
> +        }
>      }

I'm not at all on board with this.  For one this'll break in a number of
weird ways on a standby (wrong error messages at the least). For another
it increases the overhead of logging quite noticably. Thirdly moving
transaction assignment to a different place based on logging settings
seems like it'll be very confusing too.

Greetings,

Andres Freund


Re: Logging transaction IDs for DDL.

From
Magnus Hagander
Date:
On Thu, Jun 14, 2018 at 12:34 AM, Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
I just noticed a problem with log_statement = 'ddl' and log_line_prefix
containing '%x'.  If the statement is the first in the transaction, it
will be logged before it is executed, and more importantly, before a
transaction ID is assigned.  That means that %x will be 0.

If the administrator has configured postgres like this in order to ease
PITR on bad DDL, it's actually of no use whatsoever and they'll have to
use pg_waldump to try to figure things out.

PFA a simple patch that I hope addresses the issue in a clean way.  It
also handles the same problem for 'mod'.

I'm undecided whether this is a bugfix or an improvement.  I'm leaning
towards bugfix.

FWIW, I'm with Andres on the general points -- especially about having to look at overheads etc...

For your specific usecase, and until the actual logging can be rethought on it, I suggest you take a look at event triggers for it. E.g.:


--