Thread: %d in log_line_prefix doesn't work for bg/autovacuum workers

%d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 17:55:02

Hi,

elog.c's log_line_prefix() processes %d with:           case 'd':               if (MyProcPort)               {
         const char *dbname = MyProcPort->database_name;
 
                   if (dbname == NULL || *dbname == '\0')                       dbname = _("[unknown]");
  if (padding != 0)                       appendStringInfo(buf, "%*s", padding, dbname);                   else
             appendStringInfoString(buf, dbname);               }               else if (padding != 0)
appendStringInfoSpaces(buf,                                          padding > 0 ? padding : -padding);
 
write_csvlog() uses similar logic.

Unfortunately MyProcPort only exists in user initiated backends.

It's imo pretty annoying that neither bgworkers nor autovacuum workers
show the proper database in the log. Why don't we just populate a global
variable in InitPostgres() once we're sure which database the backend is
connected to? We could fill fake MyProcPorts, but that doesn't seem like
a good idea to me.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

16 May 2014, 18:02:50

Andres Freund <andres@2ndquadrant.com> writes:
> elog.c's log_line_prefix() processes %d with:
>             case 'd':
>                 if (MyProcPort)
>                 {
>                     const char *dbname = MyProcPort->database_name;

>                     if (dbname == NULL || *dbname == '\0')
>                         dbname = _("[unknown]");
>                     if (padding != 0)
>                         appendStringInfo(buf, "%*s", padding, dbname);

Not directly related to your gripe, but: where did this "padding" logic
come from, and what prevents it from creating invalidly-encoded output by
means of truncating multibyte characters in the middle?  Not to mention
that if glibc has a different idea of the prevailing encoding than we do,
it is quite likely to mess up completely.  We've been burnt badly enough
by use of this coding technique that it should never have been accepted in
someplace as critical as elog.c.
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 18:06:41

On 2014-05-16 14:02:44 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > elog.c's log_line_prefix() processes %d with:
> >             case 'd':
> >                 if (MyProcPort)
> >                 {
> >                     const char *dbname = MyProcPort->database_name;
> 
> >                     if (dbname == NULL || *dbname == '\0')
> >                         dbname = _("[unknown]");
> >                     if (padding != 0)
> >                         appendStringInfo(buf, "%*s", padding, dbname);
> 
> Not directly related to your gripe, but: where did this "padding" logic
> come from, and what prevents it from creating invalidly-encoded output by
> means of truncating multibyte characters in the middle?

Isn't that syntax just the *minimal* width?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

16 May 2014, 18:51:09

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-05-16 14:02:44 -0400, Tom Lane wrote:
>> Not directly related to your gripe, but: where did this "padding" logic
>> come from, and what prevents it from creating invalidly-encoded output by
>> means of truncating multibyte characters in the middle?

> Isn't that syntax just the *minimal* width?

Ah, you're right, so sprintf shouldn't attempt to truncate the data
anywhere.  Nonetheless, this has created a hazard that wasn't there
before: with any padding spec, sprintf has to determine the
width-in-characters of the supplied string.  If glibc thinks the data
is invalid according to *its* idea of the prevailing encoding, it will
do something we won't like.  My recollection is it refuses to print
anything at all.
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 19:13:13

On 2014-05-16 14:51:01 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-16 14:02:44 -0400, Tom Lane wrote:
> >> Not directly related to your gripe, but: where did this "padding" logic
> >> come from, and what prevents it from creating invalidly-encoded output by
> >> means of truncating multibyte characters in the middle?
> 
> > Isn't that syntax just the *minimal* width?
> 
> Ah, you're right, so sprintf shouldn't attempt to truncate the data
> anywhere.  Nonetheless, this has created a hazard that wasn't there
> before: with any padding spec, sprintf has to determine the
> width-in-characters of the supplied string.

Shouldn't we already have setup the correct locale at that point beside
a couple of lines inside InitPostgres()?

> If glibc thinks the data is invalid according to *its* idea of the
> prevailing encoding, it will do something we won't like.  My
> recollection is it refuses to print anything at all.

Since this is just about things like database,user, application name,
not the actual log message it's probably not too bad if that
happens. I.e. still debuggable. The message itself is separately
appended, so it should still go through unhindered.

Nnote that I haven't been involved in the feature and haven't thought
much about related issues...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

16 May 2014, 19:44:59

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-05-16 14:51:01 -0400, Tom Lane wrote:
>> Ah, you're right, so sprintf shouldn't attempt to truncate the data
>> anywhere.  Nonetheless, this has created a hazard that wasn't there
>> before: with any padding spec, sprintf has to determine the
>> width-in-characters of the supplied string.

> Shouldn't we already have setup the correct locale at that point beside
> a couple of lines inside InitPostgres()?

Define "correct locale".  Keep in mind that names in shared catalogs
might be in any encoding.  But according to previous research, we don't
really guarantee that glibc thinks the encoding is what we think, anyway.
The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and
ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here.  The second
one suggests that only "%.*s" is at risk not "%*s", but I'm not really
convinced right now.  My recollection is that glibc will abandon
processing either format spec if it thinks the string is wrongly encoded.

> Since this is just about things like database,user, application name,
> not the actual log message it's probably not too bad if that
> happens.

[ shrug... ]  People *will* complain if data disappears from their logs.
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 20:09:56

On 2014-05-16 15:44:53 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-16 14:51:01 -0400, Tom Lane wrote:
> >> Ah, you're right, so sprintf shouldn't attempt to truncate the data
> >> anywhere.  Nonetheless, this has created a hazard that wasn't there
> >> before: with any padding spec, sprintf has to determine the
> >> width-in-characters of the supplied string.
> 
> > Shouldn't we already have setup the correct locale at that point beside
> > a couple of lines inside InitPostgres()?
> 
> Define "correct locale".  Keep in mind that names in shared catalogs
> might be in any encoding.

Even the name of the database connected to? I've never checked but I had
assumed it be valid in that database's encoding. But evidently not...

Why doesn't name_text() have to do an encodign check btw? As there
doesn't seem to be much input checking in namein it's a pretty easy way
to get bogus data in.

> But according to previous research, we don't
> really guarantee that glibc thinks the encoding is what we think, anyway.
> The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and
> ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here.  The second
> one suggests that only "%.*s" is at risk not "%*s", but I'm not really
> convinced right now.  My recollection is that glibc will abandon
> processing either format spec if it thinks the string is wrongly encoded.

I've tested it here. From what it looks like it prints just fine but
misjudges the width for padding.

> > Since this is just about things like database,user, application name,
> > not the actual log message it's probably not too bad if that
> > happens.
> 
> [ shrug... ]  People *will* complain if data disappears from their logs.

You propose to revert?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

16 May 2014, 20:19:35

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-05-16 15:44:53 -0400, Tom Lane wrote:
>> But according to previous research, we don't
>> really guarantee that glibc thinks the encoding is what we think, anyway.
>> The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and
>> ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here.  The second
>> one suggests that only "%.*s" is at risk not "%*s", but I'm not really
>> convinced right now.  My recollection is that glibc will abandon
>> processing either format spec if it thinks the string is wrongly encoded.

> I've tested it here. From what it looks like it prints just fine but
> misjudges the width for padding.

Oh, good.  We can live with that.

> You propose to revert?

Not if the data gets printed.  I was afraid it wouldn't be.

(In any case, we could retain the feature and just do the padding
computations ourselves.  But I now think we don't have to.)
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 20:31:10

On 2014-05-16 16:19:28 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-16 15:44:53 -0400, Tom Lane wrote:
> >> But according to previous research, we don't
> >> really guarantee that glibc thinks the encoding is what we think, anyway.
> >> The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and
> >> ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here.  The second
> >> one suggests that only "%.*s" is at risk not "%*s", but I'm not really
> >> convinced right now.  My recollection is that glibc will abandon
> >> processing either format spec if it thinks the string is wrongly encoded.
> 
> > I've tested it here. From what it looks like it prints just fine but
> > misjudges the width for padding.
> 
> Oh, good.  We can live with that.

Yea, seems pretty harmless. It's a pretty new glibc version though, so
things might have been different in the past. On the other hand I can
see more justification to stop processing on input than output...

> (In any case, we could retain the feature and just do the padding
> computations ourselves.  But I now think we don't have to.)

Might actually make the code somewhat less ugly... But I am not
volunteering.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 21:27:52

Hi,

On 2014-05-16 19:54:56 +0200, Andres Freund wrote:
> Hi,
>
> elog.c's log_line_prefix() processes %d with:
>             case 'd':
>                 if (MyProcPort)
>                 {
>                     const char *dbname = MyProcPort->database_name;
>
>                     if (dbname == NULL || *dbname == '\0')
>                         dbname = _("[unknown]");
>                     if (padding != 0)
>                         appendStringInfo(buf, "%*s", padding, dbname);
>                     else
>                         appendStringInfoString(buf, dbname);
>                 }
>                 else if (padding != 0)
>                     appendStringInfoSpaces(buf,
>                                            padding > 0 ? padding : -padding);
> write_csvlog() uses similar logic.
>
> Unfortunately MyProcPort only exists in user initiated backends.
>
> It's imo pretty annoying that neither bgworkers nor autovacuum workers
> show the proper database in the log. Why don't we just populate a global
> variable in InitPostgres() once we're sure which database the backend is
> connected to? We could fill fake MyProcPorts, but that doesn't seem like
> a good idea to me.

The attached simple patch implements the former idea.

We could probably replace a couple of get_database_name(MyDatabaseId)
calls by it, but that doesn't look that important.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-Support-showing-the-database-in-log-messages-origina.patch

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

16 May 2014, 21:48:36

Andres Freund <andres@2ndquadrant.com> writes:
>> It's imo pretty annoying that neither bgworkers nor autovacuum workers
>> show the proper database in the log. Why don't we just populate a global
>> variable in InitPostgres() once we're sure which database the backend is
>> connected to? We could fill fake MyProcPorts, but that doesn't seem like
>> a good idea to me.

> The attached simple patch implements the former idea.

This is basically reintroducing a variable that used to exist and was
intentionally gotten rid of, on the grounds that it'd fail to track
any renaming of the session's current database (cf commit b256f24264).
I'm not sure how much effort ought to be put into dealing with that corner
case; but let's not reintroduce old bugs just for failure to remember
them.

The existing implementation of log line %d has the same issue of course,
so this is not a very good argument not to change %d.  It *is* a reason
not to blindly change get_database_name(MyDatabaseId) calls, which were
coded that way precisely for this reason.  It might also be a reason
not to blithely expose a global variable like this, which would encourage
making of the same mistake in places that might be more critical than %d.
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 22:17:55

On 2014-05-16 17:48:28 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> >> It's imo pretty annoying that neither bgworkers nor autovacuum workers
> >> show the proper database in the log. Why don't we just populate a global
> >> variable in InitPostgres() once we're sure which database the backend is
> >> connected to? We could fill fake MyProcPorts, but that doesn't seem like
> >> a good idea to me.
> 
> > The attached simple patch implements the former idea.
> 
> This is basically reintroducing a variable that used to exist and was
> intentionally gotten rid of, on the grounds that it'd fail to track
> any renaming of the session's current database (cf commit b256f24264).
> I'm not sure how much effort ought to be put into dealing with that corner
> case; but let's not reintroduce old bugs just for failure to remember
> them.

I did look whether there's a race making MyDatabaseName out of date. I
didn't see any. There's a windows where it could be renamed between
where I'd done the MyDatabaseName = ... and the LockSharedObject() a few
lines below. But directly afterwards there's a check that the database name is
still correct.
We could move the filling of MyDatabaseName to lessen the amount of
comments that need to be added.

It looks like the interlock around database names didn't exist back in
b256f24264 - so the situation simply has changed. Looks like that has
happend in 2667d56a3b489e5645f0.

> The existing implementation of log line %d has the same issue of course,
> so this is not a very good argument not to change %d.  It *is* a reason
> not to blindly change get_database_name(MyDatabaseId) calls, which were
> coded that way precisely for this reason.  It might also be a reason
> not to blithely expose a global variable like this, which would encourage
> making of the same mistake in places that might be more critical than %d.

I think exposing the variable somewhat reasonable. It allows to to print
the database name in situations where we'd normally not dare because of
the syscache lookup... And given that it doesn't look racy anymore the
danger seems pretty low.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

16 May 2014, 23:00:42

On 2014-05-17 00:17:46 +0200, Andres Freund wrote:
> On 2014-05-16 17:48:28 -0400, Tom Lane wrote:
> > This is basically reintroducing a variable that used to exist and was
> > intentionally gotten rid of, on the grounds that it'd fail to track
> > any renaming of the session's current database (cf commit b256f24264).
> > I'm not sure how much effort ought to be put into dealing with that corner
> > case; but let's not reintroduce old bugs just for failure to remember
> > them.
>
> I did look whether there's a race making MyDatabaseName out of date. I
> didn't see any. There's a windows where it could be renamed between
> where I'd done the MyDatabaseName = ... and the LockSharedObject() a few
> lines below. But directly afterwards there's a check that the database name is
> still correct.
> We could move the filling of MyDatabaseName to lessen the amount of
> comments that need to be added.

Moving it also provides the name in bootstrap mode. No idea whether
there'll ever be a use in that but given the variable may later replace
some get_database_name(MyDatabaseId) calls it seems to be a benefit.

I'd briefly changed elog.c to only use MydatabaseName, thinking that
there seems to be little reason to continue using database_name in
elog.c once the variable is introduced. But that's not a good idea:
MyProcPort->database_name exists earlier.
It's imo a bit debatable whether it's a good idea to log database names
in log_line_prefix that don't exist. But that's a separate change.

I've changed it though so that MyDatabaseName is preferred over
MyProc. It's imo easier to see that it has the correct value. And it'll
be correct if we ever support mapping authentication and real database
names or something.

Revised patch attached.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-Support-showing-the-database-in-log-entries-issued-b.patch

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

17 May 2014, 16:24:07

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-05-17 00:17:46 +0200, Andres Freund wrote:
>> On 2014-05-16 17:48:28 -0400, Tom Lane wrote:
>>> This is basically reintroducing a variable that used to exist and was
>>> intentionally gotten rid of, on the grounds that it'd fail to track
>>> any renaming of the session's current database (cf commit b256f24264).

>> I did look whether there's a race making MyDatabaseName out of date. I
>> didn't see any.

Hm.  Actually, it looks like commit b256f24264 was a bit schizophrenic:
although it removed this particular way in which a database rename could
cause trouble, it nonetheless *added* checks that there were no active
sessions in the database.  As you say, the checks were race-prone at the
time and have since been made bulletproof (though I could not find the
commit you mentioned?).  But the intent was there.

I think the key issue comes down to this comment in RenameDatabase:
    * XXX Client applications probably store the current database somewhere,    * so renaming it could cause confusion.
On the other hand, there may not    * be an actual problem besides a little confusion, so think about this    * and
decide.

If we're willing to decide that we will never support renaming an active
database, then the approach you're proposing makes sense.  And it seems
to me that this issue of potentially confusing client apps is much the
strongest argument for making such a decision.  But is it strong enough?

Given that there haven't been complaints in the past ten years about how
you can't rename an active database, I'm OK personally with locking this
down forever.  But I wonder if anyone wants to argue the contrary.

> I'd briefly changed elog.c to only use MydatabaseName, thinking that
> there seems to be little reason to continue using database_name in
> elog.c once the variable is introduced. But that's not a good idea:
> MyProcPort->database_name exists earlier.

If we do this at all, I think actually that is a good idea.
MyProcPort->database_name is a *requested* db name, but arguably the
%d log line field should represent the *actual* connected db name,
which means it shouldn't start being reported until we have some
confidence that such a DB exists.  I'd personally fill MyDatabaseName
a bit sooner than you have here, probably about where MyProc->databaseId
gets set.  But I don't think elog.c should be looking at more than one
source of truth for %d.
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

17 May 2014, 19:29:09

Hi,

On 2014-05-17 12:23:51 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-17 00:17:46 +0200, Andres Freund wrote:
> >> On 2014-05-16 17:48:28 -0400, Tom Lane wrote:
> >>> This is basically reintroducing a variable that used to exist and was
> >>> intentionally gotten rid of, on the grounds that it'd fail to track
> >>> any renaming of the session's current database (cf commit b256f24264).
> 
> >> I did look whether there's a race making MyDatabaseName out of date. I
> >> didn't see any.
> 
> Hm.  Actually, it looks like commit b256f24264 was a bit schizophrenic:
> although it removed this particular way in which a database rename could
> cause trouble, it nonetheless *added* checks that there were no active
> sessions in the database.

I think it was just about making it harder to hit issues related to
renames. But since there wasn't enough infrastructure to make it
bulletproof yet...

>  As you say, the checks were race-prone at the
> time and have since been made bulletproof (though I could not find the
> commit you mentioned?).  But the intent was there.

It's 52667d56a3b489e5645f069522631824b7ffc520. I've lost the initial 5
when copying it into emacs.

> I think the key issue comes down to this comment in RenameDatabase:
> 
>      * XXX Client applications probably store the current database somewhere,
>      * so renaming it could cause confusion.  On the other hand, there may not
>      * be an actual problem besides a little confusion, so think about this
>      * and decide.
> 
> If we're willing to decide that we will never support renaming an active
> database, then the approach you're proposing makes sense.  And it seems
> to me that this issue of potentially confusing client apps is much the
> strongest argument for making such a decision.  But is it strong enough?
> 
> Given that there haven't been complaints in the past ten years about how
> you can't rename an active database, I'm OK personally with locking this
> down forever.  But I wonder if anyone wants to argue the contrary.

I don't see much need to allow it. But even if we're interested in
allowing rename properly there'll definitely be the need to update the
database name used in log messages. This doesn't seem to make it
harder. My guess would be that we'd need some kind of invalidation
message for it...

> > I'd briefly changed elog.c to only use MydatabaseName, thinking that
> > there seems to be little reason to continue using database_name in
> > elog.c once the variable is introduced. But that's not a good idea:
> > MyProcPort->database_name exists earlier.
> 
> If we do this at all, I think actually that is a good idea.
> MyProcPort->database_name is a *requested* db name, but arguably the
> %d log line field should represent the *actual* connected db name,
> which means it shouldn't start being reported until we have some
> confidence that such a DB exists.

Hm. I tentatively think that it's still reasonable to report it
earlier. There's a bunch of things (option processing, authentication)
that can error out before the database name is finally validated. It
seems somewhat helpful to give context there.
It's easy enough to get data from connecting clients into the log
anyway, so there doesn't seem to be a a security argument either.

> I'd personally fill MyDatabaseName
> a bit sooner than you have here, probably about where MyProc->databaseId
> gets set.

The reason I placed it where I did is that it's the first location where
we can be sure the database conected to is the correct one because now
the lock on the database is held and we've rechecked the name.

> But I don't think elog.c should be looking at more than one source of
> truth for %d.

Ok.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Tom Lane

Date:

17 May 2014, 20:23:40

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-05-17 12:23:51 -0400, Tom Lane wrote:
>> I think the key issue comes down to this comment in RenameDatabase:
>> 
>> * XXX Client applications probably store the current database somewhere,
>> * so renaming it could cause confusion.  On the other hand, there may not
>> * be an actual problem besides a little confusion, so think about this
>> * and decide.
>> 
>> If we're willing to decide that we will never support renaming an active
>> database, then the approach you're proposing makes sense.  And it seems
>> to me that this issue of potentially confusing client apps is much the
>> strongest argument for making such a decision.  But is it strong enough?
>> 
>> Given that there haven't been complaints in the past ten years about how
>> you can't rename an active database, I'm OK personally with locking this
>> down forever.  But I wonder if anyone wants to argue the contrary.

> I don't see much need to allow it. But even if we're interested in
> allowing rename properly there'll definitely be the need to update the
> database name used in log messages.

I think the client-side issues are far worse.  For example, if we allow
renaming active databases then the subprocesses in a parallel pg_dump or
pg_restore could connect to the wrong database, ie not the one the leader
process is connected to.  The very best-case outcome of that is confusing
error messages, and worst-case could be pretty nasty (perhaps even
security issues?).  We could no doubt teach pg_dump and pg_restore to
cross-check database OIDs after reconnecting, but lots of other
applications could have comparable problems.

>> If we do this at all, I think actually that is a good idea.
>> MyProcPort->database_name is a *requested* db name, but arguably the
>> %d log line field should represent the *actual* connected db name,
>> which means it shouldn't start being reported until we have some
>> confidence that such a DB exists.

> Hm. I tentatively think that it's still reasonable to report it
> earlier. There's a bunch of things (option processing, authentication)
> that can error out before the database name is finally validated. It
> seems somewhat helpful to give context there.

What context?  By definition, no such processing can depend on which
database you're trying to connect to.

> The reason I placed it where I did is that it's the first location where
> we can be sure the database conected to is the correct one because now
> the lock on the database is held and we've rechecked the name.

No, if we've completed the previous lookup then the DB exists.  The
recheck is to catch the possibility that a rename or drop is completing
concurrently --- but I don't think it's unreasonable to show the
database's (old) name as soon as we've seen evidence that it does/did
exist.
        regards, tom lane

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

17 May 2014, 20:34:17

On 2014-05-17 16:23:26 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-17 12:23:51 -0400, Tom Lane wrote:
> >> Given that there haven't been complaints in the past ten years about how
> >> you can't rename an active database, I'm OK personally with locking this
> >> down forever.  But I wonder if anyone wants to argue the contrary.
> 
> > I don't see much need to allow it. But even if we're interested in
> > allowing rename properly there'll definitely be the need to update the
> > database name used in log messages.
> 
> I think the client-side issues are far worse.

No disagreements from me there. Those just aren't particulary related to
MyDatabaseName existing...

> For example, if we allow
> renaming active databases then the subprocesses in a parallel pg_dump or
> pg_restore could connect to the wrong database, ie not the one the leader
> process is connected to.  The very best-case outcome of that is confusing
> error messages, and worst-case could be pretty nasty (perhaps even
> security issues?).

Yea, I am pretty sure there's some nastyness that way. Don't really want
to go there for a feature that's not even been requested.

>  We could no doubt teach pg_dump and pg_restore to
> cross-check database OIDs after reconnecting, but lots of other
> applications could have comparable problems.

I guess pg_dump would have to lock the database before dumping....

> >> If we do this at all, I think actually that is a good idea.
> >> MyProcPort->database_name is a *requested* db name, but arguably the
> >> %d log line field should represent the *actual* connected db name,
> >> which means it shouldn't start being reported until we have some
> >> confidence that such a DB exists.
> 
> > Hm. I tentatively think that it's still reasonable to report it
> > earlier. There's a bunch of things (option processing, authentication)
> > that can error out before the database name is finally validated. It
> > seems somewhat helpful to give context there.
> 
> What context?  By definition, no such processing can depend on which
> database you're trying to connect to.

With context I mean printing a value for log_line_prefix's '%d'. If you
have a application constantly connecting with the wrong username it
might be easier to diagnose if you know the database it's trying to
connect to.

> > The reason I placed it where I did is that it's the first location where
> > we can be sure the database conected to is the correct one because now
> > the lock on the database is held and we've rechecked the name.
> 
> No, if we've completed the previous lookup then the DB exists.  The
> recheck is to catch the possibility that a rename or drop is completing
> concurrently --- but I don't think it's unreasonable to show the
> database's (old) name as soon as we've seen evidence that it does/did
> exist.

Fair enough. It's imo a minor judgement call with reasons for going
either way.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Christoph Berg

Date:

17 May 2014, 21:10:49

Re: Tom Lane 2014-05-17 <22961.1400343831@sss.pgh.pa.us>
> I think the key issue comes down to this comment in RenameDatabase:
> 
>      * XXX Client applications probably store the current database somewhere,
>      * so renaming it could cause confusion.  On the other hand, there may not
>      * be an actual problem besides a little confusion, so think about this
>      * and decide.
> 
> If we're willing to decide that we will never support renaming an active
> database, then the approach you're proposing makes sense.  And it seems
> to me that this issue of potentially confusing client apps is much the
> strongest argument for making such a decision.  But is it strong enough?
> 
> Given that there haven't been complaints in the past ten years about how
> you can't rename an active database, I'm OK personally with locking this
> down forever.  But I wonder if anyone wants to argue the contrary.

Fwiw, we ran into exactly this question yesterday during a customer
training session. Their plan is to provide a database with updated
content every few hours, so their idea was to populate db_new, rename
db to db_old, rename db_new to db, and eventually drop db_old from
cron. This fails when sessions are connected to db. Surprisingly, it
actually works if you do the db renaming on a master server, but let
the (anyway readonly) client connect to a streaming slave. (On 9.2,
didn't test 9.4.)

We also looked into the confused-client problem, but
current_database() reports the new name correctly, and hence figured
there shouldn't be any problem with this approach, despite it
obviously being slightly out of spec because of the dependency on
running on a SR slave. I even jokingly noted that this will probably
get fixed in one or the other direction once I mention it on the
mailing list ;)

So here's your complaint/request that renaming active databases should
be possible...

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Christoph Berg

Date:

17 May 2014, 21:16:23

Re: Andres Freund 2014-05-17 <20140517203404.GB4484@awork2.anarazel.de>
> > For example, if we allow
> > renaming active databases then the subprocesses in a parallel pg_dump or
> > pg_restore could connect to the wrong database, ie not the one the leader
> > process is connected to.  The very best-case outcome of that is confusing
> > error messages, and worst-case could be pretty nasty (perhaps even
> > security issues?).
> 
> Yea, I am pretty sure there's some nastyness that way. Don't really want
> to go there for a feature that's not even been requested.

Does pg_dumpall protect against connecting to the wrong database if
renames are on the way? To me, this sounds like the same problem as
with pg_dump -j.

> >  We could no doubt teach pg_dump and pg_restore to
> > cross-check database OIDs after reconnecting, but lots of other
> > applications could have comparable problems.
> 
> I guess pg_dump would have to lock the database before dumping....

Sounds like a sane idea for both.

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

17 May 2014, 21:19:40

On 2014-05-17 23:10:42 +0200, Christoph Berg wrote:
> > Given that there haven't been complaints in the past ten years about how
> > you can't rename an active database, I'm OK personally with locking this
> > down forever.  But I wonder if anyone wants to argue the contrary.
> 
> Fwiw, we ran into exactly this question yesterday during a customer
> training session. Their plan is to provide a database with updated
> content every few hours, so their idea was to populate db_new, rename
> db to db_old, rename db_new to db, and eventually drop db_old from
> cron. This fails when sessions are connected to db. Surprisingly, it
> actually works if you do the db renaming on a master server, but let
> the (anyway readonly) client connect to a streaming slave. (On 9.2,
> didn't test 9.4.)

Ick. That's a bug imo. It should cause a recovery conflict disconnecting
active sessions with force...
I'd very much discourage your users from relying on this. Independent of
my complaint about %d this isn't something that's supported and it won't
change unless somebody puts non-neglegible resources into it.

> We also looked into the confused-client problem, but
> current_database() reports the new name correctly, and hence figured
> there shouldn't be any problem with this approach, despite it
> obviously being slightly out of spec because of the dependency on
> running on a SR slave.

At the very least log_line_prefix's %d will log the wrong thing.

> I even jokingly noted that this will probably
> get fixed in one or the other direction once I mention it on the
> mailing list ;)

Heh.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Christoph Berg

Date:

17 May 2014, 21:35:48

Re: Andres Freund 2014-05-17 <20140517211930.GA10899@awork2.anarazel.de>
> On 2014-05-17 23:10:42 +0200, Christoph Berg wrote:
> > > Given that there haven't been complaints in the past ten years about how
> > > you can't rename an active database, I'm OK personally with locking this
> > > down forever.  But I wonder if anyone wants to argue the contrary.
> > 
> > Fwiw, we ran into exactly this question yesterday during a customer
> > training session. Their plan is to provide a database with updated
> > content every few hours, so their idea was to populate db_new, rename
> > db to db_old, rename db_new to db, and eventually drop db_old from
> > cron. This fails when sessions are connected to db. Surprisingly, it
> > actually works if you do the db renaming on a master server, but let
> > the (anyway readonly) client connect to a streaming slave. (On 9.2,
> > didn't test 9.4.)
> 
> Ick. That's a bug imo. It should cause a recovery conflict disconnecting
> active sessions with force...
> I'd very much discourage your users from relying on this. Independent of
> my complaint about %d this isn't something that's supported and it won't
> change unless somebody puts non-neglegible resources into it.

They want to pg_terminate_backend() the sessions on db_old anyway, but
to get that point, you need to be able to rename the db. The
alternative would be to disallow connections to the db (for which
there is no real nice way), kill existing connections, do the
renaming, and then reenable connections. That's much more effort,
though.

Fwiw, this wasn't the first time I've heard of that idea, it also
doesn't sound too far-fetched for me. I guess people usually go "damn,
I can't rename active dbs, let's try something else" instead of
complaining on the mailing lists in that case.

> > We also looked into the confused-client problem, but
> > current_database() reports the new name correctly, and hence figured
> > there shouldn't be any problem with this approach, despite it
> > obviously being slightly out of spec because of the dependency on
> > running on a SR slave.
> 
> At the very least log_line_prefix's %d will log the wrong thing.

I guess if you are doing database renames, you can live with that,
especially if you are doing the renames only for the purpose of
putting an updated db in place, and then throwing away the old db
quickly.

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Andres Freund

Date:

17 May 2014, 21:50:33

On 2014-05-17 23:35:43 +0200, Christoph Berg wrote:
> They want to pg_terminate_backend() the sessions on db_old anyway, but
> to get that point, you need to be able to rename the db. The
> alternative would be to disallow connections to the db (for which
> there is no real nice way), kill existing connections, do the
> renaming, and then reenable connections. That's much more effort,
> though.

There should really be a way to set datallowcon... But anyway, you can
set the connection limit to 0, that should mostly work?

> Fwiw, this wasn't the first time I've heard of that idea, it also
> doesn't sound too far-fetched for me. I guess people usually go "damn,
> I can't rename active dbs, let's try something else" instead of
> complaining on the mailing lists in that case.

Hm.

> > > We also looked into the confused-client problem, but
> > > current_database() reports the new name correctly, and hence figured
> > > there shouldn't be any problem with this approach, despite it
> > > obviously being slightly out of spec because of the dependency on
> > > running on a SR slave.
> > 
> > At the very least log_line_prefix's %d will log the wrong thing.
> 
> I guess if you are doing database renames, you can live with that,
> especially if you are doing the renames only for the purpose of
> putting an updated db in place, and then throwing away the old db
> quickly.

I don't know. If you switch databases around, for example, and sessions
on both report the original database name things really get confusing.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: %d in log_line_prefix doesn't work for bg/autovacuum workers

From

Alvaro Herrera

Date:

18 May 2014, 20:21:47

Andres Freund wrote:
> On 2014-05-17 23:35:43 +0200, Christoph Berg wrote:

> > Fwiw, this wasn't the first time I've heard of that idea, it also
> > doesn't sound too far-fetched for me. I guess people usually go "damn,
> > I can't rename active dbs, let's try something else" instead of
> > complaining on the mailing lists in that case.
> 
> Hm.

http://www.postgresql.org/message-id/1305688547-sup-7028@alvh.no-ip.org

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services