Thread: %d in log_line_prefix doesn't work for bg/autovacuum workers
Hi, elog.c's log_line_prefix() processes %d with: case 'd': if (MyProcPort) { const char *dbname = MyProcPort->database_name; if (dbname == NULL || *dbname == '\0') dbname = _("[unknown]"); if (padding != 0) appendStringInfo(buf, "%*s", padding, dbname); else appendStringInfoString(buf, dbname); } else if (padding != 0) appendStringInfoSpaces(buf, padding > 0 ? padding : -padding); write_csvlog() uses similar logic. Unfortunately MyProcPort only exists in user initiated backends. It's imo pretty annoying that neither bgworkers nor autovacuum workers show the proper database in the log. Why don't we just populate a global variable in InitPostgres() once we're sure which database the backend is connected to? We could fill fake MyProcPorts, but that doesn't seem like a good idea to me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > elog.c's log_line_prefix() processes %d with: > case 'd': > if (MyProcPort) > { > const char *dbname = MyProcPort->database_name; > if (dbname == NULL || *dbname == '\0') > dbname = _("[unknown]"); > if (padding != 0) > appendStringInfo(buf, "%*s", padding, dbname); Not directly related to your gripe, but: where did this "padding" logic come from, and what prevents it from creating invalidly-encoded output by means of truncating multibyte characters in the middle? Not to mention that if glibc has a different idea of the prevailing encoding than we do, it is quite likely to mess up completely. We've been burnt badly enough by use of this coding technique that it should never have been accepted in someplace as critical as elog.c. regards, tom lane
On 2014-05-16 14:02:44 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > elog.c's log_line_prefix() processes %d with: > > case 'd': > > if (MyProcPort) > > { > > const char *dbname = MyProcPort->database_name; > > > if (dbname == NULL || *dbname == '\0') > > dbname = _("[unknown]"); > > if (padding != 0) > > appendStringInfo(buf, "%*s", padding, dbname); > > Not directly related to your gripe, but: where did this "padding" logic > come from, and what prevents it from creating invalidly-encoded output by > means of truncating multibyte characters in the middle? Isn't that syntax just the *minimal* width? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-16 14:02:44 -0400, Tom Lane wrote: >> Not directly related to your gripe, but: where did this "padding" logic >> come from, and what prevents it from creating invalidly-encoded output by >> means of truncating multibyte characters in the middle? > Isn't that syntax just the *minimal* width? Ah, you're right, so sprintf shouldn't attempt to truncate the data anywhere. Nonetheless, this has created a hazard that wasn't there before: with any padding spec, sprintf has to determine the width-in-characters of the supplied string. If glibc thinks the data is invalid according to *its* idea of the prevailing encoding, it will do something we won't like. My recollection is it refuses to print anything at all. regards, tom lane
On 2014-05-16 14:51:01 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-05-16 14:02:44 -0400, Tom Lane wrote: > >> Not directly related to your gripe, but: where did this "padding" logic > >> come from, and what prevents it from creating invalidly-encoded output by > >> means of truncating multibyte characters in the middle? > > > Isn't that syntax just the *minimal* width? > > Ah, you're right, so sprintf shouldn't attempt to truncate the data > anywhere. Nonetheless, this has created a hazard that wasn't there > before: with any padding spec, sprintf has to determine the > width-in-characters of the supplied string. Shouldn't we already have setup the correct locale at that point beside a couple of lines inside InitPostgres()? > If glibc thinks the data is invalid according to *its* idea of the > prevailing encoding, it will do something we won't like. My > recollection is it refuses to print anything at all. Since this is just about things like database,user, application name, not the actual log message it's probably not too bad if that happens. I.e. still debuggable. The message itself is separately appended, so it should still go through unhindered. Nnote that I haven't been involved in the feature and haven't thought much about related issues... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-16 14:51:01 -0400, Tom Lane wrote: >> Ah, you're right, so sprintf shouldn't attempt to truncate the data >> anywhere. Nonetheless, this has created a hazard that wasn't there >> before: with any padding spec, sprintf has to determine the >> width-in-characters of the supplied string. > Shouldn't we already have setup the correct locale at that point beside > a couple of lines inside InitPostgres()? Define "correct locale". Keep in mind that names in shared catalogs might be in any encoding. But according to previous research, we don't really guarantee that glibc thinks the encoding is what we think, anyway. The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here. The second one suggests that only "%.*s" is at risk not "%*s", but I'm not really convinced right now. My recollection is that glibc will abandon processing either format spec if it thinks the string is wrongly encoded. > Since this is just about things like database,user, application name, > not the actual log message it's probably not too bad if that > happens. [ shrug... ] People *will* complain if data disappears from their logs. regards, tom lane
On 2014-05-16 15:44:53 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-05-16 14:51:01 -0400, Tom Lane wrote: > >> Ah, you're right, so sprintf shouldn't attempt to truncate the data > >> anywhere. Nonetheless, this has created a hazard that wasn't there > >> before: with any padding spec, sprintf has to determine the > >> width-in-characters of the supplied string. > > > Shouldn't we already have setup the correct locale at that point beside > > a couple of lines inside InitPostgres()? > > Define "correct locale". Keep in mind that names in shared catalogs > might be in any encoding. Even the name of the database connected to? I've never checked but I had assumed it be valid in that database's encoding. But evidently not... Why doesn't name_text() have to do an encodign check btw? As there doesn't seem to be much input checking in namein it's a pretty easy way to get bogus data in. > But according to previous research, we don't > really guarantee that glibc thinks the encoding is what we think, anyway. > The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and > ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here. The second > one suggests that only "%.*s" is at risk not "%*s", but I'm not really > convinced right now. My recollection is that glibc will abandon > processing either format spec if it thinks the string is wrongly encoded. I've tested it here. From what it looks like it prints just fine but misjudges the width for padding. > > Since this is just about things like database,user, application name, > > not the actual log message it's probably not too bad if that > > happens. > > [ shrug... ] People *will* complain if data disappears from their logs. You propose to revert? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-16 15:44:53 -0400, Tom Lane wrote: >> But according to previous research, we don't >> really guarantee that glibc thinks the encoding is what we think, anyway. >> The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and >> ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here. The second >> one suggests that only "%.*s" is at risk not "%*s", but I'm not really >> convinced right now. My recollection is that glibc will abandon >> processing either format spec if it thinks the string is wrongly encoded. > I've tested it here. From what it looks like it prints just fine but > misjudges the width for padding. Oh, good. We can live with that. > You propose to revert? Not if the data gets printed. I was afraid it wouldn't be. (In any case, we could retain the feature and just do the padding computations ourselves. But I now think we don't have to.) regards, tom lane
On 2014-05-16 16:19:28 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-05-16 15:44:53 -0400, Tom Lane wrote: > >> But according to previous research, we don't > >> really guarantee that glibc thinks the encoding is what we think, anyway. > >> The commit messages for 54cd4f04576833abc394e131288bf3dd7dcf4806 and > >> ed437e2b27c48219a78f3504b0d05c17c2082d02 are relevant here. The second > >> one suggests that only "%.*s" is at risk not "%*s", but I'm not really > >> convinced right now. My recollection is that glibc will abandon > >> processing either format spec if it thinks the string is wrongly encoded. > > > I've tested it here. From what it looks like it prints just fine but > > misjudges the width for padding. > > Oh, good. We can live with that. Yea, seems pretty harmless. It's a pretty new glibc version though, so things might have been different in the past. On the other hand I can see more justification to stop processing on input than output... > (In any case, we could retain the feature and just do the padding > computations ourselves. But I now think we don't have to.) Might actually make the code somewhat less ugly... But I am not volunteering. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-05-16 19:54:56 +0200, Andres Freund wrote: > Hi, > > elog.c's log_line_prefix() processes %d with: > case 'd': > if (MyProcPort) > { > const char *dbname = MyProcPort->database_name; > > if (dbname == NULL || *dbname == '\0') > dbname = _("[unknown]"); > if (padding != 0) > appendStringInfo(buf, "%*s", padding, dbname); > else > appendStringInfoString(buf, dbname); > } > else if (padding != 0) > appendStringInfoSpaces(buf, > padding > 0 ? padding : -padding); > write_csvlog() uses similar logic. > > Unfortunately MyProcPort only exists in user initiated backends. > > It's imo pretty annoying that neither bgworkers nor autovacuum workers > show the proper database in the log. Why don't we just populate a global > variable in InitPostgres() once we're sure which database the backend is > connected to? We could fill fake MyProcPorts, but that doesn't seem like > a good idea to me. The attached simple patch implements the former idea. We could probably replace a couple of get_database_name(MyDatabaseId) calls by it, but that doesn't look that important. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Andres Freund <andres@2ndquadrant.com> writes: >> It's imo pretty annoying that neither bgworkers nor autovacuum workers >> show the proper database in the log. Why don't we just populate a global >> variable in InitPostgres() once we're sure which database the backend is >> connected to? We could fill fake MyProcPorts, but that doesn't seem like >> a good idea to me. > The attached simple patch implements the former idea. This is basically reintroducing a variable that used to exist and was intentionally gotten rid of, on the grounds that it'd fail to track any renaming of the session's current database (cf commit b256f24264). I'm not sure how much effort ought to be put into dealing with that corner case; but let's not reintroduce old bugs just for failure to remember them. The existing implementation of log line %d has the same issue of course, so this is not a very good argument not to change %d. It *is* a reason not to blindly change get_database_name(MyDatabaseId) calls, which were coded that way precisely for this reason. It might also be a reason not to blithely expose a global variable like this, which would encourage making of the same mistake in places that might be more critical than %d. regards, tom lane
On 2014-05-16 17:48:28 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > >> It's imo pretty annoying that neither bgworkers nor autovacuum workers > >> show the proper database in the log. Why don't we just populate a global > >> variable in InitPostgres() once we're sure which database the backend is > >> connected to? We could fill fake MyProcPorts, but that doesn't seem like > >> a good idea to me. > > > The attached simple patch implements the former idea. > > This is basically reintroducing a variable that used to exist and was > intentionally gotten rid of, on the grounds that it'd fail to track > any renaming of the session's current database (cf commit b256f24264). > I'm not sure how much effort ought to be put into dealing with that corner > case; but let's not reintroduce old bugs just for failure to remember > them. I did look whether there's a race making MyDatabaseName out of date. I didn't see any. There's a windows where it could be renamed between where I'd done the MyDatabaseName = ... and the LockSharedObject() a few lines below. But directly afterwards there's a check that the database name is still correct. We could move the filling of MyDatabaseName to lessen the amount of comments that need to be added. It looks like the interlock around database names didn't exist back in b256f24264 - so the situation simply has changed. Looks like that has happend in 2667d56a3b489e5645f0. > The existing implementation of log line %d has the same issue of course, > so this is not a very good argument not to change %d. It *is* a reason > not to blindly change get_database_name(MyDatabaseId) calls, which were > coded that way precisely for this reason. It might also be a reason > not to blithely expose a global variable like this, which would encourage > making of the same mistake in places that might be more critical than %d. I think exposing the variable somewhat reasonable. It allows to to print the database name in situations where we'd normally not dare because of the syscache lookup... And given that it doesn't look racy anymore the danger seems pretty low. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-05-17 00:17:46 +0200, Andres Freund wrote: > On 2014-05-16 17:48:28 -0400, Tom Lane wrote: > > This is basically reintroducing a variable that used to exist and was > > intentionally gotten rid of, on the grounds that it'd fail to track > > any renaming of the session's current database (cf commit b256f24264). > > I'm not sure how much effort ought to be put into dealing with that corner > > case; but let's not reintroduce old bugs just for failure to remember > > them. > > I did look whether there's a race making MyDatabaseName out of date. I > didn't see any. There's a windows where it could be renamed between > where I'd done the MyDatabaseName = ... and the LockSharedObject() a few > lines below. But directly afterwards there's a check that the database name is > still correct. > We could move the filling of MyDatabaseName to lessen the amount of > comments that need to be added. Moving it also provides the name in bootstrap mode. No idea whether there'll ever be a use in that but given the variable may later replace some get_database_name(MyDatabaseId) calls it seems to be a benefit. I'd briefly changed elog.c to only use MydatabaseName, thinking that there seems to be little reason to continue using database_name in elog.c once the variable is introduced. But that's not a good idea: MyProcPort->database_name exists earlier. It's imo a bit debatable whether it's a good idea to log database names in log_line_prefix that don't exist. But that's a separate change. I've changed it though so that MyDatabaseName is preferred over MyProc. It's imo easier to see that it has the correct value. And it'll be correct if we ever support mapping authentication and real database names or something. Revised patch attached. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-17 00:17:46 +0200, Andres Freund wrote: >> On 2014-05-16 17:48:28 -0400, Tom Lane wrote: >>> This is basically reintroducing a variable that used to exist and was >>> intentionally gotten rid of, on the grounds that it'd fail to track >>> any renaming of the session's current database (cf commit b256f24264). >> I did look whether there's a race making MyDatabaseName out of date. I >> didn't see any. Hm. Actually, it looks like commit b256f24264 was a bit schizophrenic: although it removed this particular way in which a database rename could cause trouble, it nonetheless *added* checks that there were no active sessions in the database. As you say, the checks were race-prone at the time and have since been made bulletproof (though I could not find the commit you mentioned?). But the intent was there. I think the key issue comes down to this comment in RenameDatabase: * XXX Client applications probably store the current database somewhere, * so renaming it could cause confusion. On the other hand, there may not * be an actual problem besides a little confusion, so think about this * and decide. If we're willing to decide that we will never support renaming an active database, then the approach you're proposing makes sense. And it seems to me that this issue of potentially confusing client apps is much the strongest argument for making such a decision. But is it strong enough? Given that there haven't been complaints in the past ten years about how you can't rename an active database, I'm OK personally with locking this down forever. But I wonder if anyone wants to argue the contrary. > I'd briefly changed elog.c to only use MydatabaseName, thinking that > there seems to be little reason to continue using database_name in > elog.c once the variable is introduced. But that's not a good idea: > MyProcPort->database_name exists earlier. If we do this at all, I think actually that is a good idea. MyProcPort->database_name is a *requested* db name, but arguably the %d log line field should represent the *actual* connected db name, which means it shouldn't start being reported until we have some confidence that such a DB exists. I'd personally fill MyDatabaseName a bit sooner than you have here, probably about where MyProc->databaseId gets set. But I don't think elog.c should be looking at more than one source of truth for %d. regards, tom lane
Hi, On 2014-05-17 12:23:51 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-05-17 00:17:46 +0200, Andres Freund wrote: > >> On 2014-05-16 17:48:28 -0400, Tom Lane wrote: > >>> This is basically reintroducing a variable that used to exist and was > >>> intentionally gotten rid of, on the grounds that it'd fail to track > >>> any renaming of the session's current database (cf commit b256f24264). > > >> I did look whether there's a race making MyDatabaseName out of date. I > >> didn't see any. > > Hm. Actually, it looks like commit b256f24264 was a bit schizophrenic: > although it removed this particular way in which a database rename could > cause trouble, it nonetheless *added* checks that there were no active > sessions in the database. I think it was just about making it harder to hit issues related to renames. But since there wasn't enough infrastructure to make it bulletproof yet... > As you say, the checks were race-prone at the > time and have since been made bulletproof (though I could not find the > commit you mentioned?). But the intent was there. It's 52667d56a3b489e5645f069522631824b7ffc520. I've lost the initial 5 when copying it into emacs. > I think the key issue comes down to this comment in RenameDatabase: > > * XXX Client applications probably store the current database somewhere, > * so renaming it could cause confusion. On the other hand, there may not > * be an actual problem besides a little confusion, so think about this > * and decide. > > If we're willing to decide that we will never support renaming an active > database, then the approach you're proposing makes sense. And it seems > to me that this issue of potentially confusing client apps is much the > strongest argument for making such a decision. But is it strong enough? > > Given that there haven't been complaints in the past ten years about how > you can't rename an active database, I'm OK personally with locking this > down forever. But I wonder if anyone wants to argue the contrary. I don't see much need to allow it. But even if we're interested in allowing rename properly there'll definitely be the need to update the database name used in log messages. This doesn't seem to make it harder. My guess would be that we'd need some kind of invalidation message for it... > > I'd briefly changed elog.c to only use MydatabaseName, thinking that > > there seems to be little reason to continue using database_name in > > elog.c once the variable is introduced. But that's not a good idea: > > MyProcPort->database_name exists earlier. > > If we do this at all, I think actually that is a good idea. > MyProcPort->database_name is a *requested* db name, but arguably the > %d log line field should represent the *actual* connected db name, > which means it shouldn't start being reported until we have some > confidence that such a DB exists. Hm. I tentatively think that it's still reasonable to report it earlier. There's a bunch of things (option processing, authentication) that can error out before the database name is finally validated. It seems somewhat helpful to give context there. It's easy enough to get data from connecting clients into the log anyway, so there doesn't seem to be a a security argument either. > I'd personally fill MyDatabaseName > a bit sooner than you have here, probably about where MyProc->databaseId > gets set. The reason I placed it where I did is that it's the first location where we can be sure the database conected to is the correct one because now the lock on the database is held and we've rechecked the name. > But I don't think elog.c should be looking at more than one source of > truth for %d. Ok. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-17 12:23:51 -0400, Tom Lane wrote: >> I think the key issue comes down to this comment in RenameDatabase: >> >> * XXX Client applications probably store the current database somewhere, >> * so renaming it could cause confusion. On the other hand, there may not >> * be an actual problem besides a little confusion, so think about this >> * and decide. >> >> If we're willing to decide that we will never support renaming an active >> database, then the approach you're proposing makes sense. And it seems >> to me that this issue of potentially confusing client apps is much the >> strongest argument for making such a decision. But is it strong enough? >> >> Given that there haven't been complaints in the past ten years about how >> you can't rename an active database, I'm OK personally with locking this >> down forever. But I wonder if anyone wants to argue the contrary. > I don't see much need to allow it. But even if we're interested in > allowing rename properly there'll definitely be the need to update the > database name used in log messages. I think the client-side issues are far worse. For example, if we allow renaming active databases then the subprocesses in a parallel pg_dump or pg_restore could connect to the wrong database, ie not the one the leader process is connected to. The very best-case outcome of that is confusing error messages, and worst-case could be pretty nasty (perhaps even security issues?). We could no doubt teach pg_dump and pg_restore to cross-check database OIDs after reconnecting, but lots of other applications could have comparable problems. >> If we do this at all, I think actually that is a good idea. >> MyProcPort->database_name is a *requested* db name, but arguably the >> %d log line field should represent the *actual* connected db name, >> which means it shouldn't start being reported until we have some >> confidence that such a DB exists. > Hm. I tentatively think that it's still reasonable to report it > earlier. There's a bunch of things (option processing, authentication) > that can error out before the database name is finally validated. It > seems somewhat helpful to give context there. What context? By definition, no such processing can depend on which database you're trying to connect to. > The reason I placed it where I did is that it's the first location where > we can be sure the database conected to is the correct one because now > the lock on the database is held and we've rechecked the name. No, if we've completed the previous lookup then the DB exists. The recheck is to catch the possibility that a rename or drop is completing concurrently --- but I don't think it's unreasonable to show the database's (old) name as soon as we've seen evidence that it does/did exist. regards, tom lane
On 2014-05-17 16:23:26 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-05-17 12:23:51 -0400, Tom Lane wrote: > >> Given that there haven't been complaints in the past ten years about how > >> you can't rename an active database, I'm OK personally with locking this > >> down forever. But I wonder if anyone wants to argue the contrary. > > > I don't see much need to allow it. But even if we're interested in > > allowing rename properly there'll definitely be the need to update the > > database name used in log messages. > > I think the client-side issues are far worse. No disagreements from me there. Those just aren't particulary related to MyDatabaseName existing... > For example, if we allow > renaming active databases then the subprocesses in a parallel pg_dump or > pg_restore could connect to the wrong database, ie not the one the leader > process is connected to. The very best-case outcome of that is confusing > error messages, and worst-case could be pretty nasty (perhaps even > security issues?). Yea, I am pretty sure there's some nastyness that way. Don't really want to go there for a feature that's not even been requested. > We could no doubt teach pg_dump and pg_restore to > cross-check database OIDs after reconnecting, but lots of other > applications could have comparable problems. I guess pg_dump would have to lock the database before dumping.... > >> If we do this at all, I think actually that is a good idea. > >> MyProcPort->database_name is a *requested* db name, but arguably the > >> %d log line field should represent the *actual* connected db name, > >> which means it shouldn't start being reported until we have some > >> confidence that such a DB exists. > > > Hm. I tentatively think that it's still reasonable to report it > > earlier. There's a bunch of things (option processing, authentication) > > that can error out before the database name is finally validated. It > > seems somewhat helpful to give context there. > > What context? By definition, no such processing can depend on which > database you're trying to connect to. With context I mean printing a value for log_line_prefix's '%d'. If you have a application constantly connecting with the wrong username it might be easier to diagnose if you know the database it's trying to connect to. > > The reason I placed it where I did is that it's the first location where > > we can be sure the database conected to is the correct one because now > > the lock on the database is held and we've rechecked the name. > > No, if we've completed the previous lookup then the DB exists. The > recheck is to catch the possibility that a rename or drop is completing > concurrently --- but I don't think it's unreasonable to show the > database's (old) name as soon as we've seen evidence that it does/did > exist. Fair enough. It's imo a minor judgement call with reasons for going either way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Tom Lane 2014-05-17 <22961.1400343831@sss.pgh.pa.us> > I think the key issue comes down to this comment in RenameDatabase: > > * XXX Client applications probably store the current database somewhere, > * so renaming it could cause confusion. On the other hand, there may not > * be an actual problem besides a little confusion, so think about this > * and decide. > > If we're willing to decide that we will never support renaming an active > database, then the approach you're proposing makes sense. And it seems > to me that this issue of potentially confusing client apps is much the > strongest argument for making such a decision. But is it strong enough? > > Given that there haven't been complaints in the past ten years about how > you can't rename an active database, I'm OK personally with locking this > down forever. But I wonder if anyone wants to argue the contrary. Fwiw, we ran into exactly this question yesterday during a customer training session. Their plan is to provide a database with updated content every few hours, so their idea was to populate db_new, rename db to db_old, rename db_new to db, and eventually drop db_old from cron. This fails when sessions are connected to db. Surprisingly, it actually works if you do the db renaming on a master server, but let the (anyway readonly) client connect to a streaming slave. (On 9.2, didn't test 9.4.) We also looked into the confused-client problem, but current_database() reports the new name correctly, and hence figured there shouldn't be any problem with this approach, despite it obviously being slightly out of spec because of the dependency on running on a SR slave. I even jokingly noted that this will probably get fixed in one or the other direction once I mention it on the mailing list ;) So here's your complaint/request that renaming active databases should be possible... Christoph -- cb@df7cb.de | http://www.df7cb.de/
Re: Andres Freund 2014-05-17 <20140517203404.GB4484@awork2.anarazel.de> > > For example, if we allow > > renaming active databases then the subprocesses in a parallel pg_dump or > > pg_restore could connect to the wrong database, ie not the one the leader > > process is connected to. The very best-case outcome of that is confusing > > error messages, and worst-case could be pretty nasty (perhaps even > > security issues?). > > Yea, I am pretty sure there's some nastyness that way. Don't really want > to go there for a feature that's not even been requested. Does pg_dumpall protect against connecting to the wrong database if renames are on the way? To me, this sounds like the same problem as with pg_dump -j. > > We could no doubt teach pg_dump and pg_restore to > > cross-check database OIDs after reconnecting, but lots of other > > applications could have comparable problems. > > I guess pg_dump would have to lock the database before dumping.... Sounds like a sane idea for both. Christoph -- cb@df7cb.de | http://www.df7cb.de/
On 2014-05-17 23:10:42 +0200, Christoph Berg wrote: > > Given that there haven't been complaints in the past ten years about how > > you can't rename an active database, I'm OK personally with locking this > > down forever. But I wonder if anyone wants to argue the contrary. > > Fwiw, we ran into exactly this question yesterday during a customer > training session. Their plan is to provide a database with updated > content every few hours, so their idea was to populate db_new, rename > db to db_old, rename db_new to db, and eventually drop db_old from > cron. This fails when sessions are connected to db. Surprisingly, it > actually works if you do the db renaming on a master server, but let > the (anyway readonly) client connect to a streaming slave. (On 9.2, > didn't test 9.4.) Ick. That's a bug imo. It should cause a recovery conflict disconnecting active sessions with force... I'd very much discourage your users from relying on this. Independent of my complaint about %d this isn't something that's supported and it won't change unless somebody puts non-neglegible resources into it. > We also looked into the confused-client problem, but > current_database() reports the new name correctly, and hence figured > there shouldn't be any problem with this approach, despite it > obviously being slightly out of spec because of the dependency on > running on a SR slave. At the very least log_line_prefix's %d will log the wrong thing. > I even jokingly noted that this will probably > get fixed in one or the other direction once I mention it on the > mailing list ;) Heh. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Andres Freund 2014-05-17 <20140517211930.GA10899@awork2.anarazel.de> > On 2014-05-17 23:10:42 +0200, Christoph Berg wrote: > > > Given that there haven't been complaints in the past ten years about how > > > you can't rename an active database, I'm OK personally with locking this > > > down forever. But I wonder if anyone wants to argue the contrary. > > > > Fwiw, we ran into exactly this question yesterday during a customer > > training session. Their plan is to provide a database with updated > > content every few hours, so their idea was to populate db_new, rename > > db to db_old, rename db_new to db, and eventually drop db_old from > > cron. This fails when sessions are connected to db. Surprisingly, it > > actually works if you do the db renaming on a master server, but let > > the (anyway readonly) client connect to a streaming slave. (On 9.2, > > didn't test 9.4.) > > Ick. That's a bug imo. It should cause a recovery conflict disconnecting > active sessions with force... > I'd very much discourage your users from relying on this. Independent of > my complaint about %d this isn't something that's supported and it won't > change unless somebody puts non-neglegible resources into it. They want to pg_terminate_backend() the sessions on db_old anyway, but to get that point, you need to be able to rename the db. The alternative would be to disallow connections to the db (for which there is no real nice way), kill existing connections, do the renaming, and then reenable connections. That's much more effort, though. Fwiw, this wasn't the first time I've heard of that idea, it also doesn't sound too far-fetched for me. I guess people usually go "damn, I can't rename active dbs, let's try something else" instead of complaining on the mailing lists in that case. > > We also looked into the confused-client problem, but > > current_database() reports the new name correctly, and hence figured > > there shouldn't be any problem with this approach, despite it > > obviously being slightly out of spec because of the dependency on > > running on a SR slave. > > At the very least log_line_prefix's %d will log the wrong thing. I guess if you are doing database renames, you can live with that, especially if you are doing the renames only for the purpose of putting an updated db in place, and then throwing away the old db quickly. Christoph -- cb@df7cb.de | http://www.df7cb.de/
On 2014-05-17 23:35:43 +0200, Christoph Berg wrote: > They want to pg_terminate_backend() the sessions on db_old anyway, but > to get that point, you need to be able to rename the db. The > alternative would be to disallow connections to the db (for which > there is no real nice way), kill existing connections, do the > renaming, and then reenable connections. That's much more effort, > though. There should really be a way to set datallowcon... But anyway, you can set the connection limit to 0, that should mostly work? > Fwiw, this wasn't the first time I've heard of that idea, it also > doesn't sound too far-fetched for me. I guess people usually go "damn, > I can't rename active dbs, let's try something else" instead of > complaining on the mailing lists in that case. Hm. > > > We also looked into the confused-client problem, but > > > current_database() reports the new name correctly, and hence figured > > > there shouldn't be any problem with this approach, despite it > > > obviously being slightly out of spec because of the dependency on > > > running on a SR slave. > > > > At the very least log_line_prefix's %d will log the wrong thing. > > I guess if you are doing database renames, you can live with that, > especially if you are doing the renames only for the purpose of > putting an updated db in place, and then throwing away the old db > quickly. I don't know. If you switch databases around, for example, and sessions on both report the original database name things really get confusing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-05-17 23:35:43 +0200, Christoph Berg wrote: > > Fwiw, this wasn't the first time I've heard of that idea, it also > > doesn't sound too far-fetched for me. I guess people usually go "damn, > > I can't rename active dbs, let's try something else" instead of > > complaining on the mailing lists in that case. > > Hm. http://www.postgresql.org/message-id/1305688547-sup-7028@alvh.no-ip.org -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services