Thread: "recovering prepared transaction" after server restart message
There have been several reports that people could not vacuum any more or observed strange locks even after server restart. The reason was that they still had uncommitted prepared transactions around. I wonder if it could help to change the log level from ereport(LOG, (errmsg("recovering prepared transaction %u", xid))); to WARNING maybe in order to make that message more striking within the normal startup messages. Joachim
"Joachim Wieland" <joe@mcknight.de> writes: > There have been several reports that people could not vacuum any more or > observed strange locks even after server restart. The reason was that they > still had uncommitted prepared transactions around. > I wonder if it could help to change the log level from > ereport(LOG, > (errmsg("recovering prepared transaction %u", xid))); > to WARNING maybe in order to make that message more striking within the > normal startup messages. That doesn't seem like a good idea. In the first place, recovering prepared xacts is exactly what system restart is *supposed* to do, and so calling it a WARNING seems out of line. (I'm not real sure why that message is even LOG level, rather than DEBUG1 or below.) In the second place, this wouldn't help anyone unless they tried to fix their problem by restarting the server --- a mentality suited only for Windows users, and certainly not something a production system is going to do lightly --- and then thought to look in the postmaster log, which the average Windows user wouldn't. I agree that there's a usability issue here though; I've been burnt by forgotten prepared xacts myself (eg by control-C'ing pg_regress at just the wrong time). Would it help if we included prepared xacts in the pg_stat_activity view? Is there any other place we could make them more visible during normal server operation? regards, tom lane
On Fri, 2006-11-03 at 01:48 -0500, Tom Lane wrote: > I agree that there's a usability issue here though; I've been burnt by > forgotten prepared xacts myself (eg by control-C'ing pg_regress at just > the wrong time). Would it help if we included prepared xacts in the > pg_stat_activity view? > Is there any other place we could make them > more visible during normal server operation? We only care when they break, otherwise its just situation normal, yes? Is there a way to see prepared transactions where the original session that prepared then has died? Perhaps the message at startup should be "you have at least one prepared transaction that needs resolution". We need something at that point, otherwise a PITR recovery is fairly likely to contain them and we need to know that. Otherwise on a system using prepared transactions heavily you may not spot the odd or two that have crashed and need resolution. Presumably they will effect oldestxmin, so its fairly important to be able to resolve them in a timely manner or at least know they are there. Not that I am advising their use though.... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > We only care when they break, otherwise its just situation normal, yes? No, the trouble case is where the XA manager that owns the transaction has forgotten about it. > Is there a way to see prepared transactions where the original session > that prepared then has died? Perhaps the message at startup should be > "you have at least one prepared transaction that needs resolution". I am completely baffled by this focus on database startup time. That's not where the problem is. regards, tom lane
On Fri, 3 Nov 2006, Tom Lane wrote: > > > Is there a way to see prepared transactions where the original session > > that prepared then has died? Perhaps the message at startup should be > > "you have at least one prepared transaction that needs resolution". > > I am completely baffled by this focus on database startup time. That's > not where the problem is. > > regards, tom lane > I'm not alluding to anyone in particular, just responding to the focus on startup time; When I joined Ingres as a Consultant (back when that was a revered job), we saw this a lot, too, bubbling through the ranks from technical support. Engineering was having a cow over it. We Consultants were expected to backline such problems and be the interface between engineering and the rest of the world. What we found was that in what we'd call the ligitimate cases, the cause for concern over startup time had to do with bugs that forced, one way or another, a server restart. Illigitimate cases - the VAST majority - were the result of, well, let's call them less-than-successful DBAs, thrashing their installations with their management breathing down their necks, often with flailing arms and fire coming out of their mouths saying things like, "I bet my business on this!"... The usual causes there were inappropriate configurations, and a critical cause of _that_ was an instalation toolset that didn't help people size/position things properly. Often a sales guy or trainee would configure a test system and then the customer would put that into production without ever reexamining the settings. I realized there was an opportunity here; I put together a training program and we sold it as a service along with installation to new customers to help them get off on the right foot. Once we did that, new customers were essentially put on notice that they could either pay us to help set them up, or they could do it, but that continuing along with what the salesman or junior techie had done wasn't sufficient for a production environment that you could bet your business on. ...The complaint and concern about startup time dropped out of sight nearly immediately... Opportunity here, for PostgreSql: A Technical Document of some kind entitled something like: "How to move your testing environment into production." No, unfortunately, I can't volunteer to be the point person on this one. And to the underlying question: is this the case with PostgreSql? I can't say... Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 rtroy@ScienceTools.com, http://ScienceTools.com/
Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: >> We only care when they break, otherwise its just situation normal, yes? > > No, the trouble case is where the XA manager that owns the transaction > has forgotten about it. Yeah, and there's no way the DBMS can detect that. >> Is there a way to see prepared transactions where the original session >> that prepared then has died? Perhaps the message at startup should be >> "you have at least one prepared transaction that needs resolution". > > I am completely baffled by this focus on database startup time. That's > not where the problem is. Agreed. Though one way to have orphaned prepared transactions is to recover from a PITR backup or bring a warm stand-by live. The transaction manager might have committed a transaction after the backup was taken. Recovering from the backup resurrects the transaction again and the TM won't know about it. The problem of orphaned transactions is most likely to occur on a dev/test environment, where the TM is run on a developer's laptop and might be killed and reinstalled or reconfigured at any time. And unfortunately there's also a lot of broken TMs out there that don't recover from crashes properly. I think it's a good idea to at least LOG about prepared transactions at startup. But it would be nice to also have a timeout, after which a big fat WARNING would be printed. I don't believe in killing transactions automatically though, that's a job for the administrator. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
This thread been saved for the 8.3 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --------------------------------------------------------------------------- Joachim Wieland wrote: > There have been several reports that people could not vacuum any more or > observed strange locks even after server restart. The reason was that they > still had uncommitted prepared transactions around. > > > I wonder if it could help to change the log level from > > ereport(LOG, > (errmsg("recovering prepared transaction %u", xid))); > > to WARNING maybe in order to make that message more striking within the > normal startup messages. > > > > Joachim > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Added to TODO: * Improve logging of prepared statements recovered during startup http://archives.postgresql.org/pgsql-hackers/2006-11/msg00092.php --------------------------------------------------------------------------- Joachim Wieland wrote: > There have been several reports that people could not vacuum any more or > observed strange locks even after server restart. The reason was that they > still had uncommitted prepared transactions around. > > > I wonder if it could help to change the log level from > > ereport(LOG, > (errmsg("recovering prepared transaction %u", xid))); > > to WARNING maybe in order to make that message more striking within the > normal startup messages. > > > > Joachim > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +