Thread: Vacuuming leaked temp tables (once again)
This thread http://archives.postgresql.org/pgsql-hackers/2008-01/msg00134.php kind of wandered off into the weeds after identifying a semi-related bug in CLUSTER, but the original problem still remains: if a backend crashes after creating some temp tables, the tables remain present. Such tables will get recycled next time someone reuses the same pg_temp_NNN schema. But if the failed backend had been occupying an unusually high-numbered BackendId slot, then its pg_temp_NNN schema might go unused for a long time --- long enough for the temp tables to pose an xid-wraparound problem. There's another report of this issue today in pgsql-general. The only solution proposed in that thread was to auto-delete temp tables at postmaster restart; which I opposed on the grounds that throwing away data right after a crash was a terrible idea from a forensic standpoint. I still think that, but I had another idea about how to cope with the situation. It's reasonably easy to tell (by looking into the sinval state) whether a given BackendId slot is actually in use, so we could detect whether a temp table actually belongs to a live backend or not. What I'm thinking is we should adjust autovacuum so that it will apply anti-wraparound vacuuming operations even to temp tables, if they belong to pg_temp schemas that belong to inactive BackendId slots. This'd fix the wraparound issue without any risk of discarding data that someone might want back. Note that this should be safe even if someone claims the pg_temp_NNN schema and tries to drop the old temp table while we're vacuuming it. Operations on temp tables take the normal types of locks, so that will get interlocked properly. A small hole in this idea is that the BackendId slot might be occupied by some new backend that actually hasn't created any temp tables yet (hence not "taken possession" of the pg_temp_NNN schema). We could fix that by making each backend's has-temp-tables state globally visible. However, I'm inclined to think it's not really an issue, because you wouldn't get into trouble unless this was always the case over many repeated autovacuum visits to the table, which seems pretty improbable. Another issue is that leftover temp tables would be significantly more likely to be self-inconsistent than normal tables, since operations on them are not WAL-logged and it's entirely likely that the owning backend crashed with some dirty pages not written out from its local buffers. AFAICS this shouldn't be any big problem for vacuuming the table proper, since heap pages are pretty independent, at least at the level understood by plain vacuum. There is a risk that indexes would be corrupt enough to make vacuum error out, thus preventing the xid wraparound cleanup from completing. But that leaves us no worse off than we are now, and at least there would be signs of distress in the postmaster log for the DBA to see. Or we could have autovacuum just drop orphaned temp tables, *if* they have gotten old enough to need anti-wraparound vacuuming. While I'm still uncomfortable with having autovac drop anything, at least this would avoid the worst cases of "gee I really needed that data to investigate the crash". The main attractions of this idea are avoiding the corrupt-index issue and not doing vacuuming work that's 99.99% sure to be useless. Thoughts? regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > The main attractions of this idea are avoiding the corrupt-index issue and > not doing vacuuming work that's 99.99% sure to be useless. It does seem strange to me to vacuum a table you're pretty sure is useless *and* quite likely corrupt. Could autovacuum emit log messages as soon as it sees such tables and start dropping them at some point later? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
Tom Lane wrote: > Another issue is that leftover temp tables would be significantly more > likely to be self-inconsistent than normal tables, since operations on > them are not WAL-logged and it's entirely likely that the owning backend > crashed with some dirty pages not written out from its local buffers. > AFAICS this shouldn't be any big problem for vacuuming the table proper, > since heap pages are pretty independent, at least at the level > understood by plain vacuum. There's the torn-page problem as well. Highly improbable, but it seems possible to me to have an inconsistent heap page with for example broken redirecting line pointers or something like that, that would cause crashes or assertion failures on vacuum. > Or we could have autovacuum just drop orphaned temp tables, *if* > they have gotten old enough to need anti-wraparound vacuuming. > While I'm still uncomfortable with having autovac drop anything, > at least this would avoid the worst cases of "gee I really needed > that data to investigate the crash". The main attractions of this > idea are avoiding the corrupt-index issue and not doing vacuuming > work that's 99.99% sure to be useless. That sounds a lot simpler and better to me. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > Could autovacuum emit log messages as soon as it sees such tables and start > dropping them at some point later? We might have to rearrange the logic a bit to make that happen (I'm not sure what order things get tested in), but a log message does seem like a good idea. I'd go for logging anytime an orphaned table is seen, and dropping once it's past the anti-wraparound horizon. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > The only solution proposed in that thread was to auto-delete temp > tables at postmaster restart; which I opposed on the grounds that > throwing away data right after a crash was a terrible idea from a > forensic standpoint. Why not just rename the files out of the way, and nuke the entries from the catalog? Something like "filename.crash" or similar that an admin can have scripts in place to check for and who could then go handle as appropriate (remove, investigate, etc). If there's data in the catalog that you think might be bad to lose, then include it in some kind of format in a "filename.crash.catalog" or similar file. Maybe also spit out a warning or error or something on backend start when this rename is done, and on subsequent starts if the file remains. What I really don't like is keeping something that is likely useless and possibly deadly to a backend if corrupt & accessed sitting around. I'm also not a big fan of keeping what is essentially 'garbage' around in the catalog which could just lead to confusion later if someone's looking at "what actual temporary tables should there be" and seeing others that they wouldn't expect. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> The only solution proposed in that thread was to auto-delete temp >> tables at postmaster restart; which I opposed on the grounds that >> throwing away data right after a crash was a terrible idea from a >> forensic standpoint. > Why not just rename the files out of the way, and nuke the entries from > the catalog? It's usually tough to make any sense of the contents of a table if you don't have the catalog entries. Anyway, that approach would put the onus on the admin to clean things up eventually, which isn't all that appealing. Bear in mind that temp table contents are subject to summary deletion during normal operation anyway. What I opposed back in January was deleting them *immediately* after a crash, but that doesn't mean I'm in favor of keeping them indefinitely. regards, tom lane
Stephen Frost wrote: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > > The only solution proposed in that thread was to auto-delete temp > > tables at postmaster restart; which I opposed on the grounds that > > throwing away data right after a crash was a terrible idea from a > > forensic standpoint. > > Why not just rename the files out of the way, and nuke the entries from > the catalog? Something like "filename.crash" or similar that an admin > can have scripts in place to check for and who could then go handle as > appropriate (remove, investigate, etc). This was my thought too. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Tom Lane wrote: > Gregory Stark <stark@enterprisedb.com> writes: > > Could autovacuum emit log messages as soon as it sees such tables and start > > dropping them at some point later? > > We might have to rearrange the logic a bit to make that happen (I'm not > sure what order things get tested in), but a log message does seem like > a good idea. I'd go for logging anytime an orphaned table is seen, > and dropping once it's past the anti-wraparound horizon. I don't think this requires much of a rearrangement -- see autovacuum.c 1921ff. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Heikki Linnakangas wrote: >> Or we could have autovacuum just drop orphaned temp tables, *if* >> they have gotten old enough to need anti-wraparound vacuuming. >> While I'm still uncomfortable with having autovac drop anything, >> at least this would avoid the worst cases of "gee I really needed >> that data to investigate the crash". The main attractions of this >> idea are avoiding the corrupt-index issue and not doing vacuuming >> work that's 99.99% sure to be useless. > > That sounds a lot simpler and better to me. Yeah, when I read the original this one struck me as almost a no-brainer choice. cheers andrew
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> We might have to rearrange the logic a bit to make that happen (I'm not >> sure what order things get tested in), but a log message does seem like >> a good idea. I'd go for logging anytime an orphaned table is seen, >> and dropping once it's past the anti-wraparound horizon. > I don't think this requires much of a rearrangement -- see autovacuum.c > 1921ff. So everyone is happy with the concept of doing it as above? If so, I'll work on it this weekend sometime. regards, tom lane
Tom Lane wrote: > We might have to rearrange the logic a bit to make that happen (I'm not > sure what order things get tested in), but a log message does seem like > a good idea. I'd go for logging anytime an orphaned table is seen, > and dropping once it's past the anti-wraparound horizon. Is there an easy way for an Admin clean-up the lost temp tables that autovacuum is complaining about? It seems like it could be along time and a lot of log messages between when they are first orphaned and and finally dropped due to anti-wraparound protection.
Tom Lane writes: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Tom Lane wrote: >>> We might have to rearrange the logic a bit to make that happen >>> (I'm not >>> sure what order things get tested in), but a log message does seem >>> like >>> a good idea. I'd go for logging anytime an orphaned table is seen, >>> and dropping once it's past the anti-wraparound horizon. > >> I don't think this requires much of a rearrangement -- see >> autovacuum.c >> 1921ff. > > So everyone is happy with the concept of doing it as above? If so, > I'll work on it this weekend sometime. I think it is the most reasonable thing to do. Regarding the log messages about orphaned tables, it would be nice if you could add a hint/detail message explaining how to cleanup those tables. If that's possible. Best Regards Michael Paesold
"Matthew T. O'Connor" <matthew@zeut.net> writes: > Is there an easy way for an Admin clean-up the lost temp tables that > autovacuum is complaining about? It seems like it could be along time > and a lot of log messages between when they are first orphaned and and > finally dropped due to anti-wraparound protection. Drop the particular temp schema, maybe? The log message should probably make sure to specify the schema name. regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> We might have to rearrange the logic a bit to make that happen (I'm not >> sure what order things get tested in), but a log message does seem like >> a good idea. I'd go for logging anytime an orphaned table is seen, >> and dropping once it's past the anti-wraparound horizon. > I don't think this requires much of a rearrangement -- see autovacuum.c > 1921ff. Hmm, maybe I'm missing something but I see no good way to do it without refactoring relation_check_autovac. Since that function is only called in one place, I'm thinking of just inlining it; do you see a reason not to? regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> We might have to rearrange the logic a bit to make that happen (I'm not > >> sure what order things get tested in), but a log message does seem like > >> a good idea. I'd go for logging anytime an orphaned table is seen, > >> and dropping once it's past the anti-wraparound horizon. > > > I don't think this requires much of a rearrangement -- see autovacuum.c > > 1921ff. > > Hmm, maybe I'm missing something but I see no good way to do it without > refactoring relation_check_autovac. Hmm, oops :-) > Since that function is only called in one place, I'm thinking of just > inlining it; do you see a reason not to? Nope, go ahead. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Fri, 2008-06-27 at 12:43 -0400, Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> The only solution proposed in that thread was to auto-delete temp > >> tables at postmaster restart; which I opposed on the grounds that > >> throwing away data right after a crash was a terrible idea from a > >> forensic standpoint. > > > Why not just rename the files out of the way, and nuke the entries from > > the catalog? > > It's usually tough to make any sense of the contents of a table if you > don't have the catalog entries. Anyway, that approach would put the > onus on the admin to clean things up eventually, which isn't all that > appealing. We need to learn to live without the catalog entries anyway. In Hot Standby mode, creating a temp table would cause a write to the catalog, which would then fail. If we want to allow creating temp tables in Hot Standby mode then we must prevent them from inserting and then later deleting them from the catalog. So it would seem that we need a way of handling temp tables that doesn't rely on catalog entries at all. My proposal would be that we identify all temp table names by the xid that first creates them (and their name...). In memory we would keep track of which files go with which tablenames, so we can still locate them easily. If we lose that context we can always see which files are in need of vacuum using that xid. Are we also at risk from very long lived sessions that use temp tables? Do we need to put in a check for sessions to vacuum old temp tables that are still valid? I'm happy to work on this if a solution to all of the above emerges. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes: > So it would seem that we need a way of handling temp tables that doesn't > rely on catalog entries at all. That's a complete non-starter; I need go no farther than to point out that it would break clients that expect to see their temp tables reflected in pg_class and so forth. There's been some idle musing in the past about causing pg_class and friends to have inheritance-child tables that are "temp", both in the sense of being backend-local and of not having guaranteed storage, and arranging to store the rows needed for a backend's temp tables in there. There's still a long way to go to make that happen, but at least it would be reasonably transparent on the client side. > Are we also at risk from very long lived sessions that use temp tables? Yeah, one of the bigger problems with it is that no one else could even see whether a backend's temp table was at risk of wraparound (or even before actual wraparound, in need of freezing because pg_clog is supposed to be truncated). Possibly a backend could advertise a min frozen xid for its temp tables in the PGPROC array. regards, tom lane
On Fri, 2008-07-11 at 17:26 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > So it would seem that we need a way of handling temp tables that doesn't > > rely on catalog entries at all. > > That's a complete non-starter; I need go no farther than to point out > that it would break clients that expect to see their temp tables > reflected in pg_class and so forth. What does the SQL Standard say about the Information Schema I wonder/ > There's been some idle musing in the past about causing pg_class and > friends to have inheritance-child tables that are "temp", both in the > sense of being backend-local and of not having guaranteed storage, > and arranging to store the rows needed for a backend's temp tables > in there. There's still a long way to go to make that happen, but > at least it would be reasonably transparent on the client side. OK, very cool if we could make it work. I realised it would have to apply all the way through to pg_statistic. Brain dump a little more, while we're on the subject? This aspect is something I've not spent any time on yet, so even a few rungs up the ladder will help lots. Thanks. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Sat, Jul 12, 2008 at 12:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Fri, 2008-07-11 at 17:26 -0400, Tom Lane wrote: >> Simon Riggs <simon@2ndquadrant.com> writes: >> > So it would seem that we need a way of handling temp tables that doesn't >> > rely on catalog entries at all. >> >> That's a complete non-starter; I need go no farther than to point out >> that it would break clients that expect to see their temp tables >> reflected in pg_class and so forth. > > What does the SQL Standard say about the Information Schema I wonder/ Many apps were written long before we had one. Not too mention that it doesn't provide anything like all the info that PostgreSQL-specific tool (though not necessarily user apps) would likely need. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sat, 2008-07-12 at 00:57 +0100, Dave Page wrote: > On Sat, Jul 12, 2008 at 12:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > On Fri, 2008-07-11 at 17:26 -0400, Tom Lane wrote: > >> Simon Riggs <simon@2ndquadrant.com> writes: > >> > So it would seem that we need a way of handling temp tables that doesn't > >> > rely on catalog entries at all. > >> > >> That's a complete non-starter; I need go no farther than to point out > >> that it would break clients that expect to see their temp tables > >> reflected in pg_class and so forth. > > > > What does the SQL Standard say about the Information Schema I wonder/ > > Many apps were written long before we had one. Not too mention that it > doesn't provide anything like all the info that PostgreSQL-specific > tool (though not necessarily user apps) would likely need. So are you saying a) that other sessions need to be able to see pg_class entries for temporary tables created by different sessions? b) that you need to be able to see pg_class entries for temporary tables created only in your session? Hopefully you just mean (b)?? a) would simply not be possible at the same time as having us avoid pg_class writes in Hot Standby mode. We would have a choice of seeing temp tables, or allowing temp tables to work in Hot Standby, not both. b) would is possible, if we follow the route of taking a locally inherited copy of pg_class. The SQL Standard Information Schema does show LOCAL TEMPORARY and GLOBAL TEMPORARY tables. Our implementation of temp tables differs from the standard, so I think (b) is fully in line with that. If anybody did want (a), then I'd suggest that we use the keyword GLOBAL TEMPORARY table to denote something that would be put in pg_class and visible to all, and thus unusable in hot standby mode. I'm not planning on building that. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Sat, Jul 12, 2008 at 9:16 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > So are you saying > > a) that other sessions need to be able to see pg_class entries for > temporary tables created by different sessions? > > b) that you need to be able to see pg_class entries for temporary tables > created only in your session? > > Hopefully you just mean (b)?? Not necessarily - it may be useful for an admin to peek at what tables are created by other sessions. I don't feel strongly about that though, moreso b). > The SQL Standard Information Schema does show LOCAL TEMPORARY and GLOBAL > TEMPORARY tables. Our implementation of temp tables differs from the > standard, so I think (b) is fully in line with that. My point is that any good admin tool will use pg_class directly, as information_schema doesn't include any of the Postgres specifc info that such a tool would want. FYI, pgAdmin doesn't do anything with temp tables at the moment anyway - I'm thinking of other apps that may do. An inheritance schema for pg_class may well work for them. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sat, 2008-07-12 at 09:56 +0100, Dave Page wrote: > An inheritance schema for pg_class may well work for them. OK, proposal is something like this: We must avoid writes to many system tables to allow temp tables to function. Priority 1 pg_class - base definition of table pg_attribute - columns pg_attrdef - defaults pg_statistic - ability to store ANALYZE and VACUUM results pg_inherits Priority 2 - would we need any of these? pg_depend pg_rules pg_rewrite pg_triggers pg_indexes Each of the above tables would have a matching temp_* version. We wouldn't want to create a full temp schema every time we connect, we would only do that when a temp table was created. When first temp table requested we create temp schema, using reserved oids established at bootstrap time. We'll need a temp-bootstrap process for the creation of temp_pg_class and temp_pg_attribute, then we can create the other catalog tables more normally. We hack find_inheritance_children() so it returns the oid of the inherited temp catalog table for appropriate catalog tables. That way we don't need to write to pg_inherits in order to have the catalog tables recognise they have inheritance children. So all direct accesses to catalog tables would result in temp tables showing up in the result set, but only within the same session. In Hot Standby mode we would ignore the inheritance hint and *always* search pg_inherits every time we access a table. Would also need to go through all catalog code so that it accessed the correct catalog table depending upon whether its permanent or temp. Seems like it would be a fairly big patch. The temp-bootstrap process is still just hand-waving though. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes: > Seems like it would be a fairly big patch. The temp-bootstrap process is > still just hand-waving though. Yeah, and it seems fairly messy. The idea I'd had was that all the catalog entries for (a single set of) inheritance child tables are Just There in the output of initdb. The tricky part is that each session that makes use of these tables needs to have its own copy of their contents. (I note that this is real close to the SQL spec's notion of how a temp table works --- maybe we could usefully combine this with a patch to provide spec-compliant temp tables?) I think that could be solved with some magic that made their pg_class entries reflect a per-session relfilenode value. This seems no worse than your proposed magic in pg_inherit, and it eliminates the problem of needing to "bootstrap" all the temp-file catalog infrastructure in every session. regards, tom lane
On Sat, 2008-07-12 at 12:04 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Seems like it would be a fairly big patch. The temp-bootstrap process is > > still just hand-waving though. > > Yeah, and it seems fairly messy. The idea I'd had was that all the > catalog entries for (a single set of) inheritance child tables are Just > There in the output of initdb. Yeh, not having a temp-boostrap process at all is best. > The tricky part is that each session that > makes use of these tables needs to have its own copy of their contents. > I think that could be solved with > some magic that made their pg_class entries reflect a per-session > relfilenode value. This seems no worse than your proposed magic in > pg_inherit, and it eliminates the problem of needing to "bootstrap" > all the temp-file catalog infrastructure in every session. Agreed. Perhaps the magic would be to create sub-directories in the pg_temp namespace based upon backend pid. That way all relfilenode values would be the same, but would refer to different objects. Side thought... We can't generate Oids in Hot Standby mode, since they'll end up duplicating values from the primary. We probably need to reserve the top 16384 Oid values for use as temp object Oids. > (I note that this is real close to the SQL spec's notion of how a temp > table works --- maybe we could usefully combine this with a patch to > provide spec-compliant temp tables?) Yeh, I read that and thought something similar. But we're talking about temp additions to catalog tables, not all temp tables. If we tried to implement spec-compliant temp tables we would need to write to catalog tables again, which is what we are trying to avoid! -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes: > Yeh, I read that and thought something similar. But we're talking about > temp additions to catalog tables, not all temp tables. If we tried to > implement spec-compliant temp tables we would need to write to catalog > tables again, which is what we are trying to avoid! No, because a spec-compliant temp table is a persistent object and *should* be reflected in the permanent catalogs. What you meant to say is that hot-standby sessions would only be able to use our traditional type of temp tables. [ thinks for a bit ... ] actually, maybe a hot standby session could be allowed to use a *pre-existing* spec-compliant temp table. It couldn't make a new one though. regards, tom lane