Re: Autovacuum loose ends - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Autovacuum loose ends |
Date | |
Msg-id | 21177.1121364298@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Autovacuum loose ends (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
List | pgsql-hackers |
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Oh, is that right? Actually in the end I forgot about temp tables so I > didn't handle them specially, but now I remember that when I started > looking at Matthew's integration code I thought that temp tables should > be analyzed if they happen to have a lot of new tuples, so that the > planner would have good stats about them. You can't analyze temp tables of other sessions at all (and analyze.c won't try) because you can't be certain of being able to access their data. The valid data might only exist inside the local buffers of the owning backend. Same goes for VACUUM --- basically, autovac has to keep its fingers off temp tables altogether. (See nearby discussion with Tom O'Connell for graphic evidence.) >> * For that matter I'm unconvinced that it's a good idea to try to force >> the pgstat DB to pick up every table in every database. If there's no >> entry it's because the table is not getting modified, and therefore it >> seems to me that we can just leave well enough alone. The code really >> is not very good about doing nothing where nothing is called for ;-) > Hmm. The problem is that the table may merit a first ANALYZE, and in a > second run we need to know that another one is not needed. How would we > know that, if we don't keep track on it in the pgstat DB? If it's being touched at all, then it will get into the pgstat DB because of the actions of regular backends. I am questioning the need for autovac to force things into the pgstat DB when they are evidently not being used otherwise. (This argument becomes stronger if we don't reset the stats contents at postmaster start, which is another open issue. I think we probably do need to toss the old stats file after a WAL recovery, but maybe it need not happen otherwise.) > Also, remember that there were mentions of changing wraparound Xid to be > kept track of on a per-table basis, instead of per-database (for 8.2 I > assume). If this happens we will _need_ to check every table. Exactly ... whether it is in pgstat or not. That does not translate to a need to force pgstat entries to be made. >> * Or actually, it would vacuum template0, except that since no regular >> backend ever connects to template0, there will be no stats DB entry for >> it and so the loop in AutoVacMain will ignore it. This is definitely >> BAD as it means that a database that's not been touched since postmaster >> start will never be vacuumed, not even for XID wraparound prevention. >> That test needs to be weakened. > See, that's what I'm talking about :-) Yeah, but you're drawing the wrong conclusion. I'm saying that for a DB or table that is not present in pgstat, it is reasonable to assume it is not being used, and so our only responsibility is to prevent XID wraparound on it --- which we can determine from the pg_database entry (or pg_class entry if per-table wrap management happens). We do not need to force a pgstat entry to be created, and we should not try. > Another idea would be keeping a per-database dead tuple counter, or some > other metric, and use that as a parameter in choosing what database to > vacuum. The current test (last autovac start time) is certainly very > naive. Yeah, keeping per-database totals of the dead tuple counts would help, and would cost little inside the stats collector AFAICS. > Not sure if it's possible to check current Xid without > connecting to a database first. We could trivially expand the database flat file to include its datfrozenxid, and any other fields we need from pg_database. >> * I'm still pretty concerned about the handling of shared catalogs. > Maybe we can store them in pgstat in a pseudo-database with Oid=0, and > special case them everywhere. Yeah, that's what I suggested before. I haven't thought of any holes in the idea yet. The "special casing" shouldn't be hard --- you can just use the same info set up for the Relation's rd_lockInfo. regards, tom lane
pgsql-hackers by date: