Re: Proposal: Another attempt at vacuum improvements - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Proposal: Another attempt at vacuum improvements |
Date | |
Msg-id | BANLkTi=STNpQFTa-e97OSwz9ts=kKQQP6Q@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal: Another attempt at vacuum improvements (Pavan Deolasee <pavan.deolasee@gmail.com>) |
List | pgsql-hackers |
On Thu, May 26, 2011 at 6:40 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: >>> There are some other issues that we should think about too. Like >>> recording free space and managing visibility map. The free space is >>> recorded in the second pass pass today, but I don't see any reason why >>> that can't be moved to the first pass. Its not clear though if we >>> should also record free space after retail page vacuum or leave it as >>> it is. >> >> Not sure. Any idea why it's like that, or why we might want to change it? > > I think it precedes the HOT days when the dead space was reclaimed > only during the second scan. Even post-HOT, if we know we would > revisit the page anyways during the second scan, it makes sense to > delay recording free space because the dead line pointers can add to > it (if they are towards the end of the line pointer array). I remember > discussing this briefly during HOT, but can't recollect why we decided > not to update the FSM after retail vacuum. But the entire focus then > was to keep things simple and that could be one reason. It's important to keep in mind that page-at-a-time vacuum is happening in the middle of a routine INSERT/UPDATE/DELETE operation, so we don't want to do anything too expensive there. Whether updating the FSM falls into that category or not, I am not sure. >> Currently, I believe the only way a page can get marked all-visible is >> by vacuum. But if we make this change, then it would be possible for >> a HOT cleanup to encounter a situation where all-visible could be set. >> We probably want to make that work. > > Yes. Thats certainly an option. > > We did not discuss where to store the information about the start-LSN > of the last successful index vacuum. I am thinking about a new > pg_class attribute, just because I can't think of anything better. Any > suggestion ? That seems fairly grotty, but I don't have a lot of brilliant ideas. One possibility that occurred to me was to stick it in the special space on the first page of the relation. But that would mean that every HOT cleanup would need to look at that page, which seems poor. Even if we cached it after the first access, it still seems kinda poor. But it would make the unlogged case easier to handle... and we have thought previously about including some metadata in the relation file itself to help with forensics (which table was this, anyway?). So I don't know. > Also for the first version, I wonder if we should let the unlogged and > temp tables to be handled by the usual two pass vacuum. Once we have > proven that one pass is better, we will extend that to other tables as > discussed on this thread. We can certainly do that for testing. Whether we want to commit it that way, I'm not sure. > Do we need a modified syntax for vacuum, like "VACUUM mytab SKIP > INDEX" or something similar ? That way, user can just vacuum the heap > if she wishes so and can also help us with testing. There's an extensible-options syntax you can use... VACUUM (index off) mytab. > Do we need more autovacuum tuning parameters to control when to vacuum > just the heap and when to vacuum the index as well ? Again, we can > discuss and decide this later, but just wanted to mention this here. Let's make tuning that a separate effort. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: