Re: Improving free space usage (was: Reducing relation locking - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: Improving free space usage (was: Reducing relation locking |
Date | |
Msg-id | 1134079214.3577.13.camel@localhost.localdomain Whole thread Raw |
In response to | Improving free space usage (was: Reducing relation locking overhead) ("Jim C. Nasby" <jim@nasby.net>) |
Responses |
Re: Improving free space usage (was: Reducing relation locking overhead)
|
List | pgsql-hackers |
Ühel kenal päeval, N, 2005-12-08 kell 12:57, kirjutas Jim C. Nasby: > On Thu, Dec 08, 2005 at 11:58:50AM +0200, Hannu Krosing wrote: > > > > The problem with separate list is that it can be huge. For example on a > > > > table with 200 inserts/updates per second an index build lasting 6 hours > > > > would accumulate total on 6*3600*200 = 4320000 new tuples. > > > > > > Sure, but it's unlikely that such a table would be very wide, so 4.3M > > > tuples would probably only amount to a few hundred MB of data. It's also > > > possible that this list could be vacuumed by whatever the regular vacuum > > > process is for the table. > > > > I think that keeping such list as part the table at well defined > > location (like pages from N to M) is the best strategy, as it will > > automatically make all new tuples available to parallel processes and > > avoids both duplicate storage as well as the the need for changing > > insert/update code. > > There's one thing I hate about that idea though: good luck trying to > move those tuples somewhere else after the index build is done and you > now want to shrink the table back down to a more normal size. I feel your pain. To solve similar problem I have been forced to write scripts that do updates of pk_column=pk_column until the tuple moves to another page as shown by ctid . Not a sensible thing to do (do a lot of updates to *increase* performance), but necessary nevertheless considering current postgres behaviour. > If we had > a better way to do that it would be much more palatable, but right now > on a heavily updated table this would result in a lot of bloat. Actually any long transaction would do that. > Along those lines, I've wondered if it makes sense to add more > flexibility in how free space is reclaimed in a table. One obvious > possibility is to have a strategy where new tuples will always look to > the FSM for space (instead of going into the current page if possible), > and the FSM will always hand out the earliest page in the table it has. > This mode would have the effect of moving tuples towards the front of > the table, allowing for space reclamation. A variation might be that > this mode will not effect tuples that are generated as part of an UPDATE > and are in the first x% of the table, since it doesn't make sense to > move a tuple from page 2 to page 1 in a 1000 page table. This % could be depending on some "fill factor" of the table, aiming not to move tuples, that would end up in the final volume of a balance table, which, in case of heavily updated table, would probably be 2 to 3 times the volume of densely populated table. > Another possibility is to always go to the FSM and to have the FSM hand > back the page that is closest to the new tuple according to a certain > index. This would allow for ALTER TABLE CLUSTER to be much more > self-maintaining. The downside is that you'd have to do a lookup on that > index, but presumably if the index is important enough to cluster on > then it should be well-cached. There's probably some other tweaks that > could be done as well to make this more performant. Yes, I agree on all your points about better placement of new tuples, all they would be useful indeed. -------------- Hannu
pgsql-hackers by date: