Thread: Postgresql backend to perform vacuum automatically
From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the backend to keep track of the number of update performed on a database and after a certain threshold start a vacuum in a separate process by itself.
Any comments?
Nicolas Bazin wrote: > >From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the backendto keep track of the number of update performed on a database and after a certain threshold start a vacuum in a separateprocess by itself. > Any comments? Yes, makes sense to me, especially now that we have a nolocking vacuum. Tom, do you have any ideas on this? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Nicolas Bazin wrote: > From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the backend tokeep track of the number of update performed on a database and after a certain threshold start a vacuum in a separate processby itself. >> Any comments? > Yes, makes sense to me, especially now that we have a nolocking vacuum. > Tom, do you have any ideas on this? There's a TODO item about this already. * Provide automatic scheduling of background vacuum (Tom) regards, tom lane
> > Yes, makes sense to me, especially now that we have a nolocking vacuum. > > Tom, do you have any ideas on this? > > There's a TODO item about this already. > > * Provide automatic scheduling of background vacuum (Tom) I think an good system would have some parameters something like this in postgresql.conf: vacuum_update_threshold = num of operations that cause frags on a table before a vacuum is run, 0 = no requirement, if followed by a percent (%) sign, indicates percentage of rows changed, rather than an absolute number vacuum_idle_threshold = system load below which the system must be before a vacuum can be performed. 0 = no requirement vacuum_time_threshold = seconds since last vacuum before which another vacuum cannot occur. 0 = no requirement. If all 3 are 0, then no auto-vacuuming is performed at all. There's probably trouble if only the idle threshold is set to zero. And the same for the 'analyze' command? If they want it on a per-table basis, then they can just do it themselves with a cronjob! Chris
> vacuum_idle_threshold > = system load below which the system must be before a vacuum can be > performed. 0 = no requirement > > vacuum_time_threshold > = seconds since last vacuum before which another vacuum cannot occur. 0 = no > requirement. > > If all 3 are 0, then no auto-vacuuming is performed at all. There's > probably trouble if only the idle threshold is set to zero. > > And the same for the 'analyze' command? > > If they want it on a per-table basis, then they can just do it themselves > with a cronjob! Yes, and Jan's statistics stuff has stats on table activity. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Nicolas Bazin wrote: > > From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the backendto keep track of the number of update performed on a database and after a certain threshold start a vacuum in a separateprocess by itself. > >> Any comments? > > > Yes, makes sense to me, especially now that we have a nolocking vacuum. > > Tom, do you have any ideas on this? > > There's a TODO item about this already. > > * Provide automatic scheduling of background vacuum (Tom) I have been thinking about this. I like the idea, but it may be problematic. I suggested running a vacuum process on a constant low priority in the background, and it was pointed out that this may cause some deadlock issues. For vacuum to run in the background, it needs to be more regulated or targeted. It needs to be able to know which tables need it. Many tables are static and never get updated, vacuuming them would be pointless. It needs to be sensitive to database load, and be tunable to vacuum only when safe or necessary to reduce load. Are there tables that track information that would be useful for guiding vacuum? Could I write a program which queries some statistical tables and and knows which tables need to be vacuumed? If the info is around, I could whip up something pretty easily, I think.
> Are there tables that track information that would be useful for guiding > vacuum? Could I write a program which queries some statistical tables and and > knows which tables need to be vacuumed? > > If the info is around, I could whip up something pretty easily, I think. Sure, Jan's new statistics tables in 7.2 had just that info.test=> \d pg_stat_user_tables View "pg_stat_user_tables" Column | Type | Modifiers ---------------+---------+----------- relid | oid | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | numeric | idx_tup_fetch| numeric | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > Are there tables that track information that would be useful for guiding > > vacuum? Could I write a program which queries some statistical tables and and > > knows which tables need to be vacuumed? > > > > If the info is around, I could whip up something pretty easily, I think. > > Sure, Jan's new statistics tables in 7.2 had just that info. > > test=> \d pg_stat_user_tables > View "pg_stat_user_tables" > Column | Type | Modifiers > ---------------+---------+----------- > relid | oid | > relname | name | > seq_scan | bigint | > seq_tup_read | bigint | > idx_scan | numeric | > idx_tup_fetch | numeric | > n_tup_ins | bigint | > n_tup_upd | bigint | > n_tup_del | bigint | I have a system running 7.2b2, does it update these fields? Is it an option? All I am getting is zeros. relid | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del -----------+----------------------+----------+--------------+----------+---------------+-----------+-----------+----------- 16559 | snf_cache | 0 | 0 | 0 | 0 | 0 | 0 | 0 8689760 | fav_stat | 0 | 0 | | | 0 | 0 | 0 19174244 | mbr_art_aff | 0 | 0 | 0 | 0 | 0 | 0 | 0 20788376 | artist_affinity | 0 | 0 | 0 | 0 | 0 | 0 | 0 83345144 | saffweak | 0 | 0 | | | 0 | 0 | 0 94811871 | pga_queries | 0 | 0 | | | 0 | 0 | 0 94812980 | pga_forms | 0 | 0 | | | 0 | 0 | 0 94813425 | pga_scripts | 0 | 0 | | | 0 | 0 | 0 94813869 | pga_reports | 0 | 0 | | | 0 | 0 | 0 94814245 | pga_schema | 0 | 0 | | | 0 | 0 | 0116675008 | int_song_affinity | 0 | 0 | | | 0 | 0 | 0166147508 | favorites | 0 | 0 | 0 | 0 | 0 | 0 | 0173869647 | song_affinity_orig_t | 0 | 0 | | | 0 | 0 | 0176339567 | song_affinity | 0 | 0 | 0 | 0 | 0 | 0 | 0178658941 | song_affinity_array | 0 | 0 | 0 | 0 | 0 | 0 | 0186403716 | mbr_art_aff_t | 0 | 0 | | | 0 | 0 | 0
Yes, I am seeing only zeros too. Jan? --------------------------------------------------------------------------- mlw wrote: > Bruce Momjian wrote: > > > > > Are there tables that track information that would be useful for guiding > > > vacuum? Could I write a program which queries some statistical tables and and > > > knows which tables need to be vacuumed? > > > > > > If the info is around, I could whip up something pretty easily, I think. > > > > Sure, Jan's new statistics tables in 7.2 had just that info. > > > > test=> \d pg_stat_user_tables > > View "pg_stat_user_tables" > > Column | Type | Modifiers > > ---------------+---------+----------- > > relid | oid | > > relname | name | > > seq_scan | bigint | > > seq_tup_read | bigint | > > idx_scan | numeric | > > idx_tup_fetch | numeric | > > n_tup_ins | bigint | > > n_tup_upd | bigint | > > n_tup_del | bigint | > > I have a system running 7.2b2, does it update these fields? Is it an option? > All I am getting is zeros. > > relid | relname | seq_scan | seq_tup_read | idx_scan | > idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del > -----------+----------------------+----------+--------------+----------+---------------+-----------+-----------+----------- > 16559 | snf_cache | 0 | 0 | 0 > | 0 | 0 | 0 | 0 > 8689760 | fav_stat | 0 | 0 | > | | 0 | 0 | 0 > 19174244 | mbr_art_aff | 0 | 0 | 0 > | 0 | 0 | 0 | 0 > 20788376 | artist_affinity | 0 | 0 | 0 > | 0 | 0 | 0 | 0 > 83345144 | saffweak | 0 | 0 | > | | 0 | 0 | 0 > 94811871 | pga_queries | 0 | 0 | > | | 0 | 0 | 0 > 94812980 | pga_forms | 0 | 0 | > | | 0 | 0 | 0 > 94813425 | pga_scripts | 0 | 0 | > | | 0 | 0 | 0 > 94813869 | pga_reports | 0 | 0 | > | | 0 | 0 | 0 > 94814245 | pga_schema | 0 | 0 | > | | 0 | 0 | 0 > 116675008 | int_song_affinity | 0 | 0 | > | | 0 | 0 | 0 > 166147508 | favorites | 0 | 0 | 0 > | 0 | 0 | 0 | 0 > 173869647 | song_affinity_orig_t | 0 | 0 | > | | 0 | 0 | 0 > 176339567 | song_affinity | 0 | 0 | 0 > | 0 | 0 | 0 | 0 > 178658941 | song_affinity_array | 0 | 0 | 0 > | 0 | 0 | 0 | 0 > 186403716 | mbr_art_aff_t | 0 | 0 | > | | 0 | 0 | 0 > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Yes, I am seeing only zeros too. Jan? Did you turn on statistics gathering? See the Admin Guide's discussion of database monitoring. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Yes, I am seeing only zeros too. Jan? > > Did you turn on statistics gathering? See the Admin Guide's discussion > of database monitoring. Oops, now I remember. Those are off by default and only the query string is on by default. Thanks. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > Yes, I am seeing only zeros too. Jan? > > > > Did you turn on statistics gathering? See the Admin Guide's discussion > > of database monitoring. > > Oops, now I remember. Those are off by default and only the query > string is on by default. Thanks. This raises the question, by turning these on, does that affect database performance? If so, it may not be the answer for a selective vacuum. If they do not affect performance, then why have them off?
mlw wrote: > Bruce Momjian wrote: > > > > Tom Lane wrote: > > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > > Yes, I am seeing only zeros too. Jan? > > > > > > Did you turn on statistics gathering? See the Admin Guide's discussion > > > of database monitoring. > > > > Oops, now I remember. Those are off by default and only the query > > string is on by default. Thanks. > > This raises the question, by turning these on, does that affect database > performance? > > If so, it may not be the answer for a selective vacuum. > > If they do not affect performance, then why have them off? I think Jan said 2-3%. If we can get autovacuum from it, it would be a win to keep it on all the time, perhaps. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > mlw wrote: > > Bruce Momjian wrote: > > > > > > Tom Lane wrote: > > > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > > > Yes, I am seeing only zeros too. Jan? > > > > > > > > Did you turn on statistics gathering? See the Admin Guide's discussion > > > > of database monitoring. > > > > > > Oops, now I remember. Those are off by default and only the query > > > string is on by default. Thanks. > > > > This raises the question, by turning these on, does that affect database > > performance? > > > > If so, it may not be the answer for a selective vacuum. > > > > If they do not affect performance, then why have them off? > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > win to keep it on all the time, perhaps. Assuming that the statistics get updated: How often should the sats table be queried? What sort of configurability would be needed?
> > > If they do not affect performance, then why have them off? > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > win to keep it on all the time, perhaps. > > Assuming that the statistics get updated: > > How often should the sats table be queried? > What sort of configurability would be needed? You could wake up every few minutes and see how the values have changed. I don't remember if there is a way to clear that stats so you can see just the changes in the past five minutes. Vacuum the table that had activity. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
-----Original Message----- From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] Sent: Tuesday, March 05, 2002 12:59 PM To: mlw Cc: Tom Lane; Nicolas Bazin; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Postgresql backend to perform vacuum automatically > > > If they do not affect performance, then why have them off? > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > win to keep it on all the time, perhaps. > > Assuming that the statistics get updated: > > How often should the sats table be queried? > What sort of configurability would be needed? You could wake up every few minutes and see how the values have changed. I don't remember if there is a way to clear that stats so you can see just the changes in the past five minutes. Vacuum the table that had activity. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> How long does it take to vacuum a table with 12 indexes and 100 million rows in it? This idea is making me very nervous. Suppose (for instance) that we have regular updates to some table, and it is constantly getting vacuum attempts thrown at it. Now imagine a large systems with many large tables which are frequently receiving updates. Would 100 simultaneous vacuum operations be a good thing when .0001% of the table has changed [on average] for each of them? I know for sure "update statistics" at regular intervals on some of the SQL systems I have used would be sheer suicide. Better make it configurable, that's for sure. <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< <<<<
On Tue, 2002-03-05 at 15:59, Bruce Momjian wrote: > > > > If they do not affect performance, then why have them off? > > > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > > win to keep it on all the time, perhaps. > > > > Assuming that the statistics get updated: > > > > How often should the sats table be queried? > > What sort of configurability would be needed? > > You could wake up every few minutes and see how the values have changed. > I don't remember if there is a way to clear that stats so you can see > just the changes in the past five minutes. Vacuum the table that had > activity. Ick -- polling. The statistics process should be able to wake somebody up / notify the postmaster when the statistics change such that a vacuum is required. Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: npadgett@redhat.com 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9
Neil Padgett wrote: > On Tue, 2002-03-05 at 15:59, Bruce Momjian wrote: > > > > > If they do not affect performance, then why have them off? > > > > > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > > > win to keep it on all the time, perhaps. > > > > > > Assuming that the statistics get updated: > > > > > > How often should the sats table be queried? > > > What sort of configurability would be needed? > > > > You could wake up every few minutes and see how the values have changed. > > I don't remember if there is a way to clear that stats so you can see > > just the changes in the past five minutes. Vacuum the table that had > > activity. > > Ick -- polling. The statistics process should be able to wake somebody > up / notify the postmaster when the statistics change such that a vacuum > is required. Yes, that would tie that stats collector closer to auto-vacuum, but it certainly could be done. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > Neil Padgett wrote: > > On Tue, 2002-03-05 at 15:59, Bruce Momjian wrote: > > > > > > If they do not affect performance, then why have them off? > > > > > > > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > > > > win to keep it on all the time, perhaps. > > > > > > > > Assuming that the statistics get updated: > > > > > > > > How often should the sats table be queried? > > > > What sort of configurability would be needed? > > > > > > You could wake up every few minutes and see how the values have changed. > > > I don't remember if there is a way to clear that stats so you can see > > > just the changes in the past five minutes. Vacuum the table that had > > > activity. > > > > Ick -- polling. The statistics process should be able to wake somebody > > up / notify the postmaster when the statistics change such that a vacuum > > is required. > > Yes, that would tie that stats collector closer to auto-vacuum, but it > certainly could be done. Using an alert can be done, but polling is easier for a proof of concept. I dont see too much difficulty there. We could use notify.
> > > > If they do not affect performance, then why have them off? > > > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > > win to keep it on all the time, perhaps. > > > > Assuming that the statistics get updated: > > > > How often should the sats table be queried? > > What sort of configurability would be needed? > > You could wake up every few minutes and see how the values have changed. > I don't remember if there is a way to clear that stats so you can see > just the changes in the past five minutes. Vacuum the table that had > activity. I cannot envision querying the stats every 4 seconds, especially if the stats thread already has most of the info in hand. I still think, that for best results the vacuums should happen continuously for single pages based on a hook in wal or the buffer manager. Do I remember correctly, that the active page (the one receiving the next row) already has a strategy for slot reuse ? Maybe this strategy should be the followed more aggressively ? Seems the worst case is a few row table that permanently get updated, it should be possible to harness this situation with above method. Andreas
>>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Are there tables that track information that would be useful >> for guiding vacuum? Could I write a program whichqueries some >> statistical tables and and knows which tables need to be >> vacuumed? >> >> If the info isaround, I could whip up something pretty easily, >> I think. Bruce> Sure, Jan's new statistics tables in 7.2 had just that Bruce> info. Bruce> test=> \d pg_stat_user_tables I don't have that table. Only 'pg_statistic'... ? -- Noriega Peking smuggle Iran jihad spy domestic disruption Cocaine CIA kibo Waco, Texas toluene iodine Qaddafi FSF [See http://www.aclu.org/echelonwatch/index.html for more about this]
Zeugswetter Andreas SB SD wrote: > > > > > > If they do not affect performance, then why have them off? > > > > > > > > I think Jan said 2-3%. If we can get autovacuum from it, it would be a > > > > win to keep it on all the time, perhaps. > > > > > > Assuming that the statistics get updated: > > > > > > How often should the sats table be queried? > > > What sort of configurability would be needed? > > > > You could wake up every few minutes and see how the values have changed. > > I don't remember if there is a way to clear that stats so you can see > > just the changes in the past five minutes. Vacuum the table that had > > activity. > > I cannot envision querying the stats every 4 seconds, especially if the stats > thread already has most of the info in hand. > > I still think, that for best results the vacuums should happen continuously > for single pages based on a hook in wal or the buffer manager. Do I remember > correctly, that the active page (the one receiving the next row) already has > a strategy for slot reuse ? Maybe this strategy should be the followed more > aggressively ? > > Seems the worst case is a few row table that permanently get updated, > it should be possible to harness this situation with above method. Perhaps the statistics thread can trigger a semaphore when it sees that vacuum needs to be run?
Turbo Fredriksson wrote: > >>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes: > > >> Are there tables that track information that would be useful > >> for guiding vacuum? Could I write a program which queries some > >> statistical tables and and knows which tables need to be > >> vacuumed? > >> > >> If the info is around, I could whip up something pretty easily, > >> I think. > > Bruce> Sure, Jan's new statistics tables in 7.2 had just that > Bruce> info. > > Bruce> test=> \d pg_stat_user_tables > > I don't have that table. Only 'pg_statistic'... ? Upgrade to 7.2. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > I still think, that for best results the vacuums should happen continuously > for single pages based on a hook in wal or the buffer manager. Not possible unless you are willing to have SELECTs grab much stronger locks than they do now (viz, the same kind of lock that VACUUM does). regards, tom lane
>>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes: Bruce> Turbo Fredriksson wrote: >> >>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes: >> >> >> Arethere tables that track information that would be useful >> >> for guiding vacuum? Could I write a program which queries >> some >> statistical tables and and knows which tables need to >> be >> vacuumed? >> >> >> >> If the infois around, I could whip up something pretty >> easily, >> I think. >> Bruce> Sure, Jan's new statistics tablesin 7.2 had just that Bruce> info. >> Bruce> test=> \d pg_stat_user_tables >> I don't have that table. Only'pg_statistic'... ? Bruce> Upgrade to 7.2. Doh! I have :) -- FSF Peking AK-47 Nazi arrangements quiche explosion jihad ammunition North Korea toluene spy Treasury FBI congress [See http://www.aclu.org/echelonwatch/index.html for more about this]
Tom Lane wrote: > "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > > I still think, that for best results the vacuums should happen continuously > > for single pages based on a hook in wal or the buffer manager. > > Not possible unless you are willing to have SELECTs grab much stronger > locks than they do now (viz, the same kind of lock that VACUUM does). Remember, you can't recycle the page until all transactions are done seeing it as active. I think something periodic will do the job just fine. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> > I still think, that for best results the vacuums should happen continuously > > for single pages based on a hook in wal or the buffer manager. > > Not possible unless you are willing to have SELECTs grab much stronger > locks than they do now (viz, the same kind of lock that VACUUM does). I am talking about slots, that are marked deleted before oldest tx in progress. It should be possible to overwrite those only with the "pin" on the page. A pageread for a select does wait (spinlock) while a new txinfo is written, this is necessary since the txinfo is more than one byte, no ? It should be more or less the same situation like using a slot from the freelist, no ? Update and delete would need to check the "old page" for %free and add it to the freelist, like vacuum does. This would avoid the (imho large) overhead vacuum imposes of reading static pages that have not been modified in ages. Andreas
If I may pipe in to this discussion, I have some experience with this. With threaded postgres, the thread that is reponsible for writing out buffer pages keeps track of the number of writes on a particular relation (implied update, insert, or delete but not always). That is multiplied by a tolerance factor and accumulated until a limit is reached. After much trial and error I came up with this formula for the tolerance factor. live_tuple_count of previous vacuum * 0.01 + dead_tuple_count of previous vacuum * 0.1 + 100 to force vacuums on small relations all divided by live_tuple_count + 1 with bounds on the change in tolerance factor from the last vacuum (3.0x max increase or 80% max decrease) Lastly, the second phase of vacuum_lazy only gets triggered if more than 10% of the relation is dead. All this seems to keep relations pretty trim without getting in the way. Tables that have a large ratio of dead to live tuples get vacuumed more frequently which actually helps performance by clearing out old tuples which don't have to be scanned by other threads doing updates. Myron Scott mkscott@sacadia.com On Thu, 7 Mar 2002, Zeugswetter Andreas SB SD wrote: > > > I still think, that for best results the vacuums should happen continuously > > > for single pages based on a hook in wal or the buffer manager. > > > > Not possible unless you are willing to have SELECTs grab much stronger > > locks than they do now (viz, the same kind of lock that VACUUM does). > > I am talking about slots, that are marked deleted before oldest tx in progress. > It should be possible to overwrite those only with the "pin" on the page. > A pageread for a select does wait (spinlock) while a new txinfo is written, > this is necessary since the txinfo is more than one byte, no ? > It should be more or less the same situation like using a slot from the > freelist, no ? > Update and delete would need to check the "old page" for %free and add it to > the freelist, like vacuum does. > > This would avoid the (imho large) overhead vacuum imposes of reading static > pages that have not been modified in ages. > > Andreas > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: >> Not possible unless you are willing to have SELECTs grab much stronger >> locks than they do now (viz, the same kind of lock that VACUUM does). > I am talking about slots, that are marked deleted before oldest tx in progress. > It should be possible to overwrite those only with the "pin" on the page. No, it is not. You can't physically remove a tuple before you've removed all index entries that point to it. That requires, at minimum, a table lock that ensures that no new indexes are being created meanwhile. Which is what VACUUM uses. There's also a serious question about whether this would really improve performance. "Retail" deletion of index tuples would be fairly inefficient over the long run, compared to the bulk deletion technique used by VACUUM. regards, tom lane