Thread: Postgresql backend to perform vacuum automatically

Postgresql backend to perform vacuum automatically

From
"Nicolas Bazin"
Date:
From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the backend to keep track of the number of update performed on a database and after a certain threshold start a vacuum in a separate process by itself.
Any comments?
 

Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
Nicolas Bazin wrote:
> >From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the
backendto keep track of the number of update performed on a database and after a certain threshold start a vacuum in a
separateprocess by itself.
 
> Any comments?

Yes, makes sense to me, especially now that we have a nolocking vacuum. 
Tom, do you have any ideas on this?

-- Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Nicolas Bazin wrote:
> From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the backend
tokeep track of the number of update performed on a database and after a certain threshold start a vacuum in a separate
processby itself.
 
>> Any comments?

> Yes, makes sense to me, especially now that we have a nolocking vacuum. 
> Tom, do you have any ideas on this?

There's a TODO item about this already.

* Provide automatic scheduling of background vacuum (Tom)
        regards, tom lane


Re: Postgresql backend to perform vacuum automatically

From
"Christopher Kings-Lynne"
Date:
> > Yes, makes sense to me, especially now that we have a nolocking vacuum.
> > Tom, do you have any ideas on this?
>
> There's a TODO item about this already.
>
> * Provide automatic scheduling of background vacuum (Tom)

I think an good system would have some parameters something like this in
postgresql.conf:

vacuum_update_threshold
= num of operations that cause frags on a table before a vacuum is run, 0 =
no requirement, if followed by a percent (%) sign, indicates percentage of
rows changed, rather than an absolute number

vacuum_idle_threshold
= system load below which the system must be before a vacuum can be
performed.  0 = no requirement

vacuum_time_threshold
= seconds since last vacuum before which another vacuum cannot occur. 0 = no
requirement.

If all 3 are 0, then no auto-vacuuming is performed at all.  There's
probably trouble if only the idle threshold is set to zero.

And the same for the 'analyze' command?

If they want it on a per-table basis, then they can just do it themselves
with a cronjob!

Chris




Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
> vacuum_idle_threshold
> = system load below which the system must be before a vacuum can be
> performed.  0 = no requirement
> 
> vacuum_time_threshold
> = seconds since last vacuum before which another vacuum cannot occur. 0 = no
> requirement.
> 
> If all 3 are 0, then no auto-vacuuming is performed at all.  There's
> probably trouble if only the idle threshold is set to zero.
> 
> And the same for the 'analyze' command?
> 
> If they want it on a per-table basis, then they can just do it themselves
> with a cronjob!

Yes, and Jan's statistics stuff has stats on table activity.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
mlw
Date:
Tom Lane wrote:
> 
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Nicolas Bazin wrote:
> > From what I read in the recent exchanges in the PostgreSQL vs ORACLE thread it would seem a good idea for the
backendto keep track of the number of update performed on a database and after a certain threshold start a vacuum in a
separateprocess by itself.
 
> >> Any comments?
> 
> > Yes, makes sense to me, especially now that we have a nolocking vacuum.
> > Tom, do you have any ideas on this?
> 
> There's a TODO item about this already.
> 
> * Provide automatic scheduling of background vacuum (Tom)

I have been thinking about this. I like the idea, but it may be problematic.

I suggested running a vacuum process on a constant low priority in the
background, and it was pointed out that this may cause some deadlock issues.

For vacuum to run in the background, it needs to be more regulated or targeted.

It needs to be able to know which tables need it. Many tables are static and
never get updated, vacuuming them would be pointless. It needs to be sensitive
to database load, and be tunable to vacuum only when safe or necessary to
reduce load.

Are there tables that track information that would be useful for guiding
vacuum? Could I write a program which queries some statistical tables and and
knows which tables need to be vacuumed?

If the info is around, I could whip up something pretty easily, I think.


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
> Are there tables that track information that would be useful for guiding
> vacuum? Could I write a program which queries some statistical tables and and
> knows which tables need to be vacuumed?
> 
> If the info is around, I could whip up something pretty easily, I think.

Sure, Jan's new statistics tables in 7.2 had just that info.test=> \d pg_stat_user_tables     View
"pg_stat_user_tables"   Column     |  Type   | Modifiers ---------------+---------+----------- relid         | oid
| relname       | name    |  seq_scan      | bigint  |  seq_tup_read  | bigint  |  idx_scan      | numeric |
idx_tup_fetch| numeric |  n_tup_ins     | bigint  |  n_tup_upd     | bigint  |  n_tup_del     | bigint  | 
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
mlw
Date:
Bruce Momjian wrote:
> 
> > Are there tables that track information that would be useful for guiding
> > vacuum? Could I write a program which queries some statistical tables and and
> > knows which tables need to be vacuumed?
> >
> > If the info is around, I could whip up something pretty easily, I think.
> 
> Sure, Jan's new statistics tables in 7.2 had just that info.
> 
>         test=> \d pg_stat_user_tables
>              View "pg_stat_user_tables"
>             Column     |  Type   | Modifiers
>         ---------------+---------+-----------
>          relid         | oid     |
>          relname       | name    |
>          seq_scan      | bigint  |
>          seq_tup_read  | bigint  |
>          idx_scan      | numeric |
>          idx_tup_fetch | numeric |
>          n_tup_ins     | bigint  |
>          n_tup_upd     | bigint  |
>          n_tup_del     | bigint  |

I have a system running 7.2b2, does it update these fields? Is it an option?
All I am getting is zeros.
  relid   |       relname        | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del

-----------+----------------------+----------+--------------+----------+---------------+-----------+-----------+-----------
  16559 | snf_cache            |        0 |            0 |        0
 
|             0 |         0 |         0 |         0  8689760 | fav_stat             |        0 |            0 |

|               |         0 |         0 |         0 19174244 | mbr_art_aff          |        0 |            0 |
0
|             0 |         0 |         0 |         0 20788376 | artist_affinity      |        0 |            0 |
0
|             0 |         0 |         0 |         0 83345144 | saffweak             |        0 |            0 |

|               |         0 |         0 |         0 94811871 | pga_queries          |        0 |            0 |

|               |         0 |         0 |         0 94812980 | pga_forms            |        0 |            0 |

|               |         0 |         0 |         0 94813425 | pga_scripts          |        0 |            0 |

|               |         0 |         0 |         0 94813869 | pga_reports          |        0 |            0 |

|               |         0 |         0 |         0 94814245 | pga_schema           |        0 |            0 |

|               |         0 |         0 |         0116675008 | int_song_affinity    |        0 |            0 |

|               |         0 |         0 |         0166147508 | favorites            |        0 |            0 |
0
|             0 |         0 |         0 |         0173869647 | song_affinity_orig_t |        0 |            0 |

|               |         0 |         0 |         0176339567 | song_affinity        |        0 |            0 |
0
|             0 |         0 |         0 |         0178658941 | song_affinity_array  |        0 |            0 |
0
|             0 |         0 |         0 |         0186403716 | mbr_art_aff_t        |        0 |            0 |

|               |         0 |         0 |         0


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
Yes, I am seeing only zeros too.  Jan?

---------------------------------------------------------------------------

mlw wrote:
> Bruce Momjian wrote:
> > 
> > > Are there tables that track information that would be useful for guiding
> > > vacuum? Could I write a program which queries some statistical tables and and
> > > knows which tables need to be vacuumed?
> > >
> > > If the info is around, I could whip up something pretty easily, I think.
> > 
> > Sure, Jan's new statistics tables in 7.2 had just that info.
> > 
> >         test=> \d pg_stat_user_tables
> >              View "pg_stat_user_tables"
> >             Column     |  Type   | Modifiers
> >         ---------------+---------+-----------
> >          relid         | oid     |
> >          relname       | name    |
> >          seq_scan      | bigint  |
> >          seq_tup_read  | bigint  |
> >          idx_scan      | numeric |
> >          idx_tup_fetch | numeric |
> >          n_tup_ins     | bigint  |
> >          n_tup_upd     | bigint  |
> >          n_tup_del     | bigint  |
> 
> I have a system running 7.2b2, does it update these fields? Is it an option?
> All I am getting is zeros.
> 
>    relid   |       relname        | seq_scan | seq_tup_read | idx_scan |
> idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
>
-----------+----------------------+----------+--------------+----------+---------------+-----------+-----------+-----------
>      16559 | snf_cache            |        0 |            0 |        0
> |             0 |         0 |         0 |         0
>    8689760 | fav_stat             |        0 |            0 |         
> |               |         0 |         0 |         0
>   19174244 | mbr_art_aff          |        0 |            0 |        0
> |             0 |         0 |         0 |         0
>   20788376 | artist_affinity      |        0 |            0 |        0
> |             0 |         0 |         0 |         0
>   83345144 | saffweak             |        0 |            0 |         
> |               |         0 |         0 |         0
>   94811871 | pga_queries          |        0 |            0 |         
> |               |         0 |         0 |         0
>   94812980 | pga_forms            |        0 |            0 |         
> |               |         0 |         0 |         0
>   94813425 | pga_scripts          |        0 |            0 |         
> |               |         0 |         0 |         0
>   94813869 | pga_reports          |        0 |            0 |         
> |               |         0 |         0 |         0
>   94814245 | pga_schema           |        0 |            0 |         
> |               |         0 |         0 |         0
>  116675008 | int_song_affinity    |        0 |            0 |         
> |               |         0 |         0 |         0
>  166147508 | favorites            |        0 |            0 |        0
> |             0 |         0 |         0 |         0
>  173869647 | song_affinity_orig_t |        0 |            0 |         
> |               |         0 |         0 |         0
>  176339567 | song_affinity        |        0 |            0 |        0
> |             0 |         0 |         0 |         0
>  178658941 | song_affinity_array  |        0 |            0 |        0
> |             0 |         0 |         0 |         0
>  186403716 | mbr_art_aff_t        |        0 |            0 |         
> |               |         0 |         0 |         0
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Yes, I am seeing only zeros too.  Jan?

Did you turn on statistics gathering?  See the Admin Guide's discussion
of database monitoring.
        regards, tom lane


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Yes, I am seeing only zeros too.  Jan?
> 
> Did you turn on statistics gathering?  See the Admin Guide's discussion
> of database monitoring.

Oops, now I remember.  Those are off by default and only the query
string is on by default.  Thanks.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
mlw
Date:
Bruce Momjian wrote:
> 
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > Yes, I am seeing only zeros too.  Jan?
> >
> > Did you turn on statistics gathering?  See the Admin Guide's discussion
> > of database monitoring.
> 
> Oops, now I remember.  Those are off by default and only the query
> string is on by default.  Thanks.

This raises the question, by turning these on, does that affect database
performance?

If so, it may not be the answer for a selective vacuum.

If they do not affect performance, then why have them off?


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
mlw wrote:
> Bruce Momjian wrote:
> > 
> > Tom Lane wrote:
> > > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > > Yes, I am seeing only zeros too.  Jan?
> > >
> > > Did you turn on statistics gathering?  See the Admin Guide's discussion
> > > of database monitoring.
> > 
> > Oops, now I remember.  Those are off by default and only the query
> > string is on by default.  Thanks.
> 
> This raises the question, by turning these on, does that affect database
> performance?
> 
> If so, it may not be the answer for a selective vacuum.
> 
> If they do not affect performance, then why have them off?

I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
win to keep it on all the time, perhaps.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
mlw
Date:
Bruce Momjian wrote:
> 
> mlw wrote:
> > Bruce Momjian wrote:
> > >
> > > Tom Lane wrote:
> > > > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > > > Yes, I am seeing only zeros too.  Jan?
> > > >
> > > > Did you turn on statistics gathering?  See the Admin Guide's discussion
> > > > of database monitoring.
> > >
> > > Oops, now I remember.  Those are off by default and only the query
> > > string is on by default.  Thanks.
> >
> > This raises the question, by turning these on, does that affect database
> > performance?
> >
> > If so, it may not be the answer for a selective vacuum.
> >
> > If they do not affect performance, then why have them off?
> 
> I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> win to keep it on all the time, perhaps.

Assuming that the statistics get updated:

How often should the sats table be queried?
What sort of configurability would be needed?


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
> > > If they do not affect performance, then why have them off?
> > 
> > I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> > win to keep it on all the time, perhaps.
> 
> Assuming that the statistics get updated:
> 
> How often should the sats table be queried?
> What sort of configurability would be needed?

You could wake up every few minutes and see how the values have changed.
I don't remember if there is a way to clear that stats so you can see
just the changes in the past five minutes.  Vacuum the table that had
activity.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
"Dann Corbit"
Date:
-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Tuesday, March 05, 2002 12:59 PM
To: mlw
Cc: Tom Lane; Nicolas Bazin; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Postgresql backend to perform vacuum
automatically


> > > If they do not affect performance, then why have them off?
> >
> > I think Jan said 2-3%.  If we can get autovacuum from it, it would
be a
> > win to keep it on all the time, perhaps.
>
> Assuming that the statistics get updated:
>
> How often should the sats table be queried?
> What sort of configurability would be needed?

You could wake up every few minutes and see how the values have changed.
I don't remember if there is a way to clear that stats so you can see
just the changes in the past five minutes.  Vacuum the table that had
activity.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>
How long does it take to vacuum a table with 12 indexes and 100 million
rows in it?  This idea is making me very nervous.  Suppose (for
instance)
that we have regular updates to some table, and it is constantly getting

vacuum attempts thrown at it.

Now imagine a large systems with many large tables which are frequently
receiving updates.  Would 100 simultaneous vacuum operations be a good
thing when .0001% of the table has changed [on average] for each of
them?

I know for sure "update statistics" at regular intervals on some of the
SQL systems I have used would be sheer suicide.

Better make it configurable, that's for sure.
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<


Re: Postgresql backend to perform vacuum automatically

From
Neil Padgett
Date:
On Tue, 2002-03-05 at 15:59, Bruce Momjian wrote:
> > > > If they do not affect performance, then why have them off?
> > > 
> > > I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> > > win to keep it on all the time, perhaps.
> > 
> > Assuming that the statistics get updated:
> > 
> > How often should the sats table be queried?
> > What sort of configurability would be needed?
> 
> You could wake up every few minutes and see how the values have changed.
> I don't remember if there is a way to clear that stats so you can see
> just the changes in the past five minutes.  Vacuum the table that had
> activity.

Ick -- polling. The statistics process should be able to wake somebody
up / notify the postmaster when the statistics change such that a vacuum
is required.

Neil

-- 
Neil Padgett
Red Hat Canada Ltd.                       E-Mail:  npadgett@redhat.com
2323 Yonge Street, Suite #300, 
Toronto, ON  M4P 2C9



Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
Neil Padgett wrote:
> On Tue, 2002-03-05 at 15:59, Bruce Momjian wrote:
> > > > > If they do not affect performance, then why have them off?
> > > > 
> > > > I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> > > > win to keep it on all the time, perhaps.
> > > 
> > > Assuming that the statistics get updated:
> > > 
> > > How often should the sats table be queried?
> > > What sort of configurability would be needed?
> > 
> > You could wake up every few minutes and see how the values have changed.
> > I don't remember if there is a way to clear that stats so you can see
> > just the changes in the past five minutes.  Vacuum the table that had
> > activity.
> 
> Ick -- polling. The statistics process should be able to wake somebody
> up / notify the postmaster when the statistics change such that a vacuum
> is required.

Yes, that would tie that stats collector closer to auto-vacuum, but it
certainly could be done.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
mlw
Date:
Bruce Momjian wrote:
> 
> Neil Padgett wrote:
> > On Tue, 2002-03-05 at 15:59, Bruce Momjian wrote:
> > > > > > If they do not affect performance, then why have them off?
> > > > >
> > > > > I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> > > > > win to keep it on all the time, perhaps.
> > > >
> > > > Assuming that the statistics get updated:
> > > >
> > > > How often should the sats table be queried?
> > > > What sort of configurability would be needed?
> > >
> > > You could wake up every few minutes and see how the values have changed.
> > > I don't remember if there is a way to clear that stats so you can see
> > > just the changes in the past five minutes.  Vacuum the table that had
> > > activity.
> >
> > Ick -- polling. The statistics process should be able to wake somebody
> > up / notify the postmaster when the statistics change such that a vacuum
> > is required.
> 
> Yes, that would tie that stats collector closer to auto-vacuum, but it
> certainly could be done.

Using an alert can be done, but polling is easier for a proof of concept. I
dont see too much difficulty there. We could use notify.


Re: Postgresql backend to perform vacuum automatically

From
"Zeugswetter Andreas SB SD"
Date:
> > > > If they do not affect performance, then why have them off?
> > >
> > > I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> > > win to keep it on all the time, perhaps.
> >
> > Assuming that the statistics get updated:
> >
> > How often should the sats table be queried?
> > What sort of configurability would be needed?
>
> You could wake up every few minutes and see how the values have changed.
> I don't remember if there is a way to clear that stats so you can see
> just the changes in the past five minutes.  Vacuum the table that had
> activity.

I cannot envision querying the stats every 4 seconds, especially if the stats
thread already has most of the info in hand.

I still think, that for best results the vacuums should happen continuously
for single pages based on a hook in wal or the buffer manager. Do I remember
correctly, that the active page (the one receiving the next row) already has
a strategy for slot reuse ? Maybe this strategy should be the followed more
aggressively ?

Seems the worst case is a few row table that permanently get updated,
it should be possible to harness this situation with above method.

Andreas


Re: Postgresql backend to perform vacuum automatically

From
Turbo Fredriksson
Date:
>>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
   >> Are there tables that track information that would be useful   >> for guiding vacuum? Could I write a program
whichqueries some   >> statistical tables and and knows which tables need to be   >> vacuumed?   >>    >> If the info
isaround, I could whip up something pretty easily,   >> I think.
 
   Bruce> Sure, Jan's new statistics tables in 7.2 had just that   Bruce> info.   Bruce>     test=> \d
pg_stat_user_tables

I don't have that table. Only 'pg_statistic'... ?
-- 
Noriega Peking smuggle Iran jihad spy domestic disruption Cocaine CIA
kibo Waco, Texas toluene iodine Qaddafi FSF
[See http://www.aclu.org/echelonwatch/index.html for more about this]


Re: Postgresql backend to perform vacuum automatically

From
mlw
Date:
Zeugswetter Andreas SB SD wrote:
> 
> > > > > If they do not affect performance, then why have them off?
> > > >
> > > > I think Jan said 2-3%.  If we can get autovacuum from it, it would be a
> > > > win to keep it on all the time, perhaps.
> > >
> > > Assuming that the statistics get updated:
> > >
> > > How often should the sats table be queried?
> > > What sort of configurability would be needed?
> >
> > You could wake up every few minutes and see how the values have changed.
> > I don't remember if there is a way to clear that stats so you can see
> > just the changes in the past five minutes.  Vacuum the table that had
> > activity.
> 
> I cannot envision querying the stats every 4 seconds, especially if the stats
> thread already has most of the info in hand.
> 
> I still think, that for best results the vacuums should happen continuously
> for single pages based on a hook in wal or the buffer manager. Do I remember
> correctly, that the active page (the one receiving the next row) already has
> a strategy for slot reuse ? Maybe this strategy should be the followed more
> aggressively ?
> 
> Seems the worst case is a few row table that permanently get updated,
> it should be possible to harness this situation with above method.

Perhaps the statistics thread can trigger a semaphore when it sees that vacuum
needs to be run?


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
Turbo Fredriksson wrote:
> >>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
> 
>     >> Are there tables that track information that would be useful
>     >> for guiding vacuum? Could I write a program which queries some
>     >> statistical tables and and knows which tables need to be
>     >> vacuumed?
>     >> 
>     >> If the info is around, I could whip up something pretty easily,
>     >> I think.
> 
>     Bruce> Sure, Jan's new statistics tables in 7.2 had just that
>     Bruce> info.
>     
>     Bruce>     test=> \d pg_stat_user_tables
> 
> I don't have that table. Only 'pg_statistic'... ?

Upgrade to 7.2.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
Tom Lane
Date:
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> I still think, that for best results the vacuums should happen continuously
> for single pages based on a hook in wal or the buffer manager.

Not possible unless you are willing to have SELECTs grab much stronger
locks than they do now (viz, the same kind of lock that VACUUM does).
        regards, tom lane


Re: Postgresql backend to perform vacuum automatically

From
Turbo Fredriksson
Date:
>>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
   Bruce> Turbo Fredriksson wrote:   >> >>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes:   >>    >> >>
Arethere tables that track information that would be useful   >> >> for guiding vacuum? Could I write a program which
queries  >> some >> statistical tables and and knows which tables need to   >> be >> vacuumed?   >> >>    >> >> If the
infois around, I could whip up something pretty   >> easily, >> I think.   >>    Bruce> Sure, Jan's new statistics
tablesin 7.2 had just that   Bruce> info.   >>   Bruce> test=> \d pg_stat_user_tables   >>  I don't have that table.
Only'pg_statistic'... ?
 
   Bruce> Upgrade to 7.2.

Doh! I have :)
-- 
FSF Peking AK-47 Nazi arrangements quiche explosion jihad ammunition
North Korea toluene spy Treasury FBI congress
[See http://www.aclu.org/echelonwatch/index.html for more about this]


Re: Postgresql backend to perform vacuum automatically

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> > I still think, that for best results the vacuums should happen continuously
> > for single pages based on a hook in wal or the buffer manager.
> 
> Not possible unless you are willing to have SELECTs grab much stronger
> locks than they do now (viz, the same kind of lock that VACUUM does).

Remember, you can't recycle the page until all transactions are done
seeing it as active.  I think something periodic will do the job just
fine.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Postgresql backend to perform vacuum automatically

From
"Zeugswetter Andreas SB SD"
Date:
> > I still think, that for best results the vacuums should happen continuously
> > for single pages based on a hook in wal or the buffer manager.
>
> Not possible unless you are willing to have SELECTs grab much stronger
> locks than they do now (viz, the same kind of lock that VACUUM does).

I am talking about slots, that are marked deleted before oldest tx in progress.
It should be possible to overwrite those only with the "pin" on the page.
A pageread for a select does wait (spinlock) while a new txinfo is written,
this is necessary since the txinfo is more than one byte, no ?
It should be more or less the same situation like using a slot from the
freelist, no ?
Update and delete would need to check the "old page" for %free and add it to
the freelist, like vacuum does.

This would avoid the (imho large) overhead vacuum imposes of reading static
pages that have not been modified in ages.

Andreas


Re: Postgresql backend to perform vacuum automatically

From
Date:
If I may pipe in to this discussion, I have some experience with this.
With threaded postgres, the thread that is reponsible for writing out
buffer pages keeps track of the number of writes on a particular relation
(implied update, insert, or delete but not always).  That is multiplied by
a tolerance factor and accumulated until a limit is reached.  After much
trial and error I came up with this formula for the tolerance factor.

live_tuple_count of previous vacuum * 0.01 
+
dead_tuple_count of previous vacuum * 0.1
+
100 to force vacuums on small relations

all divided by live_tuple_count + 1

with bounds on the change in tolerance factor from 
the last vacuum (3.0x max increase or 80% max decrease)

Lastly, the second phase of vacuum_lazy only gets triggered 
if more than 10% of the relation is dead.

All this seems to keep relations pretty trim without getting
in the way.  Tables that have a large ratio of dead to live tuples 
get vacuumed more frequently which actually helps performance by clearing 
out old tuples which don't have to be scanned by other threads doing
updates.

Myron Scott
mkscott@sacadia.com

On Thu, 7 Mar 2002, Zeugswetter Andreas SB SD wrote:

> > > I still think, that for best results the vacuums should happen continuously
> > > for single pages based on a hook in wal or the buffer manager.
> > 
> > Not possible unless you are willing to have SELECTs grab much stronger
> > locks than they do now (viz, the same kind of lock that VACUUM does).
> 
> I am talking about slots, that are marked deleted before oldest tx in progress.
> It should be possible to overwrite those only with the "pin" on the page.
> A pageread for a select does wait (spinlock) while a new txinfo is written,
> this is necessary since the txinfo is more than one byte, no ?
> It should be more or less the same situation like using a slot from the 
> freelist, no ?
> Update and delete would need to check the "old page" for %free and add it to 
> the freelist, like vacuum does.
> 
> This would avoid the (imho large) overhead vacuum imposes of reading static 
> pages that have not been modified in ages.
> 
> Andreas
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
> 



Re: Postgresql backend to perform vacuum automatically

From
Tom Lane
Date:
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
>> Not possible unless you are willing to have SELECTs grab much stronger
>> locks than they do now (viz, the same kind of lock that VACUUM does).

> I am talking about slots, that are marked deleted before oldest tx in progress.
> It should be possible to overwrite those only with the "pin" on the page.

No, it is not.

You can't physically remove a tuple before you've removed all index
entries that point to it.  That requires, at minimum, a table lock that
ensures that no new indexes are being created meanwhile.  Which is what
VACUUM uses.

There's also a serious question about whether this would really improve
performance.  "Retail" deletion of index tuples would be fairly
inefficient over the long run, compared to the bulk deletion technique
used by VACUUM.
        regards, tom lane