Thread: detection of VACUUM in progress
Hello, Is there any way how to detect running command VACUUM by reading pg_* tables? The idea is to detectect when table is not accessible due maintainance. The approach of explicitely setting a flag into statustable is not very convenient, while I want to cover also non-systematic launching of this command Regards, Bohdan
Bohdan Linda <bohdan.linda@seznam.cz> writes: > Is there any way how to detect running command VACUUM by reading pg_* tables? > The idea is to detectect when table is not accessible due maintainance. Um, ordinary VACUUM doesn't render the table "not accessible". If you're using VACUUM FULL, maybe the right answer is to not do that. But to answer your question, you could look in pg_locks to see if there's an exclusive lock on the table. regards, tom lane
On Tue, Aug 30, 2005 at 05:35:13PM +0200, Bohdan Linda wrote: > > Is there any way how to detect running command VACUUM by reading pg_* tables? If you have stats_command_string enabled then you could query pg_stat_activity, but be aware that the results will be stale by the time you see them (there's a slight lag in updating the stats tables, and a VACUUM might start or complete immediately after you issue the query but before you read the results). This method is therefore unreliable. > The idea is to detectect when table is not accessible due maintainance. In modern versions of PostgreSQL a plain VACUUM (without FULL) should have little impact on a table's accessibility unless you're doing something that needs a strong lock (DDL, etc.). But VACUUM FULL and a few other commands do prevent other transactions from accessing a table, so if you want to check for accessibility then you need to check for more than just VACUUM. You might be able to use statement_timeout to make statements fail if they take longer than expected, and infer from the failure that the table is unavailable. But again, that information could be stale by the time you see it -- the table might become available immediately after you decide that it isn't. What problem are you trying to solve? If we knew what you're really trying to do then we might be able to make suggestions. -- Michael Fuhr
On Tue, Aug 30, 2005 at 06:07:24PM +0200, Michael Fuhr wrote: > tables, and a VACUUM might start or complete immediately after you > issue the query but before you read the results). This method is > therefore unreliable. I intend to do the VACUUM FULL during quiet hours, thus the chance of fitting exactly to the time that VACUUM started and it is not reflected in the tables is quite small. And even if it would happend, very likely it will affect only one user, who may get around hitting "refresh" button. > > > What problem are you trying to solve? If we knew what you're really > trying to do then we might be able to make suggestions. I have database, which gets around 240 000 new lines each day and about the same is also deleted each day. The table has something around 8M lines in average and simple query takes about 70s to complete(V210 1x UltraSPARC-IIIi). As this time is quite high, I need "defragment" database on daily basis. These queries get visualized in web application. My problem is, how to make the web application aware that maintainace (VACUUM FULL) is in place, but the database is not down. I really would not like to do it via extra status table, while sometimes it may happend, that someone will run VACUUM FULL ad-hoc-ly in good-faith and will forget to update the status table.
Bohdan Linda wrote: > On Tue, Aug 30, 2005 at 06:07:24PM +0200, Michael Fuhr wrote: > >>tables, and a VACUUM might start or complete immediately after you >>issue the query but before you read the results). This method is >>therefore unreliable. > > > I intend to do the VACUUM FULL during quiet hours, thus the chance of > fitting exactly to the time that VACUUM started and it is not reflected in > the tables is quite small. And even if it would happend, very likely it will > affect only one user, who may get around hitting "refresh" button. > >> >>What problem are you trying to solve? If we knew what you're really >>trying to do then we might be able to make suggestions. > > > I have database, which gets around 240 000 new lines each day and about the > same is also deleted each day. The table has something around 8M lines in > average and simple query takes about 70s to complete(V210 1x > UltraSPARC-IIIi). As this time is quite high, I need "defragment" database > on daily basis. These queries get visualized in web application. My > problem is, how to make the web application aware that maintainace (VACUUM > FULL) is in place, but the database is not down. I really would not like > to do it via extra status table, while sometimes it may happend, that > someone will run VACUUM FULL ad-hoc-ly in good-faith and will forget to > update the status table. > From the postgresql manual http://www.postgresql.org/docs/8.0/interactive/maintenance.html : " The standard form of VACUUM is best used with the goal of maintaining a fairly level steady-state usage of disk space. If you need to return disk space to the operating system you can use VACUUM FULL — but what's the point of releasing disk space that will only have to be allocated again soon? Moderately frequent standard VACUUM runs are a better approach than infrequent VACUUM FULL runs for maintaining heavily-updated tables." From this I conclude that an ordinary VACUUM is sufficent to your purpose cause you insert/delete almost the same amount of data daily. But then again I can be mistaken so if anyone can back me up here or throw the manual on me will be nice ;P Cheers
> > From the postgresql manual > http://www.postgresql.org/docs/8.0/interactive/maintenance.html : > " The standard form of VACUUM is best used with the goal of maintaining > a fairly level steady-state usage of disk space. If you need to return > disk space to the operating system you can use VACUUM FULL ? but what's > the point of releasing disk space that will only have to be allocated > again soon? Moderately frequent standard VACUUM runs are a better > approach than infrequent VACUUM FULL runs for maintaining > heavily-updated tables." > > From this I conclude that an ordinary VACUUM is sufficent to your > purpose cause you insert/delete almost the same amount of data daily. > > But then again I can be mistaken so if anyone can back me up here or > throw the manual on me will be nice ;P If I vacuum only the table, the records may be used by new lines, that is fine. Problem is, that when creating select on such table, it takes more pages to be read from the IO (it will read laso deleted rows) thus the select will last a bit longer. regards, Bohdan
Your table size should stabilize to the size you currently get before VACUUM FULL.... If you're afraid about having too many "deleted" rows, you just have to run VACUUM more often... I think that you have to test how often you have to run VACUUM so that your performance is not impacted.... Or perhaps you could try pg_autovacuum which will run VACUUM for you based on statistics... Regards, Patrick ---------------------------------------------------------------------------- --------------- Patrick Fiche email : patrick.fiche@aqsacom.com tel : 01 69 29 36 18 ---------------------------------------------------------------------------- --------------- -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Bohdan Linda Sent: mercredi 31 aout 2005 14:19 To: Ben-Nes Yonatan Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] detection of VACUUM in progress > > From the postgresql manual > http://www.postgresql.org/docs/8.0/interactive/maintenance.html : > " The standard form of VACUUM is best used with the goal of maintaining > a fairly level steady-state usage of disk space. If you need to return > disk space to the operating system you can use VACUUM FULL ? but what's > the point of releasing disk space that will only have to be allocated > again soon? Moderately frequent standard VACUUM runs are a better > approach than infrequent VACUUM FULL runs for maintaining > heavily-updated tables." > > From this I conclude that an ordinary VACUUM is sufficent to your > purpose cause you insert/delete almost the same amount of data daily. > > But then again I can be mistaken so if anyone can back me up here or > throw the manual on me will be nice ;P If I vacuum only the table, the records may be used by new lines, that is fine. Problem is, that when creating select on such table, it takes more pages to be read from the IO (it will read laso deleted rows) thus the select will last a bit longer. regards, Bohdan ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster
Bohdan Linda wrote: >From the postgresql manual >>http://www.postgresql.org/docs/8.0/interactive/maintenance.html : >>" The standard form of VACUUM is best used with the goal of maintaining >>a fairly level steady-state usage of disk space. If you need to return >>disk space to the operating system you can use VACUUM FULL ? but what's >>the point of releasing disk space that will only have to be allocated >>again soon? Moderately frequent standard VACUUM runs are a better >>approach than infrequent VACUUM FULL runs for maintaining >>heavily-updated tables." >> >From this I conclude that an ordinary VACUUM is sufficent to your >>purpose cause you insert/delete almost the same amount of data daily. >> >>But then again I can be mistaken so if anyone can back me up here or >>throw the manual on me will be nice ;P > > > > If I vacuum only the table, the records may be used by new lines, that is > fine. Problem is, that when creating select on such table, it takes more > pages to be read from the IO (it will read laso deleted rows) thus the > select will last a bit longer. > > regards, > Bohdan > As far as I understand the vacuum process, it does delete the "deleted rows" so the next queries after the vacuum will not read those rows. Of course that the table will be the same size as with the "deleted rows" but I dont think that with a proper index it will result in any meaningful overhead... But then again its just my opinion and anyway lately im feeling increasingly amateur at this subject... :) Cheers! Ben-Nes Yonatan
On Wed, 2005-08-31 at 07:18, Bohdan Linda wrote: > > > > From the postgresql manual > > http://www.postgresql.org/docs/8.0/interactive/maintenance.html : > > " The standard form of VACUUM is best used with the goal of maintaining > > a fairly level steady-state usage of disk space. If you need to return > > disk space to the operating system you can use VACUUM FULL ? but what's > > the point of releasing disk space that will only have to be allocated > > again soon? Moderately frequent standard VACUUM runs are a better > > approach than infrequent VACUUM FULL runs for maintaining > > heavily-updated tables." > > > > From this I conclude that an ordinary VACUUM is sufficent to your > > purpose cause you insert/delete almost the same amount of data daily. > > > > But then again I can be mistaken so if anyone can back me up here or > > throw the manual on me will be nice ;P > > > If I vacuum only the table, the records may be used by new lines, that is > fine. Problem is, that when creating select on such table, it takes more > pages to be read from the IO (it will read laso deleted rows) thus the > select will last a bit longer. It really depends on what percentage of rows you're updating. If you are updating 240,000 rows a day, and have a database with 100M rows, then that's not too bad. Regular vacuums once a day would be plenty. If you're updating 240,000 rows a day, spread out over the day, and you have a table that has 10,000 rows, then you will need to run vacuum far more often to keep the table at a steady state of 10,000 to 20,000 rows. If you're doing the 240,000 updates all at once on a small table, then you might well be a candidate for a vacuum full. So, again, it's about percentages really. If 240k rows represents 1% of your table, then daily, regular vacuums will do fine. Personally, I just install pg_autovacuum and check on it once a week or so to make sure it's doing its job.