Thread: Autovacuum query

Autovacuum query

From
Mitu Verma
Date:

Correcting the subject

 

From: Mitu Verma
Sent: March 26, 2015 9:28 AM
To: pgsql-general@postgresql.org
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] Populating missing dates in postgresql data

 

Hi,

 

We have a customer complaining about the time taken by one of the application scripts while deleting older data from the log tables.

During the deletion, customer reported that he often sees the below error and because of which table size doesn’t reduce.

 

ERROR: canceling autovacuum task

Date: 2015-03-14 04:29:19

Context: automatic analyze of table "fm_db_Server3.mmsuper.audittraillogentry"

 

We have the following queries in this regard:

-          How often is the autovacuum task invoked by postgres

-          If the task gets cancelled (while we were deleting data from the database) would this task be re-invoked at a later time?

-          If insertion of data into a table also impact this task?

-          If we can manually schedule this task to a particular time (like off peak hours)?

 

 

Regards

Mitu

 

 

 

Re: Autovacuum query

From
"David G. Johnston"
Date:
On Wed, Mar 25, 2015 at 8:58 PM, Mitu Verma <mitu.verma@ericsson.com> wrote:

Correcting the subject


​And this is why it is considered good form to do "compose new message" instead of replying to an existing one.  Injecting your new topic into an existing unrelated mail thread is mildly annoying.

David J.
 

Re: Autovacuum query

From
Bill Moran
Date:
On Thu, 26 Mar 2015 03:58:59 +0000
Mitu Verma <mitu.verma@ericsson.com> wrote:
>
> We have a customer complaining about the time taken by one of the application scripts while deleting older data from
thelog tables. 
> During the deletion, customer reported that he often sees the below error and because of which table size doesn?t
reduce.
>
> ERROR: canceling autovacuum task
> Date: 2015-03-14 04:29:19
> Context: automatic analyze of table "fm_db_Server3.mmsuper.audittraillogentry"
>
> We have the following queries in this regard:
>
> -          How often is the autovacuum task invoked by postgres

As needed. Read:
http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html

> -          If the task gets cancelled (while we were deleting data from the database) would this task be re-invoked
ata later time? 

Yes. Read the above

> -          If insertion of data into a table also impact this task?

No. Vacuum operations are not triggered by INSERTs.

> -          If we can manually schedule this task to a particular time (like off peak hours)?

Yes, but given the questions you're asking, you probably do not
have a good enough understanding of the situation to schedule it
correctly and will make the problem worse. You can run it manually
any time you want, but I don't recommend that you disable
autovacuum unless you have a good understanding of what you're
doing.

Let me take a guess at the problem: The table gets LOTs of inserts,
constantly, and somewhere there's a job that runs out of cron or
some similar scheduler that DELETEs a lot of those rows in a big
chunk. The DELETE process probably runs infrequently, like once
a day or even once a week because the designers thought it would
be best to get everything taken care of all at once during some
real or perceived slow period on the database.

One solution to this is to run the DELETE process more frequently,
such as every 15 minutes. In such a case, the process will run
much faster, make less changes, and require less work on the part
of autovacuum to clean up after. People frequently complain that
"this will impact performance if run during normal use hours,"
but in every case I've seen, nobody had actually tested to see
if that statement was true, and running smaller purges more
frequently actually solved the problem.

Another option would be to manually run vacuum after the big
DELETE runs.
See http://www.postgresql.org/docs/9.4/static/sql-vacuum.html

Don't fall into the trap of running VACUUM FULL. This is usually
a bad idea. If the client is complaining about reclaiming disk
space, start asking some hard questions: How much space is too
much? Why are you convinced that the space is wasted?
Usually the correct answer is to add more disk space, since
Postgres tends to fall into a groove with a particular table
whereby the "unused" space is actually being used and reclaimed
by data tuples as the data in the table changes. It's not
unusal for the table to be 2x the size of the actual data on
a heavily updated table.

--
Bill Moran


Re: Autovacuum query

From
Steven Erickson
Date:
Another option, depending on the nature of the data and deletes, would be to partition the table.  I created 7 tables
thatinherited from the original, one table for each day of the week.  A nightly cron job then runs, leaving alone
yesterday'sand today's tables but truncating the other 5.  Runs in < 10 msec and vacuum doesn't need to run. 

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Bill Moran
Sent: Thursday, March 26, 2015 6:07 AM
To: Mitu Verma
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Autovacuum query

On Thu, 26 Mar 2015 03:58:59 +0000
Mitu Verma <mitu.verma@ericsson.com> wrote:
>
> We have a customer complaining about the time taken by one of the application scripts while deleting older data from
thelog tables. 
> During the deletion, customer reported that he often sees the below error and because of which table size doesn?t
reduce.
>
> ERROR: canceling autovacuum task
> Date: 2015-03-14 04:29:19
> Context: automatic analyze of table "fm_db_Server3.mmsuper.audittraillogentry"
>
> We have the following queries in this regard:
>
> -          How often is the autovacuum task invoked by postgres

As needed. Read:
http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html

> -          If the task gets cancelled (while we were deleting data from the database) would this task be re-invoked
ata later time? 

Yes. Read the above

> -          If insertion of data into a table also impact this task?

No. Vacuum operations are not triggered by INSERTs.

> -          If we can manually schedule this task to a particular time (like off peak hours)?

Yes, but given the questions you're asking, you probably do not have a good enough understanding of the situation to
scheduleit correctly and will make the problem worse. You can run it manually any time you want, but I don't recommend
thatyou disable autovacuum unless you have a good understanding of what you're doing. 

Let me take a guess at the problem: The table gets LOTs of inserts, constantly, and somewhere there's a job that runs
outof cron or some similar scheduler that DELETEs a lot of those rows in a big chunk. The DELETE process probably runs
infrequently,like once a day or even once a week because the designers thought it would be best to get everything taken
careof all at once during some real or perceived slow period on the database. 

One solution to this is to run the DELETE process more frequently, such as every 15 minutes. In such a case, the
processwill run much faster, make less changes, and require less work on the part of autovacuum to clean up after.
Peoplefrequently complain that "this will impact performance if run during normal use hours," 
but in every case I've seen, nobody had actually tested to see if that statement was true, and running smaller purges
morefrequently actually solved the problem. 

Another option would be to manually run vacuum after the big DELETE runs.
See http://www.postgresql.org/docs/9.4/static/sql-vacuum.html

Don't fall into the trap of running VACUUM FULL. This is usually a bad idea. If the client is complaining about
reclaimingdisk space, start asking some hard questions: How much space is too much? Why are you convinced that the
spaceis wasted? 
Usually the correct answer is to add more disk space, since Postgres tends to fall into a groove with a particular
tablewhereby the "unused" space is actually being used and reclaimed by data tuples as the data in the table changes.
It'snot unusal for the table to be 2x the size of the actual data on a heavily updated table. 

--
Bill Moran


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged
information.Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient,
pleasecontact the sender by reply email and destroy all copies of the original message. 


Re: Autovacuum query

From
Jan de Visser
Date:
On March 25, 2015 09:31:24 PM David G. Johnston wrote:
> On Wed, Mar 25, 2015 at 8:58 PM, Mitu Verma <mitu.verma@ericsson.com> wrote:
> >  Correcting the subject
>
> ​And this is why it is considered good form to do "compose new message"
> instead of replying to an existing one.  Injecting your new topic into an
> existing unrelated mail thread is mildly annoying.

Wildly off-topic, but I'm blaming Google. Their thread detection logic is so
good that people actually don't know anymore how it was supposed to work, back
in the bad old days where threads were managed by In-Reply-To headers.

>
> David J.
> ​