Thread: vacuum

vacuum

From
Mike Cianflone
Date:
    I know that vacuum has come up in the past, and even saw the
discussion about putting a cron entry to have it run every once in a while,
but I don't remember seeing anything about having it kick off via a trigger
every so may inserts.Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run vacuum
once a minute.Is there a way to run vacuum every so many inserts, updates, and
deletes? (with 1 table the trigger is easy, but is there a global way to
check no matter which tables are utilized?).
Do other DBMSs such as Oracle run into the same problem and need to
be vacuumed, in their own sort of way?

Thanks for any insight,
Mike


Re: vacuum

From
Doug McNaught
Date:
Mike Cianflone <mcianflone@littlefeet-inc.com> writes:

>     I know that vacuum has come up in the past, and even saw the
> discussion about putting a cron entry to have it run every once in a while,
> but I don't remember seeing anything about having it kick off via a trigger
> every so may inserts.
>     Is there a relative consensus for how often to run vacuum? I have a
> table of about 8 columns that I fill with 100,000 items simply via a "\i
> alarms.sql". After 1,000 items or so it gets extremely slow to fill with
> data, and will take over a day to fill the entire thing unless I run vacuum
> once a minute.

Ummm...  Are you wrapping the inserts in a transaction?  If not, no
wonder it's slow--you're paying transaction overhead for each insert.

I don't think VACUUM has that much effect on insert performance, as
insert doesn't leave empty space in existing pages.

Try using COPY instead of INSERT, or at least wrap all your inserts in 
a transaction.  Also, drop indexes before running the insert, and
recreate them afterwards.

-Doug
-- 
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time...          --Dylan


RE: vacuum

From
Mike Cianflone
Date:
    After the comment by someone about the UPDATE being responsible for
the reason for vacuuming (sorry, I didn't know that), I looked into a stored
procedure that gets triggered during an insert. The stored procedure does an
UPDATE on another table, for every insert. So inserting 100,000 items into
the table causes an update on 100,000 items in another table. I noticed that
the other table's file size gets very large (right now it's over a megabyte
and only 10% complete inserting), even though there are only about 5 items
in that table. Since that table has the UPDATE happening to it, it's getting
large. A vacuum chops it down to 8K.I tried increasing the buffer size, and that made the 100,000
inserts (with the corresponding update) go longer before hitting the barrier
and slowing down tremendously (until another vacuum is done).
Since vacuum isn't tied to a time, but rather the size of the
buffers? or the indices? it would seem plausible to do as another person had
mentioned and have vacuum kick off when the buffers are xx% full.

Mike



-----Original Message-----
From: Zeugswetter Andreas SB [mailto:ZeugswetterA@wien.spardat.at]
Sent: Wednesday, June 13, 2001 1:04 AM
To: 'Mike Cianflone'; Hackers List
Subject: AW: [HACKERS] vacuum



>     Is there a relative consensus for how often to run vacuum? I have a
> table of about 8 columns that I fill with 100,000 items simply via a "\i
> alarms.sql". After 1,000 items or so it gets extremely slow to fill with
> data, and will take over a day to fill the entire thing unless I run
vacuum
> once a minute.

You will have to tell us, what exactly your alarms.sql does, and what
indexes 
your table has. Above behavior is certainly not to be expected in general,
especially the "vacuum once a minute" is highly suspicious.

For a series of insert only statements, the vacuum is not supposed to help
at 
all, thus there must be an update hidden somewhere. 

Andreas

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


RE: vacuum

From
Peter Eisentraut
Date:
Mike Cianflone writes:

>     After the comment by someone about the UPDATE being responsible for
> the reason for vacuuming (sorry, I didn't know that), I looked into a stored
> procedure that gets triggered during an insert. The stored procedure does an
> UPDATE on another table, for every insert. So inserting 100,000 items into
> the table causes an update on 100,000 items in another table. I noticed that
> the other table's file size gets very large (right now it's over a megabyte
> and only 10% complete inserting), even though there are only about 5 items
> in that table. Since that table has the UPDATE happening to it, it's getting
> large. A vacuum chops it down to 8K.

An UPDATE basically acts as "append and mark old record obsolete".  This
is the so-called non-overwriting storage manager which keeps all data,
even deleted data, indefinitely until you run vacuum (hence the name).

>     I tried increasing the buffer size, and that made the 100,000
> inserts (with the corresponding update) go longer before hitting the barrier
> and slowing down tremendously (until another vacuum is done).

Increasing the buffer size will keep the data in memory longer before
pushing it out to disk.  The net result will be the same though.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



RE: vacuum

From
The Hermit Hacker
Date:
Now that you've narrowed it down to a specific table, at least you can
specifically vacuum just that table and ignore the rest of the database
...might help a bit?

On Wed, 13 Jun 2001, Mike Cianflone wrote:

>     After the comment by someone about the UPDATE being responsible for
> the reason for vacuuming (sorry, I didn't know that), I looked into a stored
> procedure that gets triggered during an insert. The stored procedure does an
> UPDATE on another table, for every insert. So inserting 100,000 items into
> the table causes an update on 100,000 items in another table. I noticed that
> the other table's file size gets very large (right now it's over a megabyte
> and only 10% complete inserting), even though there are only about 5 items
> in that table. Since that table has the UPDATE happening to it, it's getting
> large. A vacuum chops it down to 8K.
>     I tried increasing the buffer size, and that made the 100,000
> inserts (with the corresponding update) go longer before hitting the barrier
> and slowing down tremendously (until another vacuum is done).
>
>     Since vacuum isn't tied to a time, but rather the size of the
> buffers? or the indices? it would seem plausible to do as another person had
> mentioned and have vacuum kick off when the buffers are xx% full.
>
> Mike
>
>
>
> -----Original Message-----
> From: Zeugswetter Andreas SB [mailto:ZeugswetterA@wien.spardat.at]
> Sent: Wednesday, June 13, 2001 1:04 AM
> To: 'Mike Cianflone'; Hackers List
> Subject: AW: [HACKERS] vacuum
>
>
>
> >     Is there a relative consensus for how often to run vacuum? I have a
> > table of about 8 columns that I fill with 100,000 items simply via a "\i
> > alarms.sql". After 1,000 items or so it gets extremely slow to fill with
> > data, and will take over a day to fill the entire thing unless I run
> vacuum
> > once a minute.
>
> You will have to tell us, what exactly your alarms.sql does, and what
> indexes
> your table has. Above behavior is certainly not to be expected in general,
> especially the "vacuum once a minute" is highly suspicious.
>
> For a series of insert only statements, the vacuum is not supposed to help
> at
> all, thus there must be an update hidden somewhere.
>
> Andreas
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://www.postgresql.org/search.mpl
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Re: vacuum

From
"Ross J. Reedstrom"
Date:
On Wed, Jun 13, 2001 at 06:24:10PM -0300, The Hermit Hacker wrote:
> 
> Now that you've narrowed it down to a specific table, at least you can
> specifically vacuum just that table and ignore the rest of the database
> ...might help a bit?

Even better: since he's loading a script anyway, the script could start
a transaction, disable the trigger, load the data, bulk UPDATE the other
table, and re-enable the trigger. I bet it takes only a few minutes to
do the whole thing that way.

Ross

> 
> On Wed, 13 Jun 2001, Mike Cianflone wrote:
> 
> >     After the comment by someone about the UPDATE being responsible for
> > the reason for vacuuming (sorry, I didn't know that), I looked into a stored
> > procedure that gets triggered during an insert. The stored procedure does an
> > UPDATE on another table, for every insert. So inserting 100,000 items into
> > the table causes an update on 100,000 items in another table. I noticed that
> > the other table's file size gets very large (right now it's over a megabyte
> > and only 10% complete inserting), even though there are only about 5 items
> > in that table. Since that table has the UPDATE happening to it, it's getting
> > large. A vacuum chops it down to 8K.
> >     I tried increasing the buffer size, and that made the 100,000
> > inserts (with the corresponding update) go longer before hitting the barrier
> > and slowing down tremendously (until another vacuum is done).
> >
> >     Since vacuum isn't tied to a time, but rather the size of the
> > buffers? or the indices? it would seem plausible to do as another person had
> > mentioned and have vacuum kick off when the buffers are xx% full.
> >
> > Mike
> >
> >
> >
> > -----Original Message-----
> > From: Zeugswetter Andreas SB [mailto:ZeugswetterA@wien.spardat.at]
> > Sent: Wednesday, June 13, 2001 1:04 AM
> > To: 'Mike Cianflone'; Hackers List
> > Subject: AW: [HACKERS] vacuum
> >
> >
> >
> > >     Is there a relative consensus for how often to run vacuum? I have a
> > > table of about 8 columns that I fill with 100,000 items simply via a "\i
> > > alarms.sql". After 1,000 items or so it gets extremely slow to fill with
> > > data, and will take over a day to fill the entire thing unless I run
> > vacuum
> > > once a minute.
> >
> > You will have to tell us, what exactly your alarms.sql does, and what
> > indexes
> > your table has. Above behavior is certainly not to be expected in general,
> > especially the "vacuum once a minute" is highly suspicious.
> >
> > For a series of insert only statements, the vacuum is not supposed to help
> > at
> > all, thus there must be an update hidden somewhere.
> >
> > Andreas
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://www.postgresql.org/search.mpl
> >
> 
> Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
> Systems Administrator @ hub.org
> primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly