Thread: Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly

Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Guy Fraser [mailto:guy@incentre.net]
> Sent: Thursday, April 22, 2004 8:44 AM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] 7.3.4 on Linux: UPDATE .. foo=foo+1
> degrades massivly
>
>
> Dann Corbit wrote:
>
> >>>A following VACCUM brings back return times to 'start' -
> >>>
> >>>
> >>but I cannot
> >>
> >>
> >>>run VACUUM any other minute (?). And it exactly vaccums as
> >>>
> >>>
> >>many tuples
> >>
> >>
> >>>as I updated.. sure thing:
> >>>
> >>>
> >>Why not? You only have to vacuum this one table. Vacuuming it
> >>once a minute should be doable.
> >>
> >>
> >
> >Shouldn't the Database server be the entity that decides
> when vacuum is
> >needed?
> >
>
> How is the database supposed to know when you want to purge
> records? Once a vacuum has been run, the table can not be
> rolled back or time traveled.

When I say commit or rollback, I don't need the dead records any longer.

> >Something is very, very strange about the whole PostgreSQL
> maintenance
> >model.
> >
> Giving the administrator full control over database
> management is a good
> thing.
> If you want to write a cron job, to purge records
> automaticaly, thats your prerogative. Not every one needs to,
> nor want's to constantly purge records.
>
> Most of my databases collect information and changing
> information in them would be taboo. Since records are not
> updated or deleted their is no reason to vacuum the
> collection tables, and they collect between 400 K to 40 M
> records per period.

The same sort of argument can be made for optimizer decisions.  If I sat
and thought about it, once in a while I could outguess the planner.  But
I would much rather have the planner do it for me, even if it is not
totally optimal once in a while.

> >Oracle uses MVCC and I do not have to UPDATE STATISTICS
> constantly to
> >keep the system from going into the toilet.
> >
> Does Oracle purge records automaticaly?
> If so how do you configure it, and what are the default parameters?
>
> >Also, I should be able to do an update on every row in a
> database table
> >without causing severe problems.  Every other database
> system I know of
> >does not have this problem.
> >
> >If I have a million row table with a column called
> is_current, and I do
> >this:
> >UPDATE tname SET is_current = 0;
> >Horrible things happen.
> >
> >Just an idea:
> >Why not recognize that more rows will be modified than the
> row setting
> >can support and actually break the command into batches internally?
> >
> It sounds like you have significant hardware limitations.
>
> I have a database I use for traffic analysys, that has over
> 40,000,000
> records, I have
> done some complicated queries with multiple subselects and joins. The
> complicated
> queries take a long time to complete, but they work. I have also done
> updates that
> affected at least 5% of the records, then vacuumed the table shortly
> there after.
>
> The bigger the table the more "scatch pad" disk space, and
> memory you need.

I think that the problems I am seeing are due to using a much older
version of PostgreSQL.  We use 7.1.3 here, because we have thoroughly
tested it (many thousands of tests are in our regression suite).  But if
I delete too many records, the only way I can reclaim the space is to
drop the table.

We are working with the beta of 7.5 and perhaps it will cure all the
ills that remain.

Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly

From
"scott.marlowe"
Date:
On Thu, 22 Apr 2004, Dann Corbit wrote:

> > -----Original Message-----
> > From: Guy Fraser [mailto:guy@incentre.net]
> > Sent: Thursday, April 22, 2004 8:44 AM
> > To: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] 7.3.4 on Linux: UPDATE .. foo=foo+1
> > degrades massivly
> >
> >
> > Dann Corbit wrote:
> >
> > >>>A following VACCUM brings back return times to 'start' -
> > >>>
> > >>>
> > >>but I cannot
> > >>
> > >>
> > >>>run VACUUM any other minute (?). And it exactly vaccums as
> > >>>
> > >>>
> > >>many tuples
> > >>
> > >>
> > >>>as I updated.. sure thing:
> > >>>
> > >>>
> > >>Why not? You only have to vacuum this one table. Vacuuming it
> > >>once a minute should be doable.
> > >>
> > >>
> > >
> > >Shouldn't the Database server be the entity that decides
> > when vacuum is
> > >needed?
> > >
> >
> > How is the database supposed to know when you want to purge
> > records? Once a vacuum has been run, the table can not be
> > rolled back or time traveled.
>
> When I say commit or rollback, I don't need the dead records any longer.

OK.  Scenario.  There are 5,000 users connected.  When one doesn't need
records anymore, should the database now poll att 4999 other users to see
if it's ok to flush those tuples?  What method are you going to use to
find out if the tuples are invisible to every other transaction /
connection currently running?

> I think that the problems I am seeing are due to using a much older
> version of PostgreSQL.  We use 7.1.3 here, because we have thoroughly
> tested it (many thousands of tests are in our regression suite).  But if
> I delete too many records, the only way I can reclaim the space is to
> drop the table.
>
> We are working with the beta of 7.5 and perhaps it will cure all the
> ills that remain.

That is an old version.  I would say that PostgreSQL is one of the few
products that has increased in stability with nearly every single release
cycle.  Upgrading to 7.4 or 7.5 would probably be a very good idea.

Index bloat was fixed in 7.4.  I'm not sure if there were any table bloat
problems not solved by full vacuums as far back as 7.1, but I don't
remember any problems with vacuuming back then.


Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly

From
"Glen Parker"
Date:
> I think that the problems I am seeing are due to using a much
> older version of PostgreSQL.  We use 7.1.3 here, because we
> have thoroughly tested it (many thousands of tests are in our
> regression suite).  But if I delete too many records, the
> only way I can reclaim the space is to drop the table.

Maybe this problem is related to indexes.  Vacuuming does not work on
indexes, so from time to time you have to issue a REINDEX on tables that
have lots of deletions.  This is problem is supposed to be going away soon,
or mostly so...  I think 7.4 contained an improvement here.

VACUUM FULL really really does reclaim all space in the heap.  Try REINDEX
next time.

> We are working with the beta of 7.5 and perhaps it will cure
> all the ills that remain.