Re: Buglist - Mailing list pgsql-general

From scott.marlowe
Subject Re: Buglist
Date
Msg-id Pine.LNX.4.33.0308200905420.12206-100000@css120.ihs.com
Whole thread Raw
In response to Re: Buglist  (Vivek Khera <khera@kcilink.com>)
List pgsql-general
On Wed, 20 Aug 2003, Vivek Khera wrote:

> >>>>> "BW" == Bruno Wolff, <Bruno> writes:
>
> >> to see it incremental.  This would result in pretty much near zero
> >> internal fragmentation, I think.
>
> BW> Why do you care about about the details of the implementation (rather than
> BW> the performance)? If it were faster to do it that way, that's how it would
> BW> have been done in the first place. The cost of doing the above is almost
> BW> certainly going to be an overall performance loser.
>
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?
>
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

Time for a mental exercise.

Our server has 2 users.  Each backend has to check with all the other
backends when it deletes a tuple (every update is also a delete, remember
every change in an MVCC database is a create / delte cycle.)  Let's create
a name for the time it takes to do the update / mark deleted versus the
time it takes to contact each of those other backends.  Tw is the Time to
do the work here, and Tc is the time to do the cleanup (i.e. vacuum the
tuple)  Note that we'd also need a Ta for answering the requests of all
the other backends, but we can assume that on average, for each request a
child process makes, it will receive exactly that many from each other
backend running.  Let x represent the number of backends.  So the answer
time is equal to x*Tc

Time = Tw + Tc + Ta

Time = Tw + Tc + (x * Tc)

Time = Tw + ((x+1) * Tc)

and our cleanup time starts to grow at an ugly rate as the number of
backends increases.  Lazy vacuuming allows the database to reclaim lost
space in the background, as the newer non-full vacuum does.

Many folks mistake this vacuum process for its older, slower cousin, full
vacuum, which does eat a lot more disk bandwidth and slow the machine
down.

On a Dual CPU X86 box a lazy vacuum running in a continuous loop will eat
about 5% of one CPU and drop pgbench scores by 10 to 15%.  The important
thing here, is that the machine will still run quite snappily when you
throw several hundred clients at it, since the lazy vacuum just sits in
the background using the spare cycles and not much more.

that means your storage usage may baloon somewhat under intense usage, but
you won't have an IPC storm kill the performance of the postgresql server.

Knowing the postgresql development team, I'm sure the reasons they chose
are clearly stated in the hackers mailing list somewhere in time, so I'm
gonna go look, but trust me on one thing, the guys programming this
database don't do much because it's easier / faster to implement without
putting something in the TODO list about making it better some day.


pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Buglist
Next
From: Stephan Szabo
Date:
Subject: Re: Buglist