Andrew Sullivan wrote:
> Sorry I haven't had a chance to reply to this sooner.
> The vacuum delay stuff that you're working on may help, but I can't
> really believe it's your salvation if this is happening after only a
> few minutes. No matter how much you're doing inside those functions,
> you surely can't be causing so many dead tuples that a vacuum is
> necessary that soon. Did you try not vacuuming for a little while to
> see if it helps?
I discussed it later in the thread, but we're adding about 400K rows per
hour and deleting most of them after processing (note this is a
commercial app, written and maintained by another department -- I can
recommend changes, but this late into their release cycle they are very
reluctant to change the app). This is 7 x 24 data collection from
equipment, so there is no "slow" time to use as a maintenance window.
But since the server in question is a test machine, I was able to shut
everything off long enough to do a full vacuum -- it took about 12 hours.
> I didn't see it anywhere in this thread, but are you quite sure that
> you're not swapping? Note that vmstat on multiprocessor Solaris
> machines is not notoriously useful. You may want to have a look at
> what the example stuff in the SE Toolkit tells you, or what you get
> from sar. I believe you have to use a special kernel setting on
> Solaris to mark shared memory as being ineligible for swap.
I'm (reasonably) sure there is no swapping. Minimum free memory (from
top) is about 800 MB, and "vmstat -S" shows no swap-in or swap-out.
I've been playing with a version of Jan's performance patch in the past
few hours. Based on my simulations, it appears that a 1 ms delay every
10 pages is just about right. The performance hit is negligible (based
on overall test time, and cpu % used by the vacuum process). I still
have a bit more analysis to do, but this is looking pretty good. More
later...
Joe