Thread: RE: [HACKERS] Re: [PORTS] vacuum takes too long

RE: [HACKERS] Re: [PORTS] vacuum takes too long

From
"Jackson, DeJuan"
Date:
> > > > Wish I knew the answer.  I can guess, but that isn't
> going to help.
> > >
> > >     Guess == throwing in ideas, even if they are
> incorrect...the way I
> > > figure it, I through out alot of guesses...some of them
> spark ideas in
> > > others and we see some really neat ideas come out of it :)
> >
> > Yes.  Good.  I have been bugging Vadim about possible row
> reuse, but I
> > don't know enough to understand the options.
>
> I'm not sure about the row-reuse thing.  What sort of
> performance hit will
> it have.  As it is now, you add a row by zipping down to the
> end, add the
> row...bang, finished.  with row-reuse, you have to search for
> a good fit,
> which could take time...
>
> Hrmmm...let's look at Oracle's "model"...bear in mind that I
> haven't dived
> very deep into it, so I could be totally off base here, but,
> with Oracle,
> you have a seperate "group" of processes started up for each
> 'instance',
> where, if I'm correct, an instance is the same as our database(?)...
>
> How hard would it be for us to implement something similar?  When you
> start up the postmaster, it starts up 1 postgres "master
> process" for each
> database that it knows about.  The point of the master process is
> effectively the garbage collector for the database, as well
> as the central
> 'traffic cop'...
>
> so, for example, I have 4 databases on my server...when you
> start up the
> system with your normal 'postmaster' process, it forks off 4
> processes,
> one for each database.  When you connect to port #### for
> database XXXX,
> the listening process (main postmaster) shunts the process over to the
> appropriate 'traffic cop' for handling...
>
> The 'traffic cop' would keep track of the number of connections to the
> database are currently open, and when zero, which woudl indicate idle
> time, process through a table in the database to clean it up.
>  As soon as
> a new connection comes in, it would "finish" its cleanup by
> making sure
> the table is in a 'sane state' (ie. finish up with its
> current record) and
> then fork off the process, to wait quietly until its idle again...
>
> Then each database could effectively have their own shared memory pool
> that could be adjusted on a per database basis.  Maybe even
> add a 'change
> threshold', where after X transactions (update, insert or delete), the
> table gets auto-vacuum'd (no analyze, just vacuum)...the
> threshold could
> be set on a per-table basis...the 'traffic cop' should be
> able to easily
> keep track of those sort of stats internally...no?
>
> Hell, the 'traffic cop' *should* be able to keep reasonably
> accurate stats
> to update the same tables that a 'vacuum analyze' maintains, adjusting
> those values periodically to give a semi-accurate picture.
> Periodically,
> a normal 'analyze' would have to be run...
>
> Its a thought...haven't got a clue as to the complexity of
> implementing,
> but...*shrug*

With MVCC an occasional 'vacuum analyze' should only be noticed from the
performance improvements.  As far as I can tell most of the work done by
an analyze is in reading the table data.  If you make sure to write the
new information at the end of the transaction you only lock the indexes
for the amount of time it takes to write them.

I see a 'vacuum analyze' being less of a problem than 'vacuum'.
Any of you experts can contradict my assumptions.
    Just my guess,
    -DEJ

RE: [HACKERS] Re: [PORTS] vacuum takes too long

From
The Hermit Hacker
Date:
On Thu, 7 Jan 1999, Jackson, DeJuan wrote:

> With MVCC an occasional 'vacuum analyze' should only be noticed from the
> performance improvements.  As far as I can tell most of the work done by
> an analyze is in reading the table data.  If you make sure to write the
> new information at the end of the transaction you only lock the indexes
> for the amount of time it takes to write them.
>
> I see a 'vacuum analyze' being less of a problem than 'vacuum'.
> Any of you experts can contradict my assumptions.

Good point...I seem to recall that at one point, there was a lock imposed
on one of hte pg_ tables when a vacuum is tarted, since it has to update a
couple of the rows in that table...has that lock been removed with MVCC?
Vadim?



Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org


Re: [HACKERS] Re: [PORTS] vacuum takes too long

From
Bruce Momjian
Date:
> With MVCC an occasional 'vacuum analyze' should only be noticed from the
> performance improvements.  As far as I can tell most of the work done by
> an analyze is in reading the table data.  If you make sure to write the
> new information at the end of the transaction you only lock the indexes
> for the amount of time it takes to write them.
>
> I see a 'vacuum analyze' being less of a problem than 'vacuum'.
> Any of you experts can contradict my assumptions.

The problem is that vacuum analyze does both vacuum and analyze.
Analyze takes so long, we figured we might as well vacuum too.  Maybe we
need to change that.


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [HACKERS] Re: [PORTS] vacuum takes too long

From
The Hermit Hacker
Date:
On Thu, 7 Jan 1999, Bruce Momjian wrote:

> > With MVCC an occasional 'vacuum analyze' should only be noticed from the
> > performance improvements.  As far as I can tell most of the work done by
> > an analyze is in reading the table data.  If you make sure to write the
> > new information at the end of the transaction you only lock the indexes
> > for the amount of time it takes to write them.
> >
> > I see a 'vacuum analyze' being less of a problem than 'vacuum'.
> > Any of you experts can contradict my assumptions.
>
> The problem is that vacuum analyze does both vacuum and analyze.
> Analyze takes so long, we figured we might as well vacuum too.  Maybe we
> need to change that.

There is, IMHO, no problem with them being combined...in the past, the
problem was that the whole system was effectively locked up while a vacuum
analyze was being run because one of the 'statistics' tables was being
locked during the whole thing, instead of when required...

As DeJuan points out, though, this should no longer be a problem with
MVCC...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org