Re: [HACKERS] Re: [PORTS] vacuum takes too long - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] Re: [PORTS] vacuum takes too long
Date
Msg-id 199907070031.UAA27558@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] Re: [PORTS] vacuum takes too long  (The Hermit Hacker <scrappy@hub.org>)
List pgsql-hackers
Added to TODO:

* Create a background process for each database that runs while
  database is idle, finding superceeded rows, gathering stats and vacuuming



> On Wed, 6 Jan 1999, Bruce Momjian wrote:
>
> > > > Wish I knew the answer.  I can guess, but that isn't going to help.
> > >
> > >     Guess == throwing in ideas, even if they are incorrect...the way I
> > > figure it, I through out alot of guesses...some of them spark ideas in
> > > others and we see some really neat ideas come out of it :)
> >
> > Yes.  Good.  I have been bugging Vadim about possible row reuse, but I
> > don't know enough to understand the options.
>
> I'm not sure about the row-reuse thing.  What sort of performance hit will
> it have.  As it is now, you add a row by zipping down to the end, add the
> row...bang, finished.  with row-reuse, you have to search for a good fit,
> which could take time...
>
> Hrmmm...let's look at Oracle's "model"...bear in mind that I haven't dived
> very deep into it, so I could be totally off base here, but, with Oracle,
> you have a seperate "group" of processes started up for each 'instance',
> where, if I'm correct, an instance is the same as our database(?)...
>
> How hard would it be for us to implement something similar?  When you
> start up the postmaster, it starts up 1 postgres "master process" for each
> database that it knows about.  The point of the master process is
> effectively the garbage collector for the database, as well as the central
> 'traffic cop'...
>
> so, for example, I have 4 databases on my server...when you start up the
> system with your normal 'postmaster' process, it forks off 4 processes,
> one for each database.  When you connect to port #### for database XXXX,
> the listening process (main postmaster) shunts the process over to the
> appropriate 'traffic cop' for handling...
>
> The 'traffic cop' would keep track of the number of connections to the
> database are currently open, and when zero, which woudl indicate idle
> time, process through a table in the database to clean it up.  As soon as
> a new connection comes in, it would "finish" its cleanup by making sure
> the table is in a 'sane state' (ie. finish up with its current record) and
> then fork off the process, to wait quietly until its idle again...
>
> Then each database could effectively have their own shared memory pool
> that could be adjusted on a per database basis.  Maybe even add a 'change
> threshold', where after X transactions (update, insert or delete), the
> table gets auto-vacuum'd (no analyze, just vacuum)...the threshold could
> be set on a per-table basis...the 'traffic cop' should be able to easily
> keep track of those sort of stats internally...no?
>
> Hell, the 'traffic cop' *should* be able to keep reasonably accurate stats
> to update the same tables that a 'vacuum analyze' maintains, adjusting
> those values periodically to give a semi-accurate picture.  Periodically,
> a normal 'analyze' would have to be run...
>
> Its a thought...haven't got a clue as to the complexity of implementing,
> but...*shrug*
>
> Marc G. Fournier
> Systems Administrator @ hub.org
> primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org
>
>
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: RE: [HACKERS] Fwd: Joins and links
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Another speedup idea (two, even)