On Tuesday 2006-06-06 20:11, Mark Woodward wrote:
> > Mark Woodward wrote:
> >> OK, here's my problem, I have a nature study where we have about 10
> >> video
> >> cameras taking 15 frames per second.
> >> For each frame we make a few transactions on a PostgreSQL database.
> >
> > Maybe if you grouped multiple operations on bigger transactions, the I/O
> > savings could be enough to buy you the ability to vacuum once in a
> > while. Or consider buffering somehow -- save the data elsewhere, and
> > have some sort of daemon to put it into the database. This would allow
> > to cope with the I/O increase during vacuum.
>
> The problem is ssufficiently large that any minor modification can easily
> hide the problem for a predictble amount of time. My hope was that someone
> would have a real "long term" work around.
I'm not certain that I understand the original problem correctly so I am going
to restate it.
VACCUM needs to be run for two reasons.
1) To recover the transaction counter.
2) To recover records marked for deletion.
VACCUM needs to be run over the entire database. If the data in the database
is N, then VACCUM is O(N). Roughly, VACCUM scales linearly with the size of
the database.
In the digital video problem:
Data is stored indefinitely online. (It is not archived.)
(Virtually no records need to be recovered from deletion.)
Data comes in at a constant rate, frames.
The database therefore grows at frames/time (D).
It follows that no matter how much tuning is done, given constant hardware,
VACCUM grows to consume so many resources that it is no longer possible to
process frames/time[m] before frames/time[m+1] arrives.
Ideally, the transaction management system would be proportional to the
marginal change in size of the database rather than the gross size of the
database. That is VACCUM being O(N) should be replaced (or there should be
an optional alternative) that scales with D, O^k(D) where any k > 1 involves
a tradeoff with VACCUM.