On Oct 28, 2007, at 2:54 PM, Josh Berkus wrote:
> I'd actually be curious what incremental changes you could see
> making to
> PostgreSQL for better in-memory operation. Ideas?
It would be difficult to make PostgreSQL really competitive for in-
memory operation, primarily because a contrary assumption pervades
the entire design. You would need to rip out a lot of the guts of
it. I was not even intending to suggest that it would be a good idea
or trivial to adapt PostgreSQL to in-memory operation, but since I am
at least somewhat familiar with the research I thought I'd offer a
useful link that detailed the kinds of considerations involved. That
said, I have seriously considered the idea since I have a major
project that requires that kind of capability and there is some
utility in using parts of PostgreSQL if possible, particularly since
it was used to prototype it. In my specific case I also need to
shoehorn a new type of access method into it as well that there is no
conceptual support for, so it will probably be easier to build a
(mostly) new database engine altogether.
Personally, if I was designing a distributed in-memory database, I
would use a somewhat more conservative set of assumptions than
Stonebraker so that it would have a more general applicability. For
example, his assumption of extremely short CPU times for a
transaction (<1 millisecond) are not even valid for some types of
OLTP loads, never mind the numerous uses that are not strictly OLTP-
like but which nonetheless are built on relatively short
transactions; in the Stonebraker design this much latency would be a
pathology. Unfortunately, if you remove that assumption, the design
starts to unravel noticeably. Nonetheless, there are other viable
design paths that while not over-fitted to OLTP still could offer
large gains.
I think the market is right for a well-designed distributed, in-
memory database, but I think one would be starting with an
architecture inferior for the purpose that would be hard to get away
from if we made incremental changes to a solid disk-based engine. It
seems short-term expedient but long-term bad engineering -- think MySQL.
Cheers,
J. Andrew Rogers