Re: Inverted-list databases (was: Working on huge RAM based datasets) - Mailing list pgsql-performance

From Mischa Sandberg
Subject Re: Inverted-list databases (was: Working on huge RAM based datasets)
Date
Msg-id bVEHc.6984$iw3.4810@clgrps13
Whole thread Raw
In response to Odd sorting behaviour  ("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
List pgsql-performance
""Andy Ballingall"" <andy_ballingall@bigfoot.com> wrote in message
news:011301c46597$15d145c0$0300a8c0@lappy...

> On another thread, (not in this mailing list), someone mentioned that
there
> are a class of databases which, rather than caching bits of database file
> (be it in the OS buffer cache or the postmaster workspace), construct a a
> well indexed memory representation of the entire data in the postmaster
> workspace (or its equivalent), and this, remaining persistent, allows the
DB
> to service backend queries far quicker than if the postmaster was working
> with the assumption that most of the data was on disk (even if, in
practice,
> large amounts or perhaps even all of it resides in OS cache).

As a historical note, System R (grandaddy of all relational dbs) worked this
way.
And it worked under ridiculous memory constraints by modern standards.

Space-conscious MOLAP databases do this, FWIW.

Sybase 11 bitmap indexes pretty much amount to this, too.

I've built a SQL engine that used bitmap indexes within B-Tree indexes,
making it practical to index every field of every table (the purpose of the
engine).

You can also build special-purpose in-memory representations to test for
existence (of a key), when you expect a lot of failures. Google
"superimposed coding" e.g.  http://www.dbcsoftware.com/dbcnews/NOV94.TXT



pgsql-performance by date:

Previous
From: "Andy Ballingall"
Date:
Subject: Re: Working on huge RAM based datasets
Next
From: "Jim Ewert"
Date:
Subject: Swapping in 7.4.3