Thanks to all who responded so far. I got some more insights from Mike
Stonebraker himself in the USENIX talk Scott pointed to before.
I'd like to revise the four points a little bit I enumerated in my
initial question and to sort out what PG already does or could do:
1. Buffering Pool
To get rid of I/O bounds Mike proposes in-memory database structures.
He argues that it's impossible to be implemented by "old elephants"
because it would be a huge code rewrite since there is also a need to
store memory structures (instead disk oriented structures).
Now I'm still wondering why PG could'nt realize that probably in
combination with unlogged tables? I don't overview the respective code
but I think it's worthwhile to discuss even if implementation of
memory-oriented structures would be to difficult.
2. Locking
This critique obviously does'nt hold for PG since we have MVCC here already.
3. WAL logging
Here Mike proposes replication over several nodes as an alternative to
WAL which fits nicely with High Availability. PG 9 has built-in
replication but just not for unlogged tables :-<
4. Latches
This is an issue I never heard before. I found some notion of latches
in the code but I does'nt seem to be related to concurrently accessing
btree structures as Mike suggests.
So if anyone could confirm that this problem exists producing overhead
I'd be interested to hear.
Mike proposes single-threads running on many cores where each core
processes a non overlapping shard.
But he also calls for ideas to invent btrees which can be processed
concurrently with as less memory locks as possible (instead of looking
to make btrees faster).
So to me the bottom line is, that PG already has reduced overhead at
least for issue #2 and perhaps for #4.
Remain issues of in-memory optimization (#2) and replication (#3)
together with High Availability to be investigated in PG.
Yours, Stefan
2012/2/26 Karsten Hilbert <Karsten.Hilbert@gmx.net>:
> On Sun, Feb 26, 2012 at 08:37:54AM -0600, Andy Colson wrote:
>
>>> 3. WAL logging
>>
>> PG writes a transaction twice. Once to WAL and once to
>> the DB. WAL is a simple and quick write, and is only ever
>> used if your computer crashes and PG has to re-play
>> transactions to get the db into a good/known state. Its a
>> safety measure that doesn't really take much time, and I
>> don't think I've heard of anyone being WAL bound. Although
>> it does increase IO ops, it's not the biggest usage of IO.
>> This one falls under "lets be safe" which is something NoSQL
>> did away with. Its not something I want to give up,
>> personally. I like using a net.
>
> And, one could still effectively disable WAL by using
> unlogged tables.
>
> Karsten
> --
> GPG key ID E4071346 @ gpg-keyserver.de
> E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general