Re: 7.4 Wishlist - Mailing list pgsql-hackers

From Kevin Brown
Subject Re: 7.4 Wishlist
Date
Msg-id 20021203204958.GA31342@filer
Whole thread Raw
In response to Re: 7.4 Wishlist  ("Al Sutton" <al@alsutton.com>)
Responses Re: 7.4 Wishlist
List pgsql-hackers
Al Sutton wrote:
> Point to Point and Broadcast replication
> ----------------------------------------
> With point to point you specify multiple endpoints, with broadcast you can
> specify a subnet address and the updates are broadcast over that subnet.
> 
> The difference being that point to point works well for cross network
> replication, or where you have a few replicants. I have multiple database
> servers which could have a deadicated class C network that they are all on,
> by broadcasting updates you can cutdown the amount of traffic on that net by
> a factor of n minus 1 (where n is the number of servers involved).

Yech.  Now you can't use TCP anymore, so the underlying replication
code has to handle all the issues that TCP deals with transparently,
like error checking, retransmits, data windows, etc.  I don't think
it's wise to assume that your transport layer is 100% reliable.

Further, this doesn't even address the problem of bringing up a leaf
server that's been down a while.  It can be significantly out of date
relative to the other servers on the subnet.

I suspect you'll be better off implementing a replication protocol
that has the leaf nodes keeping each other up to date, to minimize the
traffic coming from the next level up.  Then you can use TCP for the
connections but minimize the traffic generated by any given node.

> Ability to use raw partitions
> ----------------------------
> 
> I've not seen an install of PostgreSQL yet that didn't put the database
> files onto a filesystem, so I'm assuming it's the only way of doing it. By
> using the filesystem the files are at the mercy of filesystem handler code
> as to where they end up on the disk, and thus the speed of access will
> always have some dependancy on the speed of the filesystem.
> 
> With a raw partition it would be possible to use two devices (e.g. /dev/hde
> and /dev/hdg on an eight channel ide linux box), and PostgreSQL could then
> ensure the WALs were located on one the disk with the entries running
> sequentally, and that the database files were located on the other disk in
> the most appropriate location (e.g. index data starting near the center of
> the disk, and user table data starting near the outside).

Yeah, but now you have to worry about optimizing placement of blocks,
optimizing writes, etc.  These are things the OS should worry about,
not the database server.

If you're really that concerned about these issues, store the WAL on
one (empty) filesystem and the tables on another (empty and separate)
filesystem.  With any reasonable filesystem you'll get reasonably
close to optimal performance, especially if the filesystem code is
capable of analyzing the write patterns and adapting itself
accordingly.

In short, I'd much rather spend the effort improving the filesystem
(where everyone can benefit) than improving PostgreSQL (where only
PostgreSQL users can benefit) for this item.

The one good reason for making it possible to use raw partitions is to
make it possible to use the PostgreSQL engine as a filesystem!  :-)


> Win32 Port
> ------------
> I've explained the reasons before. Apart from that it's always useful to
> open PostgreSQL up to a larger audience.

Agreed.


- Kevin Brown



pgsql-hackers by date:

Previous
From: Vince Vielhaber
Date:
Subject: Re: [GENERAL] PostgreSQL Global Development Group Announces
Next
From: "Dan Langille"
Date:
Subject: Re: PostgreSQL in Universities (Was: Re: 7.4 Wishlist)