Thread: PostgreSQL alternative to "Oracle Real Application Cluster"

PostgreSQL alternative to "Oracle Real Application Cluster"

From
Hubert Fröhlich
Date:
Hello list

I am working with PostgreSQL for the Bavarian Cadastral Administration.
We want to update our WWW online services so that they can use more
data. For that reason we want to set up an archive of geographical data
(ca. 400GB vector data and - maybe also 4 TB of raster data) manily used
for http services, i.e. the focus is on read operations. (although we
have to do also some kind of replication from primary storage ) The
system should be highly performant  and highly available. The use is
designed mainly for OpenGIS standardized WWW applications on
geographical data such as UMN MapServer ( > several 100000 web hits for
geographical data alone). Yet it is not quite clear if we put the raster
data simply in a file system or (as BLOBs) into a database.

We use a lot of OpenSource software, not only PostgreSQL. Using
OpenSource software is a major column of our IT strategy.

Now we have got a proposal for hard- and software for this machine which
is based mainly on the concept of the "Oracle Real Application Cluster"
on RedHat Linux boxes & file servers.
The concept is said to offer big performance - but this concept does not
fit into our IT strategy very well. We are a bit cautious concerning
promises, as Oracle Real Application Cluster seems to be some kind of
"black box" and we do not know HOW it really works.

What we would like to have is some alternative concept which allows us
to use PostgreSQL on a powerful hardware to get highly performant and
higly available database access on our terabytes.

a) Does PostgreSQL have some features using a clustered hardware?
b) If no, what could be an alternative hardware concept ?


Can anybody give me some advice or some hints to somebody who could help
us a bit further, some web page ...

  Any help will be greatly appreciated.

Thanks a lot.

Yours,

Hubert


--
-------------------------------------------------------------------------------
Dr.-Ing. Hubert Fröhlich
Bezirksfinanzdirektion München
Alexandrastr. 3, D-80538 München, GERMANY
Tel. :+49 (0)89 / 2190 - 2980
Fax  :+49 (0)89 / 2190 - 2459
hubert.froehlich@bvv.bayern.de


Re: PostgreSQL alternative to "Oracle Real Application

From
Ron Johnson
Date:
On Wed, 2003-06-18 at 05:19, Hubert Fröhlich wrote:
> Hello list
>
> I am working with PostgreSQL for the Bavarian Cadastral Administration.
> We want to update our WWW online services so that they can use more
> data. For that reason we want to set up an archive of geographical data
> (ca. 400GB vector data and - maybe also 4 TB of raster data) manily used
> for http services, i.e. the focus is on read operations. (although we
> have to do also some kind of replication from primary storage ) The
> system should be highly performant  and highly available. The use is
> designed mainly for OpenGIS standardized WWW applications on
> geographical data such as UMN MapServer ( > several 100000 web hits for
> geographical data alone). Yet it is not quite clear if we put the raster
> data simply in a file system or (as BLOBs) into a database.
>
> We use a lot of OpenSource software, not only PostgreSQL. Using
> OpenSource software is a major column of our IT strategy.
>
> Now we have got a proposal for hard- and software for this machine which
> is based mainly on the concept of the "Oracle Real Application Cluster"
> on RedHat Linux boxes & file servers.
> The concept is said to offer big performance - but this concept does not
> fit into our IT strategy very well. We are a bit cautious concerning
> promises, as Oracle Real Application Cluster seems to be some kind of
> "black box" and we do not know HOW it really works.
>
> What we would like to have is some alternative concept which allows us
> to use PostgreSQL on a powerful hardware to get highly performant and
> higly available database access on our terabytes.
>
> a) Does PostgreSQL have some features using a clustered hardware?
> b) If no, what could be an alternative hardware concept ?

You'll only be able to run 1 postmaster at a time, but in order
to only need one copy of the database and the raster data, maybe
the GlobalFileSystem would satisfy some or all of your needs.

Commercial version:
http://www.sistina.com/products_gfs.htm

OSS version:
http://opengfs.sourceforge.net/

Any "non-database" files could still be accessed by multiple web
servers.

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| "Oh, great altar of passive entertainment, bestow upon me |
|  thy discordant images at such speed as to render linear  |
|  thought impossible" (Calvin, regarding TV)               |
+-----------------------------------------------------------


Re: PostgreSQL alternative to "Oracle Real Application

From
"scott.marlowe"
Date:
On Wed, 18 Jun 2003, Hubert Fröhlich wrote:

> Hello list
>
> I am working with PostgreSQL for the Bavarian Cadastral Administration.
> We want to update our WWW online services so that they can use more
> data. For that reason we want to set up an archive of geographical data
> (ca. 400GB vector data and - maybe also 4 TB of raster data) manily used
> for http services, i.e. the focus is on read operations. (although we
> have to do also some kind of replication from primary storage ) The
> system should be highly performant  and highly available. The use is
> designed mainly for OpenGIS standardized WWW applications on
> geographical data such as UMN MapServer ( > several 100000 web hits for
> geographical data alone). Yet it is not quite clear if we put the raster
> data simply in a file system or (as BLOBs) into a database.

blobs pretty much ARE just simply in a file system in Postgresql.  It's
usually better to just store the file in a file system and store the path
in postgresql with keywords etc... to search on.

> We use a lot of OpenSource software, not only PostgreSQL. Using
> OpenSource software is a major column of our IT strategy.

Smart move.  You never know what vendor X is gonna want next year for
licensing fees after all.  What starts as an affordable project can
quickly become very expensive should your license fees jump up by a factor
of 10.

> Now we have got a proposal for hard- and software for this machine which
> is based mainly on the concept of the "Oracle Real Application Cluster"
> on RedHat Linux boxes & file servers.
>
> The concept is said to offer big performance - but this concept does not
> fit into our IT strategy very well. We are a bit cautious concerning
> promises, as Oracle Real Application Cluster seems to be some kind of
> "black box" and we do not know HOW it really works.

I've heard a lot of good about RAC, but haven't used them myself.  Don't
believe promises, only benchmarks of your own design on your own data.  Is
Oracle willing to let you "test drive" the RAC solution, or do you they
insist you pay up in full to see it in operation?  If you can't test it,
assume it won't work until you can test it.  Promises are religion, proof
is science.

> What we would like to have is some alternative concept which allows us
> to use PostgreSQL on a powerful hardware to get highly performant and
> higly available database access on our terabytes.

Good thought.  Keep in mind that whatever you spend on Oracle licenses you
can spend on hardware for Postgresql, and Postgresql doesn't JUST run on
linux on X86.  It can run on IBM mainframes (with Linux), Sun Sparc
(solaris, linux or BSD), SGI Altix (up to 64 CPUs running linux) etc...

Plus, each year after that, you can buy more hardware for the cost of
yearly licensing on Oracle.

This may not be enough for some situations, but for most, it makes
Postgresql the faster option.  Plus, if you pay money for more advanced
hardware, you likely don't need hardware failover, since mainframes & big
iron unix usually have fault tolerant hardware.

> a) Does PostgreSQL have some features using a clustered hardware?

Nothing is built into Postgresql to do clustering.  There are many
different clustering / replication / failover setups out there.

www.pgsql.com sells a support contract that includes a commercial version
of rserv that works well for certain types of setups.  usogres is another
package, and there are several more.

> b) If no, what could be an alternative hardware concept ?

Big iron.  This is especially attractive if you already have big iron with
spare cycles or expansion capabilities laying about.

> Can anybody give me some advice or some hints to somebody who could help
> us a bit further, some web page ...

Do a search on google for postgresql and replication.  Plus search the
archives of the pgsql mailing lists at fts.postgresql.org and see if
anything looks good.


Re: PostgreSQL alternative to "Oracle Real Application Cluster"

From
Paul Thomas
Date:
On 18/06/2003 11:19 Hubert Fröhlich wrote:
>
> What we would like to have is some alternative concept which allows us
> to use PostgreSQL on a powerful hardware to get highly performant and
> higly available database access on our terabytes.
>
> a) Does PostgreSQL have some features using a clustered hardware?
> b) If no, what could be an alternative hardware concept ?
>
>
> Can anybody give me some advice or some hints to somebody who could help
> us a bit further, some web page ...

Have you considered a simple load balancing set-up? You don't mention how
you generate the web pages - hopefully not a scripting language if
performance matters! If, for  instance, your application is written in
Java/JSPs then you could set up several servers running Tomcat and
PostgreSQL and load balance them from an Apache web server with mod_jk.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: PostgreSQL alternative to "Oracle Real Application Cluster"

From
"Arjen van der Meijden"
Date:
> Paul Thomas wrote:
>
> On 18/06/2003 11:19 Hubert Fröhlich wrote:
> >
> > What we would like to have is some alternative concept
> which allows us
> > to use PostgreSQL on a powerful hardware to get highly
> performant and
> > higly available database access on our terabytes.
> >
> > a) Does PostgreSQL have some features using a clustered hardware?
> > b) If no, what could be an alternative hardware concept ?
> >
> >
> > Can anybody give me some advice or some hints to somebody who could
> > help
> > us a bit further, some web page ...
>
> Have you considered a simple load balancing set-up? You don't
> mention how
> you generate the web pages - hopefully not a scripting language if
> performance matters! If, for  instance, your application is
> written in
> Java/JSPs then you could set up several servers running Tomcat and
> PostgreSQL and load balance them from an Apache web server
> with mod_jk.
Where would the 400-4000GB of data go in this setup? On all the distinct
postgresql-servers? On a single SAN/NAS? (Postgresql doesn't really work
with that, does it? At least not in a loadbalancing setup).
And why can't a scriptinglanguage not be used with loadbalancing? I know
it is hard or impossible to get a connectionpool in such setups, but
that doesn't mean they can't be used with loadbalancing...

Arjen




Re: PostgreSQL alternative to "Oracle Real Application Cluster"

From
Paul Thomas
Date:
On 18/06/2003 14:39 Arjen van der Meijden wrote:
> > Have you considered a simple load balancing set-up? You don't
> > mention how
> > you generate the web pages - hopefully not a scripting language if
> > performance matters! If, for  instance, your application is
> > written in
> > Java/JSPs then you could set up several servers running Tomcat and
> > PostgreSQL and load balance them from an Apache web server
> > with mod_jk.
> Where would the 400-4000GB of data go in this setup? On all the distinct
> postgresql-servers? On a single SAN/NAS? (Postgresql doesn't really work
> with that, does it? At least not in a loadbalancing setup).

I think each server would need its own copy of the data as (from what I
read on this list) 2 postmasters cannot share a common database. So if the
OP wants to use PG then he is going to have to have duplicate databases.

> And why can't a scriptinglanguage not be used with loadbalancing? I know
> it is hard or impossible to get a connectionpool in such setups, but
> that doesn't mean they can't be used with loadbalancing...

I was referring to the performance of scripting languages in web
applications. Java web applications are generally much faster and, as I
pointed out, there is an OSS load balancing option available for Java.

regards

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: PostgreSQL alternative to "Oracle Real Application

From
Jonathan Bartlett
Date:
PLEASE NOTE -

I worked at EDS for a while managing UNIX boxes on the front line.
Oracle, even with clustering, did not always fail over or work properly.
Many times it had to be manually failed over because the database didn't
detect that one was down.  In other cases, the drivers were hosed and we
had to restart the entire application (on ~80 different boxes) to get them
to point to the new live target.

Don't believe the hype.  It's just that - hype.  It's better than nothing,
but certainly not "unbreakable".

Jon

On 18 Jun 2003, Ron Johnson wrote:

> On Wed, 2003-06-18 at 05:19, Hubert Fr�hlich wrote:
> > Hello list
> >
> > I am working with PostgreSQL for the Bavarian Cadastral Administration.
> > We want to update our WWW online services so that they can use more
> > data. For that reason we want to set up an archive of geographical data
> > (ca. 400GB vector data and - maybe also 4 TB of raster data) manily used
> > for http services, i.e. the focus is on read operations. (although we
> > have to do also some kind of replication from primary storage ) The
> > system should be highly performant  and highly available. The use is
> > designed mainly for OpenGIS standardized WWW applications on
> > geographical data such as UMN MapServer ( > several 100000 web hits for
> > geographical data alone). Yet it is not quite clear if we put the raster
> > data simply in a file system or (as BLOBs) into a database.
> >
> > We use a lot of OpenSource software, not only PostgreSQL. Using
> > OpenSource software is a major column of our IT strategy.
> >
> > Now we have got a proposal for hard- and software for this machine which
> > is based mainly on the concept of the "Oracle Real Application Cluster"
> > on RedHat Linux boxes & file servers.
> > The concept is said to offer big performance - but this concept does not
> > fit into our IT strategy very well. We are a bit cautious concerning
> > promises, as Oracle Real Application Cluster seems to be some kind of
> > "black box" and we do not know HOW it really works.
> >
> > What we would like to have is some alternative concept which allows us
> > to use PostgreSQL on a powerful hardware to get highly performant and
> > higly available database access on our terabytes.
> >
> > a) Does PostgreSQL have some features using a clustered hardware?
> > b) If no, what could be an alternative hardware concept ?
>
> You'll only be able to run 1 postmaster at a time, but in order
> to only need one copy of the database and the raster data, maybe
> the GlobalFileSystem would satisfy some or all of your needs.
>
> Commercial version:
> http://www.sistina.com/products_gfs.htm
>
> OSS version:
> http://opengfs.sourceforge.net/
>
> Any "non-database" files could still be accessed by multiple web
> servers.
>
> --
> +-----------------------------------------------------------+
> | Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
> | Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
> |                                                           |
> | "Oh, great altar of passive entertainment, bestow upon me |
> |  thy discordant images at such speed as to render linear  |
> |  thought impossible" (Calvin, regarding TV)               |
> +-----------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>


Re: PostgreSQL alternative to "Oracle Real Application Cluster"

From
"Matthew Nuzum"
Date:
Think about something like this:

A = Data storage.  Probaby SAN or something very reliable and big enough to
hold all the data for images.

B = A database server containing the "Live", real, up-to-the-minute data.

C = An application server used to manage b.

D = A replicated database copy of b.  Read only.

E = One or more web servers, load balancers or caching servers accessible to
your users

+---+        +---+     +---+
| A | ------ | B |-----| C |
+---+        +---+     +---+
 | \
 |  \______________
 |                 \
+---+   +---+     +---+   +---+
| E | - | D |     | E | - | D |
+---+   +---+     +---+   +---+


I hope my diagram is a clear.  This isn't really too difficult a setup.
Imagine for now three database servers.  The master db server (B) is updated
infrequently.  Several database servers out there (D) keep an up-to-date
read only copy of B.  Each of these database servers can feed data to
several web servers.  The webservers are accessible to your users and do
most of the work.

All your image data is stored on a good quality storage medium that is
highly available and fault tolerant.  Each webserver can have read only
access to this data and serve it to requesting web clients directly.

Load can be taken off your storage network and your replica database servers
by using caching servers running something like squid.

No one on this list can tell you what your needs are going to be, but maybe
an example, general purpose cluster might look something like this:

A: One heavy duty SAN.
B: One dual processor XEON server with 4 GB of RAM and fast disk array
C: One application server or an application server cluster suited to your
tasks
D: Two dual processor XEON servers with 4 GB of RAM and fast disk array
E: Two web server clusters consisting of:
E.1: 3 web servers with 1GB RAM and connection to A and able to serve data
from D
E.2: 3 caching servers serving content hosted on E.1
E.3: 1 Load balancer directing requests to E.2 and E.1

So a total configuration like this MIGHT consist of:
1 heavy duty SAN
3 dual processor XEON servers with 4 GB of RAM each and fast disk arrays
13 single processor servers with 1GB of RAM each and HBA connection to SAN
(1 application server (C), 6 web servers (E.1), 6 caching servers (E.2))
2 load balancers

Use any programming language you like.  Perl, Php, Java can all handle a
setup like this.

Store the meta info about your images in the database, store a link to the
image data in the database but store the images on the filesystem (so that
they don't have to be replicated).

The quantity of E.1 and E.2 per each D would be a big variable and need some
testing and tweaking.

Just an idea.  ;-)

--
Matthew Nuzum
www.bearfruit.org
cobalt@bearfruit.org


> -----Original Message-----
> From: Arjen van der Meijden [mailto:acm@tweakers.net]
> Sent: Wednesday, June 18, 2003 9:39 AM
> To: 'Paul Thomas'; 'pgsql-general @ postgresql . org'
> Subject: Re: PostgreSQL alternative to "Oracle Real Application Cluster"
>
> > Paul Thomas wrote:
> >
> > On 18/06/2003 11:19 Hubert Fröhlich wrote:
> > >
> > > What we would like to have is some alternative concept
> > which allows us
> > > to use PostgreSQL on a powerful hardware to get highly
> > performant and
> > > higly available database access on our terabytes.
> > >
> > > a) Does PostgreSQL have some features using a clustered hardware?
> > > b) If no, what could be an alternative hardware concept ?
> > >
> > >
> > > Can anybody give me some advice or some hints to somebody who could
> > > help
> > > us a bit further, some web page ...
> >
> > Have you considered a simple load balancing set-up? You don't
> > mention how
> > you generate the web pages - hopefully not a scripting language if
> > performance matters! If, for  instance, your application is
> > written in
> > Java/JSPs then you could set up several servers running Tomcat and
> > PostgreSQL and load balance them from an Apache web server
> > with mod_jk.
> Where would the 400-4000GB of data go in this setup? On all the distinct
> postgresql-servers? On a single SAN/NAS? (Postgresql doesn't really work
> with that, does it? At least not in a loadbalancing setup).
> And why can't a scriptinglanguage not be used with loadbalancing? I know
> it is hard or impossible to get a connectionpool in such setups, but
> that doesn't mean they can't be used with loadbalancing...
>
> Arjen
>
>



Re: PostgreSQL alternative to "Oracle Real Application

From
Ron Johnson
Date:
On Wed, 2003-06-18 at 10:52, Jonathan Bartlett wrote:
> PLEASE NOTE -
>
> I worked at EDS for a while managing UNIX boxes on the front line.
> Oracle, even with clustering, did not always fail over or work properly.
> Many times it had to be manually failed over because the database didn't
> detect that one was down.  In other cases, the drivers were hosed and we
> had to restart the entire application (on ~80 different boxes) to get them
> to point to the new live target.
>
> Don't believe the hype.  It's just that - hype.  It's better than nothing,
> but certainly not "unbreakable".

What (exact, if possible) version of Oracle?

[snip]
--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| "Oh, great altar of passive entertainment, bestow upon me |
|  thy discordant images at such speed as to render linear  |
|  thought impossible" (Calvin, regarding TV)               |
+-----------------------------------------------------------


Re: PostgreSQL alternative to "Oracle Real Application

From
Jonathan Bartlett
Date:
> What (exact, if possible) version of Oracle?

My guess would be 8, but I'm not positive.  I'm pretty sure they hadn't
upgraded to 9 yet, and they may have been back as far as 7 - yes, I'm
aware that's a pretty wide window.  Anyway, they were using an
Oracle/Weblogic combo, and it didn't work as smoothly as they wanted it
to :)

Anyway, I know this discussion was on Oracle 9, but I just wanted to throw
out that the vendors use hype more than reality, and the chances that they
have something that is terribly better than what you could cook up at home
is remote.  That's _why_ they don't publish the details.  If you knew the
details, you wouldn't be nearly as excited.

Jon

>
> [snip]
> --
> +-----------------------------------------------------------+
> | Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
> | Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
> |                                                           |
> | "Oh, great altar of passive entertainment, bestow upon me |
> |  thy discordant images at such speed as to render linear  |
> |  thought impossible" (Calvin, regarding TV)               |
> +-----------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>


Re: PostgreSQL alternative to "Oracle Real Application

From
"scott.marlowe"
Date:
On Wed, 18 Jun 2003, Jonathan Bartlett wrote:

> > What (exact, if possible) version of Oracle?
>
> My guess would be 8, but I'm not positive.  I'm pretty sure they hadn't
> upgraded to 9 yet, and they may have been back as far as 7 - yes, I'm
> aware that's a pretty wide window.  Anyway, they were using an
> Oracle/Weblogic combo, and it didn't work as smoothly as they wanted it
> to :)
>
> Anyway, I know this discussion was on Oracle 9, but I just wanted to throw
> out that the vendors use hype more than reality, and the chances that they
> have something that is terribly better than what you could cook up at home
> is remote.  That's _why_ they don't publish the details.  If you knew the
> details, you wouldn't be nearly as excited.

Which feeds back to my earlier point that if the vendor won't setup a test
bench system for you to test on, be extra suspicious of their promises.


Re: PostgreSQL alternative to "Oracle Real Application Cluster"

From
"Shridhar Daithankar"
Date:
Sorry, I lost original post so replying several quote level deep.

On 18 Jun 2003 at 13:22, Matthew Nuzum wrote:
> > Where would the 400-4000GB of data go in this setup? On all the distinct
> > postgresql-servers? On a single SAN/NAS? (Postgresql doesn't really work
> > with that, does it? At least not in a loadbalancing setup).
> > And why can't a scriptinglanguage not be used with loadbalancing? I know
> > it is hard or impossible to get a connectionpool in such setups, but
> > that doesn't mean they can't be used with loadbalancing...

May be this is another approach, may be dumb but consider.

The data is read-only archive data. Not likely to be changed once loaded.
Besides this is not OLTP style stuff where heavy concurrent transactions are
involved.

I suggest that keep BLOBs in file system rather than in database. I know it is
already suggested but just ealborating on that.

There is no need to replicate huge amount of data. The app can store file name
and checksum in database. Once loading is complete and correct, an archive copy
of data can be kept offline for recovery purpose.

The app. can retrieve data and file separately and checksum it. May be it would
 need a small kind of caching layer but should be fairly trivial.

Once that is done, the rest of the actual database would be pretty small and
could be replicated if required. The database can be hosted on a small-medium
server and BLOBS can be sent to SAN/NAS.

Data loading need to be verified but that is a small cost to pay for advantages
this method offers.

Just a thought..



Bye
 Shridhar

--
No one wants war.        -- Kirk, "Errand of Mercy", stardate 3201.7


Re: PostgreSQL alternative to "Oracle Real Application

From
Ron Johnson
Date:
On Thu, 2003-06-19 at 02:39, Shridhar Daithankar wrote:
> Sorry, I lost original post so replying several quote level deep.
>
> On 18 Jun 2003 at 13:22, Matthew Nuzum wrote:
> > > Where would the 400-4000GB of data go in this setup? On all the distinct
> > > postgresql-servers? On a single SAN/NAS? (Postgresql doesn't really work
> > > with that, does it? At least not in a loadbalancing setup).
> > > And why can't a scriptinglanguage not be used with loadbalancing? I know
> > > it is hard or impossible to get a connectionpool in such setups, but
> > > that doesn't mean they can't be used with loadbalancing...
>
> May be this is another approach, may be dumb but consider.
>
> The data is read-only archive data. Not likely to be changed once loaded.
> Besides this is not OLTP style stuff where heavy concurrent transactions are
> involved.
>
> I suggest that keep BLOBs in file system rather than in database. I know it is
> already suggested but just ealborating on that.
>
> There is no need to replicate huge amount of data. The app can store file name
> and checksum in database. Once loading is complete and correct, an archive copy
> of data can be kept offline for recovery purpose.
>
> The app. can retrieve data and file separately and checksum it. May be it would
>  need a small kind of caching layer but should be fairly trivial.
>
> Once that is done, the rest of the actual database would be pretty small and
> could be replicated if required. The database can be hosted on a small-medium
> server and BLOBS can be sent to SAN/NAS.
>
> Data loading need to be verified but that is a small cost to pay for advantages
> this method offers.

Another benefit of this method is that by using the Global File
System, multiple web servers could access those files simultaneously,
thus speeding response time.

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| "Oh, great altar of passive entertainment, bestow upon me |
|  thy discordant images at such speed as to render linear  |
|  thought impossible" (Calvin, regarding TV)               |
+-----------------------------------------------------------