Advice on multi-machine high-reliability setup? - Mailing list pgsql-admin

From postgresql@phreaker.net
Subject Advice on multi-machine high-reliability setup?
Date
Msg-id 20030904125947.A3F73D1B89A@svr1.postgresql.org
Whole thread Raw
Responses FATAL 2: (pg_clog ... no such file or directory)
List pgsql-admin
Hi,

I've done some work with databases, but nothing extreme.
I just got assigned the setting up of what I'd call a
"high-reliability" site, as opposed to "high-availability" --
which I'd like too, of course :-)

I've got some ideas on how to acheive my goal, but I fear I'm
not quite up-to-date on the possibilities of replication and
such, and I haven't found much in-depth documentation on the
kind of setup I'd like (pointers anyone?), so I'd appreciate
it if you could give my ideas the once-over and tell me if I'm
missing something.

Requirements:
=============

Basic model is that of a reference database; I am a repository
for data which is constantly consulted and regularly updated.
OK, that pretty much describes any database :-) Anyway, a lot
more queries than updates.

I'll have one or more webservers using PHP to insert into the
database.  I want them to be limited to queries and inserts so
that bugs in the PHP scripts can only insert bad data and not
remove good data.  I prefer to have programs that modify the
central data to be server-side logic, and not farm out "UPDATE"s
to PHP scripts.

I want to keep a record of all updates (made through client
scripts) for some time, and I do want to be able to query
the updates, which I suppose eliminates simple logging.  I
need a ticket number for each update (I'm avoiding the term
"transaction"), and the capability to see when and how the
update was fulfilled, which updates were made in a given
time-frame, etc.

Consulting has to be immediate, while updates don't have to go
through instantaneously, but I want it rather quick.  One or two
minutes is OK, five minutes start being a long time.

If the client scripts say "there was an error, try again"
it's not a severe problem.  If they say "OK" and the data
doesn't make it into the nightly backup, even once, that's
a showstopper, and nobody will use my system.  Least
acceptable reason for losing data once acknowledged is the
near-simultaneous catastrophic loss of hard disks on two
separate machines.  (Yes, software RAID -- or maybe even
hardware if I get my hands on enough controllers -- will make
that four disks lost simultaneously :-)). I think a good way
would be for updating clients to write to one machine, and delay
acknowledgement until the data is read from the second "slave"
machine, possibly saying after some time "there seems to be a
delay, please recheck <URL with ticket number> in a few minutes
to see if your update went through".  Does that sound feasible?

I need to be able to exchange machines.  I fully expect my
crummy hardware to break, I know that sooner or later I'll want
to upgrade the OS in ways that require a reboot, or upgrade
crummy hardware, and I don't want the downtime.  I'm working on
the premise that having multiple cheap Linux boxes (I've got
lots of layoffs) is better than one or two really big expensive
servers (no money for that anyway).  I want to be able to take
a new machine, install it to specifications (or restore one
from backup !), switch it on, bring it up to date with current
database, and let it take over as a hot backup and/or primary
server.  Or as a web server, of course.

I don't have much of an idea on data size or update frequency,
which is one of the reasons I want to be able to plug in new
machines seamlessly; if load gets high, it means I'm popular,
and I'll be able to get better hardware :-)

My ideas:
=========

I'm thinking that the PHP scripts could write updates to a
"write-only" table/database/server.  I suppose there is a simple
way to make an insert into an auto-increment table and get in
return the unique number to use as a ticket.

I'd use the newly released replication to have the updates
and the data on a query-only server, an update only being
acknowleged to the user when it's registered on the query-only
server.

Once the update is registered, I use some kind of script
to convert "ticket=aaaaa, time=xx:xx who=Alice name=XXX
newvalue=YYY, done=NULL" into "UPDATE data SET value=YYY,
lastchange=aaaaa where name=XXX; UPDATE tickets set done=time
where ticket=aaaaa;".  Stored Procedure triggered on ticket
creation?  I've never used them . . . do they work across
machines?  That is, what would be the best way to have update
tables on one machine and data tables on another?  If I had
that, changing the data master would be transparent to users,
who'd just notice a five-minute delay before the update went
through instead of 30 (?) seconds.  It would be cool to use
transactions, but I don't think one transaction can refer to two
databases on two machines (yet)?

This should also enable me to restore a backup of the primary
data (in case of loss of the primary data), apply the updates
since the backup was made, and end-up with an up-to date
system. Hmm. Is this necessary if I have replication . . . is
replication necessary if I have this?

My doubts:
==========

If I do manage to put updates and data on two different servers,
would it be possible to make a transaction on the data triggered
by the creation of an update ticket?  That way I could make
updates conditional on the data ("You wanted to update name=XXX
from ZZZ (value when you queried) to YYY, but between you query
of the data and the update request another update went though,
setting the value to TTT").

Could I maybe have several update databases (one per web server
for instance, each replicating their data onto a shared slave)
so that I can exchange the update master without user-perceived
downtime?

Can I do partial replication?  Querying updates is obviously a
totally different operation from querying the data, and it'll be
much less frequent, so could I have slaves for just the data?

(Sudden fundamental doubt) Replication doesn't just mean better
query time on the slave, it also means failover to the slave if
need be, right?

(Most important) Am I overcomplicating things? I do want the
slightest data loss to need complete and near-simultanous loss
of two servers, I do want to be able to add and remove machines
from my almost-cluster witout users perceiving downtime, but is
there a simpler way to achive that goal than the multi-tiered
cascading update system I've outlined?

Thanks a lot for any pointers or comments you can give me.

HAND

pgsql-admin by date:

Previous
From: "David F. Skoll"
Date:
Subject: Row locking during UPDATE
Next
From: Rodrigo De la Peña
Date:
Subject: unsubscrib