Home > mailing lists

[Q] Cluster design for geographically separated dbs - Mailing list pgsql-general

From	V S P
Subject	[Q] Cluster design for geographically separated dbs
Date	March 7, 2009 20:11:17
Msg-id	1236459827.25013.1304185549@webmail.messagingengine.com Whole thread Raw
In response to	Re: VACUUM (John R Pierce <pierce@hogranch.com>)
Responses	Re: [Q] Cluster design for geographically separated dbs
List	pgsql-general

Tree view

Hello,
I am designing a db to hold often changed user data.

I just wanted to write down what I am thinking and ask people on the
list to comment
if they have any experiences in that area.

My thought is to have
say west-cost and east-cost data center and each user will
go to either East Coast or West Coast

and then within each Coast, I would want to partition by Hash on a user
id.

I am reviewing the Skype paper on the subject


http://kaiv.wordpress.com/2007/07/27/postgresql-cluster-partitioning-with-plproxy-part-i/


And wanted to ask what would be the main challenges I am facing with --
from the experience of the users on this list.

Especially I am not sure how to for example manage 'overlapping unique
IDs' data.

First, say I have a user who is trying to register with the same ID as
somebody
else only in a different data center -- that means that I always have to
check
first in each datacenter if ID exists.  Then based on his/her IP address
I decide what data center is closest (but IP addresses are often not a
good indication of geographical location of the user either, so I will
give them a 'manual' select option)

Then if I have say 'BIG' serial in my tables, but since there is more
than one database -- the 'big-serial' in one database can well overlap
it in another database.

So if I have any tables that must contain data from different databases
-- I have to add something else to the 'foreign' key -- besides the
reference to the big serial. And so on...

Right now - on paper, I am just having quite a few 'extra' fields in my
tables just to support 'UNiqueness' of the record across clusters.

I am not sure if I am doing it the right way (because then I also have
to at some point in time 'Defgrament' the IDs (as the data with
BIGserial keys can be deleted).

It looks to me that If I design things to take advantage of  Skype's
plproxy -- I will be able to leverage, what appears to be, a relatively
easy way to get data between databases (for reports that span clusters).


thanks in advance for any comments,
Vlad
--
  V S P
  toreason@fastmail.fm

--
http://www.fastmail.fm - Email service worth paying for. Try it for free

pgsql-general by date:

From: John R Pierce
Date: 07 March 2009, 18:41:23
Subject: Re: VACUUM

From: Willy-Bas Loos
Date: 07 March 2009, 20:32:19
Subject: open up firewall from "anywhere" to postgres ports?

[Q] Cluster design for geographically separated dbs - Mailing list pgsql-general

Previous

Next