Thread: idea for a geographically distributed database: how best to implement?

idea for a geographically distributed database: how best to implement?

From

Andy Ballingall

Date:

17 November 2005, 07:55:14

Hello,

I've got a database for a website which is a variant of the 'show stuff near
to me' sort of thing.

Rather than host this database on a single server, I have a scheme in mind
to break the database up geographically so that each one can run comfortably
on a small server, but I'm not sure about the best way of implementing it.

Here's the scheme:

--------------------------------
Imagine that the country is split into an array of square cells.
Each cell contains a database that stores information about people who live
in the area covered by the cell.

There's one problem with this scheme. What happens if you live near the edge
of a cell?

My solution is that any inserted data which lies near to the edge of cell A
is *also* inserted in the database of the relevant neighbouring cell - let's
say cell B.

Thus, if someone lives in cell B, but close to the border with cell A,
they'll see the data that is geographically close to 
them, even if it lies in cell A.

--------------------------------

Is this a common pattern?

I could, of course, simply find every insert, update and delete in the
application and alter the code to explicitly update all the relevant
databases, but is there a more elegant way of simply saying: "Do this
transaction on both Database A and Database B" monotonically?

I've had a look at some replication solutions, but they all seem to involve
replicating an entire database. The advantage of my scheme is that if I can
distribute my application over large numbers of small servers, I'll end up
with more bangs for the buck, and it'll be much easier to manage growth by
managing the number of servers, and number of cells hosted on each server.

Thanks for any suggestions!
Andy Ballingall

Re: idea for a geographically distributed database: how best to implement?

From

"codeWarrior"

Date:

17 November 2005, 11:58:07

Sounds like a mis-guided approach to me...

You might want to consider using latitude and longitude or zip codes or 
taking more of a traditional "GIS" approach rather than duplicating data 
across redundant databases.

Another issue is that you end up having to query every database to find 
proximity... Suppose you have 500 "cells" ? You now have to invoke some sort 
of RPC mechanism on 499 other RDBMS, manage 500 user names and passwords, 
retrieve potentially 500 recordsets, merge the data, etc...

Your problems will continue to get more and more complex... You are better 
off with a single RDBMS and a single schema...




"Andy Ballingall" <andy@areyoulocal.co.uk> wrote in message 
news:ECOWS04Mp8nkfZyufzT00010fe4@smtp-out4.blueyonder.co.uk...
> Hello,
>
> I've got a database for a website which is a variant of the 'show stuff 
> near
> to me' sort of thing.
>
> Rather than host this database on a single server, I have a scheme in mind
> to break the database up geographically so that each one can run 
> comfortably
> on a small server, but I'm not sure about the best way of implementing it.
>
> Here's the scheme:
>
> --------------------------------
> Imagine that the country is split into an array of square cells.
> Each cell contains a database that stores information about people who 
> live
> in the area covered by the cell.
>
> There's one problem with this scheme. What happens if you live near the 
> edge
> of a cell?
>
> My solution is that any inserted data which lies near to the edge of cell 
> A
> is *also* inserted in the database of the relevant neighbouring cell - 
> let's
> say cell B.
>
> Thus, if someone lives in cell B, but close to the border with cell A,
> they'll see the data that is geographically close to
> them, even if it lies in cell A.
>
> --------------------------------
>
> Is this a common pattern?
>
> I could, of course, simply find every insert, update and delete in the
> application and alter the code to explicitly update all the relevant
> databases, but is there a more elegant way of simply saying: "Do this
> transaction on both Database A and Database B" monotonically?
>
> I've had a look at some replication solutions, but they all seem to 
> involve
> replicating an entire database. The advantage of my scheme is that if I 
> can
> distribute my application over large numbers of small servers, I'll end up
> with more bangs for the buck, and it'll be much easier to manage growth by
> managing the number of servers, and number of cells hosted on each server.
>
> Thanks for any suggestions!
> Andy Ballingall
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>

Re: idea for a geographically distributed database: how best

From

Achilleus Mantzios

Date:

17 November 2005, 12:41:18

O codeWarrior έγραψε στις Nov 17, 2005 :

> Sounds like a mis-guided approach to me...
> 
> You might want to consider using latitude and longitude or zip codes or 
> taking more of a traditional "GIS" approach rather than duplicating data 
> across redundant databases.
> 
> Another issue is that you end up having to query every database to find 
> proximity... Suppose you have 500 "cells" ? You now have to invoke some sort 
> of RPC mechanism on 499 other RDBMS, manage 500 user names and passwords, 
> retrieve potentially 500 recordsets, merge the data, etc...
> 
> Your problems will continue to get more and more complex... You are better 
> off with a single RDBMS and a single schema...

Andy, i agree with what codeWarrior says.
But if you are interested in replication, dbmirror is very elegant
(altho not as trendy) simple, and highly customizable replication 
solution.

I have heavily modified dbmirror to even support
Asynchronous Conditional row grained 
Foreign key dependency Traversal oriented 
Lazy Replication!!!
(which ofcourse nobody wants :)

> 
> 
> 
> 
> "Andy Ballingall" <andy@areyoulocal.co.uk> wrote in message 
> news:ECOWS04Mp8nkfZyufzT00010fe4@smtp-out4.blueyonder.co.uk...
> > Hello,
> >
> > I've got a database for a website which is a variant of the 'show stuff 
> > near
> > to me' sort of thing.
> >
> > Rather than host this database on a single server, I have a scheme in mind
> > to break the database up geographically so that each one can run 
> > comfortably
> > on a small server, but I'm not sure about the best way of implementing it.
> >
> > Here's the scheme:
> >
> > --------------------------------
> > Imagine that the country is split into an array of square cells.
> > Each cell contains a database that stores information about people who 
> > live
> > in the area covered by the cell.
> >
> > There's one problem with this scheme. What happens if you live near the 
> > edge
> > of a cell?
> >
> > My solution is that any inserted data which lies near to the edge of cell 
> > A
> > is *also* inserted in the database of the relevant neighbouring cell - 
> > let's
> > say cell B.
> >
> > Thus, if someone lives in cell B, but close to the border with cell A,
> > they'll see the data that is geographically close to
> > them, even if it lies in cell A.
> >
> > --------------------------------
> >
> > Is this a common pattern?
> >
> > I could, of course, simply find every insert, update and delete in the
> > application and alter the code to explicitly update all the relevant
> > databases, but is there a more elegant way of simply saying: "Do this
> > transaction on both Database A and Database B" monotonically?
> >
> > I've had a look at some replication solutions, but they all seem to 
> > involve
> > replicating an entire database. The advantage of my scheme is that if I 
> > can
> > distribute my application over large numbers of small servers, I'll end up
> > with more bangs for the buck, and it'll be much easier to manage growth by
> > managing the number of servers, and number of cells hosted on each server.
> >
> > Thanks for any suggestions!
> > Andy Ballingall
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> >               http://www.postgresql.org/docs/faq
> > 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 

-- 
-Achilleus

Re: idea for a geographically distributed database: how best to implement?

From

Andy Ballingall

Date:

17 November 2005, 13:01:45

Hi again,

I don't think I've explained my idea well enough:

>You might want to consider using latitude and longitude or zip codes or 
taking more of a traditional "GIS" approach rather than duplicating data 
across redundant databases.

I do use coordinates to position data. The databases aren't redundant. Each
database serves a particular region, the aim being to ensure that no one
database gets too big for the machines in my farm to cope with.

>Another issue is that you end up having to query every database to find 
proximity... Suppose you have 500 "cells" ? You now have to invoke some sort

of RPC mechanism on 499 other RDBMS, manage 500 user names and passwords, 
retrieve potentially 500 recordsets, merge the data, etc...


1. It doesn't matter how many cells there are. A user's details are only
found in the database covering the area he lives, and possibly in the
neighbouring databases (at most 3, if he lives in the corner of the cell).

2. The neighbouring cells of cell A never change, therefore you don't need
to do anything clever to identify which other databases to write to in these
edge conditions, since when the user logs in, the relationship information
is read just once (just like the user's name and picture)

3. No merging ever needs to take place during queries. Here's an example.
Let's say, you, a resident of cell A, say 'show me who lives nearby!'. The
database of cell A has *all* the information. Why? Well, when a resident in
nearby cell B registered, his details were also added to cell A at that
time. Database A doesn't know, or care, how the data it holds got there.

4. Which database do I use to serve all your requests? Well, yes, once, I
have to work out which one it is, based on where you are, but that's almost
no work at all (see next point...)


5. I didn't mention before that my scheme *does* also require a national
database, but the size of that is well defined, as the only jobs it needs to
do are:
a) Know the names of all the places,
b) Know the definition of the existing cells, and their positions - to make
it easy to manage the cell structure, and to know where to send you when you
login
c) A set of global sequences used to generate unique ids for all the other
databases. When your record goes into cell A and cell B, the id of this
record is the *same*...

>Your problems will continue to get more and more complex... You are better 
off with a single RDBMS and a single schema...

I currently have a single database. It is clearly simpler. However, I also
have a headache about what machinery to buy so that:
a) it starts off cheap
b) It is easy to expand, if and when the website takes off.

Yes, if I knew how many people would use this service, and how quickly it
would expand, I'd be able to raise the capital to buy a huge nationwide
server. 

But I don't. So the answer 'how big does your DB server need to be' is
simply 'I can only make a very poor guess'. This doesn't look good in a
business plan. Far better if I can demonstrate an application which can
start off on a single simple server, and can elegantly expand with the user
base, with the farm of servers growing over time.


Unlike many data sets, mine is almost totally partitioned geographically.
There is only *one* little detail - that of visibility of data in
neighbouring cells, and that is sorted out with my idea of duplicating
information between neighbours.

Hope that fills in some gaps...

Thanks for your comments,
Andy



"Andy Ballingall" <andy@areyoulocal.co.uk> wrote in message 
news:ECOWS04Mp8nkfZyufzT00010fe4@smtp-out4.blueyonder.co.uk...
> Hello,
>
> I've got a database for a website which is a variant of the 'show stuff 
> near
> to me' sort of thing.
>
> Rather than host this database on a single server, I have a scheme in mind
> to break the database up geographically so that each one can run 
> comfortably
> on a small server, but I'm not sure about the best way of implementing it.
>
> Here's the scheme:
>
> --------------------------------
> Imagine that the country is split into an array of square cells.
> Each cell contains a database that stores information about people who 
> live
> in the area covered by the cell.
>
> There's one problem with this scheme. What happens if you live near the 
> edge
> of a cell?
>
> My solution is that any inserted data which lies near to the edge of cell 
> A
> is *also* inserted in the database of the relevant neighbouring cell - 
> let's
> say cell B.
>
> Thus, if someone lives in cell B, but close to the border with cell A,
> they'll see the data that is geographically close to
> them, even if it lies in cell A.
>
> --------------------------------
>
> Is this a common pattern?
>
> I could, of course, simply find every insert, update and delete in the
> application and alter the code to explicitly update all the relevant
> databases, but is there a more elegant way of simply saying: "Do this
> transaction on both Database A and Database B" monotonically?
>
> I've had a look at some replication solutions, but they all seem to 
> involve
> replicating an entire database. The advantage of my scheme is that if I 
> can
> distribute my application over large numbers of small servers, I'll end up
> with more bangs for the buck, and it'll be much easier to manage growth by
> managing the number of servers, and number of cells hosted on each server.
>
> Thanks for any suggestions!
> Andy Ballingall
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
> 



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to      choose an index scan if your joining column's
datatypesdo not      match
 



-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 16/11/2005

Re: idea for a geographically distributed database: how best

From

Andy Ballingall

Date:

17 November 2005, 13:05:57


>Andy, i agree with what codeWarrior says.
But if you are interested in replication, dbmirror is very elegant
(altho not as trendy) simple, and highly customizable replication 
solution.

I have heavily modified dbmirror to even support
Asynchronous Conditional row grained 
Foreign key dependency Traversal oriented 
Lazy Replication!!!
(which ofcourse nobody wants :)


I'll go and check it out, so even if I don't want to use it, I'll at least
know what it means! 

(I considered suggesting using an acronym for it, but 'ACRGFKDTOLR' made my
lips bleed when I tried to say it).

Thanks,
Andy


> 
> 
> 
> 
> "Andy Ballingall" <andy@areyoulocal.co.uk> wrote in message 
> news:ECOWS04Mp8nkfZyufzT00010fe4@smtp-out4.blueyonder.co.uk...
> > Hello,
> >
> > I've got a database for a website which is a variant of the 'show stuff 
> > near
> > to me' sort of thing.
> >
> > Rather than host this database on a single server, I have a scheme in
mind
> > to break the database up geographically so that each one can run 
> > comfortably
> > on a small server, but I'm not sure about the best way of implementing
it.
> >
> > Here's the scheme:
> >
> > --------------------------------
> > Imagine that the country is split into an array of square cells.
> > Each cell contains a database that stores information about people who 
> > live
> > in the area covered by the cell.
> >
> > There's one problem with this scheme. What happens if you live near the 
> > edge
> > of a cell?
> >
> > My solution is that any inserted data which lies near to the edge of
cell 
> > A
> > is *also* inserted in the database of the relevant neighbouring cell - 
> > let's
> > say cell B.
> >
> > Thus, if someone lives in cell B, but close to the border with cell A,
> > they'll see the data that is geographically close to
> > them, even if it lies in cell A.
> >
> > --------------------------------
> >
> > Is this a common pattern?
> >
> > I could, of course, simply find every insert, update and delete in the
> > application and alter the code to explicitly update all the relevant
> > databases, but is there a more elegant way of simply saying: "Do this
> > transaction on both Database A and Database B" monotonically?
> >
> > I've had a look at some replication solutions, but they all seem to 
> > involve
> > replicating an entire database. The advantage of my scheme is that if I 
> > can
> > distribute my application over large numbers of small servers, I'll end
up
> > with more bangs for the buck, and it'll be much easier to manage growth
by
> > managing the number of servers, and number of cells hosted on each
server.
> >
> > Thanks for any suggestions!
> > Andy Ballingall
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> >               http://www.postgresql.org/docs/faq
> > 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 

-- 
-Achilleus


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
              http://archives.postgresql.org



-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 16/11/2005

Re: idea for a geographically distributed database: how best to implement?

From

Andy Ballingall

Date:

18 November 2005, 05:09:26

Thanks Aidan,

That's very interesting! I've not used slony yet, so I'll setup some tests
and try exactly this mechanism.

What I like about it is that the changes that have to occur to either add a
cell or to split an existing cell into two to manage growing traffic have a
localised effect on the web of database interdependencies, making it more
straightforward to do maintenance without the website being taken offline
for most people. They'd only notice downtime if they happened to live in or
next to the cells being messed around with. Not having to touch the actual
web application is nice.

The other advantage over my manual hack is that the replicated data is just
that - replicated, so when each database does a backup, it's only the master
data that is getting backed up and restored. In my manual scheme, following
restoration after a node failure,  I'd have to do a postprocessing step to
work out which restored data was actually data slaved from a neighbouring
cell, and refresh it in case the neighbouring data has been updated since
the node went down.

(I hadn't considered backup and restore properly up till now, but I'm going
to add that to the list of good reasons for taking advantage of my
'distantly disjoint' data set. If my national data is shared between 500
cells, then each database is going to be titchy, and backup will take 'no
time at all', as in parallel, 500 streams of database data flow first to
node local storage and thence to safer, remote storage)

Cheers,
Andy

---------

-----Original Message-----
From: Aidan Van Dyk [mailto:aidan@highrise.ca] 
Sent: 17 November 2005 19:54
To: andy@areyoulocal.co.uk; aidan@highrise.ca
Subject: Re: idea for a geographically distributed database: how best to
implement?

Andy,

This is  *very* interesting scenario.  Definitely peaked my interest on this
one.

I haven't tried this scenario before, but I have used "inheritted" tables to
do "splitups".  Basically, just partition your data, and replicate certain
partitions between various neighbours. http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html

With Slony, you can get "master slave" replication of particular tables
"sets" (not necessarily whole databases).

So I haven't developped this idea, or tried it, but you might be able to
play with it and make it work.

On *each* database, you have a head table, with no records: CREATE TABLE complete ();

And a bunch of "cell" tables that inherit the "complete" table: CREATE TABLE cell_a () INHERITS "complete"; CREATE
TABLEcell_b () INHERITS "complete";

Basically, you *always* work on the "complete" table, with the following
rules:

CREATE RULE cell_replicate_insert ON INSERT TO complete DO INSTEAD       INSERT INTO cell_a (...);
CREAT RULE cell_replicate_update ON UPDATE TO complete DO INSTEAD       UPDATE cell_a ...;

Use SlonyI to replecate cella to NODE B (master->slave), and on NODE B, do
something similar (except work on b, and replicate b -> NODE A).

This way, all your web/interface code everywhere is using the "complete"
table for any insert/update/select.  When insert/updateing, it will always
go to the "local" table, which Slony "replicates" out to the neighbouring
nodes.  And when selecting on it (the "complete" table), you get data from
your local table, and any "replicated" neighbouring node's tables which are
in the local schema inheriting from "complete" and SlonyI is updating.

Good luck, and if you get anywhere with this, let me know!

Andy Ballingall wrote:

> Hello,
> 
> I've got a database for a website which is a variant of the 'show stuff
> near to me' sort of thing.
> 
> Rather than host this database on a single server, I have a scheme in mind
> to break the database up geographically so that each one can run
> comfortably on a small server, but I'm not sure about the best way of
> implementing it.
> 
> Here's the scheme:
> 
> --------------------------------
> Imagine that the country is split into an array of square cells.
> Each cell contains a database that stores information about people who
> live in the area covered by the cell.
> 
> There's one problem with this scheme. What happens if you live near the
> edge of a cell?
> 
> My solution is that any inserted data which lies near to the edge of cell
> A is *also* inserted in the database of the relevant neighbouring cell -
> let's say cell B.
> 
> Thus, if someone lives in cell B, but close to the border with cell A,
> they'll see the data that is geographically close to
> them, even if it lies in cell A.
> 
> --------------------------------
> 
> Is this a common pattern?
> 
> I could, of course, simply find every insert, update and delete in the
> application and alter the code to explicitly update all the relevant
> databases, but is there a more elegant way of simply saying: "Do this
> transaction on both Database A and Database B" monotonically?
> 
> I've had a look at some replication solutions, but they all seem to
> involve replicating an entire database. The advantage of my scheme is that
> if I can distribute my application over large numbers of small servers,
> I'll end up with more bangs for the buck, and it'll be much easier to
> manage growth by managing the number of servers, and number of cells
> hosted on each server.
> 
> Thanks for any suggestions!
> Andy Ballingall
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 16/11/2005

Re: idea for a geographically distributed database: how best to implement?

From

Andrew Sullivan

Date:

18 November 2005, 11:35:11

On Fri, Nov 18, 2005 at 09:09:24AM -0000, Andy Ballingall wrote:
> That's very interesting! I've not used slony yet, so I'll setup some tests
> and try exactly this mechanism.

If you do this with Slony, and have any success, I know that the
folks on the Slony list would dearly like to hear about it
(especially if you can provide details of how you did it).  If you
get this working in a production system, I can think of more than
one conference that would _also_ like a paper on it.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.    --Alexander Hamilton

Re: idea for a geographically distributed database: how best to implement?

From

"Bath, David"

Date:

21 November 2005, 19:59:18

On Thu, 17 Nov 2005 19:44, Andy Ballingall wrote:
> I've got a database for a website which is a variant of the 'show stuff near
> to me' sort of thing.
> 
> Rather than host this database on a single server, I have a scheme in mind
> to break the database up geographically so that each one can run comfortably
> on a small server, but I'm not sure about the best way of implementing it.

There are a couple of philosophical perspectives I've come across in previous
work with cadastral data that may be useful.

1) A given point may be in more than one "hierarchy" of geographical units  (ofbiz.org implements this quite well with
differentregion types and  a "region relationship type" table).

2) Some relevant hierarchies involve irregular shapes.  e.g. a) postal/zip codes are not squares  e.g. b) Often a
postal/zipcode system for a country is only loosely          based on states/province, as sometimes "twin cities" can
      straddle a river and state/province boundary, but the postal          centre for both sides is handled from one
side. (The same          often applies for business servicing).  An example that comes          to mind is Mildura
(Victoria/Australia)and its smaller          sibling across the Murray River, Wentworth (New South
Wales/Australia). e.g. c) the "capture" area for many offices are often based on the          state/province, again,
notsquare.

3) There are ways of looking at "approximate nearness" of irregular shapes  (although the same applies to squares)
usingterms such as "centric"  and "centroid", which involve things like drawing the smallest circle  AROUND a region
thatencloses ALL of it, or the largest circle INSIDE  a region.  It might be relevant to consider the "slightly outside
but close" points in your "replication" as well.  (In my previous example,  Mildura and Wentworth would be included in
thedatabases of branch  offices in BOTH Vic and NSW, although one office should be considered  authoritative, as
servicesfor both might be through Victorian branch  offices, although Wentworth customers are likely to contact the NSW
branch office).  It is useful to consider "extending" the diameter  of the circle enclosing the entire region by x%.
Yep,"squaring the  circle" and "circling the square" are useful in the real world.

4) There are a lot of useful bits and pieces at opengis.org and postgis.org

5) There are many useful GIS functions in postgres to decide if a point is  inside an irregular polygon (including
whetherit has both convex and  concave bits), and whether any two shapes overlap.

Hope these points are useful.
-- 
David T. Bath
dave.bath@unix.net

Re: idea for a geographically distributed database: how best to implement?

From

Andy Ballingall

Date:

22 November 2005, 09:22:23

David Bath wrote:
> There are a couple of philosophical perspectives I've come across in
> previous
> work with cadastral data that may be useful...[snipped]

Thanks, David

In this particular application, structures such as postcode sectors,
administrative boundaries etc. are not really of much importance, as most
stuff is a simple coordinate based searches. Even with the problem
partitioned into disjoint regions, within each region, the search remains
trivial, as all the data that the user is allowed to access will be stored
with that region (this includes data replicated from neighbouring regions).

In this context, the interesting task isn't so much the actual database
searching, or the exact definition of the disjoint regions.

The interesting task is to define a system which can dynamically remap the
hosting of regions to specific servers, so that no one server gets too busy.
As demand grows, I simply plug in more 4 blades and press the 'reconfigure'
button (Sorry - I was dreaming for a moment...)

The only limiters are the number of servers available and the activity
within a single region (which must be servable by a single server), but
given the highly localised nature of the application, the regions can be
very small, and I don't expect to ever see a region with more than 1GB of
data - the aim being for all the data to be resident in RAM.

So far, I've already seen some issues. I've been looking at slony-1 to
handle the replication between adjacent regions, and not only is it
asynchronous (I was hoping otherwise...slony-2 seems a long way off), but
changing the db schema has ramifications too. (I.e. changing the schema
means redefining each replication). Still - no show stoppers yet.

Thanks for your insights,

Andy

DEFAULT Constraint based on table type?

From

"Announce"

Date:

28 November 2005, 17:22:50

Lets say I have the following tables.

CREATE TABLE animals(id primary key, name varchar, type varchar);
CREATE TABLE dogs (breed varchar)            INHERITS (animals);
CREATE TABLE birds (bool hasFeathers) INHERITS (animals);

Is there a way I can specify a default on the child table that will populate
the 'type' column? For example, if I am inserting a row in table DOGS, I
would always want the default value for column TYPE to be 'DOG'. If I am
inserting into BIRDS.... type 'BIRD'.

I know that I could add individual triggers on each table that set the TYPE
field to a default value on insert but I wanted a more simple solution like
setting a DEFAULT table-constraint.

Also, In java, this could be done on a parent object by overriding a
constructor or method, using the Class object or instanceof.  Is there
anyway for a table to "know" it's "class" in this scenario?

Thanks,

-Aaron

Re: DEFAULT Constraint based on table type?

From

Jaime Casanova

Date:

28 November 2005, 18:14:18

On 11/28/05, Announce <truthhurts@insightbb.com> wrote:
> Lets say I have the following tables.
>
> CREATE TABLE animals(id primary key, name varchar, type varchar);
> CREATE TABLE dogs (breed varchar)            INHERITS (animals);
> CREATE TABLE birds (bool hasFeathers) INHERITS (animals);
>
> Is there a way I can specify a default on the child table that will populate
> the 'type' column? For example, if I am inserting a row in table DOGS, I
> would always want the default value for column TYPE to be 'DOG'. If I am
> inserting into BIRDS.... type 'BIRD'.
>
> I know that I could add individual triggers on each table that set the TYPE
> field to a default value on insert but I wanted a more simple solution like
> setting a DEFAULT table-constraint.
>
> Also, In java, this could be done on a parent object by overriding a
> constructor or method, using the Class object or instanceof.  Is there
> anyway for a table to "know" it's "class" in this scenario?
>
> Thanks,
>
> -Aaron
>

CREATE TABLE animals(id primary key, name varchar, type varchar);

CREATE TABLE dogs (breed varchar)            INHERITS (animals);
ALTER TABLE dogs ALTER COLUMN type SET DEFAULT 'DOG';

CREATE TABLE birds (bool hasFeathers) INHERITS (animals);
ALTER TABLE birds ALTER COLUMN type SET DEFAULT 'BIRD';


--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

Re: DEFAULT Constraint based on table type?

From

Rod Taylor

Date:

28 November 2005, 20:06:55

On Mon, 2005-11-28 at 14:22 -0600, Announce wrote:
> Lets say I have the following tables.
> 
> CREATE TABLE animals(id primary key, name varchar, type varchar);
> CREATE TABLE dogs (breed varchar)            INHERITS (animals);
> CREATE TABLE birds (bool hasFeathers) INHERITS (animals);

r=# alter table birds alter type set default 'Bird';
ALTER TABLE
r=# \d birds                       Table "public.birds"  Column    |       Type        |             Modifiers
-------------+-------------------+-----------------------------------id          | integer           | not nullname
  | character varying |type        | character varying | default 'Bird'::character varyinghasfeathers | boolean
 |
 
Inherits: animals

r=# \d animals        Table "public.animals"Column |       Type        | Modifiers
--------+-------------------+-----------id     | integer           | not nullname   | character varying |type   |
charactervarying |
 
Indexes:   "animals_pkey" PRIMARY KEY, btree (id)



--

Re: DEFAULT Constraint based on table type?

From

Oliver Elphick

Date:

29 November 2005, 06:30:06

On Mon, 2005-11-28 at 14:22 -0600, Announce wrote:
> Lets say I have the following tables.
> 
> CREATE TABLE animals(id primary key, name varchar, type varchar);
> CREATE TABLE dogs (breed varchar)            INHERITS (animals);
> CREATE TABLE birds (bool hasFeathers) INHERITS (animals);
> 
> Is there a way I can specify a default on the child table that will populate
> the 'type' column? For example, if I am inserting a row in table DOGS, I
> would always want the default value for column TYPE to be 'DOG'. If I am
> inserting into BIRDS.... type 'BIRD'.
> 
> I know that I could add individual triggers on each table that set the TYPE
> field to a default value on insert but I wanted a more simple solution like
> setting a DEFAULT table-constraint.
> 
> Also, In java, this could be done on a parent object by overriding a
> constructor or method, using the Class object or instanceof.  Is there
> anyway for a table to "know" it's "class" in this scenario?

SELECT tableoid::regproc, * from animals;

-- 
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
======================================== Do you want to know God?   http://www.lfix.co.uk/knowing_god.html

Re: DEFAULT Constraint based on table type?

From

Oliver Elphick

Date:

29 November 2005, 06:31:18

On Tue, 2005-11-29 at 10:31 +0000, Oliver Elphick wrote:
> SELECT tableoid::regproc, * from animals;

regproc should be regclass; sorry.

-- 
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
======================================== Do you want to know God?   http://www.lfix.co.uk/knowing_god.html