Re: Geographic data sources, queries and questions - Mailing list pgsql-general

From btober
Subject Re: Geographic data sources, queries and questions
Date
Msg-id 465599d0.27d.1693.1458166436@ct.metrocast.net
Whole thread Raw
In response to Geographic data sources, queries and questions  ("Chuck D." <pgsql-list@nullmx.com>)
Responses Re: Geographic data sources, queries and questions
List pgsql-general
>
> >
> > I don't believe this is good design.  You'll have to
have a trigger or
> > something to verify that the country_id+state_id on the
city table are
> > exactly equal to the country_id+state_id on the state
table.  If you
> > don't, you might have something like (using US city
names...) "country:
> > USA -> state: NY" -> "country: Zimbabwe -> state: NY ->
city: New
> > York".
>
> >
> > It isn't a problem of "any country and any state" on the
city table, but
> > a problem of "this state inside that particular
country".  I'd drop the
> > country column.
>
> You are right, this is a bad design.  The country_id on
the city table has to
> go.


I'm not sure it is a bad design. Country has a country_id.
That's the primary key. State has a state_id, and exactly
one country, so really state has a compound primary key,
namely (country_id, state_id). And similarly to produce
relational integrity between state and city, city needs to
reference the state primary key, which means state has to
have all three (country_id, state_id, city_id) as it's
primary key.

This ties in with a subject dear to my heart and discussed
at great length starting here:

"http://archives.postgresql.org/pgsql-general/2006-08/msg00535.php"

The tie-in is not that you necessarily need the infamous
"gap-less" sequence, but that you don't necessarily need
three sequences per se.

While it would be temptingly easy to simply declare all
three separately in each table as

country:
country_id SERIAL, ...

state:
country_id integer,
state_id SERIAL, ...

city:
country_id integer,
state_id integer,
city_id SERIAL, ...

with that naive approach, every row in state has a unique
state_id, and every row in city has a unique city_id. Then
you'll notice that values of country_id are repeated in
state, and state_id values are repeated in city.

And then you'll realize that really it is the combination of
(country_id, state_id) that defines a unique state, and
(country_id, state_id, city_id) that defines a unique city.
It would require the use of stored programs and triggers to
manage these compound keys. But that's what stored programs
and triggers are for.

If you allow the fact that two countries could lay claim to
the same geographic sub-region, then you need a separate
table for an n-m relation.

Then throw in postal codes. Not only can a city have
multiple postal codes, but a postal code can serve more than
one city. And the cities served might be in different
states! (I used to have an example of that, but I can't find
it right now.)

Same with telephone area codes. (You are going there,
eventually, right?)




pgsql-general by date:

Previous
From: Marek Lewczuk
Date:
Subject: Re: the future of pljava development
Next
From: "Merlin Moncure"
Date:
Subject: Re: composite type array and index