Re: How many views... - Mailing list pgsql-general

From Uwe C. Schroeder
Subject Re: How many views...
Date
Msg-id 200411282327.57565.uwe@oss4u.com
Whole thread Raw
In response to Re: How many views...  (Greg Stark <gsstark@mit.edu>)
List pgsql-general
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 28 November 2004 10:49 pm, Greg Stark wrote:
> "Uwe C. Schroeder" <uwe@oss4u.com> writes:
> > I could create a view for every node in the system which calculates the
> > distance in the result set, making it easy to handle for the application:
> > select * from <view> where distance <= 50
> > The problem is, that the data will possibly contain thousands of nodes.
> > I'd also need 2 or 3 views per node - which could lead to 50.000 or even
> > 100.000 views.
>
> Normalize.
>
> Make your view construct the distance for _every_ node. So your query looks
> like:
>
>   select * from <view> where node_id = 99 and distance <= 50

Well, in my case a "node_id" would refer to a zipcode (for the basic version,
later on maybe even pushing it down to street level using more detailed gps
data). The problem I see is that the records the view sits on top of may and
will contain several similar records. Think of UPS: they would want to know
the distance to any recipient of a parcel, however a lot of those recipients
have the same zipcode. I just can't think of a view that retrieves a specific
person in that zipcode based on the zipcode. So there would have to be a
second parameter to it - or a view for each record.
To put it better: The application records customers.
table customer (
customer_id int4 primary key,
zipcode varchar(10),
other stuff about the customer
)

A normalized view just wouldn't return that specific customer plainly based on
the zipcode, because there could/will be a second or third customer in the
same zipcode. For the distance that wouldn't matter, but for the customer
info it would.
I'm just playing with options right now. Maybe/likely I have to revise the
database model. The stored proc works fine, it just could become slow with a
lot of customer records. I think Michael's prior post is the better answer -
limit the possible coordinates to a subset before starting to calculate the
actual distance.

> The danger here is you have to be absolutely sure that Postgres is pushing
> that clause down into your view or the performance will be bad. But if your
> view is at all sane then I think it will. And you'll certainly know if it
> isn't.

That's what I'm afraid of. The database will potentially contain 100.000
customer records once productive - in the US alone, leaving aside what has to
be done about the rest of the world. So, yeah - one will notice the drag on
an expensive calculation, particularly because the application has lists that
show ALL customers sorted by distance to one (changing) location.

    UC

- --
Open Source Solutions 4U, LLC    2570 Fleetwood Drive
Phone:  +1 650 872 2425        San Bruno, CA 94066
Cell:   +1 650 302 2405        United States
Fax:    +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFBqs99jqGXBvRToM4RAsv+AKCsM05f9JR0yMIXfbELrArJ6z9WKACeKfYa
nAsM0NRh09R+Zl7eu+FDS/g=
=DiJv
-----END PGP SIGNATURE-----


pgsql-general by date:

Previous
From: "Andrew Dunstan"
Date:
Subject: Re: [HACKERS] Adding Reply-To: to Lists
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Adding Reply-To: to Lists configuration ...