Thread: Getting the output of a function used in a where clause

Getting the output of a function used in a where clause

From
"Bill Lawrence"
Date:
<div class="Section1"><p class="MsoNormal"><span class="EmailStyle15"><font color="black" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:
Arial">HI,</span></font></span><p class="MsoNormal"><span class="EmailStyle15"><font color="black" face="Arial"
size="2"><spanstyle="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family: 
Arial"> </span></font></span><p class="MsoNormal"><span class="EmailStyle15"><font color="black" face="Arial"
size="2"><spanstyle="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family: 
Arial">I’m a newbie so please bear with me. I have a function defined (got it from one of your threads… thanks Joe
Conway)which calculates the distance <span style="mso-spacerun: yes"> </span>between 2 zip code centeroids (in
lat,long).This thing works great. However, I want to sort my results by distance without incurring the additional
burdenof executing the function twice. A simplified version of my current SQL (written in a perl cgi) <span
style="mso-spacerun:yes"> </span>that returns a set of zip codes within a given radius is:</span></font></span><p
class="MsoNormal"><spanclass="EmailStyle15"><font color="black" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:
Arial"> </span></font></span><pre><span class="EmailStyle15"><font color="black" face="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial">$sql= “SELECT zipcode from zipcodes where </span></font></span><font
face="Arial"><spanstyle="font-family:Arial">zipdist($lat1d,$lon1d,lat,long) <=
$dist;”;</span></font></pre><pre><fontcolor="black" face="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial"> </span></font></pre><pre><fontcolor="black" face="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial">WhatI want to write is something like:</span></font></pre><pre><font
color="black"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial"> </span></font></pre><pre><span
class="EmailStyle15"><fontcolor="black" face="Arial" size="2"><span style="font-size: 
10.0pt;font-family:Arial">$sql = “SELECT zipcode, distance from zipcodes where </span></font></span><font
face="Arial"><spanstyle="font-family:Arial">distance <= $dist order by distance;”;</span></font></pre><pre><font
color="black"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial"> </span></font></pre><pre><font
color="black"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial">But I don’t the magic SQL phrase to
populatethe distance variable using my nifty function. Do I need to create an output type for
distance?</span></font></pre><pre><fontcolor="black" face="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial"> </span></font></pre><pre><fontcolor="black" face="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial">Thanksin advance!</span></font></pre><pre><font color="black" face="Arial"
size="2"><spanstyle="font-size:10.0pt;font-family:Arial"> </span></font></pre><pre><font color="black" face="Arial"
size="2"><spanstyle="font-size:10.0pt;font-family:Arial">Bill</span></font></pre><p class="MsoNormal"><span
class="EmailStyle15"><fontcolor="black" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:
Arial"> </span></font></span><p class="MsoNormal"><span class="EmailStyle15"><font color="black" face="Arial"
size="2"><spanstyle="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family: 
Arial"> </span></font></span><p class="MsoNormal"><font color="black" face="Arial" size="3"><span style="font-size:
12.0pt;font-family:Arial;color:black"> </span></font><font color="black" face="Arial"><span
style="font-family:Arial;color:black;mso-color-alt:
windowtext"></span></font></div>

Re: Getting the output of a function used in a where clause

From
PFC
Date:
try:

SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from zipcodes  
where distance <= $dist;”;

OR you could use a gist index with a geometric datatype to get it a lot  
faster.


On Sat, 09 Apr 2005 03:43:39 +0200, Bill Lawrence <bill.lawrence@cox.net>  
wrote:

> HI,
>
> I’m a newbie so please bear with me. I have a function defined (got it  
> from
> one of your threads… thanks Joe Conway) which calculates the distance
> between 2 zip code centeroids (in lat,long). This thing works great.
> However, I want to sort my results by distance without incurring the
> additional burden of executing the function twice. A simplified version  
> of
> my current SQL (written in a perl cgi)  that returns a set of zip codes
> within a given radius is:
>
>
> What I want to write is something like:
>
> $sql = “SELECT zipcode, distance from zipcodes where distance <= $dist  
> order
> by distance;”;
>
> But I don’t the magic SQL phrase to populate the distance variable using  
> my
> nifty function. Do I need to create an output type for distance?
>
> Thanks in advance!
>
> Bill
>
>
>




Re: Getting the output of a function used in a where clause

From
"Bill Lawrence"
Date:
Boy I sure thought that would work... I received the following from postgres:

ERROR:  Attribute "distance" not found.

Started looking into gist.... Looks complex.

Any other ideas?


-----Original Message-----
From: PFC [mailto:lists@boutiquenumerique.com]
Sent: Monday, April 11, 2005 1:51 AM
To: Bill Lawrence; pgsql-sql@postgresql.org
Subject: Re: [SQL] Getting the output of a function used in a where clause


try:

SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from zipcodes
where distance <= $dist;";

OR you could use a gist index with a geometric datatype to get it a lot
faster.


On Sat, 09 Apr 2005 03:43:39 +0200, Bill Lawrence <bill.lawrence@cox.net>
wrote:

> HI,
>
> I'm a newbie so please bear with me. I have a function defined (got it
> from
> one of your threads... thanks Joe Conway) which calculates the distance
> between 2 zip code centeroids (in lat,long). This thing works great.
> However, I want to sort my results by distance without incurring the
> additional burden of executing the function twice. A simplified version
> of
> my current SQL (written in a perl cgi)  that returns a set of zip codes
> within a given radius is:
>
>
> What I want to write is something like:
>
> $sql = "SELECT zipcode, distance from zipcodes where distance <= $dist
> order
> by distance;";
>
> But I don't the magic SQL phrase to populate the distance variable using
> my
> nifty function. Do I need to create an output type for distance?
>
> Thanks in advance!
>
> Bill
>
>
>





Re: Getting the output of a function used in a where clause

From
Scott Marlowe
Date:
Why not just do:

SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from
zipcodes where zipdist($lat1d,$lon1d,lat,long) <= $dist;";


On Mon, 2005-04-11 at 20:25, Bill Lawrence wrote:
> Boy I sure thought that would work... I received the following from postgres:
> 
> ERROR:  Attribute "distance" not found.
> 
> Started looking into gist.... Looks complex.
> 
> Any other ideas?
> 
> 
> -----Original Message-----
> From: PFC [mailto:lists@boutiquenumerique.com]
> Sent: Monday, April 11, 2005 1:51 AM
> To: Bill Lawrence; pgsql-sql@postgresql.org
> Subject: Re: [SQL] Getting the output of a function used in a where clause
> 
> 
> try:
> 
> SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from zipcodes 
> where distance <= $dist;";
> 
> OR you could use a gist index with a geometric datatype to get it a lot 
> faster.
> 
> 
> On Sat, 09 Apr 2005 03:43:39 +0200, Bill Lawrence <bill.lawrence@cox.net> 
> wrote:
> 
> > HI,
> >
> > I'm a newbie so please bear with me. I have a function defined (got it 
> > from
> > one of your threads... thanks Joe Conway) which calculates the distance
> > between 2 zip code centeroids (in lat,long). This thing works great.
> > However, I want to sort my results by distance without incurring the
> > additional burden of executing the function twice. A simplified version 
> > of
> > my current SQL (written in a perl cgi)  that returns a set of zip codes
> > within a given radius is:
> >
> >
> > What I want to write is something like:
> >
> > $sql = "SELECT zipcode, distance from zipcodes where distance <= $dist 
> > order
> > by distance;";
> >
> > But I don't the magic SQL phrase to populate the distance variable using 
> > my
> > nifty function. Do I need to create an output type for distance?
> >
> > Thanks in advance!
> >
> > Bill
> >
> >
> >
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings


Re: Getting the output of a function used in a where clause

From
"Bill Lawrence"
Date:
Thanks a bunch!

Looks pretty step-by-step at the site for the link you sent. I'll give it a
shot and see how it turns out.

Thanks again for all your help!

Bill
-----Original Message-----
From: PFC [mailto:lists@boutiquenumerique.com]
Sent: Tuesday, April 12, 2005 1:03 AM
To: Bill Lawrence
Subject: Re: [SQL] Getting the output of a function used in a where clause


> Boy I sure thought that would work... I received the following from
> postgres:
>
> ERROR:  Attribute "distance" not found.
>
> Started looking into gist.... Looks complex.
>
> Any other ideas?
       Complex ?

CREATE TABLE stuff (       ...       coords BOX NOT NULL,       ...
) WITHOUT OIDS;

CREATE INDEX cities_coords_idx ON geo.cities USING GIST ( coords
gist_box_ops );

For some reason you must use BOX instead ot POINT to use the index.


CREATE OR REPLACE FUNCTION boxpoint(FLOAT,FLOAT)        RETURNS BOX        RETURNS NULL ON NULL INPUT        LANGUAGE
plpgsql        AS
$$
DECLARE       p POINT;
BEGIN       p := point($1,$2);       IF $1=0 AND $2=0 THEN RETURN NULL; END IF;        RETURN box(p,p);
END;
$$;

now use boxpoint(x,y) to select a box :

INSERT INTO stuff (...,coords,...) VALUES (...,boxpoint(x,y),...)

Now to get all the records whose coords are inside a box using the index :

SELECT ... WHERE cords && '((xa,ya),(xb,yb))'::box

for all the details look there :

http://www.postgis.org/docs/ch04.html#id3530280

it's simple once you're into it. You'll need to install postgis.






















Re: Getting the output of a function used in a where clause

From
"Bill Lawrence"
Date:
Thanks,

Unfortunately, I think that solution requires the distance calculation to be
executed twice for each record in the table. There are ~70K records in the
table. Is the postgres query optimizer smart enough to only perform the
calculation once?

Bill



-----Original Message-----
From: Scott Marlowe [mailto:smarlowe@g2switchworks.com]
Sent: Tuesday, April 12, 2005 6:55 AM
To: Bill Lawrence
Cc: PFC; pgsql-sql@postgresql.org
Subject: Re: [SQL] Getting the output of a function used in a where clause

Why not just do:

SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from
zipcodes where zipdist($lat1d,$lon1d,lat,long) <= $dist;";


On Mon, 2005-04-11 at 20:25, Bill Lawrence wrote:
> Boy I sure thought that would work... I received the following from
postgres:
>
> ERROR:  Attribute "distance" not found.
>
> Started looking into gist.... Looks complex.
>
> Any other ideas?
>
>
> -----Original Message-----
> From: PFC [mailto:lists@boutiquenumerique.com]
> Sent: Monday, April 11, 2005 1:51 AM
> To: Bill Lawrence; pgsql-sql@postgresql.org
> Subject: Re: [SQL] Getting the output of a function used in a where clause
>
>
> try:
>
> SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from zipcodes
> where distance <= $dist;";
>
> OR you could use a gist index with a geometric datatype to get it a lot
> faster.
>
>
> On Sat, 09 Apr 2005 03:43:39 +0200, Bill Lawrence <bill.lawrence@cox.net>
> wrote:
>
> > HI,
> >
> > I'm a newbie so please bear with me. I have a function defined (got it
> > from
> > one of your threads... thanks Joe Conway) which calculates the distance
> > between 2 zip code centeroids (in lat,long). This thing works great.
> > However, I want to sort my results by distance without incurring the
> > additional burden of executing the function twice. A simplified version
> > of
> > my current SQL (written in a perl cgi)  that returns a set of zip codes
> > within a given radius is:
> >
> >
> > What I want to write is something like:
> >
> > $sql = "SELECT zipcode, distance from zipcodes where distance <= $dist
> > order
> > by distance;";
> >
> > But I don't the magic SQL phrase to populate the distance variable using
> > my
> > nifty function. Do I need to create an output type for distance?
> >
> > Thanks in advance!
> >
> > Bill
> >
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings




Re: Getting the output of a function used in a where clause

From
Rod Taylor
Date:
On Tue, 2005-04-12 at 23:08 -0700, Bill Lawrence wrote:
> Thanks,
> 
> Unfortunately, I think that solution requires the distance calculation to be
> executed twice for each record in the table. There are ~70K records in the
> table. Is the postgres query optimizer smart enough to only perform the
> calculation once?

It is in some places, but possibly not in that one.

You can force it with a subselect though:              SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)
   as distance       from zipcodes) AS tab where distance <= $dist;
 

> Bill
> 
> 
> 
> -----Original Message-----
> From: Scott Marlowe [mailto:smarlowe@g2switchworks.com]
> Sent: Tuesday, April 12, 2005 6:55 AM
> To: Bill Lawrence
> Cc: PFC; pgsql-sql@postgresql.org
> Subject: Re: [SQL] Getting the output of a function used in a where clause
> 
> Why not just do:
> 
> SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from
> zipcodes where zipdist($lat1d,$lon1d,lat,long) <= $dist;";
> 
> 
> On Mon, 2005-04-11 at 20:25, Bill Lawrence wrote:
> > Boy I sure thought that would work... I received the following from
> postgres:
> >
> > ERROR:  Attribute "distance" not found.
> >
> > Started looking into gist.... Looks complex.
> >
> > Any other ideas?
> >
> >
> > -----Original Message-----
> > From: PFC [mailto:lists@boutiquenumerique.com]
> > Sent: Monday, April 11, 2005 1:51 AM
> > To: Bill Lawrence; pgsql-sql@postgresql.org
> > Subject: Re: [SQL] Getting the output of a function used in a where clause
> >
> >
> > try:
> >
> > SELECT zipcode, zipdist($lat1d,$lon1d,lat,long) as distance from zipcodes
> > where distance <= $dist;";
> >
> > OR you could use a gist index with a geometric datatype to get it a lot
> > faster.
> >
> >
> > On Sat, 09 Apr 2005 03:43:39 +0200, Bill Lawrence <bill.lawrence@cox.net>
> > wrote:
> >
> > > HI,
> > >
> > > I'm a newbie so please bear with me. I have a function defined (got it
> > > from
> > > one of your threads... thanks Joe Conway) which calculates the distance
> > > between 2 zip code centeroids (in lat,long). This thing works great.
> > > However, I want to sort my results by distance without incurring the
> > > additional burden of executing the function twice. A simplified version
> > > of
> > > my current SQL (written in a perl cgi)  that returns a set of zip codes
> > > within a given radius is:
> > >
> > >
> > > What I want to write is something like:
> > >
> > > $sql = "SELECT zipcode, distance from zipcodes where distance <= $dist
> > > order
> > > by distance;";
> > >
> > > But I don't the magic SQL phrase to populate the distance variable using
> > > my
> > > nifty function. Do I need to create an output type for distance?
> > >
> > > Thanks in advance!
> > >
> > > Bill
> > >
> > >
> > >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 7: don't forget to increase your free space map settings
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 
-- 



Re: Getting the output of a function used in a where clause

From
Tom Lane
Date:
Rod Taylor <pg@rbt.ca> writes:
> You can force it with a subselect though:       
>         SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)
>         as distance
>         from zipcodes) AS tab where distance <= $dist;

The above will *not* stop zipdist from being run twice, because the
planner will happily flatten the subquery into the outer query,
resulting in the same situation of zipdist() being present twice in
the text of the query.

You could force the issue with 
       SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)       as distancefrom zipcodes OFFSET 0) AS tab
wheredistance <= $dist;
 

since LIMIT/OFFSET clauses presently disable the flattening
optimization.  Keep in mind though that the OFFSET is an absolute
optimization fence: it will result in the subquery being evaluated
completely, even if there were outer conditions that might have
avoided the need to look at some rows.  For example, if the query is
       SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)       as distancefrom zipcodes OFFSET 0) AS tab
wheredistance <= $distAND some-other-conditions
 

then not letting the some-other-conditions migrate down below the
evaluation of zipdist could result in making the query be far slower,
not faster, than you would get if you weren't trying to outsmart the
planner.

In general the system operates on the assumption that function calls
are cheap relative to disk I/O.  If that's not true for you, you're
going to have some issues ...
        regards, tom lane


Re: Getting the output of a function used in a where clause

From
PFC
Date:
> Thanks Tom and Rod.
>
> There are indeed several additional conditions on the "real" query which
> prune the search space (I formulate a quick search box and filter on
> Lat/Lon's within the box). Since my user interface limits the search to  
> a 30
> mile radius, there are at most 81 results (in New York city, far fewer,  
> for
Why don't you use a GiST index which will index this bounding box search ?


Re: Getting the output of a function used in a where clause

From
"Bill Lawrence"
Date:
Thanks Tom and Rod.

There are indeed several additional conditions on the "real" query which
prune the search space (I formulate a quick search box and filter on
Lat/Lon's within the box). Since my user interface limits the search to a 30
mile radius, there are at most 81 results (in New York city, far fewer, for
other regions of the US), so I've elected to post process the results in my
script (calculating the distance for each returned record) and display only
the closest 20 results (I intended to use a LIMIT clause combined with an
ORDER BY in my SQL... LOL).

Again, Thanks for all the great advice!


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, April 18, 2005 8:00 PM
To: Rod Taylor
Cc: Bill Lawrence; Scott Marlowe; PFC; pgsql-sql@postgresql.org
Subject: Re: [SQL] Getting the output of a function used in a where clause

Rod Taylor <pg@rbt.ca> writes:
> You can force it with a subselect though:

>         SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)
>         as distance
>         from zipcodes) AS tab where distance <= $dist;

The above will *not* stop zipdist from being run twice, because the
planner will happily flatten the subquery into the outer query,
resulting in the same situation of zipdist() being present twice in
the text of the query.

You could force the issue with
       SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)       as distance       from zipcodes OFFSET 0)
AStab where distance <= $dist;
 

since LIMIT/OFFSET clauses presently disable the flattening
optimization.  Keep in mind though that the OFFSET is an absolute
optimization fence: it will result in the subquery being evaluated
completely, even if there were outer conditions that might have
avoided the need to look at some rows.  For example, if the query is
       SELECT * FROM (SELECT zipcode, zipdist($lat1d,$lon1d,lat,long)       as distance       from zipcodes OFFSET 0)
AStab where distance <= $dist       AND some-other-conditions
 

then not letting the some-other-conditions migrate down below the
evaluation of zipdist could result in making the query be far slower,
not faster, than you would get if you weren't trying to outsmart the
planner.

In general the system operates on the assumption that function calls
are cheap relative to disk I/O.  If that's not true for you, you're
going to have some issues ...
                       regards, tom lane