I have this QUERY (below) that PostgreSQL 7.3.X servers run MUCH slower
than
the 7.1.3 server does. It makes sense that both servers have to do a
sequential scan over the ZIPCODE column. There are over 7,500 rows in the
LOCATIONS table.
Does anyone know what changed in the planner or optimizer? Can I change
the
postgresql.conf file to improve 7.3.3 performance?
Situation
---------
Here is the situation...
PG 7.1.3 returns QUERY in about 4 seconds. The EXPLAIN plan says it uses
the index on country.
PG 7.3.3 simply does not return QUERY. I've waited over 3 minutes. With
the extra condition WHERE STATE = 'NJ' it takes almost 5 seconds. Other
states are much worse.
QUERY
-----
SELECT ZIPCODE
FROM LOCATIONS
WHERE
COUNTRY = 'USA' AND ZIP_DIST_MI('07306', ZIPCODE) <= 25;
The function is written in C, using SPI. Given two US ZIP codes, it
returns the distance in miles. For example, it is 78 miles from Jersey
City to Philadelphia:
db=> select ZIP_DIST_MI('07306', '19130'); zip_dist_mi
-----------------78.801595557406
(1 row)
ZIP_DIST_MI() uses a geo_zipdata table to get the latitude and longitude.
Using those, it can calculate the "great circle distance" between ZIPs
with C double arithmetic. It finds the ZIPs locations with a prepared
(and saved) SPI query, which uses an index:
"select latitude, longitude from geo_zipdata where zip = $1"
FUNCTION
--------
CREATE FUNCTION ZIP_DIST_MI(TEXT, TEXT)
RETURNS DOUBLE PRECISION...
ZIP DATA TABLE
--------------
CREATE TABLE GEO_ZIPDATA (ZIP VARCHAR(5) NOT NULL,STATE VARCHAR(2) NOT NULL,CITY VARCHAR(64) NOT
NULL,COUNTY VARCHAR(64) NOT NULL,LATITUDE FLOAT NOT NULL,LONGITUDE FLOAT NOT NULL,FIPS NUMERIC(10) NOT
NULL
);
CREATE INDEX GEO_ZIPDATA_ZIP_IDX ON GEO_ZIPDATA (ZIP);