Geometric bewilderment - Mailing list pgsql-hackers

From Paul Matthews
Subject Geometric bewilderment
Date
Msg-id 4A8BC607.6070306@netspace.net.au
Whole thread Raw
Responses Re: Geometric bewilderment  (David Fetter <david@fetter.org>)
List pgsql-hackers
Suspect that attaching large amounts of code is a breach of etiquette. So apologies in advance.

Needed to write a contains(polygon,point) function that gave a level of accuracy greater than the current 'poly@>point' operator. The new function works just fine. While at it, thought it would be nice to add in all the 'box op point' and 'point op box' routines that are not currently defined.

The code in contains.c implements these functions. In contains.sql you can find how these functions are defined, tided to operators, and then placed in an operator class. (Permission granted to copy any code in here you like, it's all trivial)

The perl script test1 tests all the operators. It compares the output of "box op point" with "box op box(point,point)". It is called with one parameter, the name of the database to use. It creates a table called points within that database. It also creates two files in the current directory a.out and b.out. tkdiff is called to show the differences. Running this shows that all the operators work fine.

The perl script test2 tests 1000 points against 1000 boxes. It is called with one parameter, the name of the database to use. It creates two tables called points and boxes within that database. It also creates two files in the current directory a.out and b.out. tkdiff is called to show the differences. Running this shows that the indexing works fine. (Explain says it uses the index)

Problem:

The following code works in production with 20,000,000 points against 10,000 polygons.
SELECT
      W.geocode,
      F.state_code,
      F.area_code,
      F.area_name
FROM
      work     as W,
      boundary as B,
      features as F
WHERE
      B.boundbox @> box(W.geocode,W.geocode)
  AND contains(B.boundary,W.geocode)
  AND B.boundout = 'T'
  AND (B.feature_id) NOT IN (
      SELECT feature_id
        FROM boundary
       WHERE boundbox @> box(W.geocode,W.geocode)
         AND contains(boundary,W.geocode)
         AND boundout = 'F' )
  AND B.feature_id = F.feature_id;
However the following does not work. It returns an empty set. Note that all that has changed is that "box @> box(point,point)" has been changed to "box @> point". From test1 we know that the operators work, and from test2 we know the indexing works. Confused and bewildered at this point.
SELECT
      W.geocode,
      F.state_code,
      F.area_code,
      F.area_name
FROM
      work     as W,
      boundary as B,
      features as F
WHERE
      B.boundbox @> W.geocode
  AND contains(B.boundary,W.geocode)
  AND B.boundout = 'T'
  AND (B.feature_id) NOT IN (
      SELECT feature_id
        FROM boundary
       WHERE boundbox @> W.geocode
         AND contains(boundary,W.geocode)
         AND boundout = 'F' )
  AND B.feature_id = F.feature_id;

pgsql-hackers by date:

Previous
From: Itagaki Takahiro
Date:
Subject: FDW-based dblink (WIP)
Next
From: Magnus Hagander
Date:
Subject: Re: [BUGS] BUG #4961: pg_standby.exe crashes with no args