Strange issue with GiST index scan taking far too long - Mailing list pgsql-hackers

From Mark Cave-Ayland
Subject Strange issue with GiST index scan taking far too long
Date
Msg-id 484D2DBA.9040508@siriusit.co.uk
Whole thread Raw
Responses Re: Strange issue with GiST index scan taking far too long  ("A. Kretschmer" <andreas.kretschmer@schollglas.com>)
Re: Strange issue with GiST index scan taking far too long  (Simon Riggs <simon@2ndquadrant.com>)
Re: Strange issue with GiST index scan taking far too long  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi there,

I'm currently looking at a bug report in PostGIS where we are getting 
extremely long index scan times querying an index in one case, but the 
same scan can take much less time if the input geometry is calculated as 
the result of another function.

First of all, I include the EXPLAIN ANALYZE of the basic query which 
looks like this:


postgis=# explain analyze select count(*) from geography where centroid 
&& (select the_geom from geography where id=69495);
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual 
 
time=2691.783..2691.784 rows=1 loops=1)   InitPlan     ->  Seq Scan on geography  (cost=0.00..7149.00 rows=1
width=4559)
 
(actual time=60.987..61.373 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
geography_geom_centroid_idxon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=79.241..2645.722 
rows=32880 loops=1)         Index Cond: (centroid && $0)         Filter: (centroid && $0) Total runtime: 2692.288 ms
(8 rows)


The only real thing to know about the query is that the id field within 
the geography table is a primary key, and hence only a single geometry 
is being returned from within the subselect. Note that most of the time 
is disappearing into the index scan.

Where things start getting really strange is when we add an extra 
function called force_2d() into the mix. All this function does is scan 
through the single geometry returned from the subselect and remove any 
3rd dimension coordinates. Now the resulting EXPLAIN ANALYZE for this 
query looks like this:


postgis=# explain analyze select count(*) from geography where centroid 
&& (select force_2d(the_geom) from geography where id=69495);
QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual 
 
time=343.004..343.005 rows=1 loops=1)   InitPlan     ->  Seq Scan on geography  (cost=0.00..7149.00 rows=1 width=4559)

(actual time=48.714..49.016 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
geography_geom_centroid_idxon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=49.367..235.296 rows=32880 
loops=1)         Index Cond: (centroid && $0)         Filter: (centroid && $0) Total runtime: 343.084 ms
(8 rows)


So by adding in an extra function around the subselect result, we have 
speeded up the index lookup by several orders of magnitude, and the 
speedup appears to be coming from somewhere within the index scan?! I've 
spent a little bit of time playing with this and it seems even writing a 
function to return a copy of the input function is enough. Here is my 
test function below:


Datum LWGEOM_mcatest(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(LWGEOM_mcatest);
Datum LWGEOM_mcatest(PG_FUNCTION_ARGS)
{        PG_LWGEOM *pgl = (PG_LWGEOM *)                PG_DETOAST_DATUM(PG_GETARG_DATUM(0));        void *mem;
        /* Copy somewhere else */        mem = palloc(VARSIZE(pgl));        memcpy(mem, pgl, VARSIZE(pgl)-VARHDRSZ);
        PG_RETURN_POINTER(mem);
}


CREATE OR REPLACE FUNCTION mcatest(geometry)        RETURNS geometry        AS '$libdir/lwpostgis','LWGEOM_mcatest'
  LANGUAGE 'C';
 


And then here is the resulting EXPLAIN ANALYZE:

postgis=# explain analyze select count(*) from geography where centroid 
&& (select mcatest(the_geom) from geography where id=69495);
QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual 
 
time=283.126..283.127 rows=1 loops=1)   InitPlan     ->  Seq Scan on geography  (cost=0.00..7149.00 rows=1 width=4559)

(actual time=48.712..49.040 rows=1 loops=1)           Filter: (id = 69495::numeric)   ->  Index Scan using
geography_geom_centroid_idxon geography 
 
(cost=0.00..8.28 rows=1 width=0) (actual time=49.321..215.524 rows=32880 
loops=1)         Index Cond: (centroid && $0)         Filter: (centroid && $0) Total runtime: 283.221 ms
(8 rows)


Unfortunately I can't seem to work out why the extra time is 
disappearing into the index scan when my extra mcatest() function is not 
present, especially as sprof doesn't seem to want to run at the moment 
:(  I'm wondering if it's related to either excess TOAST/palloc/pfree 
somewhere in the code, but I'd definitely appreciate any pointers.

All these tests were done using PostgreSQL 8.3.1 and the latest PostGIS SVN.


Many thanks,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


pgsql-hackers by date:

Previous
From: Decibel!
Date:
Subject: Re: proposal: add window function to 8.4
Next
From: Andrew Dunstan
Date:
Subject: Re: pg_dump restore time and Foreign Keys