Re: Strange issue with GiST index scan taking far too long - Mailing list pgsql-hackers

From Mark Cave-Ayland
Subject Re: Strange issue with GiST index scan taking far too long
Date
Msg-id 484E4683.7030501@siriusit.co.uk
Whole thread Raw
In response to Re: Strange issue with GiST index scan taking far too long  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:

>> So you are saying it is de-toasted 32880 times, in this case? If not,
>> where are the repeated de-toastings happening?
> 
> Inside the index support functions.  I'm thinking we could fix this by
> forcibly detoasting values passed as index scan keys, but it's not quite
> clear where's the best place to do that.

Ouch. This is rapidly getting out of my sphere of knowledge, but I'd 
guess you'd want to do this either just before you start the index scan, 
or cache the results within the AM after the first deTOASTing.

In terms of PostGIS, we tend to do a lot of index queries against large 
geometries so we see cases like this frequently - so optimising them 
would be good.

I did think of another idea though: at the moment all members of the 
GiST opclass for geometry objects are declared using the geometry type 
(which contains the entire geometry), whereas individual entries are 
stored within the index as box2d objects representing just their 
bounding box.

Would it make sense to rework the GiST routines so that instead of 
accepting geometry <op> geometry, they accept box2d <op> box2d? Then 
surely if we add a CAST from geometry to box2d then the geometry would 
get converted to its bounding box (which would not require deTOASTing) 
before being used as an index scan key.


ATB,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


pgsql-hackers by date:

Previous
From: Zeugswetter Andreas OSB sIT
Date:
Subject: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Next
From: Gregory Stark
Date:
Subject: Re: Core team statement on replication in PostgreSQL