Re: GSoC proposal. Index-only scans for GIST - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: GSoC proposal. Index-only scans for GIST
Date
Msg-id 53287F6B.7040707@agliodbs.com
Whole thread Raw
In response to GSoC proposal. Index-only scans for GIST  (Anastasia Lubennikova <lubennikovaav@gmail.com>)
Responses Re: GSoC proposal. Index-only scans for GIST
List pgsql-hackers
Alexander,

Can you comment on the below proposal?  I'd like your opinion on how
difficult it will be.

On 03/18/2014 06:12 AM, Anastasia Lubennikova wrote:
> Hello!
> Here is the text of my proposal which I've applied to GSoC.
> (and link
> http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/lubennikovaav/5629499534213120)
> Any suggestions and comments are welcome.
> 
> *Project name*
> 
> Support for index-only scans for GIST index
> 
> 
> 
> *Brief review*
> 
> Currently GiST index don't have index-only scan functionality. Index-only
> scans are a major performance feature added to Postgres 9.2. They allow
> certain types of queries to be satisfied just by retrieving data from
> indexes, and not from tables. This feature for B-trees (implemented since
> PostgreSQL-9.2) allows doing certain types of queries significantly faster.
> This is achieved by reduction in the amount of I/O necessary to satisfy
> queries. I think it will be useful to implement this feature for user
> defined data types that use GiST index.
> 
> 
> 
> *Benefits to the PostgreSQL Community*
> 
> Faster GiST index search would be actual for many PostgreSQL applications
> (for example some GIS systems).
> 
> 
>  *Quantifiable results*
> 
> Acceleration of GiST index search.
> 
> 
> *Project details*
> 
> 1. The GiST is a balanced tree structure like a B-tree, containing <key,
> pointer> pairs. GiST key is a member of a user-defined class, and
> represents some property that is true of all data items reachable from the
> pointer associated with the key. The GiST provides a possibility to create
> custom data types with indexed access methods and extensible set of queries.
> 
> There are seven methods that an index operator class for GiST must provide,
> and an eighth that is optional.
> 
> -consistent
> 
> -union
> 
> -compress
> 
> -decompress
> 
> -penalty
> 
> -picksplit
> 
> -equal
> 
> -distance (optional)
> 
> I'm going to create new optional method fetch() in addition to existing.
> Thus user can create a method of retrieving data from the index without
> loss. It will be used when performing search queries to speed data
> retrieving.
> 
> 
> 
> 2. gistget algorithm (several parts omitted):
> 
> Check the key
> gistindex_keytest() - does this index tuple satisfy the scan key(s)?
> 
> Scan all items on the GiST index page and insert them into the queue (or
> directly to output areas)
> 
> plain scan
> 
> Heap tuple TIDs are returned into so->pageData[]
> 
> ordered scan
> 
> Heap tuple TIDs are pushed into individual search queue items
> 
> If the fetch() is specified by the developer, then using it, algorithm can
> retrieve the data directly to output areas at this stage, without reference
> to the heap.
> 
> 
> 3. Create function gistcanreturn to check whether fetch() is specified by
> user.
> 
> Amcanreturn - Function to check whether index supports index-only scans, or
> zero if none
> 
> There is the question of support index-only scans for multicolumn indexes.
> Probably it would require extend the interface to add separate columns
> checking.
> 
> To solve this question I need the community's help.
> 
> 
> 4. Add details for index only scans into gistcostestimate function
> 
> 
> 
>  *Links*
> 
> 1)     Hellerstein J. M., Naughton J. F., Pfeffer A. Generalized search
> trees for database systems. - September, 1995.
> 
> 2)     http://www.sai.msu.su/~megera/postgres/gist/
> 
> 3)     PostgreSQL 9.3.3 Documentation: chapters 11. Indexes, 55. GiST
> Indexes.
> 


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: json/jsonb/hstore operator precedence
Next
From: Tom Lane
Date:
Subject: Re: Portability issues in shm_mq