Re: GSoC proposal. Index-only scans for GIST - Mailing list pgsql-hackers
From | Josh Berkus |
---|---|
Subject | Re: GSoC proposal. Index-only scans for GIST |
Date | |
Msg-id | 53287F6B.7040707@agliodbs.com Whole thread Raw |
In response to | GSoC proposal. Index-only scans for GIST (Anastasia Lubennikova <lubennikovaav@gmail.com>) |
Responses |
Re: GSoC proposal. Index-only scans for GIST
|
List | pgsql-hackers |
Alexander, Can you comment on the below proposal? I'd like your opinion on how difficult it will be. On 03/18/2014 06:12 AM, Anastasia Lubennikova wrote: > Hello! > Here is the text of my proposal which I've applied to GSoC. > (and link > http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/lubennikovaav/5629499534213120) > Any suggestions and comments are welcome. > > *Project name* > > Support for index-only scans for GIST index > > > > *Brief review* > > Currently GiST index don't have index-only scan functionality. Index-only > scans are a major performance feature added to Postgres 9.2. They allow > certain types of queries to be satisfied just by retrieving data from > indexes, and not from tables. This feature for B-trees (implemented since > PostgreSQL-9.2) allows doing certain types of queries significantly faster. > This is achieved by reduction in the amount of I/O necessary to satisfy > queries. I think it will be useful to implement this feature for user > defined data types that use GiST index. > > > > *Benefits to the PostgreSQL Community* > > Faster GiST index search would be actual for many PostgreSQL applications > (for example some GIS systems). > > > *Quantifiable results* > > Acceleration of GiST index search. > > > *Project details* > > 1. The GiST is a balanced tree structure like a B-tree, containing <key, > pointer> pairs. GiST key is a member of a user-defined class, and > represents some property that is true of all data items reachable from the > pointer associated with the key. The GiST provides a possibility to create > custom data types with indexed access methods and extensible set of queries. > > There are seven methods that an index operator class for GiST must provide, > and an eighth that is optional. > > -consistent > > -union > > -compress > > -decompress > > -penalty > > -picksplit > > -equal > > -distance (optional) > > I'm going to create new optional method fetch() in addition to existing. > Thus user can create a method of retrieving data from the index without > loss. It will be used when performing search queries to speed data > retrieving. > > > > 2. gistget algorithm (several parts omitted): > > Check the key > gistindex_keytest() - does this index tuple satisfy the scan key(s)? > > Scan all items on the GiST index page and insert them into the queue (or > directly to output areas) > > plain scan > > Heap tuple TIDs are returned into so->pageData[] > > ordered scan > > Heap tuple TIDs are pushed into individual search queue items > > If the fetch() is specified by the developer, then using it, algorithm can > retrieve the data directly to output areas at this stage, without reference > to the heap. > > > 3. Create function gistcanreturn to check whether fetch() is specified by > user. > > Amcanreturn - Function to check whether index supports index-only scans, or > zero if none > > There is the question of support index-only scans for multicolumn indexes. > Probably it would require extend the interface to add separate columns > checking. > > To solve this question I need the community's help. > > > 4. Add details for index only scans into gistcostestimate function > > > > *Links* > > 1) Hellerstein J. M., Naughton J. F., Pfeffer A. Generalized search > trees for database systems. - September, 1995. > > 2) http://www.sai.msu.su/~megera/postgres/gist/ > > 3) PostgreSQL 9.3.3 Documentation: chapters 11. Indexes, 55. GiST > Indexes. > -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
pgsql-hackers by date: