Re: Index-only scans for GIST - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Index-only scans for GIST |
Date | |
Msg-id | 53E2906E.8080707@vmware.com Whole thread Raw |
In response to | Index-only scans for GIST (Anastasia Lubennikova <lubennikovaav@gmail.com>) |
Responses |
Re: Index-only scans for GIST
|
List | pgsql-hackers |
On 08/01/2014 10:58 AM, Anastasia Lubennikova wrote: > Hi, hackers! > I work on a GSoC project "Index-only scans for GIST" > https://wiki.postgresql.org/wiki/Support_for_Index-only_scans_for_GIST_GSoC_2014 > > Repository is > https://github.com/lubennikovaav/postgres/tree/indexonlygist2 > Patch is in attachments. Thanks! Some comments: * I got this compiler warning: gistget.c:556:5: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] ListCell *tmpPageData = so->curPageData; ^ * I'm getting two regression failures with this (opr_sanity and join). * After merging with master, build fails because of duplicate OIDs. * The regression test queries that use LIMIT are not guaranteed to always return the same rows, hence they're not very good regression test cases. I'd suggest using more restricting WHERE clauses, so that each query only returns a handful of rows. * What's the reason for turning GISTScanOpaqueData.pageData from an array to a List? * I think it's leaking memory, in GIST scan context. I tested this with a variant of the regression tests: insert into gist_tbl select box(point(0.05*i, 0.05*i), point(0.05*i, 0.05*i)), point(0.05*i, 0.05*i) FROM generate_series(0, 10000000) as i; CREATE INDEX gist_tbl_point_index ON gist_tbl USING gist (p); set enable_seqscan=off; set enable_bitmapscan=off; explain analyze select p from gist_tbl where p <@ box(point(0,0), point(9999999,9999999)) and length(p::text) < 10; while the final query runs, 'top' shows constantly increasing memory usage. > It includes index-only scans for multicolumn GIST and new regression test. > Fetch() method is realized for box and point opclasses. Can we have Fetch functions for all the datatypes in btree_gist contrib module, please? Do other contrib modules contain GiST opclasses that could have Fetch functions? > Documentation is not updated yet, but I'm going to do it till the end of > GSoC. > > I've got one question about query with OR condition. It is the last query > in regression test "gist_indexonly". It doesn't fail but it doensn't use > index-only scans. Could someone explain to me how it works? > It seems to depend on build_paths_for_OR > <http://doxygen.postgresql.org/indxpath_8c.html#ae660d2e886355e53ed3b9ec693e4afd2> > function. > But I couldn't understand how. The query is: select * from gist_tbl where b <@ box(point(5,5), point(6,6)) or p <@ box(point(0,0), point(100,100)) limit 10; It cannot use an index(-only) scan for this, because a single index scan can only return rows based on one key. In this case, you need to do two scans, and then return the rows returned by either scan, removing duplicates. A bitmap scan is possible, because it can remove the duplicates, but the planner can't produce a plain index scan plan that would do the same. A common trick when that happens in a real-world application is to re-write the query using UNION: select * from gist_tbl where b <@ box(point(5,5), point(6,6)) UNION select * from gist_tbl where p <@ box(point(0,0), point(100,100)) limit 10; Although that doesn't seem to actually work: ERROR: could not identify an equality operator for type box LINE 1: select * from gist_tbl ^ but that's not your patch's fault, the same happens with unpatched master. IOW, you don't need to worry about that case. - Heikki
pgsql-hackers by date: