Re: NULL's support in SP-GiST - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: NULL's support in SP-GiST |
Date | |
Msg-id | 7324.1331333072@sss.pgh.pa.us Whole thread Raw |
In response to | NULL's support in SP-GiST (Oleg Bartunov <oleg@sai.msu.su>) |
Responses |
Re: NULL's support in SP-GiST
|
List | pgsql-hackers |
Oleg Bartunov <oleg@sai.msu.su> writes: > attached patch introduces NULLs indexing for SP-GiST. With this patch > Sp-GiST supports IS NULL, IS NOT NULL clauses, as well as full index scan. I've looked at this patch a bit. I share Jaime's extreme discomfort with re-using GIN code to handle some pages of an SPGist index. Making that code serve two masters is going to bite us on the rear sooner or later, probably sooner. I must also object in the strongest terms to the proposed rearrangement of SPGiST page special space to make it sort-of-compatible with GIN special space. That will entirely break tools such as pg_filedump, which needs the special space to be visibly different from GIN, or it won't know how to print the page contents. The other aspect I don't particularly like is the proposed changes to the opclass interface API. Adding a "satisfyAll" field seems like just a kluge. Also, it does nothing to fix the complaints I had in http://archives.postgresql.org/pgsql-hackers/2011-12/msg00804.php about the search API being inherently inefficient for multiple scan keys, because it forces repeat reconstruction of the indexed value. I think a better fix for the opclass API would be to do what I suggested there: > * Perhaps it'd be a good idea to move the loop over scankeys to inside > the opclass consistent methods, ie call them just once to check all the > scankeys. Then we could meaningfully define zero scankeys as a full > index scan, and we would also get rid of redundant value reconstruction > work when there's more than one scankey. I'm less sure about what to do to store nulls, but one idea is to have a separate SPGiST tree storing only nulls and descending from its own root page, similar to the idea in this patch of having a separate root page for nulls. It'd be a tad less efficient than GIN-based storage for large numbers of nulls, but you probably don't want to use SPGiST to index columns with lots of nulls anyway. Normally, if I felt that a patch needed to be thrown away and rewritten, I'd just bounce it back to the author for rework. However, in this case we are under a time crunch, and I feel that it's critical that we try to get both the opclass API and the on-disk format right for 9.2. It will be much harder to change either thing once we release. So I'm willing to spend some time rewriting the patch according to these ideas, and will go off and do that if there are not objections. regards, tom lane
pgsql-hackers by date: