Thread: Documentation: GiST extension implementation
Hi, The following documentation page explains the GiST API to extensions authors: http://www.postgresql.org/docs/current/static/gist-implementation.htm I think we should be a little more verbose, and at least explains some more the big picture: same/consistent/union are responsible for correctness of the index while penalty and picksplit are responsible for performances of it, which leaves compress/decompress, to use when leaf/nodes are not the same datatype. This leaf/node construct is explained in the last paragraph of following page, but can exists directly into the C module too: http://www.postgresql.org/docs/current/static/xindex.html The consistent and union should get a lot of attention, and when exactly do your operators need RECHECK is still unclear to me. It's hard to give precise advices about consistent/union in a generic way, but I've been given the following general rule (thanks RhodiumToad): (z is consistent with x) implies (z is consistent with union(x,y)) What's unclear too is memory management: when to palloc() and when to reuse arguments given by -core GiST support functions. I know it was a game of trial and error to get it right, and while I know it's working now, I'd be in a bad position to explain how and why. Maybe reading the source code is what to do here, but a detailed API expectancies page in the documentation wouldn't hurt... Regards, -- dim I didn't propose a real doc patch mainly because english isn't my native language, and while you'll have to reword the content...
On Wednesday 29 April 2009 16:43:44 Dimitri Fontaine wrote: > The following documentation page explains the GiST API to extensions > authors: > I think we should be a little more verbose, > I didn't propose a real doc patch mainly because english isn't my native > language, and while you'll have to reword the content... There aren't a lot of people who have the experience to write that documentation. So if you want to improve it, you will have to write it, or at least organize the outline. Others can help cleaning it up.
Hi, Le 4 mai 09 à 14:24, Peter Eisentraut a écrit : > There aren't a lot of people who have the experience to write that > documentation. So if you want to improve it, you will have to write > it, or at > least organize the outline. Others can help cleaning it up. For the record, here's the current version of the documentation patch, which still needs those items to be worked out: - patch format - proper English review - consistent signature update for 8.4 (recheck arg) - compress/decompress non void example - multi column support in picksplit I intend to try and work out those points, but it's all about areas I don't (yet) know about. Of course I won't complete the second point myself. Regards, -- dim
Attachment
Hi, I've been working some more on this documentation: Le 20 mai 09 à 18:10, Dimitri Fontaine a écrit : > - consistent signature update for 8.4 (recheck arg) > - compress/decompress non void example This means the multi-column support part is still missing, as proper english review too. But I think it should not stop us to consider the patch either for 8.4 (documentation patch I believe are accepted this late in the game) or for 8.5, in which case I'll put it on the CommitFest page. Even without touching the multi-column aspect of GiST custom support, I think it improves this part of the documentation a lot. <why care?> Please consider it: it's not coming from nowhere. All examples are based on real working code, prefix.c (to be found on pgfoundry, served more than 70 millions GiST lookups already, is used in several telco companies) and gistproc.c, and I enjoyed Teodor's and Andrew Gierth's advices and comments. </> Regards, -- dim
Attachment
Dimitri Fontaine <dfontaine@hi-media.com> writes: > I've been working some more on this documentation: Applied with some editorialization. It seems to me it could still do with a lot more detail to specify what API the functions are really expected to implement. regards, tom lane
Hi, Le 12 juin 09 à 21:49, Tom Lane a écrit : > Applied with some editorialization. Thanks a lot :) > It seems to me it could still do > with a lot more detail to specify what API the functions are really > expected to implement. I'm sorry I'm not following... I guess you're talking about a better high-level view of things? Like describing GiST itself, the way it's done in the following link, but reduced in one or two paragraphs? http://gist.cs.berkeley.edu/gist1.html I'll be happy to work on improving this documentation some more (but won't be there next week)... Regards, -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Le 12 juin 09 � 21:49, Tom Lane a �crit : >> It seems to me it could still do >> with a lot more detail to specify what API the functions are really >> expected to implement. > I'm sorry I'm not following... I guess you're talking about a better > high-level view of things? Like describing GiST itself, the way it's > done in the following link, but reduced in one or two paragraphs? > http://gist.cs.berkeley.edu/gist1.html No, we already have that level of detail (some of it word for word in fact); and it's not all that important for opclass authors to know how GIST works anyway. What's bothering me is the fuzziness of the API specifications for the support functions. It's not real clear for example what you have to do to have an index storage type different from the column datatype, and even less clear which type the same() function is comparing. Having some skeletons that execute magic bits of undocumented code is not a substitute for a specification. regards, tom lane
Le 12 juin 09 à 23:20, Tom Lane a écrit : > Dimitri Fontaine <dfontaine@hi-media.com> writes: >> Le 12 juin 09 à 21:49, Tom Lane a écrit : >>> It seems to me it could still do >>> with a lot more detail to specify what API the functions are really >>> expected to implement. > > What's bothering me is the fuzziness of the API > specifications for the support functions. It's not real clear for > example what you have to do to have an index storage type different > from > the column datatype, and even less clear which type the same() > function > is comparing. Having some skeletons that execute magic bits of > undocumented code is not a substitute for a specification. Oh yes that wasn't easy to guess: I had to look at others implementations then do some tests (trial&error) to determine this. Andrew Gierth has been really helpful here, and his ip4r module a good example (but without varlena). I'll try to provide something here, what I'm trying to say is that I need some help and research (and core code reading) to reverse engineer the specs. Regards, -- dim