Thread: Documentation: GiST extension implementation

Documentation: GiST extension implementation

From
Dimitri Fontaine
Date:
Hi,

The following documentation page explains the GiST API to extensions authors:
http://www.postgresql.org/docs/current/static/gist-implementation.htm

I think we should be a little more verbose, and at least explains some more
the big picture: same/consistent/union are responsible for correctness of the
index while penalty and picksplit are responsible for performances of it,
which leaves compress/decompress, to use when leaf/nodes are not the same
datatype.

This leaf/node construct is explained in the last paragraph of following page,
but can exists directly into the C module too: http://www.postgresql.org/docs/current/static/xindex.html

The consistent and union should get a lot of attention, and when exactly do
your operators need RECHECK is still unclear to me. It's hard to give precise
advices about consistent/union in a generic way, but I've been given the
following general rule (thanks RhodiumToad): (z is consistent with x) implies (z is consistent with union(x,y))

What's unclear too is memory management: when to palloc() and when to reuse
arguments given by -core GiST support functions. I know it was a game of trial
and error to get it right, and while I know it's working now, I'd be in a bad
position to explain how and why. Maybe reading the source code is what to do
here, but a detailed API expectancies page in the documentation wouldn't
hurt...

Regards,
--
dim

I didn't propose a real doc patch mainly because english isn't my native
language, and while you'll have to reword the content...

Re: Documentation: GiST extension implementation

From
Peter Eisentraut
Date:
On Wednesday 29 April 2009 16:43:44 Dimitri Fontaine wrote:
> The following documentation page explains the GiST API to extensions
> authors:

> I think we should be a little more verbose,

> I didn't propose a real doc patch mainly because english isn't my native 
> language, and while you'll have to reword the content...

There aren't a lot of people who have the experience to write that 
documentation.  So if you want to improve it, you will have to write it, or at 
least organize the outline.  Others can help cleaning it up.



Re: Documentation: GiST extension implementation

From
Dimitri Fontaine
Date:
Hi,

Le 4 mai 09 à 14:24, Peter Eisentraut a écrit :
> There aren't a lot of people who have the experience to write that
> documentation.  So if you want to improve it, you will have to write
> it, or at
> least organize the outline.  Others can help cleaning it up.

For the record, here's the current version of the documentation patch,
which still needs those items to be worked out:
  - patch format
  - proper English review
  - consistent signature update for 8.4 (recheck arg)
  - compress/decompress non void example
  - multi column support in picksplit

I intend to try and work out those points, but it's all about areas I
don't (yet) know about. Of course I won't complete the second point
myself.

Regards,
--
dim



Attachment

Re: Documentation: GiST extension implementation

From
Dimitri Fontaine
Date:
Hi,

I've been working some more on this documentation:

Le 20 mai 09 à 18:10, Dimitri Fontaine a écrit :
> - consistent signature update for 8.4 (recheck arg)
> - compress/decompress non void example

This means the multi-column support part is still missing, as proper
english review too.

But I think it should not stop us to consider the patch either for 8.4
(documentation patch I believe are accepted this late in the game) or
for 8.5, in which case I'll put it on the CommitFest page. Even
without touching the multi-column aspect of GiST custom support, I
think it improves this part of the documentation a lot.

<why care?>
Please consider it: it's not coming from nowhere. All examples are
based on real working code, prefix.c (to be found on pgfoundry, served
more than 70 millions GiST lookups already, is used in several telco
companies) and gistproc.c, and I enjoyed Teodor's and Andrew Gierth's
advices and comments.
</>

Regards,
--
dim



Attachment

Re: Documentation: GiST extension implementation

From
Tom Lane
Date:
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> I've been working some more on this documentation:

Applied with some editorialization.  It seems to me it could still do
with a lot more detail to specify what API the functions are really
expected to implement.
        regards, tom lane


Re: Documentation: GiST extension implementation

From
Dimitri Fontaine
Date:
Hi,

Le 12 juin 09 à 21:49, Tom Lane a écrit :
> Applied with some editorialization.

Thanks a lot :)

>  It seems to me it could still do
> with a lot more detail to specify what API the functions are really
> expected to implement.

I'm sorry I'm not following... I guess you're talking about a better
high-level view of things? Like describing GiST itself, the way it's
done in the following link, but reduced in one or two paragraphs?  http://gist.cs.berkeley.edu/gist1.html

I'll be happy to work on improving this documentation some more (but
won't be there next week)...

Regards,
--
dim

Re: Documentation: GiST extension implementation

From
Tom Lane
Date:
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Le 12 juin 09 � 21:49, Tom Lane a �crit :
>> It seems to me it could still do
>> with a lot more detail to specify what API the functions are really
>> expected to implement.

> I'm sorry I'm not following... I guess you're talking about a better  
> high-level view of things? Like describing GiST itself, the way it's  
> done in the following link, but reduced in one or two paragraphs?
>    http://gist.cs.berkeley.edu/gist1.html

No, we already have that level of detail (some of it word for word in
fact); and it's not all that important for opclass authors to know how
GIST works anyway.  What's bothering me is the fuzziness of the API
specifications for the support functions.  It's not real clear for
example what you have to do to have an index storage type different from
the column datatype, and even less clear which type the same() function
is comparing.  Having some skeletons that execute magic bits of
undocumented code is not a substitute for a specification.
        regards, tom lane


Re: Documentation: GiST extension implementation

From
Dimitri Fontaine
Date:
Le 12 juin 09 à 23:20, Tom Lane a écrit :
> Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> Le 12 juin 09 à 21:49, Tom Lane a écrit :
>>> It seems to me it could still do
>>> with a lot more detail to specify what API the functions are really
>>> expected to implement.
>
> What's bothering me is the fuzziness of the API
> specifications for the support functions.  It's not real clear for
> example what you have to do to have an index storage type different
> from
> the column datatype, and even less clear which type the same()
> function
> is comparing.  Having some skeletons that execute magic bits of
> undocumented code is not a substitute for a specification.

Oh yes that wasn't easy to guess: I had to look at others
implementations then do some tests (trial&error) to determine this.
Andrew Gierth has been really helpful here, and his ip4r module a good
example (but without varlena).
I'll try to provide something here, what I'm trying to say is that I
need some help and research (and core code reading) to reverse
engineer the specs.

Regards,
--
dim