Re: GIST and TOAST - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: GIST and TOAST
Date
Msg-id 45ED838E.5000204@sigaev.ru
Whole thread Raw
In response to Re: GIST and TOAST  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: GIST and TOAST
List pgsql-hackers
> The problem is that this is the only place in the code where we make wholesale
> assumptions that a datum that comes from a tuple (heap tuple or index tuple)
> isn't toasted. There are other places but they're always flagged with big
> comments explaining *why* the datum can't be toasted and they're minor
> localized instances, not a whole subsystem.
> 
> This was one of the assumptions that the packed varlena code depended on: that
> anyone looking at a datum from a tuple would always detoast it even if they
> had formed the tuple themselves and never passed it through the toaster. The
> *only* place this has come up as a problem is in GIST.

I'm afraid that we have some lack of understanding. Flow of operation with 
indexed tuple in gist is:
- read tuple
- get n-th attribute with a help of  index_getattr
- call user-defined decompress method which should, at least, detoast value
- result value is passed to other user-defined method

Any new value, produced by user-defined method of GiST, before packing into 
tuple should be compressed by user-defined compress method. Compress method 
should not toast value - that is not its task.

New values are always modified by compress method before insertion. See 
gistinsert:gist.c and gistFormTuple:gistutil.c.

So, index_form_tuple should toast value, but value is already compressed and 
live in memory. Detoasting of value should be done by decompress method and live 
in memory, and so, only after that value can be passed to other user-defined method.

As I understand, packing/unpacking varlena header is doing during 
toasting/detoastiong. So, I'm not understand the problem here.

What is more, GiST API doesn't limit type of keys passed between user-defined 
GiST methods. It just says that new value should be a type on which opclass was 
defined and output of compress method should be a type pointed by STORAGE option  in CREATE OPERATOR CLASS.

> There may be places that assume they won't leak detoasted copies of datums. If
> you could help point those places out they should just need PG_FREE_IF_COPY()

GiST code works in separate memory context to prevent memory leaks. See 
gistinsert/gistbuildCallback/gistfindnext.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Aggressive freezing in lazy-vacuum
Next
From: Teodor Sigaev
Date:
Subject: Re: user-defined tree methods in GIST