Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Date
Msg-id 23146.1317337381@sss.pgh.pa.us
Whole thread Raw
In response to Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
List pgsql-hackers
Alexander Korotkov <aekorotkov@gmail.com> writes:
> On Fri, Sep 30, 2011 at 1:08 AM, Heikki Linnakangas <
> heikki.linnakangas@enterprisedb.com> wrote:
>> At every call, gtrgm_penalty() has to calculate the signature for newitem,
>> using makesign(). That's an enormous waste of effort, but there's currently
>> no way gtrgm_penalty() to avoid that. If we could call makesign() only on
>> the first call in the loop, and remember it for the subsequent calls, that
>> would eliminate the need for any micro-optimization in makesign() and make
>> inserting into a trigram index much faster (including building the index
>> from scratch)

> Isn't it possible to cache signature of newitem in gtrgm_penalty
> like gtrgm_consistent do this for query?

[ studies that code for awhile ... ]  Ick, what a kluge.

The main problem with that code is that the cache data gets leaked at
the conclusion of a scan.  Having just seen the consequences of leaking
the "giststate", I think this is something we need to fix not emulate.

I wonder whether it's worth having the GIST code create a special
scan-lifespan (or insert-lifespan) memory context that could be used
for cached data such as this?  It's already creating a couple of
contexts for its own purposes, so one more might not be a big problem.
We'd have to figure out a way to make that context available to GIST
support functions, though, as well as something cleaner than fn_extra
for them to keep pointers in.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_upgrade - add config directory setting
Next
From: Michael Paquier
Date:
Subject: Re: pg_regress input/output directory option