Re: Fixing GIN for empty/null/full-scan cases - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Fixing GIN for empty/null/full-scan cases
Date
Msg-id AANLkTimdUq9iaDem_PnvQbe=CR2aPF2DpZxp9H+x4MND@mail.gmail.com
Whole thread Raw
In response to Re: Fixing GIN for empty/null/full-scan cases  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fixing GIN for empty/null/full-scan cases  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Fixing GIN for empty/null/full-scan cases  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Tue, Jan 4, 2011 at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Jan 4, 2011 at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> * Existing GIN indexes are upwards compatible so far as on-disk storage
>>> goes, but they will of course be missing entries for empty, null, or
>>> null-containing items.  Users who want to do searches that should find
>>> such items will need to reindex after updating to 9.1.
>
>> This is the only part of this proposal that bothers me a little bit.
>> It would be nice if the system could determine whether a GIN index is
>> "upgraded from 9.0 or earlier and thus doesn't contain these entries"
>> - and avoid trying to use the index for these sorts of queries in
>> cases where it might return wrong answers.
>
> I don't think it's really worth the trouble.  The GIN code has been
> broken for these types of queries since day one, and yet we've had only
> maybe half a dozen complaints about it.  Moreover there's no practical
> way to "avoid trying to use the index", since in many cases the fact
> that a query requires a full-index scan isn't determinable at plan time.
>
> The best we could really do is throw an error at indexscan start, and
> that doesn't seem all that helpful.  But it probably wouldn't take much
> code either, if you're satisfied with that answer.  (I'm envisioning
> adding a version ID to the GIN metapage and then checking that before
> proceeding with a full-index scan.)

I'd be satisfied with that answer.  It at least makes it a lot more
clear when you've got a problem.  If this were a more common scenario,
I'd probably advocate for a better solution, but the one you propose
seems adequate given the frequency of the problem as you describe it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: can shared cache be swapped to disk?
Next
From: Josh Berkus
Date:
Subject: Re: Fixing GIN for empty/null/full-scan cases