Re: Fixing GIN for empty/null/full-scan cases - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fixing GIN for empty/null/full-scan cases
Date
Msg-id 20044.1294416468@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fixing GIN for empty/null/full-scan cases  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> 2. Add another output bool parameter to extractQuery that it must set
> true (from a default false state) if the query could match with no check
> values set.  This would prompt the GIN code to search for EMPTY_ITEM
> placeholders, but they'd not be part of the check[] array.

On further reflection: if we're going to go this route, we really ought
to take one more step and allow the opclass to demand a full-index scan.
The reason for this is cases like tsvector's NOT operator:
SELECT ... WHERE tsvectorcol @@ '! unwanted'::tsquery

Right now, this will do what it says on the tin if implemented as a
seqscan.  It will fail (silently, I think) if implemented as a GIN index
search.  We didn't use to have any way of making it behave sanely as
an indexsearch, but the mechanisms I'm building now would support doing
this right.

So, instead of just a bool, I'm now proposing adding an int return
argument specified like this:
   searchMode is an output argument that allows extractQuery to specify   details about how the search will be done. If
*searchModeis set to   GIN_SEARCH_MODE_DEFAULT (which is the value it is initialized to   before call), only items that
matchat least one of the returned   keys are considered candidate matches. If *searchMode is set to
GIN_SEARCH_MODE_INCLUDE_EMPTY,then in addition to items containing   at least one matching key, items that contain no
keysat all are   considered candidate matches. (This mode is useful for implementing   is-subset-of operators, for
example.)If *searchMode is set to   GIN_SEARCH_MODE_ALL, then all non-null items in the index are   considered
candidatematches, whether they match any of the returned   keys or not. (This mode is much slower than the other two
choices,  since it requires scanning essentially the entire index, but it may   be necessary to implement corner cases
correctly.An operator that   needs this mode in most cases is probably not a good candidate for a   GIN operator
class.)The symbols to use for setting this mode are   defined in access/gin.h.
 

The default mode is equivalent to what used to happen implicitly, so
this is still backwards-compatible with existing opclasses.

Don't have code to back up this spec yet, but I believe I see how to do
it.
        regards, tom lane


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: LOCK for non-tables
Next
From: Robert Haas
Date:
Subject: Re: Re: [COMMITTERS] pgsql: New system view pg_stat_replication displays activity of wal sen