Proposal for partial resove issue of GIN fullscan. - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Proposal for partial resove issue of GIN fullscan.
Date
Msg-id 45BF8656.7050703@sigaev.ru
Whole thread Raw
List pgsql-hackers
Small introduction: GIN index doesn't support full scan of index now because of 
disaster performance. Pointer to each heap tuple will be returned several times.  Next, if extractQuery doesn't return
anything,GIN generates error 'GIN index 
 
does not support search with void query'. That is because of different semantic 
meaning of operations: some operation with void query should returns all tuples, 
some nothing.

Now, support function extractQuery has prototype (pseudocode):
Datum *extractQuery( Datum value, uint32 *nentry, StrategyNumber strategy)

Proposal:
Change  extractQuery's prototype to:
Datum *extractQuery( Datum value, int32 *nentry, StrategyNumber strategy)
And add agreement about meaning nentry's value:
nentry > 0  - number of entry to search
nentry = 0  - query requires full scan
nentry < 0  - guarantee that any tuple can't satisfy query

So, if GIN gets nentry < 0 from at least one index quals then 
gingettuple/gingetmulti can do not actual search, just returns false.

Next, modify gincostestimate to call extractQuery to define nentry answer for 
each clause in indexQuals. In case nentry == 0, gincostestimate should return 
disable_cost cost estimate of index search to prevent index usage.

Disadvantage of this proposal: gincostestimate can't work with queries which are 
taken from table or subselect, so proposal doesn't resolve all cases of issue, 
but eliminates most frequent. Void tsquery (from tsearch2) always means empty 
result and fast working of GIN, so, tsearch2's users will not face a error 'GIN 
index does not support search with void query'

Comments, objections, suggestions?


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: "May", "can", "might"
Next
From: Guido Goldstein
Date:
Subject: Re: pgsql: Fix for plpython functions; return true/false for boolean,