Re: Understanding GIN posting trees - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Understanding GIN posting trees
Date
Msg-id 4E1F5B28.1060506@enterprisedb.com
Whole thread Raw
In response to Re: Understanding GIN posting trees  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: Understanding GIN posting trees
List pgsql-hackers
On 14.07.2011 17:28, Teodor Sigaev wrote:
>> Why is the posting tree a tree? AFAICS, we never search it using the
>> TID, it's always scanned in whole. It would be simpler to store the TIDs
>> in a posting list in no particular order. This could potentially make
>> insertions cheaper, as you could just append to the last posting list
>> page for the key, instead of traversing the posting tree to a particular
>> location. You could also pack the tids denser, as you wouldn't need to
>> reserve free space for additions in the middle.
> For consistentFn call we need to collect all data for current tid. We do
> that by scanning over logical ordered arrays of tids and trees allows to
> do that by scanning a leafs pages.

Oh, I see. You essentially do a merge join of all the posting trees of 
query keys.

Hmm, but we do need to scan all the posting trees of all the matched 
keys in whole anyway. We could collect all TIDs in the posting lists of 
all the keys into separate TIDBitmaps, and then combine the bitmaps, 
calling consistentFn for each TID that was present in at least one 
bitmap. I guess the performance characteristics of that would be 
somewhat different from what we have now, and you'd need to keep a lot 
of in-memory bitmaps if the query contains a lot of keys.


While we're at it, it just occurred to me that it if the number of query 
keys is limited, say <= 16, you could build a lookup table for each 
combination of keys either occurring or not. You could use then use that 
instead of calling consistentFn for each possible match. You could even 
use the table to detect common cases like "all/any keys must match", 
perhaps opening some optimization opportunities elsewhere.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Is there a committer in the house?
Next
From: Dave Page
Date:
Subject: Re: pg_class.relistemp