Re: BRIN indexes (was Re: Minmax indexes) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: BRIN indexes (was Re: Minmax indexes)
Date
Msg-id 540EE93D.7090301@vmware.com
Whole thread Raw
In response to BRIN indexes (was Re: Minmax indexes)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BRIN indexes - TRAP: BadArgument
List pgsql-hackers
On 09/08/2014 07:02 PM, Alvaro Herrera wrote:
> Here's version 18.  I have renamed it: These are now BRIN indexes.
>
> I have fixed numerous race conditions and deadlocks.  In particular I
> fixed this problem you noted:
>
> Heikki Linnakangas wrote:
>> Another race condition:
>>
>> If a new tuple is inserted to the range while summarization runs,
>> it's possible that the new tuple isn't included in the tuple that
>> the summarization calculated, nor does the insertion itself udpate
>> it.
>
> I did it mostly in the way you outlined, i.e. by way of a placeholder
> tuple that gets updated by concurrent inserters and then the tuple
> resulting from the scan is unioned with the values in the updated
> placeholder tuple.  This required the introduction of one extra support
> proc for opclasses (pretty simple stuff anyhow).

Hmm. So the union support proc is only called if there is a race 
condition? That makes it very difficult to test, I'm afraid.

It would make more sense to pass BrinValues to the support functions, 
rather than DeformedBrTuple. An opclass'es support function should never 
need to access the values for other columns.

Does minmaxUnion handle NULLs correctly?

minmaxUnion pfrees the old values. Is that necessary? What memory 
context does the function run in? If the code runs in a short-lived 
memory context, you might as well just let them leak. If it runs in a 
long-lived context, well, perhaps it shouldn't. It's nicer to write 
functions that can leak freely. IIRC, GiST and GIN runs the support 
functions in a temporary context. In any case, it might be worth noting 
explicitly in the docs which functions may leak and which may not.

If you add a new datatype, and define b-tree operators for it, what is 
required to create a minmax opclass for it? Would it be possible to 
generalize the functions in brin_minmax.c so that they can be reused for 
any datatype (with b-tree operators) without writing any new C code? I 
think we're almost there; the only thing that differs between each data 
type is the opcinfo function. Let's pass the type OID as argument to the 
opcinfo function. You could then have just a single minmax_opcinfo 
function, instead of the macro to generate a separate function for each 
built-in datatype.

In general, this patch is in pretty good shape now, thanks!

- Heikki




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: pgbench throttling latency limit
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [TODO] Process pg_hba.conf keywords as case-insensitive