Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays |
Date | |
Msg-id | 557e8853-7cb8-9c01-6193-1d4109345401@iki.fi Whole thread Raw |
In response to | Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays (James Coleman <jtc331@gmail.com>) |
Responses |
Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
|
List | pgsql-hackers |
On 08/09/2020 22:25, James Coleman wrote: > On Wed, Aug 19, 2020 at 3:16 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> >> You could also apply the optimization for non-Const expressions, as long >> as they're stable (i.e. !contain_volatile_functions(expr)). > > Is that true? Don't we also have to worry about whether or not the > value is stable (i.e., know when a param has changed)? There have been > discussions about being able to cache stable subexpressions, and my > understanding was that we'd need to have that infrastructure (along > with the necessarily invalidation when the param changes) to be able > to use this for non-Const expressions. Yeah, you're right, you'd have to also check for PARAM_EXEC Params. And Vars. I think the conditions are the same as those checked in match_clause_to_partition_key() in partprune.c (it's a long function, search for "if (!IsA(rightop, Const))"). Not sure it's worth the trouble, then. But it would be nice to handle queries like "WHERE column = ANY ($1)" >> Deconstructing the array Datum into a simple C array on first call would >> be a win even for very small arrays and for AND semantics, even if you >> don't use a hash table. > > Because you wouldn't have to repeatedly detoast it? Or some other > reason I'm not thinking of? My intuition would have been that (aside > from detoasting if necessary) there wouldn't be much real overhead in, > for example, an array storing integers. Dealing with NULLs and different element sizes in the array is pretty complicated. Looping through the array currently looks like this: /* Loop over the array elements */ s = (char *) ARR_DATA_PTR(arr); bitmap = ARR_NULLBITMAP(arr); bitmask = 1; for (int i = 0; i < nitems; i++) { Datum elt; Datum thisresult; /* Get array element, checking for NULL */ if (bitmap && (*bitmap & bitmask) == 0) { fcinfo->args[1].value = (Datum) 0; fcinfo->args[1].isnull = true; } else { elt = fetch_att(s, typbyval, typlen); s = att_addlength_pointer(s, typlen, s); s = (char *) att_align_nominal(s, typalign); fcinfo->args[1].value = elt; fcinfo->args[1].isnull = false; } [do stuff with Datum/isnull] /* advance bitmap pointer if any */ if (bitmap) { bitmask <<= 1; if (bitmask == 0x100) { bitmap++; bitmask = 1; } } } Compared with just: for (int i = 0; i < nitems; i++) { Datum elt = datums[i]; [do stuff with the Datum] } I'm not sure how much difference that makes, but I presume it's not zero, and it seems like an easy win when you have the code to deal with the Datum array representation anyway. - Heikki
pgsql-hackers by date: