Home > mailing lists

Re: count_nulls(VARIADIC "any") - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	Re: count_nulls(VARIADIC "any")
Date	January 4, 2016 07:24:42
Msg-id	CAFj8pRBkMU=DNdBamyeKqVeydJMhGOmL0N1q-aEfHqbd_JPkwA@mail.gmail.com Whole thread Raw
In response to	Re: count_nulls(VARIADIC "any") (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses	Re: count_nulls(VARIADIC "any") (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List	pgsql-hackers

Tree view

2016-01-03 22:49 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/3/16 2:37 PM, Pavel Stehule wrote:
+ /* num_nulls(VARIADIC NULL) is defined as NULL */
+ if (PG_ARGISNULL(0))
+ return false;

Could you add to the comment explaining why that's the desired behavior?

This case should be different than num_nulls(VARIADIC ARRAY[..]) - this situation is really equivalent of missing data and NULL is correct answer. It should not be too clean in num_nulls, but when it is cleaner for num_notnulls. And more, it is consistent with other variadic functions in Postgres: see concat_internal and text_format.

+ /*
+ * Non-null argument had better be an array. We assume that any call
+ * context that could let get_fn_expr_variadic return true will have
+ * checked that a VARIADIC-labeled parameter actually is an array. So
+ * it should be okay to just Assert that it's an array rather than
+ * doing a full-fledged error check.
+ */
+ Assert(OidIsValid(get_base_element_type(get_fn_expr_argtype(fcinfo->flinfo, 0))));

Erm... is that really the way to verify that what you have is an array? ISTM there should be a macro for that somewhere...

really, it is. It is used more time. Although I am not against some macro, I don't think so it is necessary. The macro should not be too shorter than this text.

+ /* OK, safe to fetch the array value */
+ arr = PG_GETARG_ARRAYTYPE_P(0);
+
+ ndims = ARR_NDIM(arr);
+ dims = ARR_DIMS(arr);
+ nitems = ArrayGetNItems(ndims, dims);
+
+ bitmap = ARR_NULLBITMAP(arr);
+ if (bitmap)
+ {
+ bitmask = 1;
+
+ for (i = 0; i < nitems; i++)
+ {
+ if ((*bitmap & bitmask) == 0)
+ count++;
+
+ bitmask <<= 1;
+ if (bitmask == 0x100)
+ {
+ bitmap++;
+ bitmask = 1;

For brevity and example sake it'd probably be better to just use the normal iterator, unless there's a serious speed difference?

The iterator does some memory allocations and some access to type cache. Almost all work of iterator is useless for this case. This code is developed by Marko, but I agree with this design. Using the iterator is big gun for this case. I didn't any performance checks, but it should be measurable for any varlena arrays.

Regards

Pavel

In the unit test, I'd personally prefer just building a table with the test cases and the expected NULL/NOT NULL results, at least for all the calls that would fit that paradigm. That should significantly reduce the size of the test. Not a huge deal though...

Also, I don't think anything is testing multiples of whatever value... how 'bout change the generate_series CASE statement to >40 instead of <>40?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

pgsql-hackers by date:

From: Jim Nasby
Date: 04 January 2016, 07:20:42
Subject: Re: 9.5 BLOCKER: regrole and regnamespace and quotes

From: Tom Lane
Date: 04 January 2016, 07:25:12
Subject: Re: 9.5 BLOCKER: regrole and regnamespace and quotes

Re: count_nulls(VARIADIC "any") - Mailing list pgsql-hackers

Previous

Next