Re: [PATCH] Introduce array_shuffle() and array_sample() - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: [PATCH] Introduce array_shuffle() and array_sample()
Date
Msg-id alpine.DEB.2.22.394.2207240929220.1359119@pseudo
Whole thread Raw
In response to Re: [PATCH] Introduce array_shuffle() and array_sample()  (Martin Kalcher <martin.kalcher@aboutsource.net>)
Responses Re: [PATCH] Introduce array_shuffle() and array_sample()
List pgsql-hackers
>> i came to the same conclusions and went with Option 1 (see patch). Mainly 
>> because most code in utils/adt is organized by type and this way it is 
>> clear, that this is a thin wrapper around pg_prng.
>> 
>
> Small patch update. I realized the new functions should live 
> array_userfuncs.c (rather than arrayfuncs.c), fixed some file headers and 
> added some comments to the code.

My 0,02€ about this patch:

Use (void) when declaring no parameters in headers or in functions.

Should the exchange be skipped when i == k?

I do not see the point of having *only* inline functions in a c file. The 
external functions should not be inlined?

The random and array shuffling functions share a common state. I'm 
wondering whether it should ot should not be so. It seems ok, but then 
ISTM that the documentation suggests implicitely that setseed applies to 
random() only, which is not the case anymore after the patch.

If more samples are required than the number of elements, it does not 
error out. I'm wondering whether it should.

Also, the sampling should not return its result in order when the number 
of elements required is the full array, ISTM that it should be shuffled 
there as well.

I must say that without significant knowledge of the array internal 
implementation, the swap code looks pretty strange. ISTM that the code 
would be clearer if pointers and array references style were not 
intermixed.

Maybe you could add a test with a 3D array? Some sample with NULLs?

Unrelated: I notice again that (postgre)SQL does not provide a way to 
generate random integers. I do not see why not. Should we provide one?

-- 
Fabien.

pgsql-hackers by date:

Previous
From: Zhang Mingli
Date:
Subject: Re: optimize lookups in snapshot [sub]xip arrays
Next
From: Dmitry Dolgov
Date:
Subject: Re: pg_stat_statements and "IN" conditions