Re: [PATCH] Introduce array_shuffle() and array_sample() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PATCH] Introduce array_shuffle() and array_sample()
Date
Msg-id 847968.1658184212@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PATCH] Introduce array_shuffle() and array_sample()  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: [PATCH] Introduce array_shuffle() and array_sample()
List pgsql-hackers
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Jul 18, 2022 at 3:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Independently of the dimensionality question --- I'd imagined that
>> array_sample would select a random subset of the array elements
>> but keep their order intact.  If you want the behavior shown
>> above, you can do array_shuffle(array_sample(...)).  But if we
>> randomize it, and that's not what the user wanted, she has no
>> recourse.

> And for those that want to know in what order those elements were chosen
> they have no recourse in the other setup.

Um ... why is "the order in which the elements were chosen" a concept
we want to expose?  ISTM sample() is a black box in which notionally
the decisions could all be made at once.

> I really think this function needs to grow an algorithm argument that can
> be used to specify stuff like ordering, replacement/without-replacement,
> etc...just some enums separated by commas that can be added to the call.

I think you might run out of gold paint somewhere around here.  I'm
still not totally convinced we should bother with the sample() function
at all, let alone that it needs algorithm variants.  At some point we
say to the user "here's a PL, write what you want for yourself".

            regards, tom lane



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Commitfest Update
Next
From: Martin Kalcher
Date:
Subject: Re: [PATCH] Introduce array_shuffle() and array_sample()