Re: Consolidate 'unique array values' logic into a reusable function? - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Consolidate 'unique array values' logic into a reusable function?
Date
Msg-id CA+hUKGK_kwiS+3VCeMMGAKg=27T1v17ABzt+xDa1qeW7W7wruA@mail.gmail.com
Whole thread Raw
In response to Re: Consolidate 'unique array values' logic into a reusable function?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Consolidate 'unique array values' logic into a reusable function?  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hello,

I'm reviving a thread from 2016, because I wanted this thing again today.

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> > Here's a sketch patch that creates a function array_unique which takes
> > the same arguments as qsort or qsort_arg and returns the new length.
>
> Hmm ... I'd be against using this in backend/regex/, because I still
> have hopes of converting that to a standalone library someday (and
> in any case it needs to stay compatible with Tcl's copy of the code).
> But otherwise this seems like a reasonable proposal.
>
> As for the function name, maybe "qunique()" to go with "qsort()"?
> I'm not thrilled with "array_unique" because that sounds like it
> is meant for Postgres' array data types.

OK, here it is renamed to qunique() and qunique_arg().  It's a bit odd
because it has nothing to do with the quicksort algorithm, but make
some sense because it's always used with qsort().  I suppose we could
save a few more lines if there were a qsort_unique() function that
does both, since the arguments are identical.  I also moved it into a
new header lib/qunique.h.  Any better ideas for where it should live?
I removed the hunk under regex.

One thing I checked is that on my system it is inlined along with the
comparator when that is visible, so no performance should be lost by
throwing away the open coded versions.  This makes me think that eg
oid_cmp() should probably be defined in a header; clearly we're also
carrying a few functions that should be consolidated into a new
int32_cmp() function, somewhere, too.  (It might also be interesting
to use the pg_attribute_always_inline trick to instantiate some common
qsort() specialisations for a bit of speed-up, but that's another
topic.)

Adding to CF.

--
Thomas Munro
https://enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Yet another fast GiST build
Next
From: Michael Paquier
Date:
Subject: Re: Improve error detections in TAP tests by spreading safe_psql