pgsql: Clean up some ad-hoc code for sorting and de-duplicating Lists. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Clean up some ad-hoc code for sorting and de-duplicating Lists.
Date
Msg-id E1hnPw8-0003V5-Bf@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Clean up some ad-hoc code for sorting and de-duplicating Lists.

heap.c and relcache.c contained nearly identical copies of logic
to insert OIDs into an OID list while preserving the list's OID
ordering (and rejecting duplicates, in one case but not the other).

The comments argue that this is faster than qsort for small numbers
of OIDs, which is at best unproven, and seems even less likely to be
true now that lappend_cell_oid has to move data around.  In any case
it's ugly and hard-to-follow code, and if we do have a lot of OIDs
to consider, it's O(N^2).

Hence, replace with simply lappend'ing OIDs to a List, then list_sort
the completed List, then remove adjacent duplicates if necessary.
This is demonstrably O(N log N) and it's much simpler for the
callers.  It's possible that this would be somewhat inefficient
if there were a very large number of duplicates, but that seems
unlikely in the existing usage.

This adds list_deduplicate_oid and list_oid_cmp infrastructure
to list.c.  I didn't bother with equivalent functionality for
integer or pointer Lists, but such could always be added later
if we find a use for it.

Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/2f5b8eb5a28b4e6de9d20cc7d2c6028c6c7a8aa8

Modified Files
--------------
src/backend/catalog/heap.c         | 49 +++++---------------------------------
src/backend/nodes/list.c           | 44 ++++++++++++++++++++++++++++++++++
src/backend/utils/cache/relcache.c | 46 +++++++----------------------------
src/include/nodes/pg_list.h        |  4 ++++
4 files changed, 63 insertions(+), 80 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Represent Lists as expansible arrays, not chains of cons-cells.
Next
From: Andrew Dunstan
Date:
Subject: Re: pgsql: Represent Lists as expansible arrays, not chains ofcons-cells.