On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote:
> AFAICS the standard only defines arrays and multisets. Arrays are pretty
> much the thing we have, including the ARRAY[] constructor etc. Multisets
> are similar to hashset discussed here, except that it tracks the number
> of elements for each value (which would be trivial in hashset).
>
> So if we want to make this a built-in feature, maybe we should aim to do
> the multiset thing, with the standard SQL syntax? Extending the grammar
> should not be hard, I think. I'm not sure of the underlying code
> (ArrayType, ARRAY_SUBLINK stuff, etc.) we could reuse or if we'd need a
> lot of separate code doing that.
Multisets handle duplicates uniquely, this may bring unexpected issues. Sets
and multisets have distinct utility in C++, Rust, Java, etc. However, sets are
more fundamental and prevalent in std libs than multisets.
Despite SQL's multiset possibility, a distinct hashset type is my preference,
helping appropriate data structure choice and reducing misuse.
The necessity of multisets is vague beyond standards compliance.
/Joel