Re: BUG #17158: Distinct ROW fails with Postgres 14 - Mailing list pgsql-bugs

From Peter Eisentraut
Subject Re: BUG #17158: Distinct ROW fails with Postgres 14
Date
Msg-id c182f5c8-fdf3-80a0-fa43-4ed7e87d4d47@enterprisedb.com
Whole thread Raw
In response to Re: BUG #17158: Distinct ROW fails with Postgres 14  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #17158: Distinct ROW fails with Postgres 14  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On 25.08.21 00:16, Tom Lane wrote:
> Undoing that would lose v14's ability to select hashed duplicate
> elimination for RECORD columns, but that's still not a regression
> because we didn't have it before.  Moreover, anyone who's unhappy can
> work around the problem by explicitly casting the column to some
> suitable named composite type.  We can leave it for later to make the
> planner smarter about anonymous record types.  It clearly could be
> smarter, at least for the case of an explicit ROW construct at top
> level; but now is no time to be writing such code for v14.

This feature is a requirement for multicolumn path and cycle tracking in 
recursive queries, as well as the search/cycle syntax built on top of 
that, so there is a bit more depending on it than might be at first 
apparent.

I've been looking at ways to repair this with minimal impact. 
Essentially, we'd need a way ask the type cache to distinguish between 
"do you have hash support if it's guaranteed to work" versus "hash 
support is my only hope, so give it to me even if you're not completely 
sure it will work".  Putting this directly into the type cache does not 
seem feasible with the current structure.  But there aren't that many 
callers of TYPECACHE_HASH_PROC*, so I looked at handling it there.

Variant 1 is that we let the type cache *not* report hash support for 
the record type, and let callers fill it in.  In the attached patch I've 
only done this for hash_array(), because that's what's needed to get the 
tests to pass, but similar code would be possible for row types, range 
types, etc.

Variant 2 is that we let the type cache report hash support for the 
record type, like now, and then let callers override it if they have 
other options.  This is the second attached patch.

It's basically fifty-fifty in terms of how many places you need to touch 
in either case.

With both patches, you'll see the "union" regression test fail, which 
includes a test case that is equivalent to the one from this bug report 
(but using money instead of bit), but the "with" test still passes, 
which covers the feature I mentioned at the beginning.

Thoughts?

Attachment

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #17163: spgist index scan statistics stays at 0
Next
From: Peter Eisentraut
Date:
Subject: Re: BUG #17148: About --no-strict-names option and --quiet option of pg_amcheck command