Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Date
Msg-id facc1d5c-cf6a-46ad-a8a1-cc01617b5ddb@2ndquadrant.com
Whole thread Raw
In response to Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2020-08-18 22:09, Tom Lane wrote:
> Here's a full patch addressing this issue.  I decided that the best
> way to address the test-instability problem is to explicitly give
> collations to all the foreign-table columns for which it matters
> in the postgres_fdw test.  (For portability's sake, that has to be
> "C" or "POSIX"; I mostly used "C".)  Aside from ensuring that the
> test still passes with some other prevailing locale, this seems like
> a good idea since we'll then be testing the case we are encouraging
> users to use.

I have studied this patch and this functionality.  I don't think 
collation differences between remote and local instances are handled 
sufficiently.  This bug report and patch addresses one particular case, 
where the database-wide collation of the remote and local instance are 
different.  But it doesn't handle cases like the same collation name 
doing different things, having different versions, or different 
attributes.  This probably works currently because the libc collations 
don't have much functionality like that, but there is a variety of work 
conceived (or, in the case of version tracking, already done since the 
bug was first discussed) that would break that.

Taking a step back, I think there are only two ways this could really 
work: Either, the admin makes a promise that all the collations match on 
all the instances; then the planner can take advantage of that.  Or, 
there is no such promise, and then the planner can't.  I don't 
understand what the currently implemented approach is.  It appears to be 
something in the middle, where certain representations are made that 
certain things might match, and then there is some nontrivial code that 
analyzes expressions whether they conform to those rules.  As you said, 
the description of the import_collate option is kind of hand-wavy about 
all this.

-- 
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Determine parallel-safety of partition relations for Inserts
Next
From: Yugo NAGATA
Date:
Subject: Re: [PATCH] Add extra statistics to explain for Nested Loop