Thread: Explanations not clear
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/16/collation.html Description: I created a collation specifying the ks-level3 setting and with deterministic set to false. But when I compare "a_b" to "a-b" with this collation I get false. According to the table 24.1 it should yield true. Only after adding ka-shifted this comparison becomes true. The interactions of the different options are not very clear. What is the difference for each level when using ks without ka and with ka? The explanations for the options kc and kb are also not very helpful. Also the role of deterministic = false is not very good explained. It seems that the settings ks, ka and kc loose any meaning with this setting. Thank you very much
On 06.05.24 19:59, PG Doc comments form wrote: > The following documentation comment has been logged on the website: > > Page: https://www.postgresql.org/docs/16/collation.html > Description: > > I created a collation specifying the ks-level3 setting and with > deterministic set to false. But when I compare "a_b" to "a-b" with this > collation I get false. According to the table 24.1 it should yield true. > Only after adding ka-shifted this comparison becomes true. The interactions > of the different options are not very clear. I think table 24.1 is somewhat incorrect in the sense that punctuation is only level 4 if you use ka-shifted, otherwise it's level 1. This should perhaps be clarified.
On Wed, 2024-05-08 at 08:52 +0200, Peter Eisentraut wrote: > > I created a collation specifying the ks-level3 setting and with > > deterministic set to false. But when I compare "a_b" to "a-b" with > > this > > collation I get false. According to the table 24.1 it should yield > > true. > > Only after adding ka-shifted this comparison becomes true. The > > interactions > > of the different options are not very clear. > > I think table 24.1 is somewhat incorrect in the sense that > punctuation > is only level 4 if you use ka-shifted, otherwise it's level 1. This > should perhaps be clarified. One option is to just include 3 levels (plus "identic") in the table, and then later document that ka-shifted creates a fourth level and moves punctuation character differences into that level. That explains the mechanism but detracts from the examples. Another option is to say that all the examples in the table are using ka-shifted for illustration purposes. I like this option, but it's a bit awkward because it refers to something that hasn't been explained yet. It's also only relevant for the 'x-y' = 'x_y' example, which might be slightly confusing. Thoughts? Regards, Jeff Davis
On 13.05.24 21:10, Jeff Davis wrote: > On Wed, 2024-05-08 at 08:52 +0200, Peter Eisentraut wrote: >>> I created a collation specifying the ks-level3 setting and with >>> deterministic set to false. But when I compare "a_b" to "a-b" with >>> this >>> collation I get false. According to the table 24.1 it should yield >>> true. >>> Only after adding ka-shifted this comparison becomes true. The >>> interactions >>> of the different options are not very clear. >> >> I think table 24.1 is somewhat incorrect in the sense that >> punctuation >> is only level 4 if you use ka-shifted, otherwise it's level 1. This >> should perhaps be clarified. > > One option is to just include 3 levels (plus "identic") in the table, > and then later document that ka-shifted creates a fourth level and > moves punctuation character differences into that level. That explains > the mechanism but detracts from the examples. > > Another option is to say that all the examples in the table are using > ka-shifted for illustration purposes. I like this option, but it's a > bit awkward because it refers to something that hasn't been explained > yet. It's also only relevant for the 'x-y' = 'x_y' example, which might > be slightly confusing. > > Thoughts? Maybe with a forward reference in a footnote, like this: diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml index daf671e6205..10c12adf634 100644 --- a/doc/src/sgml/charset.sgml +++ b/doc/src/sgml/charset.sgml @@ -1320,7 +1320,7 @@ <title>ICU Collation Levels</title> </row> <row> <entry>level4</entry> - <entry>Punctuation</entry> + <entry>Punctuation<footnote><para>only with <literal>ka-shifted</literal>; see <xref linkend="icu-collation-settings-table"/></para></footnote></entry> <entry><literal>true</literal></entry> <entry><literal>true</literal></entry> <entry><literal>false</literal></entry>