Thread: Lower or Upper case for F.33. pg_trgm
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/14/pgtrgm.html Description: Hey guys, I have a question regarding the trigram algorithm and I can not find any information about it in your documentation: Do you distinguish between lower and uppercase? Or do you consider all words in lowercase? Happy to get a short feedback from you, Greetings, Marc
> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote: > I have a question regarding the trigram algorithm and I can not find any > information about it in your documentation: Maybe we should add something about this? > Do you distinguish between lower and uppercase? Or do you consider all words > in lowercase? There is support for compiling pg_trgm case sensitive, but it's by default case insensitive. # SELECT word_similarity('word', 'WORD'); word_similarity ----------------- 1 (1 row) > Happy to get a short feedback from you, I would recommend the pg_general mailinglist as that will be a safer way to get general questions answered. -- Daniel Gustafsson https://vmware.com/
Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: >> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote: > >> I have a question regarding the trigram algorithm and I can not find any >> information about it in your documentation: > > Maybe we should add something about this? Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there is no mention of the ~ versus ~* difference. Maybe worth to (already in pgtrgm.html) give the simple hint: ~ is case-sensitive ~* is case-insensitive In any case a link to functions-matching.html seems indicated. Erik Rijkers > >> Do you distinguish between lower and uppercase? Or do you consider all words >> in lowercase? > > There is support for compiling pg_trgm case sensitive, but it's by default case > insensitive. > > # SELECT word_similarity('word', 'WORD'); > word_similarity > ----------------- > 1 > (1 row) > >> Happy to get a short feedback from you, > > I would recommend the pg_general mailinglist as that will be a safer way to get > general questions answered. > > -- > Daniel Gustafsson https://vmware.com/ > > >
> On 16 Aug 2022, at 12:54, Erik Rijkers <er@xs4all.nl> wrote: > > Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: >>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote: >>> I have a question regarding the trigram algorithm and I can not find any >>> information about it in your documentation: >> Maybe we should add something about this? > > Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper','lower', and that there is no mention of the ~ versus ~* difference. > > Maybe worth to (already in pgtrgm.html) give the simple hint: > ~ is case-sensitive > ~* is case-insensitive > > In any case a link to functions-matching.html seems indicated. Yeah, I think there is room for improvements here. Are you up for drafting a patch for this? -- Daniel Gustafsson https://vmware.com/
Thanks for your fast response.
Is this a question for me? I am fine with a short hint regarding the default.
A link to another documentation is also fine.
Is this a question for me? I am fine with a short hint regarding the default.
A link to another documentation is also fine.
Am Di., 16. Aug. 2022 um 13:46 Uhr schrieb Daniel Gustafsson <daniel@yesql.se>:
> On 16 Aug 2022, at 12:54, Erik Rijkers <er@xs4all.nl> wrote:
>
> Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:
>>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote:
>>> I have a question regarding the trigram algorithm and I can not find any
>>> information about it in your documentation:
>> Maybe we should add something about this?
>
> Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there is no mention of the ~ versus ~* difference.
>
> Maybe worth to (already in pgtrgm.html) give the simple hint:
> ~ is case-sensitive
> ~* is case-insensitive
>
> In any case a link to functions-matching.html seems indicated.
Yeah, I think there is room for improvements here. Are you up for drafting a
patch for this?
--
Daniel Gustafsson https://vmware.com/
Op 16-08-2022 om 13:46 schreef Daniel Gustafsson: >> On 16 Aug 2022, at 12:54, Erik Rijkers <er@xs4all.nl> wrote: >> >> Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: >>>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote: >>>> I have a question regarding the trigram algorithm and I can not find any >>>> information about it in your documentation: >>> Maybe we should add something about this? >> >> Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper','lower', and that there is no mention of the ~ versus ~* difference. >> >> Maybe worth to (already in pgtrgm.html) give the simple hint: >> ~ is case-sensitive >> ~* is case-insensitive >> >> In any case a link to functions-matching.html seems indicated. > > Yeah, I think there is room for improvements here. Are you up for drafting a > patch for this? > How is this? (bluntly stating 'similarity comparisons are case-insensitive' - although I'm not really sure..) Erik > -- > Daniel Gustafsson https://vmware.com/ >
Attachment
Erik Rijkers <er@xs4all.nl> writes: > (bluntly stating 'similarity comparisons are case-insensitive' - > although I'm not really sure..) Perhaps like "similarity comparisons are case-insensitive in a standard build of pg_trgm", if you want to nod to the existence of a compile option without going into detail. regards, tom lane
Sounds good to me.
Am Di., 16. Aug. 2022 um 15:53 Uhr schrieb Tom Lane <tgl@sss.pgh.pa.us>:
Erik Rijkers <er@xs4all.nl> writes:
> (bluntly stating 'similarity comparisons are case-insensitive' -
> although I'm not really sure..)
Perhaps like "similarity comparisons are case-insensitive in a
standard build of pg_trgm", if you want to nod to the existence
of a compile option without going into detail.
regards, tom lane
> On 16 Aug 2022, at 15:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Erik Rijkers <er@xs4all.nl> writes: >> (bluntly stating 'similarity comparisons are case-insensitive' - >> although I'm not really sure..) > > Perhaps like "similarity comparisons are case-insensitive in a > standard build of pg_trgm", if you want to nod to the existence > of a compile option without going into detail. Looking at this I'm leaning towards paring down the diff posted upthread with pretty much this, I think that will provide value while avoid causing confusion. As a related side note, there are four instances of "case insensitive{ly}" in the docs with all other instances using "case-insensitive{ly}". I'm inclined to fix those four to use a dash while at it to be consistent across all pages. -- Daniel Gustafsson https://vmware.com/
Attachment
Daniel Gustafsson <daniel@yesql.se> writes: > Looking at this I'm leaning towards paring down the diff posted upthread with > pretty much this, I think that will provide value while avoid causing > confusion. WFM. > As a related side note, there are four instances of "case insensitive{ly}" in > the docs with all other instances using "case-insensitive{ly}". I'm inclined > to fix those four to use a dash while at it to be consistent across all pages. +1 regards, tom lane