Thread: Fault with initcap

Fault with initcap

From
Shaozhong SHI
Date:
I tried initcap and found a major problem with it.

Initcap of notemachine is NoteMachine.

Initcap of Sainsbury's Bank is Sainsbury'S bank.

This is not expected.

Anyway to get around this problem?

Regards,

David

Aw: Fault with initcap

From
Karsten Hilbert
Date:
> I tried initcap and found a major problem with it.
> Initcap of notemachine is NoteMachine.
> Initcap of Sainsbury's Bank is Sainsbury'S bank.
> This is not expected.
>
> Anyway to get around this problem?
 
Sure. Step one: define "expected"

Karsten



Re: Fault with initcap

From
Rob Sargent
Date:

> On Oct 12, 2021, at 10:30 AM, Shaozhong SHI <shishaozhong@gmail.com> wrote:
> 
> I tried initcap and found a major problem with it.
> 
> Initcap of notemachine is NoteMachine.
My version 12 system gives Notemachine
> 
> Initcap of Sainsbury's Bank is Sainsbury'S bank.
> 
> This is not expected.
> 
> Anyway to get around this problem?
> 
> Regards,
> 
> David




Re: Fault with initcap

From
Steve Midgley
Date:


On Tue, Oct 12, 2021 at 9:54 AM Rob Sargent <robjsargent@gmail.com> wrote:


> On Oct 12, 2021, at 10:30 AM, Shaozhong SHI <shishaozhong@gmail.com> wrote:
>
> I tried initcap and found a major problem with it.
>
> Initcap of notemachine is NoteMachine.
My version 12 system gives Notemachine
>
> Initcap of Sainsbury's Bank is Sainsbury'S bank.
>
> This is not expected.
>
> Anyway to get around this problem?
>


Pg v.13 doesn't have a problem with notemachine either, but does seem to screw up Sainsbury's Bank as Sainsbury'S bank.


The only way I could think of fixing this would be to swap out the apostrophe with an alpha string not present in the string, and then swap back when you're done (see example above -- I'm not recommending using "xyzxyz" - that's just an example. And ideally you'd search for your placeholder string first to be sure it doesn't exist, or escape it).

Or just use a regexp to solve your casing problem -- maybe break up your string by whitespace into array elements, uppercase the first letter of each array and the concat the array elements back?

Steve
 

Re: Fault with initcap

From
Metin Ulusinan
Date:
I tried both on PostgreSQL 11.13 and yes "Sainsbury's bank" 's "S" is buggy, but notemachine is not.

Will anyone try @ 14?


nsversion
NotemachineSainsbury'S BankPostgreSQL 11.13 (Debian 11.13-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit


On Tue, Oct 12, 2021 at 7:30 PM Shaozhong SHI <shishaozhong@gmail.com> wrote:
I tried initcap and found a major problem with it.

Initcap of notemachine is NoteMachine.

Initcap of Sainsbury's Bank is Sainsbury'S bank.

This is not expected.

Anyway to get around this problem?

Regards,

David

Re: Fault with initcap

From
Pavel Stehule
Date:


st 13. 10. 2021 v 8:34 odesílatel Metin Ulusinan <metin.ulusinan@ssicilian.net> napsal:
I tried both on PostgreSQL 11.13 and yes "Sainsbury's bank" 's "S" is buggy, but notemachine is not.

Will anyone try @ 14?

It cannot to work everywhere - the word is defined (for this function) as sequence of alpha numeric chars. "'" is not alpha numeric char.

Theoretically it can work if you use ICU collation provider, if it is supported by ICU.

Regards

Pavel

Re: Fault with initcap

From
Shaozhong SHI
Date:


On Wed, 13 Oct 2021 at 07:33, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:
I tried both on PostgreSQL 11.13 and yes "Sainsbury's bank" 's "S" is buggy, but notemachine is not.

Will anyone try @ 14?


nsversion
NotemachineSainsbury'S BankPostgreSQL 11.13 (Debian 11.13-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit


Hello, Metin Ulusinan,

I can confirm that you pin-out the issue.


Can this be adapted to capitalise the first letter of each work in a string?

Regards,

David 

Re: Fault with initcap

From
Metin Ulusinan
Date:
Hi,
Yes, this can be adaptable, and i did simple version of this. 
It just split text words with find spaces, capitalise each 
pieces(word) and merge together again.
This is a quick and simple work. You can develop over it about your needs.

Try that and tell us about result.


CREATE OR REPLACE FUNCTION initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'\\s+');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := regexp_replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;



On Wed, Oct 13, 2021 at 11:48 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:


On Wed, 13 Oct 2021 at 07:33, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:
I tried both on PostgreSQL 11.13 and yes "Sainsbury's bank" 's "S" is buggy, but notemachine is not.

Will anyone try @ 14?


nsversion
NotemachineSainsbury'S BankPostgreSQL 11.13 (Debian 11.13-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit


Hello, Metin Ulusinan,

I can confirm that you pin-out the issue.


Can this be adapted to capitalise the first letter of each work in a string?

Regards,

David 

RE: Fault with initcap

From
"Jain, Ankit"
Date:

This was reported in PG 8.1.4 15 years ago - https://pgsql-bugs.postgresql.narkive.com/R7HmTKef/bug-2579-initcap-should-not-capitalize-letter-after-apostrophe

 

From: Metin Ulusinan <metin.ulusinan@ssicilian.net>
Sent: Wednesday, October 13, 2021 5:39 AM
To: Shaozhong SHI <shishaozhong@gmail.com>
Cc: pgsql-sql <pgsql-sql@lists.postgresql.org>
Subject: Re: Fault with initcap

 

CAUTION: This email originated from outside of Snap-on. Do not click on links or open attachments unless you have validated the sender, even if it is a known contact. Contact the sender by phone to validate the contents.

Hi,

Yes, this can be adaptable, and i did simple version of this. 

It just split text words with find spaces, capitalise each 

pieces(word) and merge together again.

This is a quick and simple work. You can develop over it about your needs.

 

Try that and tell us about result.

 

 

CREATE OR REPLACE FUNCTION initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'\\s+');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := regexp_replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;

 

 

 

On Wed, Oct 13, 2021 at 11:48 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:

 

 

On Wed, 13 Oct 2021 at 07:33, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:

I tried both on PostgreSQL 11.13 and yes "Sainsbury's bank" 's "S" is buggy, but notemachine is not.

 

Will anyone try @ 14?

 

 

n

s

version

Notemachine

Sainsbury'S Bank

PostgreSQL 11.13 (Debian 11.13-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

 

 

Hello, Metin Ulusinan,

 

I can confirm that you pin-out the issue.

 

 

Can this be adapted to capitalise the first letter of each work in a string?

 

Regards,

 

David 

Re: Fault with initcap

From
Shaozhong SHI
Date:


On Wed, 13 Oct 2021 at 10:39, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:
Hi,
Yes, this can be adaptable, and i did simple version of this. 
It just split text words with find spaces, capitalise each 
pieces(word) and merge together again.
This is a quick and simple work. You can develop over it about your needs.

Try that and tell us about result.


CREATE OR REPLACE FUNCTION initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'\\s+');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := regexp_replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;

Hello, Metin,

See the following testing response.

ERROR: invalid regular expression: parentheses () not balanced CONTEXT: PL/pgSQL function testinitcap2(text) line 18 at assignment 
SQL state: 2201B

Regards,

David 

Re: Fault with initcap

From
Metin Ulusinan
Date:
Problem is about with parantesheses in text 
and real problem is non alphanumeric chars.
Different samples will cause different problems. 
We don't know about your data and samples.

But i changed the code again from original one. 
I deleted  " ' " char in regexp function code.
It seems better than before.



CREATE OR REPLACE FUNCTION public.initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'[\\[\\]\^\$\.\|\?\*\+\(\)\\~`\!@#%&\\-\\_+={}"<>:;, ]');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;


On Thu, Oct 14, 2021 at 11:55 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:


On Wed, 13 Oct 2021 at 10:39, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:
Hi,
Yes, this can be adaptable, and i did simple version of this. 
It just split text words with find spaces, capitalise each 
pieces(word) and merge together again.
This is a quick and simple work. You can develop over it about your needs.

Try that and tell us about result.


CREATE OR REPLACE FUNCTION initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'\\s+');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := regexp_replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;

Hello, Metin,

See the following testing response.

ERROR: invalid regular expression: parentheses () not balanced CONTEXT: PL/pgSQL function testinitcap2(text) line 18 at assignment 
SQL state: 2201B

Regards,

David 

Re: Fault with initcap

From
Shaozhong SHI
Date:
It is better to reveal the content of initcap.

But, \sf+ initcap and etc. did not reveal.

Regards,

David

On Thu, 14 Oct 2021 at 14:42, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:
Problem is about with parantesheses in text 
and real problem is non alphanumeric chars.
Different samples will cause different problems. 
We don't know about your data and samples.

But i changed the code again from original one. 
I deleted  " ' " char in regexp function code.
It seems better than before.



CREATE OR REPLACE FUNCTION public.initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'[\\[\\]\^\$\.\|\?\*\+\(\)\\~`\!@#%&\\-\\_+={}"<>:;, ]');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;


On Thu, Oct 14, 2021 at 11:55 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:


On Wed, 13 Oct 2021 at 10:39, Metin Ulusinan <metin.ulusinan@ssicilian.net> wrote:
Hi,
Yes, this can be adaptable, and i did simple version of this. 
It just split text words with find spaces, capitalise each 
pieces(word) and merge together again.
This is a quick and simple work. You can develop over it about your needs.

Try that and tell us about result.


CREATE OR REPLACE FUNCTION initcap2(text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE
sentence TEXT := '';
word_array TEXT[];
word TEXT;
word_out TEXT;
BEGIN
sentence := $1;

IF sentence is NULL THEN
RETURN NULL;
END IF;

word_array := regexp_split_to_array($1, E'\\s+');
FOREACH word IN ARRAY word_array
LOOP
word_out := upper(left(word, 1)) || lower(substring(word, 2));
sentence := regexp_replace(sentence, word, word_out);
END LOOP;

RETURN trim(sentence);
END;
$function$
;

Hello, Metin,

See the following testing response.

ERROR: invalid regular expression: parentheses () not balanced CONTEXT: PL/pgSQL function testinitcap2(text) line 18 at assignment 
SQL state: 2201B

Regards,

David