Re: Counting the number of repeated phrases in a column - Mailing list pgsql-general

From Rob Sargent
Subject Re: Counting the number of repeated phrases in a column
Date
Msg-id 7f204b3c-3224-f8cb-a841-879f57ebf120@gmail.com
Whole thread Raw
In response to Re: Counting the number of repeated phrases in a column  (Shaozhong SHI <shishaozhong@gmail.com>)
List pgsql-general
On 1/26/22 13:35, Shaozhong SHI wrote:


On Tue, 25 Jan 2022 at 17:10, Shaozhong SHI <shishaozhong@gmail.com> wrote:
There is a short of a function in the standard Postgres to do the following:

It is easy to count the number of occurrence of words, but it is rather difficult to count the number of occurrence of phrases.

For instance:

A cell of value:  'Hello World' means 1 occurrence a phrase.

A cell of value: 'Hello World World Hello' means no occurrence of any repeated phrase.

But, A cell of value: 'Hello World World Hello Hello World' means 2 occurrences of 'Hello World'.

'The City of London, London' also has no occurrences of any repeated phrase.

Anyone has got such a function to check out the number of occurrence of any repeated phrases?

Regards,

David

Hi, All Friends,

Whatever.   Can we try to build a regex for   'The City of London London Great London UK ' ?

It could be something like '[\w\s]+[\s-]+[a-z]+[\s-][\s\w]+'.   [\s-]+[a-z]+[\s-] is catered for some people think that 'City of London' is 'City-of-London' or 'City-of-London'.

Regards,

David
Do you really want "The City of", by itself, to be one of the detected phrases?  eg 'The City of London London Great London UK The City of Liverpool'.

pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Counting the number of repeated phrases in a column
Next
From: Karsten Hilbert
Date:
Subject: Re: Counting the number of repeated phrases in a column