Thread: Regular Expressions

Regular Expressions

From
"Mark Williams"
Date:

If I wanted to search for whole words in a field I would use something like:

 

Select * from mytable where myfield ~* ‘(\mtext1\M) | (\mtext2\M)’

 

This would find all instances of myfield containing either “text1” or “text2”.

 

I can’t figure out how to search myfield for all instances which contain “text1” AND “text2”.

 

In other words | is the OR operator. What is the AND operator. Tried + and whilst that executes, it doesn’t return matching fields.

 

Thanks

 

Mark

 

__

 

Re: Regular Expressions

From
"A. Sasaki"
Date:

‘(*\mtext1\M*\mtext2\M)|(*\mtext2\M*\mtext1\M)


Thanks,

-Andrew-

On Nov 4, 2018, at 9:10 AM, Mark Williams <markwillimas@gmail.com> wrote:

If I wanted to search for whole words in a field I would use something like:

 

Select * from mytable where myfield ~* ‘(\mtext1\M) | (\mtext2\M)’

 

This would find all instances of myfield containing either “text1” or “text2”.

 

I can’t figure out how to search myfield for all instances which contain “text1” AND “text2”.

 

In other words | is the OR operator. What is the AND operator. Tried + and whilst that executes, it doesn’t return matching fields.

 

Thanks

 

Mark

 

__

 

Re: Regular Expressions

From
"David G. Johnston"
Date:
On Sunday, November 4, 2018, Mark Williams <markwillimas@gmail.com> wrote: 

I can’t figure out how to search myfield for all instances which contain “text1” AND “text2”.

In other words | is the OR operator. What is the AND operator. Tried + and whilst that executes, it doesn’t return matching fields.

=====================

Myfield ~* ‘text1’ AND myfield ~* ‘text2’

There is no convenient concept of AND in the sense you want it in Regular Expressions and “+” has its own meaning of “one or more of the previous item”.   

David J.


RE: Regular Expressions

From
"Mark Williams"
Date:

Hi Andrew,

 

Thanks for the reply.

 

I tried the query, but it produced an error “invalid regular expression: quantifier operand invalid”.

 

Also, what would be the regular expression if you want to check whether all the words were in the field where you had say 10 words/phrases you wanted to check for?

 

 

__

 

From: A. Sasaki <asasaki@gmail.com>
Sent: 04 November 2018 19:30
To: Mark Williams <markwillimas@gmail.com>
Subject: Re: Regular Expressions

 

‘(*\mtext1\M*\mtext2\M)|(*\mtext2\M*\mtext1\M)’

Thanks,

 

-Andrew-


On Nov 4, 2018, at 9:10 AM, Mark Williams <markwillimas@gmail.com> wrote:

If I wanted to search for whole words in a field I would use something like:

 

Select * from mytable where myfield ~* ‘(\mtext1\M) | (\mtext2\M)’

 

This would find all instances of myfield containing either “text1” or “text2”.

 

I can’t figure out how to search myfield for all instances which contain “text1” AND “text2”.

 

In other words | is the OR operator. What is the AND operator. Tried + and whilst that executes, it doesn’t return matching fields.

 

Thanks

 

Mark

 

__

 

Re: Regular Expressions

From
"David G. Johnston"
Date:
On Sunday, November 4, 2018, Mark Williams <markwillimas@gmail.com> wrote: 

Also, what would be the regular expression if you want to check whether all the words were in the field where you had say 10 words/phrases you wanted to check for?

==============

Consider full text search instead.

Maybe “split_to_array” on spaces and then do something like:

ARRAY[‘term1’, ‘term2’]::text[] && split_to_array(field, ‘ ‘)

David J.


Re: Regular Expressions

From
Tom Lane
Date:
"Mark Williams" <markwillimas@gmail.com> writes:
> Also, what would be the regular expression if you want to check whether all the words were in the field where you had
say10 words/phrases you wanted to check for? 

As David said, regular expressions aren't really designed to do that.
Personally I'd do the AND at the SQL level, ie

myfield ~* '\mtext1\M' AND myfield ~* '\mtext2\M' AND ...

You might also take a look at the full text search machinery, which
is probably better suited to this task.

            regards, tom lane


RE: Regular Expressions

From
"Mark Williams"
Date:
Thanks all. Have implemented by way of "AND" for time being. Full text
search on my list of todos!

__

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us> 
Sent: 04 November 2018 19:55
To: Mark Williams <markwillimas@gmail.com>
Cc: 'A. Sasaki' <asasaki@gmail.com>; pgsql-sql@lists.postgresql.org
Subject: Re: Regular Expressions

"Mark Williams" <markwillimas@gmail.com> writes:
> Also, what would be the regular expression if you want to check whether
all the words were in the field where you had say 10 words/phrases you
wanted to check for?

As David said, regular expressions aren't really designed to do that.
Personally I'd do the AND at the SQL level, ie

myfield ~* '\mtext1\M' AND myfield ~* '\mtext2\M' AND ...

You might also take a look at the full text search machinery, which is
probably better suited to this task.

            regards, tom lane



Re: Regular Expressions

From
"A. Sasaki"
Date:
I didn’t test it in a DB, sorry ’bout that.

I think others are on the right track here, using SQL for the AND functionality:

‘Select *
FROM [TABLE]
WHERE [regex search 1]
AND [regex search 2]
AND [regex search 3]
....’


Thanks,

-Andrew-

On Nov 4, 2018, at 9:43 AM, Mark Williams <markwillimas@gmail.com> wrote:

Hi Andrew,

 

Thanks for the reply.

 

I tried the query, but it produced an error “invalid regular expression: quantifier operand invalid”.

 

Also, what would be the regular expression if you want to check whether all the words were in the field where you had say 10 words/phrases you wanted to check for?

 

 

__

 

From: A. Sasaki <asasaki@gmail.com>
Sent: 04 November 2018 19:30
To: Mark Williams <markwillimas@gmail.com>
Subject: Re: Regular Expressions

 

‘(*\mtext1\M*\mtext2\M)|(*\mtext2\M*\mtext1\M)’

Thanks,

 

-Andrew-


On Nov 4, 2018, at 9:10 AM, Mark Williams <markwillimas@gmail.com> wrote:

If I wanted to search for whole words in a field I would use something like:

 

Select * from mytable where myfield ~* ‘(\mtext1\M) | (\mtext2\M)’

 

This would find all instances of myfield containing either “text1” or “text2”.

 

I can’t figure out how to search myfield for all instances which contain “text1” AND “text2”.

 

In other words | is the OR operator. What is the AND operator. Tried + and whilst that executes, it doesn’t return matching fields.

 

Thanks

 

Mark

 

__

 

RE: Regular Expressions

From
"Mark Williams"
Date:

No problem. That’s how I have now gone about it.

 

__

 

From: A. Sasaki <asasaki@gmail.com>
Sent: 04 November 2018 22:45
To: Mark Williams <markwillimas@gmail.com>
Cc: pgsql-sql@lists.postgresql.org
Subject: Re: Regular Expressions

 

I didn’t test it in a DB, sorry ’bout that.

 

I think others are on the right track here, using SQL for the AND functionality:

 

‘Select *

FROM [TABLE]

WHERE [regex search 1]

AND [regex search 2]

AND [regex search 3]
....’

Thanks,

 

-Andrew-


On Nov 4, 2018, at 9:43 AM, Mark Williams <markwillimas@gmail.com> wrote:

Hi Andrew,

 

Thanks for the reply.

 

I tried the query, but it produced an error “invalid regular expression: quantifier operand invalid”.

 

Also, what would be the regular expression if you want to check whether all the words were in the field where you had say 10 words/phrases you wanted to check for?

 

 

__

 

From: A. Sasaki <asasaki@gmail.com>
Sent: 04 November 2018 19:30
To: Mark Williams <markwillimas@gmail.com>
Subject: Re: Regular Expressions

 

‘(*\mtext1\M*\mtext2\M)|(*\mtext2\M*\mtext1\M)’

Thanks,

 

-Andrew-


On Nov 4, 2018, at 9:10 AM, Mark Williams <markwillimas@gmail.com> wrote:

If I wanted to search for whole words in a field I would use something like:

 

Select * from mytable where myfield ~* ‘(\mtext1\M) | (\mtext2\M)’

 

This would find all instances of myfield containing either “text1” or “text2”.

 

I can’t figure out how to search myfield for all instances which contain “text1” AND “text2”.

 

In other words | is the OR operator. What is the AND operator. Tried + and whilst that executes, it doesn’t return matching fields.

 

Thanks

 

Mark

 

__

 

Re: Regular Expressions

From
Andrew Gierth
Date:
>>>>> "Mark" == Mark Williams <markwillimas@gmail.com> writes:

 Mark> I can't figure out how to search myfield for all instances which
 Mark> contain "text1" AND "text2".

I should start by pointing out that (as mentioned by other people) using
regexps is not necessarily the best way to do this, especially not when
dealing with actual words which is what FTS exists for.

But a solution does exist (at least in pg and other regexp engines that
support lookahead assertions):

myfield ~* '^(?=.*\mtext1\M)(?=.*\mtext2\M)'

What this says is: match at the start of the string if (and only if)
both the lookahead assertions succeed; since neither assertion advances
the match, they will find the two specified words regardless of the
order in which they appear. (The trick of using | to search for both
possible orders works for 2 words, but gets unwieldy very quickly with
more; with the assertion method you can handle any number of words.)

-- 
Andrew (irc:RhodiumToad)