Re: Question about POSIX Regular Expressions performance on large dataset. - Mailing list pgsql-sql

From Scott Marlowe
Subject Re: Question about POSIX Regular Expressions performance on large dataset.
Date
Msg-id AANLkTimaiPKsy8xgyseSxD1qn-en7ciDP=U6kniNeFwH@mail.gmail.com
Whole thread Raw
In response to Question about POSIX Regular Expressions performance on large dataset.  (Jose Ildefonso Camargo Tolosa <ildefonso.camargo@gmail.com>)
List pgsql-sql
On Tue, Aug 17, 2010 at 8:21 PM, Jose Ildefonso Camargo Tolosa
<ildefonso.camargo@gmail.com> wrote:
> Hi!
>
> I'm analyzing the possibility of using PostgreSQL to store a huge
> amount of data (around 1000M records, or so....), and these, even
> though are short (each record just have a timestamp, and a string that
> is less than 128 characters in length), the strings will be matched
> against POSIX Regular Expressions (different regexps, and maybe
> complex).
>
> Because I don't have a system large enough to test this here, I have
> to ask you (I may borrow a medium-size server, but it would take a
> week or more, so I decided to ask here first).  How is the performance
> of Regexp matching in PostgreSQL?  Can it use indexes? My guess is:
> no, because I don't see a way of generally indexing to match regexp :(
> , so, tablescans for this huge dataset.....
>
> What do you think of this?

Yes it can index such things, but it has to index them in a fixed way.i.e. you can create functional indexes with
pre-builtregexes.  But 
for ones where the values change each time, you're correct, no indexes
will be used.

Could full text searching be used instead?


pgsql-sql by date:

Previous
From: Jose Ildefonso Camargo Tolosa
Date:
Subject: Question about POSIX Regular Expressions performance on large dataset.
Next
From: Jose Ildefonso Camargo Tolosa
Date:
Subject: Re: Question about POSIX Regular Expressions performance on large dataset.