Re: Indexing and regular expressions - Mailing list pgsql-hackers

From Kjartan Бsюуrsson
Subject Re: Indexing and regular expressions
Date
Msg-id 148154092062.20020407134908@student.his.se
Whole thread Raw
In response to Re: Indexing and regular expressions  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: Indexing and regular expressions  (Neil Conway <nconway@klamath.dyndns.org>)
List pgsql-hackers
Thank you for your reply.

No, I am not looking for a fuzzy match. I am simply wondering if there
are some methods available that can speed up joining of tables when
the join is done with a regular expression operator (one table
contains regular expression patterns, and the other strings that
should be matched against the pattern).

If no indexes are used, a full table scan of the string table is
needed if we want to select all strings that matches a given pattern.

My question is if there are any indexing methods available that can
be used to prone the search space in the string table when using
the regular expression operator.

I hope I made myself more clear now. Best regards,
Kjartan


Sunday, April 7, 2002, 12:54:27 PM, you wrote:

> On Sun, 7 Apr 2002, [ISO-8859-1] Kjartan аsЧСrsson wrote:

>> Is there any indexing technique available I can use when joining tables
>> with a regular expression pattern in pgsql?
>>
>> I know one method for indexing strings that will be matched with regular
>> expression patterns, and that is using so called k-gram indexes.
>> Indexing the string "kjartan" with k-gram index where k = 3 would
>> create "kja", "jar", "art", "rta", "tan" as an index. Ofcourse it is hard to

> Usually, k-grams technique is used to match patterns with errors and
> 3-grams produce "__k", "_kj", "kja", "jar", "art", "rta", "ta_", "a__"
> where leading and trailing spaces are used to compensate 'boundary' effect.

> But I dont' quite understand your question. Are you looking for fuzzy match ?
> If so, take a look on contrib modules.

>> decide the size of k and I'm sure in many cases mulitple k values might
>> be needed, depending on the situation.
>>
>> I have not done any major survey of available techniques, but I was
>> hoping I could get some pointers here.
>>
>> I assume pgsql always uses nested loop join when joining relations which are
>> joined with regular expression pattern?
>>

>         Regards,
>                 Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83


> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org




pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: Re: What's the CURRENT schema ?
Next
From: Gavin Sherry
Date:
Subject: Re: Suggestion for optimization