Home > mailing lists

Re: looking for a faster way to do that - Mailing list pgsql-general

From	Eduardo Morras
Subject	Re: looking for a faster way to do that
Date	September 26, 2011 14:02:12
Msg-id	4E5EF2B0006A4305@ Whole thread
In response to	Re: looking for a faster way to do that (hamann.w@t-online.de)
List	pgsql-general

Tree view

At 18:18 26/09/2011, you wrote:

>Eduardo Morras <nec556@retena.com> wrote:
>
> >>
> >> At 08:04 25/09/2011, hamann.w@t-online.de wrote:
> >>
> >> > >> select * from items where regexp_matches(items.code,'(ABC) (DE1)
> >> > >> (any_substring)')<>{};
> >> > >>
> >> >
> >> >Hi Eduardo,
> >> >
> >> >it is clear that scanning the table once with a list of matches will
> >> >outperform
> >> >rescanning the table for every string wanted. Now, my problem is
> >> >that the patterns are
> >> >dynamic as well. So if I could translate a table with one
> >> >column  and a few thousand rows
> >> >into something like
> >> >regexp_matches(code,'string1|string2|.....string2781')
> >> >would ideally be a performant query. Unfortunately I have no idea
> >> >how I could achieve this
> >> >transformation inside the database. Doing it externally fails,
> >> >because any single query cannot
> >> >be more than so few characters.
> >>
> >> You can create a plsql function and pass a setof text that do it.
> >> Sorry but instead saying you What Must You Type, WMYT(c), i prefer
> >> the How Should You Do way, HSYD(c). Note that you can get the same
> >> results using other approachs (f.ex. using FTS described in chapter 12)
> >>
> >> Check this topics:
> >>
> >> Function
> >>
> Creation  http://www.postgresql.org/docs/9.0/static/sql-createfunction.html
> >>
> >> Tutorial about Function
> >> Creation  http://www.adderpit.com/practical-postgresql/x10374.htm
> >>
>
>Hi,
>
>I tried the pl/sql approach to convert the contents of that patterns
>table into a regex.
>Results: 40 seconds runtime for 9500 candidates and 815 patterns
>718 seconds for the same set of 9500 candidates, but using 4000
>patterns instead.
>So it seems that I am reaching limits of pattern match

Perhaps calling the function twice with half the values go faster.
How do you call the function? EXECUTE or SELECT? If you use EXECUTE
then the prepared plan in a previous call is ignored and is usually
faster. Don't know if in your case it run faster but you can try it.

>As for the fulltext index (and the underlying tsquery): this is an
>exact match rather than prefix
>match, so I would need to know match patterns in advance in order to
>build the index
>
>I am thinking about that anyway (because ABC1234 likely should not
>match ABC123 pattern
>in my context), but I would sort of prefer a system where I can
>state the rules when I
>see the data set, rather than having to pre-create an index.
>
>Thanks for the tutorial link :)

It's the one i used time ago. A bit old but very good one.

>It seems that the responses on my post give all sorts of input that
>will help me on other
>tasks

>Regards
>Wolfgang Hamann

pgsql-general by date:

From: "Edson Carlos Ericksson Richter"
Date: 26 September 2011, 14:00:57
Subject: RES: Does postgresql 9.0.4 use index on PREFIX%SUFFIX queries?

From: Alan Hodgson
Date: 26 September 2011, 14:07:53
Subject: Re: : PostgreSQL Online Backup

Re: looking for a faster way to do that - Mailing list pgsql-general

Previous

Next