Re: Regex performance issue - Mailing list pgsql-performance

From Alexandru Coseru
Subject Re: Regex performance issue
Date
Msg-id 022c01c7165f$17f49e80$3cb16956@alex
Whole thread Raw
In response to Re: Regex performance issue  ("Dave Dutcher" <dave@tridecap.com>)
Responses Re: Regex performance issue  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
List pgsql-performance
Hello..

I cannot use the first advice , because i'm not aware of the prefix length
in the database...
This is why i'm ordering after length(prefix)..


On the 2nd one , i'm not sure that i can follow you..

Regards
    Alex
----- Original Message -----
From: "Heikki Linnakangas" <heikki@enterprisedb.com>
To: "Alexandru Coseru" <alexandru.coseru@totaltelecom.ro>
Cc: "Dave Dutcher" <dave@tridecap.com>; <pgsql-performance@postgresql.org>
Sent: Sunday, December 03, 2006 12:04 AM
Subject: Re: [PERFORM] Regex performance issue


> Alexandru Coseru wrote:
>> I cannot use LIKE , because the order of the match is reversed.
>> The prefix column is containing telephone destinations.
>> IE:    ^001  - US  , ^0039 Italy , etc..
>
> Maybe you could create a functional index on substr(<minimum length of
> prefix>)? It might restrict the result set prior to applying the regex
> just enough to make the performance acceptable.
>
>> asterisk=> select * from destlist LIMIT 10;
>> id | id_ent | dir |   prefix   |   country   |      network       | tip
>> ----+--------+-----+------------+-------------+--------------------+-----
>>  1 |     -1 |   0 | (^0093)    | AFGHANISTAN | AFGHANISTAN        |   6
>>  2 |     -1 |   0 | (^00937)   | AFGHANISTAN | AFGHANISTAN Mobile |   5
>>  3 |     -1 |   0 | (^00355)   | ALBANIA     | ALBANIA            |   6
>>  4 |     -1 |   0 | (^0035538) | ALBANIA     | ALBANIA Mobile     |   5
>>  5 |     -1 |   0 | (^0035568) | ALBANIA     | ALBANIA Mobile     |   5
>>  6 |     -1 |   0 | (^0035569) | ALBANIA     | ALBANIA Mobile     |   5
>>  7 |     -1 |   0 | (^00213)   | ALGERIA     | ALGERIA            |   6
>>  8 |     -1 |   0 | (^0021361) | ALGERIA     | ALGERIA Mobile     |   5
>>  9 |     -1 |   0 | (^0021362) | ALGERIA     | ALGERIA Mobile     |   5
>> 10 |     -1 |   0 | (^0021363) | ALGERIA     | ALGERIA Mobile     |   5
>>
>> Now , I have to match a dialednumber   (let's say   00213618833) and find
>> it's destination...(It's algeria mobile).
>> I tried to make with a query of using LIKE , but i was not able to get
>> something..
>
> Another idea would be to add some extra rows so that you could use normal
> inequality searches. For example, let's take the Albanian rows:
>
>   3 |     -1 |   0 | 00355
>   4 |     -1 |   0 | 0035538
> * 3 |     -1 |   0 | 0035539
>   5 |     -1 |   0 | 0035568
>   6 |     -1 |   0 | 0035569
> * 3 |     -1 |   0 | 0035570
>
> Now you can do "SELECT * FROM destlist WHERE ? >= prefix ORDER BY prefix
> LIMIT 1".
>
> --
>   Heikki Linnakangas
>   EnterpriseDB   http://www.enterprisedb.com
>
>
>
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.409 / Virus Database: 268.15.4/563 - Release Date: 12/2/2006
>
>


pgsql-performance by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: Regex performance issue
Next
From: "Heikki Linnakangas"
Date:
Subject: Re: Regex performance issue