Re: Pattern matching operators a index - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Pattern matching operators a index
Date
Msg-id 1381343689.77303.YahooMailNeo@web162902.mail.bf1.yahoo.com
Whole thread Raw
In response to Pattern matching operators a index  (Soroosh Sardari <soroosh.sardari@gmail.com>)
List pgsql-hackers
Soroosh Sardari <soroosh.sardari@gmail.com> wrote:

> I'm developing a new type for character string, like varchar. I
> wrote operators for btree and so forth.
>
> I wonder how pattern matching operators using btree index,
> because btree operator class ony knows about >, >=, <=, and =
> operators, but operators for pattern matching, such as LIKE, are
> not known for btree access method.

In addition to Heikki's answer, which more directly answers your
question about what btree can do for you, you might want to look at
the pg_trgm extension and the gist_trgm_ops and gin_trgm_ops
operator classes, to see what other index types can do for you.
Specifically, while a btree index can only hlep much if the pattern
is anchored at the left, the regular expression searches need not
be.

test=# \d war_and_peace
                            Table "public.war_and_peace"
  Column  |  Type   |                           Modifiers
----------+---------+----------------------------------------------------------------
 lineno   | integer | not null default nextval('war_and_peace_lineno_seq'::regclass)
 linetext | text    | not null
Indexes:
    "war_and_peace_pkey" PRIMARY KEY, btree (lineno)

test=# explain analyze select * from war_and_peace where linetext ~ 'gentlemen';
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Seq Scan on war_and_peace  (cost=0.00..947.79 rows=283 width=76) (actual time=4.697..62.065 rows=67 loops=1)
   Filter: (linetext ~ 'gentlemen'::text)
   Rows Removed by Filter: 36636
 Total runtime: 62.101 ms
(4 rows)

test=# create index war_and_peace_linetext_gist on war_and_peace using gist (linetext gist_trgm_ops);
CREATE INDEX
test=# analyze;
ANALYZE
test=# explain analyze select * from war_and_peace where linetext ~ 'gentlemen';
                                                              QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on war_and_peace  (cost=4.30..15.63 rows=3 width=76) (actual time=23.231..24.436 rows=67 loops=1)
   Recheck Cond: (linetext ~ 'gentlemen'::text)
   Rows Removed by Index Recheck: 22
   ->  Bitmap Index Scan on war_and_peace_linetext_gist  (cost=0.00..4.30 rows=3 width=0) (actual time=23.200..23.200
rows=89loops=1) 
         Index Cond: (linetext ~ 'gentlemen'::text)
 Total runtime: 24.483 ms
(6 rows)

test=# drop index war_and_peace_linetext_gist;
DROP INDEX
test=# create index war_and_peace_linetext_gin on war_and_peace using gin (linetext gin_trgm_ops);
CREATE INDEX
test=# analyze;
ANALYZE
test=# explain analyze select * from war_and_peace where linetext ~ 'gentlemen';
                                                             QUERY
PLAN                                                              

-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on war_and_peace  (cost=68.02..79.35 rows=3 width=76) (actual time=2.393..5.206 rows=67 loops=1)
   Recheck Cond: (linetext ~ 'gentlemen'::text)
   Rows Removed by Index Recheck: 22
   ->  Bitmap Index Scan on war_and_peace_linetext_gin  (cost=0.00..68.02 rows=3 width=0) (actual time=2.360..2.360
rows=89loops=1) 
         Index Cond: (linetext ~ 'gentlemen'::text)
 Total runtime: 5.263 ms
(6 rows)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Auto-tuning work_mem and maintenance_work_mem
Next
From: Robert Haas
Date:
Subject: Re: pg_system_identifier()