Thread: Pattern matching ints
Hi, Is there a more efficient way to pattern match integer columns other than something like : where cast(mynumber as text) ~ '.*123.*' I also seem to recall you can't create indexes on casts either ? Thx Tim
Tim Smith wrote: > Is there a more efficient way to pattern match integer columns other > than something like : > > where cast(mynumber as text) ~ '.*123.*' > > > I also seem to recall you can't create indexes on casts either ? I don't think you can do this without converting the column to a string. I guess that you can create an index if the cast function is immutable; at any rate you can create an index on intcolumn::text. But such an index would not help you with a query like the one you show above. The only thing that could speed up such a query would be a trigram index on the string representation of the value. Yours, Laurenz Albe
On 26/01/15 20:32, Tim Smith wrote: > Hi, > > Is there a more efficient way to pattern match integer columns other > than something like : > > where cast(mynumber as text) ~ '.*123.*' > > > I also seem to recall you can't create indexes on casts either ? This is perfectly possible: postgres=# CREATE TABLE foo (id INT); CREATE TABLE postgres=# CREATE INDEX idx ON foo (CAST(id AS TEXT) text_pattern_ops); CREATE INDEX postgres=# INSERT INTO foo values(generate_series(1,1000000)); INSERT 0 1000000 but not necessarily useful... postgres=# explain analyze SELECT * from foo where id::text ~ '.*12345.*'; QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on foo (cost=0.00..21925.00 rows=100 width=4) (actual time=17.331..961.384 rows=20 loops=1) Filter: ((id)::text ~ '.*12345.*'::text) Rows Removed by Filter: 999980 Planning time: 0.296 ms Execution time: 961.411 ms (5 rows) However you might find the pg_trgm extension [1] useful: postgres=# CREATE TABLE foo (id INT); CREATE TABLE postgres=# CREATE INDEX trgm_idx ON foo using gist(cast(id as text) gist_trgm_ops); CREATE INDEX postgres=# INSERT INTO foo values(generate_series(1,1000000)); INSERT 0 1000000 postgres=# explain analyze SELECT * from foo where id::text ~ '.*12345.*'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on foo (cost=200.47..4938.11 rows=5184 width=4) (actual time=61.163..61.211 rows=20 loops=1) Recheck Cond: ((id)::text ~ '.*12345.*'::text) Heap Blocks: exact=11 -> Bitmap Index Scan on trgm_idx (cost=0.00..199.17 rows=5184 width=0) (actual time=61.140..61.140 rows=20 loops=1) Index Cond: ((id)::text ~ '.*12345.*'::text) Planning time: 0.241 ms Execution time: 61.257 ms (7 rows) [1] http://www.postgresql.org/docs/current/interactive/pgtrgm.html Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Ian, Re: However you might find the pg_trgm extension [1] useful: Indeed... pretty awesome. Thanks ! On 26 January 2015 at 12:55, Ian Barwick <ian@2ndquadrant.com> wrote: > On 26/01/15 20:32, Tim Smith wrote: >> Hi, >> >> Is there a more efficient way to pattern match integer columns other >> than something like : >> >> where cast(mynumber as text) ~ '.*123.*' >> >> >> I also seem to recall you can't create indexes on casts either ? > > This is perfectly possible: > > postgres=# CREATE TABLE foo (id INT); > CREATE TABLE > postgres=# CREATE INDEX idx ON foo (CAST(id AS TEXT) text_pattern_ops); > CREATE INDEX > postgres=# INSERT INTO foo values(generate_series(1,1000000)); > INSERT 0 1000000 > > but not necessarily useful... > > postgres=# explain analyze SELECT * from foo where id::text ~ '.*12345.*'; > QUERY PLAN > ------------------------------------------------------------------------------------------------------- > Seq Scan on foo (cost=0.00..21925.00 rows=100 width=4) (actual time=17.331..961.384 rows=20 loops=1) > Filter: ((id)::text ~ '.*12345.*'::text) > Rows Removed by Filter: 999980 > Planning time: 0.296 ms > Execution time: 961.411 ms > (5 rows) > > However you might find the pg_trgm extension [1] useful: > > postgres=# CREATE TABLE foo (id INT); > CREATE TABLE > postgres=# CREATE INDEX trgm_idx ON foo using gist(cast(id as text) gist_trgm_ops); > CREATE INDEX > postgres=# INSERT INTO foo values(generate_series(1,1000000)); > INSERT 0 1000000 > postgres=# explain analyze SELECT * from foo where id::text ~ '.*12345.*'; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on foo (cost=200.47..4938.11 rows=5184 width=4) (actual time=61.163..61.211 rows=20 loops=1) > Recheck Cond: ((id)::text ~ '.*12345.*'::text) > Heap Blocks: exact=11 > -> Bitmap Index Scan on trgm_idx (cost=0.00..199.17 rows=5184 width=0) (actual time=61.140..61.140 rows=20 loops=1) > Index Cond: ((id)::text ~ '.*12345.*'::text) > Planning time: 0.241 ms > Execution time: 61.257 ms > (7 rows) > > > [1] http://www.postgresql.org/docs/current/interactive/pgtrgm.html > > > Regards > > > Ian Barwick > > -- > Ian Barwick http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services