Thread: is it normal behavior of index?
Hi list, Simple test table w/ a varchar(32): if I create an index with 'varchar_pattern_ops' explain says it is NOT used in my query; if I create it without it is used (?!) JY -- Knocked, you weren't in. -- Opportunity
Jean-Yves F. Barbier <12ukwn@gmail.com> wrote: > Hi list, > > Simple test table w/ a varchar(32): if I create an index with > 'varchar_pattern_ops' explain says it is NOT used in my query; > if I create it without it is used (?!) Please show more details and/or an example. I have one for you: test=# create table words ( w text); CREATE TABLE Time: 5,829 ms test=*# copy words from '/usr/share/dict/ngerman'; COPY 320577 Time: 458,902 ms test=*# create index idx_words on words (w varchar_pattern_ops); CREATE INDEX Time: 1270,534 ms test=*# explain select * from words where w = 'foo'; QUERY PLAN ---------------------------------------------------------------------------- Bitmap Heap Scan on words (cost=36.72..1781.30 rows=1603 width=32) Recheck Cond: (w = 'foo'::text) -> Bitmap Index Scan on idx_words (cost=0.00..36.32 rows=1603 width=0) Index Cond: (w = 'foo'::text) (4 rows) Time: 0,467 ms As you can see, there is an index WITH varchar_pattern_ops, and the index used for the query. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Sun, 1 Jan 2012 12:52:04 +0100 Andreas Kretschmer <akretschmer@spamfence.net> wrote: > > Simple test table w/ a varchar(32): if I create an index with > > 'varchar_pattern_ops' explain says it is NOT used in my query; > > if I create it without it is used (?!) > > Please show more details and/or an example. I have one for you: > > test=# create table words ( w text); > CREATE TABLE > Time: 5,829 ms > test=*# copy words from '/usr/share/dict/ngerman'; > COPY 320577 > Time: 458,902 ms > test=*# create index idx_words on words (w varchar_pattern_ops); > CREATE INDEX > Time: 1270,534 ms > test=*# explain select * from words where w = 'foo'; > QUERY PLAN > ---------------------------------------------------------------------------- > Bitmap Heap Scan on words (cost=36.72..1781.30 rows=1603 width=32) > Recheck Cond: (w = 'foo'::text) > -> Bitmap Index Scan on idx_words (cost=0.00..36.32 rows=1603 width=0) > Index Cond: (w = 'foo'::text) > (4 rows) > > Time: 0,467 ms > > > > As you can see, there is an index WITH varchar_pattern_ops, and the > index used for the query. Yep, I HAD this behavior with my prior test version (nothing has changed: I just droped/recreated it) but not anymore: Table "public.tst1m" Column | Type | Modifiers | Storage | Description --------+-----------------------+----------------------------------------------------+----------+------------- id | integer | not null default nextval('tst1m_id_seq'::regclass) | plain | name | character varying(32) | not null | extended | note | character varying(64) | not null | extended | Indexes: "tst1m_pkey" PRIMARY KEY, btree (id) "tst1m_name_ix" UNIQUE, btree (name) "tst1m_name_lmetaphone_ix" btree (lower(metaphone(name::text, 16)) varchar_pattern_ops) "tst1m_name_lu_ix" btree (lower(jyunaccent(name)::text) text_pattern_ops) "tst1m_note_ix" btree (note varchar_pattern_ops) "tst1m_note_lu_ix" btree (note) Has OIDs: no CREATE INDEX tst1m_name_regular_ix ON tst1m(name varchar_pattern_ops); EXPLAIN SELECT * FROM tst1m ORDER BY name; QUERY PLAN --------------------------------------------------------------------- Sort (cost=25402.82..25652.82 rows=100000 width=138) Sort Key: name -> Seq Scan on tst1m (cost=0.00..3084.00 rows=100000 width=138) (3 rows) JY -- Kiss me, Kate, we will be married o' Sunday. -- William Shakespeare, "The Taming of the Shrew"
Jean-Yves F. Barbier <12ukwn@gmail.com> wrote: > Yep, I HAD this behavior with my prior test version (nothing has > changed: I just droped/recreated it) but not anymore: > > Table "public.tst1m" > Column | Type | Modifiers | Storage | Description > --------+-----------------------+----------------------------------------------------+----------+------------- > id | integer | not null default nextval('tst1m_id_seq'::regclass) | plain | > name | character varying(32) | not null | extended | > note | character varying(64) | not null | extended | > Indexes: > "tst1m_pkey" PRIMARY KEY, btree (id) > "tst1m_name_ix" UNIQUE, btree (name) > "tst1m_name_lmetaphone_ix" btree (lower(metaphone(name::text, 16)) varchar_pattern_ops) > "tst1m_name_lu_ix" btree (lower(jyunaccent(name)::text) text_pattern_ops) > "tst1m_note_ix" btree (note varchar_pattern_ops) > "tst1m_note_lu_ix" btree (note) > Has OIDs: no > > CREATE INDEX tst1m_name_regular_ix ON tst1m(name varchar_pattern_ops); > > EXPLAIN SELECT * FROM tst1m ORDER BY name; > QUERY PLAN > --------------------------------------------------------------------- > Sort (cost=25402.82..25652.82 rows=100000 width=138) > Sort Key: name > -> Seq Scan on tst1m (cost=0.00..3084.00 rows=100000 width=138) > (3 rows) select *, without a WHERE-condition. In this case an index is useless, the whole table is the result and a seq-scan the fastest way. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Sun, 1 Jan 2012 14:11:11 +0100 Andreas Kretschmer <akretschmer@spamfence.net> wrote: > > select *, without a WHERE-condition. In this case an index is useless, > the whole table is the result and a seq-scan the fastest way. I *need* it to work to present ordered lists! And I don't agree, essentially because of that: Column | Type | Modifiers | Storage | Description --------+-----------------------+----------------------------------------------------+----------+------------- id | integer | not null default nextval('tst1m_id_seq'::regclass) | plain | name | character varying(32) | not null | extended | note | character varying(64) | not null | extended | Indexes: "tst1m_pkey" PRIMARY KEY, btree (id) "tst1m_name_lmetaphone_ix" btree (lower(metaphone(name::text, 16)) varchar_pattern_ops) "tst1m_name_lu_ix" btree (lower(jyunaccent(name)::text) text_pattern_ops) "tst1m_name_regular_ix" btree (name varchar_pattern_ops) "tst1m_note_ix" btree (note varchar_pattern_ops) Has OIDs: no EXPLAIN ANALYZE SELECT * FROM tst1m ORDER BY note; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Sort (cost=25402.82..25652.82 rows=100000 width=138) (actual time=9429.292..12468.852 rows=100000 loops=1) Sort Key: note Sort Method: external merge Disk: 14576kB -> Seq Scan on tst1m (cost=0.00..3084.00 rows=100000 width=138) (actual time=0.020..97.160 rows=100000 loops=1) Total runtime: 12516.256 ms (5 rows) DROP INDEX tst1m_note_ix ; CREATE INDEX tst1m_note_ix ON tst1m(note); EXPLAIN ANALYZE SELECT * FROM tst1m ORDER BY note; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Index Scan using tst1m_note_ix on tst1m (cost=0.00..15635.95 rows=100000 width=138) (actual time=0.142..372.800 rows=100000loops=1) Total runtime: 415.164 ms (2 rows) I know the planner's "intelligent", and with 100k rows is is *very* strange that it don't use index - remember: note strings are [32-64] length filled with random characters from any CE languages. Even @ creation, I had a unique index on name ([16-32] chars) and not any ONE doublon... What I don't understand is it was working with the former version which appear to be exactly the same - AND why the varchar.... version isn't used at all when the regular version is used and brings a 30x acceleration. JY -- Absent, adj.: Exposed to the attacks of friends and acquaintances; defamed; slandered.
Jean-Yves F. Barbier <12ukwn@gmail.com> wrote: > On Sun, 1 Jan 2012 14:11:11 +0100 > Andreas Kretschmer <akretschmer@spamfence.net> wrote: > > > > > select *, without a WHERE-condition. In this case an index is useless, > > the whole table is the result and a seq-scan the fastest way. > > I *need* it to work to present ordered lists! > And I don't agree, essentially because of that: I think, in this case you needs 2 indexes, one with and one without the opclass - option. Copy&Paste from http://www.postgresql.org/docs/9.1/static/indexes-opclass.html: The operator classes text_pattern_ops, varchar_pattern_ops, and bpchar_pattern_ops support B-tree indexes on the types text, varchar, and char respectively. The difference from the default operator classes is that the values are compared strictly character by character rather than according to the locale-specific collation rules. This makes these operator classes suitable for use by queries involving pattern matching expressions (LIKE or POSIX regular expressions) when the database does not use the standard "C" locale. As an example, you might index a varchar column like this: CREATE INDEX test_index ON test_table (col varchar_pattern_ops); Note that you should also create an index with the default operator class if you want queries involving ordinary <, <=, >, or >= comparisons to use an index. Such queries cannot use the xxx_pattern_ops operator classes. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Sun, 1 Jan 2012 16:12:10 +0100 Andreas Kretschmer <akretschmer@spamfence.net> wrote: > > I think, in this case you needs 2 indexes, one with and one without the > opclass - option. > ...... > Note that you should also create an index with the default operator > class if you want queries involving ordinary <, <=, >, or >= comparisons > to use an index. Such queries cannot use the xxx_pattern_ops operator > classes. Aïe I missed this one; that was my conclusion but I wanted to make absolutely sure. Thanks Andreas. JY -- Alimony is the curse of the writing classes. -- Norman Mailer