Thread: is it normal behavior of index?

is it normal behavior of index?

From
"Jean-Yves F. Barbier"
Date:
Hi list,

Simple test table w/ a varchar(32): if I create an index with
'varchar_pattern_ops' explain says it is NOT used in my query;
if I create it without it is used (?!)

JY
--
Knocked, you weren't in.
        -- Opportunity

Re: is it normal behavior of index?

From
Andreas Kretschmer
Date:
Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

> Hi list,
>
> Simple test table w/ a varchar(32): if I create an index with
> 'varchar_pattern_ops' explain says it is NOT used in my query;
> if I create it without it is used (?!)

Please show more details and/or an example. I have one for you:

test=# create table words ( w text);
CREATE TABLE
Time: 5,829 ms
test=*# copy words from '/usr/share/dict/ngerman';
COPY 320577
Time: 458,902 ms
test=*# create index idx_words on words (w varchar_pattern_ops);
CREATE INDEX
Time: 1270,534 ms
test=*# explain select * from words where w = 'foo';
                                 QUERY PLAN
----------------------------------------------------------------------------
 Bitmap Heap Scan on words  (cost=36.72..1781.30 rows=1603 width=32)
   Recheck Cond: (w = 'foo'::text)
   ->  Bitmap Index Scan on idx_words  (cost=0.00..36.32 rows=1603 width=0)
         Index Cond: (w = 'foo'::text)
(4 rows)

Time: 0,467 ms



As you can see, there is an index WITH varchar_pattern_ops, and the
index used for the query.



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: is it normal behavior of index?

From
"Jean-Yves F. Barbier"
Date:
On Sun, 1 Jan 2012 12:52:04 +0100
Andreas Kretschmer <akretschmer@spamfence.net> wrote:

> > Simple test table w/ a varchar(32): if I create an index with
> > 'varchar_pattern_ops' explain says it is NOT used in my query;
> > if I create it without it is used (?!)
>
> Please show more details and/or an example. I have one for you:
>
> test=# create table words ( w text);
> CREATE TABLE
> Time: 5,829 ms
> test=*# copy words from '/usr/share/dict/ngerman';
> COPY 320577
> Time: 458,902 ms
> test=*# create index idx_words on words (w varchar_pattern_ops);
> CREATE INDEX
> Time: 1270,534 ms
> test=*# explain select * from words where w = 'foo';
>                                  QUERY PLAN
> ----------------------------------------------------------------------------
>  Bitmap Heap Scan on words  (cost=36.72..1781.30 rows=1603 width=32)
>    Recheck Cond: (w = 'foo'::text)
>    ->  Bitmap Index Scan on idx_words  (cost=0.00..36.32 rows=1603 width=0)
>          Index Cond: (w = 'foo'::text)
> (4 rows)
>
> Time: 0,467 ms
>
>
>
> As you can see, there is an index WITH varchar_pattern_ops, and the
> index used for the query.

Yep, I HAD this behavior with my prior test version (nothing has
changed: I just droped/recreated it) but not anymore:

                                             Table "public.tst1m"
 Column |         Type          |                     Modifiers                      | Storage  | Description
--------+-----------------------+----------------------------------------------------+----------+-------------
 id     | integer               | not null default nextval('tst1m_id_seq'::regclass) | plain    |
 name   | character varying(32) | not null                                           | extended |
 note   | character varying(64) | not null                                           | extended |
Indexes:
    "tst1m_pkey" PRIMARY KEY, btree (id)
    "tst1m_name_ix" UNIQUE, btree (name)
    "tst1m_name_lmetaphone_ix" btree (lower(metaphone(name::text, 16)) varchar_pattern_ops)
    "tst1m_name_lu_ix" btree (lower(jyunaccent(name)::text) text_pattern_ops)
    "tst1m_note_ix" btree (note varchar_pattern_ops)
    "tst1m_note_lu_ix" btree (note)
Has OIDs: no

CREATE INDEX tst1m_name_regular_ix ON tst1m(name varchar_pattern_ops);

EXPLAIN  SELECT * FROM tst1m ORDER BY name;
                             QUERY PLAN
---------------------------------------------------------------------
 Sort  (cost=25402.82..25652.82 rows=100000 width=138)
   Sort Key: name
   ->  Seq Scan on tst1m  (cost=0.00..3084.00 rows=100000 width=138)
(3 rows)

JY
--
Kiss me, Kate, we will be married o' Sunday.
        -- William Shakespeare, "The Taming of the Shrew"

Re: is it normal behavior of index?

From
Andreas Kretschmer
Date:
Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

> Yep, I HAD this behavior with my prior test version (nothing has
> changed: I just droped/recreated it) but not anymore:
>
>                                              Table "public.tst1m"
>  Column |         Type          |                     Modifiers                      | Storage  | Description
> --------+-----------------------+----------------------------------------------------+----------+-------------
>  id     | integer               | not null default nextval('tst1m_id_seq'::regclass) | plain    |
>  name   | character varying(32) | not null                                           | extended |
>  note   | character varying(64) | not null                                           | extended |
> Indexes:
>     "tst1m_pkey" PRIMARY KEY, btree (id)
>     "tst1m_name_ix" UNIQUE, btree (name)
>     "tst1m_name_lmetaphone_ix" btree (lower(metaphone(name::text, 16)) varchar_pattern_ops)
>     "tst1m_name_lu_ix" btree (lower(jyunaccent(name)::text) text_pattern_ops)
>     "tst1m_note_ix" btree (note varchar_pattern_ops)
>     "tst1m_note_lu_ix" btree (note)
> Has OIDs: no
>
> CREATE INDEX tst1m_name_regular_ix ON tst1m(name varchar_pattern_ops);
>
> EXPLAIN  SELECT * FROM tst1m ORDER BY name;
>                              QUERY PLAN
> ---------------------------------------------------------------------
>  Sort  (cost=25402.82..25652.82 rows=100000 width=138)
>    Sort Key: name
>    ->  Seq Scan on tst1m  (cost=0.00..3084.00 rows=100000 width=138)
> (3 rows)

select *, without a WHERE-condition. In this case an index is useless,
the whole table is the result and a seq-scan the fastest way.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: is it normal behavior of index?

From
"Jean-Yves F. Barbier"
Date:
On Sun, 1 Jan 2012 14:11:11 +0100
Andreas Kretschmer <akretschmer@spamfence.net> wrote:

>
> select *, without a WHERE-condition. In this case an index is useless,
> the whole table is the result and a seq-scan the fastest way.

I *need* it to work to present ordered lists!
And I don't agree, essentially because of that:

 Column |         Type          |                     Modifiers                      | Storage  | Description
--------+-----------------------+----------------------------------------------------+----------+-------------
 id     | integer               | not null default nextval('tst1m_id_seq'::regclass) | plain    |
 name   | character varying(32) | not null                                           | extended |
 note   | character varying(64) | not null                                           | extended |
Indexes:
    "tst1m_pkey" PRIMARY KEY, btree (id)
    "tst1m_name_lmetaphone_ix" btree (lower(metaphone(name::text, 16)) varchar_pattern_ops)
    "tst1m_name_lu_ix" btree (lower(jyunaccent(name)::text) text_pattern_ops)
    "tst1m_name_regular_ix" btree (name varchar_pattern_ops)
    "tst1m_note_ix" btree (note varchar_pattern_ops)
Has OIDs: no

EXPLAIN ANALYZE SELECT * FROM tst1m ORDER BY note;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Sort  (cost=25402.82..25652.82 rows=100000 width=138) (actual time=9429.292..12468.852 rows=100000 loops=1)
   Sort Key: note
   Sort Method: external merge  Disk: 14576kB
   ->  Seq Scan on tst1m  (cost=0.00..3084.00 rows=100000 width=138) (actual time=0.020..97.160 rows=100000 loops=1)
 Total runtime: 12516.256 ms
(5 rows)

DROP INDEX tst1m_note_ix ;

CREATE INDEX tst1m_note_ix ON tst1m(note);

EXPLAIN ANALYZE SELECT * FROM tst1m ORDER BY note;
                                                              QUERY PLAN
               

---------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using tst1m_note_ix on tst1m  (cost=0.00..15635.95 rows=100000 width=138) (actual time=0.142..372.800
rows=100000loops=1) 
 Total runtime: 415.164 ms
(2 rows)

I know the planner's "intelligent", and with 100k rows is is *very*
strange that it don't use index - remember: note strings are [32-64]
length filled with random characters from any CE languages.
Even @ creation, I had a unique index on name ([16-32] chars) and
not any ONE doublon...

What I don't understand is it was working with the former version
which appear to be exactly the same - AND why the varchar.... version
isn't used at all when the regular version is used and brings a 30x
acceleration.

JY
--
Absent, adj.:
    Exposed to the attacks of friends and acquaintances;
defamed; slandered.

Re: is it normal behavior of index?

From
Andreas Kretschmer
Date:
Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:

> On Sun, 1 Jan 2012 14:11:11 +0100
> Andreas Kretschmer <akretschmer@spamfence.net> wrote:
>
> >
> > select *, without a WHERE-condition. In this case an index is useless,
> > the whole table is the result and a seq-scan the fastest way.
>
> I *need* it to work to present ordered lists!
> And I don't agree, essentially because of that:

I think, in this case you needs 2 indexes, one with and one without the
opclass - option.

Copy&Paste from
http://www.postgresql.org/docs/9.1/static/indexes-opclass.html:

The operator classes text_pattern_ops, varchar_pattern_ops, and
bpchar_pattern_ops support B-tree indexes on the types text, varchar,
and char respectively. The difference from the default operator classes
is that the values are compared strictly character by character rather
than according to the locale-specific collation rules. This makes these
operator classes suitable for use by queries involving pattern matching
expressions (LIKE or POSIX regular expressions) when the database does
not use the standard "C" locale. As an example, you might index a
varchar column like this:

CREATE INDEX test_index ON test_table (col varchar_pattern_ops);

Note that you should also create an index with the default operator
class if you want queries involving ordinary <, <=, >, or >= comparisons
to use an index. Such queries cannot use the xxx_pattern_ops operator
classes.



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: is it normal behavior of index?

From
"Jean-Yves F. Barbier"
Date:
On Sun, 1 Jan 2012 16:12:10 +0100
Andreas Kretschmer <akretschmer@spamfence.net> wrote:

>
> I think, in this case you needs 2 indexes, one with and one without the
> opclass - option.
>
......
> Note that you should also create an index with the default operator
> class if you want queries involving ordinary <, <=, >, or >= comparisons
> to use an index. Such queries cannot use the xxx_pattern_ops operator
> classes.

Aïe I missed this one; that was my conclusion but I wanted to make
absolutely sure.
Thanks Andreas.
JY
--
Alimony is the curse of the writing classes.
        -- Norman Mailer