TEXT column and indexing - Mailing list pgsql-performance

From Ian Barwick
Subject TEXT column and indexing
Date
Msg-id 200311191018.18492.barwick@gmx.net
Whole thread Raw
Responses Re: TEXT column and indexing  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Re: TEXT column and indexing  (Manfred Koizar <mkoi-pg@aon.at>)
List pgsql-performance
I have this table:

db=> \d object_property_value
                       Table "db.object_property_value"
        Column         |          Type          |              Modifiers
-----------------------+------------------------+--------------------
 obj_property_value_id | integer                | not null default nextval(...
 obj_property_id       | integer                | not null
 value                 | text                   |
Indexes:
    "object_property_value_pkey" primary key, btree (obj_property_value_id)
    "opv_obj_property_id_ix" btree (obj_property_id)
    "opv_v_ix" btree (substr(value, 1, 128))
Foreign-key constraints:
    "object_property_fkey" FOREIGN KEY (obj_property_id)
           REFERENCES object_property(obj_property_id)
          ON UPDATE CASCADE ON DELETE CASCADE

(long lines edited for readability).

The table contains about 250,000 records and will grow at regular intervals.
The 'value' column contains text of various lengths. The table is VACUUMed
and ANALYZEd regularly and waxed on Sunday mornings. Database encoding is
Unicode. Server is 7.4RC1 or 7.4RC2 and will be 7.4 ASAP.

I want to query this table to match a specific value along
the lines of:

SELECT obj_property_id
  FROM object_property_value opv
 WHERE opv.value = 'foo'

There will only be a few (at the moment 2 or 3) rows exactly matching
'foo'. This query will only be performed with values containing less
than around 100 characters, which account for ca. 10% of all rows in the
table.

The performance is of course lousy:

db=> EXPLAIN
db-> SELECT obj_property_id
db->   FROM object_property_value opv
db->  WHERE opv.value = 'foo';
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Seq Scan on object_property_value opv  (cost=0.00..12258.26 rows=2 width=4)
   Filter: (value = 'foo'::text)
(2 rows)

However, if I create a VARCHAR field containing the first 128 characters of
the text field and index that, an index scan is used:

db=> EXPLAIN
db-> SELECT obj_property_id
db->   FROM object_property_value opv
db->  WHERE opv.opv_vc = 'foo';
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Index Scan using opv_vc_ix on object_property_value opv  (cost=0.00..6.84
rows=2 width=4)
   Index Cond: ((opv_vc)::text = 'foo'::text)

The question is therefore: can I get an index to work on the TEXT column? It
is currently indexed with:
 "opv_v_ix" btree (substr(value, 1, 128))

which doesn't appear to have any effect. I am probably missing something
obvious though. I can live with maintaining an extra VARCHAR column but
would like to keep the table as simple as possible.

(For anyone wondering: yes, I can access the data using tsearch2 - via
a different table in this case - but this is not always appropriate).


Thanks for any hints.


Ian Barwick
barwick@gmx.net


pgsql-performance by date:

Previous
From: Shridhar Daithankar
Date:
Subject: Re: Join on incompatible types
Next
From: Laurent Martelli
Date:
Subject: Re: Join on incompatible types