Thread: TEXT column and indexing

TEXT column and indexing

From
Ian Barwick
Date:
I have this table:

db=> \d object_property_value
                       Table "db.object_property_value"
        Column         |          Type          |              Modifiers
-----------------------+------------------------+--------------------
 obj_property_value_id | integer                | not null default nextval(...
 obj_property_id       | integer                | not null
 value                 | text                   |
Indexes:
    "object_property_value_pkey" primary key, btree (obj_property_value_id)
    "opv_obj_property_id_ix" btree (obj_property_id)
    "opv_v_ix" btree (substr(value, 1, 128))
Foreign-key constraints:
    "object_property_fkey" FOREIGN KEY (obj_property_id)
           REFERENCES object_property(obj_property_id)
          ON UPDATE CASCADE ON DELETE CASCADE

(long lines edited for readability).

The table contains about 250,000 records and will grow at regular intervals.
The 'value' column contains text of various lengths. The table is VACUUMed
and ANALYZEd regularly and waxed on Sunday mornings. Database encoding is
Unicode. Server is 7.4RC1 or 7.4RC2 and will be 7.4 ASAP.

I want to query this table to match a specific value along
the lines of:

SELECT obj_property_id
  FROM object_property_value opv
 WHERE opv.value = 'foo'

There will only be a few (at the moment 2 or 3) rows exactly matching
'foo'. This query will only be performed with values containing less
than around 100 characters, which account for ca. 10% of all rows in the
table.

The performance is of course lousy:

db=> EXPLAIN
db-> SELECT obj_property_id
db->   FROM object_property_value opv
db->  WHERE opv.value = 'foo';
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Seq Scan on object_property_value opv  (cost=0.00..12258.26 rows=2 width=4)
   Filter: (value = 'foo'::text)
(2 rows)

However, if I create a VARCHAR field containing the first 128 characters of
the text field and index that, an index scan is used:

db=> EXPLAIN
db-> SELECT obj_property_id
db->   FROM object_property_value opv
db->  WHERE opv.opv_vc = 'foo';
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Index Scan using opv_vc_ix on object_property_value opv  (cost=0.00..6.84
rows=2 width=4)
   Index Cond: ((opv_vc)::text = 'foo'::text)

The question is therefore: can I get an index to work on the TEXT column? It
is currently indexed with:
 "opv_v_ix" btree (substr(value, 1, 128))

which doesn't appear to have any effect. I am probably missing something
obvious though. I can live with maintaining an extra VARCHAR column but
would like to keep the table as simple as possible.

(For anyone wondering: yes, I can access the data using tsearch2 - via
a different table in this case - but this is not always appropriate).


Thanks for any hints.


Ian Barwick
barwick@gmx.net


Re: TEXT column and indexing

From
Stephan Szabo
Date:
On Wed, 19 Nov 2003, Ian Barwick wrote:

>
> I have this table:
>
> db=> \d object_property_value
>                        Table "db.object_property_value"
>         Column         |          Type          |              Modifiers
> -----------------------+------------------------+--------------------
>  obj_property_value_id | integer                | not null default nextval(...
>  obj_property_id       | integer                | not null
>  value                 | text                   |
> Indexes:
>     "object_property_value_pkey" primary key, btree (obj_property_value_id)
>     "opv_obj_property_id_ix" btree (obj_property_id)
>     "opv_v_ix" btree (substr(value, 1, 128))
> Foreign-key constraints:
>     "object_property_fkey" FOREIGN KEY (obj_property_id)
>            REFERENCES object_property(obj_property_id)
>           ON UPDATE CASCADE ON DELETE CASCADE
> I want to query this table to match a specific value along
> the lines of:
>
> SELECT obj_property_id
>   FROM object_property_value opv
>  WHERE opv.value = 'foo'
>
> The question is therefore: can I get an index to work on the TEXT column? It
> is currently indexed with:
>  "opv_v_ix" btree (substr(value, 1, 128))
>
> which doesn't appear to have any effect. I am probably missing something
> obvious though. I can live with maintaining an extra VARCHAR column but

You probably need to be querying like:
WHERE substr(value,1,128)='foo';
in order to use that index.

While substr(txtcol, 1,128) happens to have the property that it would be
probably be useful in a search against a short constant string, that's an
internal property of that function.

Re: TEXT column and indexing

From
Manfred Koizar
Date:
On Wed, 19 Nov 2003 10:18:18 +0100, Ian Barwick <barwick@gmx.net>
wrote:
>Indexes:
>    [...]
>    "opv_v_ix" btree (substr(value, 1, 128))

>SELECT obj_property_id
>  FROM object_property_value opv
> WHERE opv.value = 'foo'

Try
    ... WHERE substr(opv.value, 1, 128) = 'foo'

HTH.
Servus
 Manfred

Re: TEXT column and indexing

From
Ian Barwick
Date:
On Wednesday 19 November 2003 17:35, Stephan Szabo wrote:
> On Wed, 19 Nov 2003, Ian Barwick wrote:
> > I have this table:
(...)
>
> You probably need to be querying like:
> WHERE substr(value,1,128)='foo';
> in order to use that index.
>
> While substr(txtcol, 1,128) happens to have the property that it would be
> probably be useful in a search against a short constant string, that's an
> internal property of that function.

That's the one :-). Thanks!

Ian Barwick
barwick@gmx.net


Re: TEXT column and indexing

From
Ian Barwick
Date:
On Wednesday 19 November 2003 17:26, you wrote:
> On Wed, 19 Nov 2003 10:18:18 +0100, Ian Barwick <barwick@gmx.net>
>
> wrote:
> >Indexes:
> >    [...]
> >    "opv_v_ix" btree (substr(value, 1, 128))
> >
> >SELECT obj_property_id
> >  FROM object_property_value opv
> > WHERE opv.value = 'foo'
>
> Try
>     ... WHERE substr(opv.value, 1, 128) = 'foo'
>
> HTH.

Yup:
db=> explain
db-> SELECT obj_property_id
db->   FROM object_property_value opv
db->  WHERE substr(opv.value,1,128) = 'foo';
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Index Scan using opv_v_ix on object_property_value opv  (cost=0.00..4185.78
rows=1101 width=4)
   Index Cond: (substr(value, 1, 128) = 'foo'::text)
(2 rows)

Many thanks

Ian Barwick
barwick@gmx.net