Thread: indexing just a part of a string

indexing just a part of a string

From
Christoph Pingel
Date:
I'm new to PostgreSQL, and it has been a pleasure so far.

There's just one thing I'm trying to do and I didn't find any hints
in the manual: I want to index just a part of a string in a column.

The situation: I have roughly 300.000 rows, and in the column we're
looking at, most of the entries have less than 200 chars. However,
there are some (very few) that have more than 3000 chars, and
postmaster relplies that this is too many for the index (b-tree).

So I would like to say 'index only the first 200 chars of the
column', which will result in a full index of 99.9 % of my entries. I
did this in MySQL, but I didn't find it in the pg manual.

How do I proceed?

best regards, and TIA
Christoph Pingel


Re: indexing just a part of a string

From
Alban Hertroys
Date:
Christoph Pingel wrote:
> So I would like to say 'index only the first 200 chars of the column',
> which will result in a full index of 99.9 % of my entries. I did this in
> MySQL, but I didn't find it in the pg manual.
>
> How do I proceed?

You could do:

CREATE INDEX <index name>
ON <table name> (SUBSTRING(<column name>, 1, 200))

But that may cause the index to be used only if you query for results
using SUBSTRING(). I don't know; You can test if it uses an index scan
using EXPLAIN.

You could also use separate indices for the short and the long string
variants, or maybe you could use a column that's better suited to the
task (for example, a column with an MD5 hash of the text or an integer
based on a sequence).
You could also try a different type of index, an ltree (contrib) for
example.

It all pretty much depends on what you're trying to do. In any case, you
should take a look at the documentation for CREATE INDEX, there are
possibilities.

Out of general curiosity: I mentioned using a hashed column as a
possible solution. Would that be equivalent to using a hash index? Or is
searching a hash value in a btree index actually faster than in a hash
index?

--
Alban Hertroys
MAG Productions

T: +31(0)53 4346874
F: +31(0)53 4346876
E: alban@magproductions.nl
W: http://www.magproductions.nl

Re: indexing just a part of a string

From
"Ian Harding"
Date:
You can use a functional index.  Something like

CREATE INDEX foo ON bar (substring(blah,1,200))

Should work I think.

Ian Harding
Programmer/Analyst II
Tacoma-Pierce County Health Department
iharding@tpchd.org
Phone: (253) 798-3549
Pager: (253) 754-0002

>>> Christoph Pingel <ch.pingel@web.de> 02/08/05 2:50 AM >>>
I'm new to PostgreSQL, and it has been a pleasure so far.

There's just one thing I'm trying to do and I didn't find any hints
in the manual: I want to index just a part of a string in a column.

The situation: I have roughly 300.000 rows, and in the column we're
looking at, most of the entries have less than 200 chars. However,
there are some (very few) that have more than 3000 chars, and
postmaster relplies that this is too many for the index (b-tree).

So I would like to say 'index only the first 200 chars of the
column', which will result in a full index of 99.9 % of my entries. I
did this in MySQL, but I didn't find it in the pg manual.

How do I proceed?

best regards, and TIA
Christoph Pingel


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


Re: indexing just a part of a string

From
Christopher Browne
Date:
In the last exciting episode, ch.pingel@web.de (Christoph Pingel) wrote:
> I'm new to PostgreSQL, and it has been a pleasure so far.
>
> There's just one thing I'm trying to do and I didn't find any hints in
> the manual: I want to index just a part of a string in a column.
>
> The situation: I have roughly 300.000 rows, and in the column we're
> looking at, most of the entries have less than 200 chars. However,
> there are some (very few) that have more than 3000 chars, and
> postmaster relplies that this is too many for the index (b-tree).
>
> So I would like to say 'index only the first 200 chars of the column',
> which will result in a full index of 99.9 % of my entries. I did this
> in MySQL, but I didn't find it in the pg manual.
>
> How do I proceed?

You might look at using a functional index:

<http://www.postgresql.org/docs/7.3/interactive/indexes-functional.html>

You'd therefore create a "first_200_chars()" function, and then create
an index using that function.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com').
http://cbbrowne.com/info/postgresql.html
Signs  of  a Klingon  Programmer  #10:  "You cannot  really appreciate
Dilbert unless you've read it in the original Klingon."