Thread: indexing just a part of a string
I'm new to PostgreSQL, and it has been a pleasure so far. There's just one thing I'm trying to do and I didn't find any hints in the manual: I want to index just a part of a string in a column. The situation: I have roughly 300.000 rows, and in the column we're looking at, most of the entries have less than 200 chars. However, there are some (very few) that have more than 3000 chars, and postmaster relplies that this is too many for the index (b-tree). So I would like to say 'index only the first 200 chars of the column', which will result in a full index of 99.9 % of my entries. I did this in MySQL, but I didn't find it in the pg manual. How do I proceed? best regards, and TIA Christoph Pingel
Christoph Pingel wrote: > So I would like to say 'index only the first 200 chars of the column', > which will result in a full index of 99.9 % of my entries. I did this in > MySQL, but I didn't find it in the pg manual. > > How do I proceed? You could do: CREATE INDEX <index name> ON <table name> (SUBSTRING(<column name>, 1, 200)) But that may cause the index to be used only if you query for results using SUBSTRING(). I don't know; You can test if it uses an index scan using EXPLAIN. You could also use separate indices for the short and the long string variants, or maybe you could use a column that's better suited to the task (for example, a column with an MD5 hash of the text or an integer based on a sequence). You could also try a different type of index, an ltree (contrib) for example. It all pretty much depends on what you're trying to do. In any case, you should take a look at the documentation for CREATE INDEX, there are possibilities. Out of general curiosity: I mentioned using a hashed column as a possible solution. Would that be equivalent to using a hash index? Or is searching a hash value in a btree index actually faster than in a hash index? -- Alban Hertroys MAG Productions T: +31(0)53 4346874 F: +31(0)53 4346876 E: alban@magproductions.nl W: http://www.magproductions.nl
You can use a functional index. Something like CREATE INDEX foo ON bar (substring(blah,1,200)) Should work I think. Ian Harding Programmer/Analyst II Tacoma-Pierce County Health Department iharding@tpchd.org Phone: (253) 798-3549 Pager: (253) 754-0002 >>> Christoph Pingel <ch.pingel@web.de> 02/08/05 2:50 AM >>> I'm new to PostgreSQL, and it has been a pleasure so far. There's just one thing I'm trying to do and I didn't find any hints in the manual: I want to index just a part of a string in a column. The situation: I have roughly 300.000 rows, and in the column we're looking at, most of the entries have less than 200 chars. However, there are some (very few) that have more than 3000 chars, and postmaster relplies that this is too many for the index (b-tree). So I would like to say 'index only the first 200 chars of the column', which will result in a full index of 99.9 % of my entries. I did this in MySQL, but I didn't find it in the pg manual. How do I proceed? best regards, and TIA Christoph Pingel ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
In the last exciting episode, ch.pingel@web.de (Christoph Pingel) wrote: > I'm new to PostgreSQL, and it has been a pleasure so far. > > There's just one thing I'm trying to do and I didn't find any hints in > the manual: I want to index just a part of a string in a column. > > The situation: I have roughly 300.000 rows, and in the column we're > looking at, most of the entries have less than 200 chars. However, > there are some (very few) that have more than 3000 chars, and > postmaster relplies that this is too many for the index (b-tree). > > So I would like to say 'index only the first 200 chars of the column', > which will result in a full index of 99.9 % of my entries. I did this > in MySQL, but I didn't find it in the pg manual. > > How do I proceed? You might look at using a functional index: <http://www.postgresql.org/docs/7.3/interactive/indexes-functional.html> You'd therefore create a "first_200_chars()" function, and then create an index using that function. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com'). http://cbbrowne.com/info/postgresql.html Signs of a Klingon Programmer #10: "You cannot really appreciate Dilbert unless you've read it in the original Klingon."