Thread: Fulltext index
Hello again, my previous post wasn't answered. I think i told my question wrong :-). I have a table like a | b -------------------- 'de' | Hallo welt 'en' | Hello world How can I create a full text index over b? CREATE INDEX idx ON table USING gin( to_tsvector( case when a = 'de' then 'german' when a = 'en' then 'english' else 'english' end ), b); This doesn't work. Error Message in german: FEHLER: Zugriffsmethode »gin« unterstützt keine mehrspaltigen Indexe SQL state: 0A000 means, gin doesn't accept multicolumn indexes. Any ideas? Greetings Andreas -- Kraftl EDV - Dienstleistungen Linux, Linuxschulungen, Webprogrammierung Autofabrikstraße 16/6 1230 Wien
On Sat, 08 Nov 2008 09:44:17 +0100 Andreas Kraftl <andreas.kraftl@kraftl.at> wrote: > Hello again, > my previous post wasn't answered. I think i told my question > wrong :-). > > I have a table like > a | b > -------------------- > 'de' | Hallo welt > 'en' | Hello world > > How can I create a full text index over b? > CREATE INDEX idx ON table USING gin( > to_tsvector( > case > when a = 'de' then 'german' > when a = 'en' then 'english' > else 'english' > end > ), b); > > This doesn't work. Error Message in german: > FEHLER: Zugriffsmethode »gin« unterstützt keine mehrspaltigen > Indexe SQL state: 0A000 > means, gin doesn't accept multicolumn indexes. > > Any ideas? Multicolumn indexes should get into 8.4. You may add a column tsvector and compute it with a trigger that chose the correct language when generating the tsvector. Then you'll have to pick up the correct language when you generate the tsquery in your search. http://www.sigaev.ru/gin/fastinsert_and_multicolumn_GIN.pdf -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Sat, 8 Nov 2008, Andreas Kraftl wrote: > Hello again, > > my previous post wasn't answered. I think i told my question wrong :-). > > I have a table like > a | b > -------------------- > 'de' | Hallo welt > 'en' | Hello world > > How can I create a full text index over b? use concatenation operator: tsvector('de',b_de) || tsvector('en',b_en) > CREATE INDEX idx ON table USING gin( > to_tsvector( > case > when a = 'de' then 'german' > when a = 'en' then 'english' > else 'english' > end > ), b); > > This doesn't work. Error Message in german: > FEHLER: Zugriffsmethode ЪЪginЪЪ unterstЪЪtzt keine mehrspaltigen Indexe > SQL state: 0A000 > means, gin doesn't accept multicolumn indexes. > > Any ideas? > > Greetings Andreas > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Andreas Kraftl <andreas.kraftl@kraftl.at> writes: > CREATE INDEX idx ON table USING gin( > to_tsvector( > case > when a = 'de' then 'german' > when a = 'en' then 'english' > else 'english' > end > ), b); > This doesn't work. Error Message in german: > FEHLER: Zugriffsmethode »gin« unterstützt keine mehrspaltigen Indexe > SQL state: 0A000 > means, gin doesn't accept multicolumn indexes. You've got the parentheses in the wrong place. The way I'd suggest doing this is regression=# create table tab (a regconfig, b text); CREATE TABLE regression=# create index idx on tab using gin(to_tsvector(a,b)); CREATE INDEX regression=# explain select * from tab where to_tsvector(a,b) @@ to_tsquery('english','foo'); QUERY PLAN ---------------------------------------------------------------- Index Scan using idx on tab (cost=0.00..8.27 rows=1 width=36) Index Cond: (to_tsvector(a, b) @@ '''foo'''::tsquery) (2 rows) If you want to use abbreviations like 'en' and 'de', create text search configurations named that way instead of inserting a run-time conversion. regards, tom lane
Am Sat, 08 Nov 2008 09:44:17 +0100 schrieb Andreas Kraftl: > How can I create a full text index over b? Hello thanks for the answers. But nothing matches my problem. I read the manual again and decide me for an other way. I change my table that it looks like: lang | text ---------------------- german | hallo welt english | hello world Then i follow the manual which says: http://www.postgresql.org/docs/8.3/interactive/textsearch- tables.html#TEXTSEARCH-TABLES-INDEX It is possible to set up more complex expression indexes wherein the configuration name is specified by another column, e.g.: CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(config_name, body)); It is possible to set up more complex expression indexes wherein the configuration name is specified by another column, e.g.: I did a CREATE INDEX idx ON test.test1 USING gin(to_tsvector(lang, "text")); and CREATE INDEX idx ON test.test1 USING gin(to_tsvector(lang::regconfig, "text")); For both I get this error in german: FEHLER: Funktionen im Indexausdruck müssen als IMMUTABLE markiert sein SQL state: 42P17 means in english functions in index expression must be marked IMMUTABLE Now I have no idea. My experience with databases and postgresql are too less to decide if this is a bug, or myself is the bug ;-). Thanks again for help Andreas
On Mon, Nov 10, 2008 at 09:14:21AM +0100, Andreas Kraftl wrote: > Am Sat, 08 Nov 2008 09:44:17 +0100 schrieb Andreas Kraftl: > > How can I create a full text index over b? > > thanks for the answers. But nothing matches my problem. I'm not sure what's wrong with Oleg's suggestion--he's the guy who wrote most of the code so his suggestions should be reasonable! I'm just learning about this stuff myself, so it may be somewhat sub-optimal. That said, I got things working when doing the following: CREATE TABLE test ( lang TEXT, text TEXT ); INSERT INTO test VALUES ('german', 'hallo welt'), ('english', 'hello world'); CREATE INDEX idx ON test USING gin(tsvector_concat( to_tsvector('german', CASE lang WHEN 'german' THEN text ELSE '' END), to_tsvector('english', CASE lang WHEN 'english' THEN text ELSE '' END))); "text" as a column name gets a bit confusing to read, but I'm trying to follow your names. Also my version of PG didn't seem to know that the '||' operator knows how to concat tsvectors, so I had to spell out tsvector_concat in full. Querying is a bit awkward, but works: SELECT * FROM test WHERE tsvector_concat( to_tsvector('english', CASE lang WHEN 'english' THEN text ELSE '' END), to_tsvector('german', CASE lang WHEN 'german' THEN text ELSE '' END)) @@ to_tsquery('english', 'hello'); Putting most of the above into a query would work, as would having PG automatically maintaining a column of type TSVECTOR. > I read the manual again and decide me for an other way. > I change my table that it looks like: > > lang | text > ---------------------- > german | hallo welt > english | hello world What types do these columns have? if "lang" is of type REGCONFIG all works for me: CREATE TABLE test ( lang REGCONFIG, text TEXT ); INSERT INTO test VALUES ('german', 'hallo welt'), ('english', 'hello world'); CREATE INDEX idx ON test USING gin(to_tsvector(lang, text)); SELECT * FROM test WHERE to_tsvector(lang, text) @@ to_tsquery('english', 'hello'); This all seems much easier than having "lang" as a TEXT column. > Now I have no idea. My experience with databases and postgresql are too > less to decide if this is a bug, or myself is the bug ;-). I think the awkward thing is that text/strings are visually indistin- guishable from arbitrary literals in SQL. The 'english' that's going into the to_tsquery() call above is actually of type REGCONFIG, but it looks like a string literal. I think that could be why you were getting confused before. Hope that all makes sense and helps a bit! Sam
Before waking up properly, I wrote: > Querying is a bit awkward, but works: > > SELECT * > FROM test > WHERE tsvector_concat( > to_tsvector('english', CASE lang WHEN 'english' THEN text ELSE '' END), > to_tsvector('german', CASE lang WHEN 'german' THEN text ELSE '' END)) > @@ to_tsquery('english', 'hello'); > > Putting most of the above into a query would work That should be "Putting most of the above into a VIEW would work"! The example would be: CREATE VIEW test_v AS SELECT lang, text, tsvector_concat( to_tsvector('english', CASE lang WHEN 'english' THEN text ELSE '' END), to_tsvector('german', CASE lang WHEN 'german' THEN text ELSE '' END)) AS tsvec FROM test; Allowing you to do: SELECT lang, text FROM test_v WHERE tsvec @@ to_tsquery('english', 'hello'); Sam