Thread: Index trouble with 8.3b4
I decided to play a bit with 8.3-b4. I did a fresh install from source, fresh initdb, and created a single test table (about 700K rows) to play with in-core FTS: > Welcome to psql 8.3beta4, the PostgreSQL interactive terminal. > > hannes=> \d fts > Table "public.fts" > Column | Type | Modifiers > --------+----------+-------------------------------------------------- > id | integer | not null default nextval('fts_id_seq'::regclass) > text | text | not null > tsv | tsvector | > Indexes: > "pk_fts" PRIMARY KEY, btree (id) > > > hannes=> CREATE INDEX CONCURRENTLY "ts_fts_tsv" ON "public"."fts" USING gin ("tsv"); > ERROR: item pointer (0,1) alreadt exists I was able to reproduce that error a few times, but not always. It seems it only happens with CONCURRENTLY. After creating a GIST index instead of GIN I tried to cluster on that: > hannes=> CLUSTER fts USING ts_fts_tsv; > ERROR: could not create unique index "pk_fts" > DETAIL: Table contains duplicated values. So duplicate values in my PK column? > hannes=> SET enable_indexscan = off; > SET > hannes=> > hannes=> SELECT > hannes-> f.id, > hannes-> COUNT(f.id) > hannes-> FROM > hannes-> public.fts f > hannes-> GROUP BY > hannes-> f.id > hannes-> HAVING > hannes-> COUNT(f.id) > 1; > id | count > ----+------- > (0 rows) Where are they? I'm a bit short of time and will probably not find time to debug this further until next weekend, sorry. -- Best regards, Hannes Dorbath
Hannes Dorbath <light@theendofthetunnel.de> writes: > hannes=> CREATE INDEX CONCURRENTLY "ts_fts_tsv" ON "public"."fts" USING gin ("tsv"); > ERROR: item pointer (0,1) alreadt exists > I was able to reproduce that error a few times, but not always. It seems > it only happens with CONCURRENTLY. > After creating a GIST index instead of GIN I tried to cluster on that: > hannes=> CLUSTER fts USING ts_fts_tsv; > ERROR: could not create unique index "pk_fts" > DETAIL: Table contains duplicated values. > So duplicate values in my PK column? I didn't have any luck reproducing either of these behaviors --- maybe it's data-dependent. Can you extract a test case? regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Hannes Dorbath <light@theendofthetunnel.de> writes: >> hannes=> CREATE INDEX CONCURRENTLY "ts_fts_tsv" ON "public"."fts" USING gin ("tsv"); >> ERROR: item pointer (0,1) alreadt exists > >> I was able to reproduce that error a few times, but not always. It seems >> it only happens with CONCURRENTLY. >... > I didn't have any luck reproducing either of these behaviors --- maybe > it's data-dependent. Can you extract a test case? I the GIN problem kicks in with the "posting tree" representation. That would require more than about 85 records with the same matching lexeme-key. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > I didn't have any luck reproducing either of these behaviors --- maybe > it's data-dependent. Can you extract a test case? I haven't been able to reproduce this either but I produced an entirely different problem: postgres=# create index concurrently dg5 on doc using gin (to_tsvector('english',d)); ERROR: deadlock detected DETAIL: Process 7076 waits for ShareLock on unrecognized locktag type 5; blocked by process 10497. Process 10497 waits for ShareUpdateExclusiveLock on relation 24656 of database 11511; blocked by process 7076. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
Tom Lane wrote: > I didn't have any luck reproducing either of these behaviors --- maybe > it's data-dependent. Can you extract a test case? I will try to come up with a test case, but it might take until next weekend, sorry. The test data was an IRC log file, containing lots of color and formating codes as well as other garbage. So anything else than a well formed text document. I cleaned it a bit with -f UTF-8 -t UTF-8 -c to make it import at all. -- Best regards, Hannes Dorbath