Re: [HACKERS] index fix report - Mailing list pgsql-hackers

From Thomas G. Lockhart
Subject Re: [HACKERS] index fix report
Date
Msg-id 35F00252.E7B6BA5D@alumni.caltech.edu
Whole thread Raw
In response to Re: [HACKERS] index fix report  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [HACKERS] index fix report  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
> It would help to know if it is the cache, or an index problem.  It is
> sometimes hard to determine because the cache often uses the indexes
> to load the cache.
> Can someone step through a bad entry, and tell me where it is failing?
> If it is in the executor, it probably is an index.  EXPLAIN does show
> what indexes are involved.  Are several indexes failing, or just one?

I'm not sure how to "step through a bad entry" for this case. Just
haven't done it before, and have never used gdb on the backend. That may
explain why I've got so many debugging print statements :)

I believe that in at least some cases the index itself is damaged. If it
were just problems _updating_ the cache, then stopping and restarting
all frontends and backends might fix the problem, at least for the first
query. That doesn't eliminate the possibility that it is a problem with
the cache as it is first built though.

regression=> select oid, relname from pg_class where relname =
'primary_tbl';
oid|relname
---+-------
(0 rows)

regression=> explain select relname
regression->  from pg_class where relname = 'primary_key';
NOTICE:  QUERY PLAN:

Index Scan using pg_class_relname_index on pg_class
  (cost=2.03 size=2 width=32)

EXPLAIN

regression=> select oid, relname from pg_class
regression->  where relname like '%primary%';
  oid|relname
-----+----------------
19947|primary_tbl
19957|primary_tbl_pkey
(2 rows)

regression=> explain select oid, relname from pg_class
regression->  where oid = 19947;
NOTICE:  QUERY PLAN:

Index Scan using pg_class_oid_index on pg_class
  (cost=2.03 size=2 width=36)

EXPLAIN

So these indices appear damaged. Now here is another symptom from my
regression tests, which appears to illustrate cache damage, though since
it is after the fact perhaps a damaged index has just been changed
enough in the meantime to uncover the right nodes...

The regression result shows trouble finding a relation on which to
create the index, and once it has trouble it never finds the relation
_during the same session_:

QUERY: CREATE INDEX onek_unique1 ON onek USING btree(unique1 int4_ops);
QUERY: CREATE INDEX onek_unique2 ON onek USING btree(unique2 int4_ops);
ERROR:  DefineIndex: onek relation not found
QUERY: CREATE INDEX onek_hundred ON onek USING btree(hundred int4_ops);
ERROR:  DefineIndex: onek relation not found
QUERY: CREATE INDEX onek_stringu1 ON onek USING btree(stringu1
name_ops);
ERROR:  DefineIndex: onek relation not found
QUERY: CREATE INDEX tenk1_unique1 ON tenk1 USING btree(unique1
int4_ops);
QUERY: CREATE INDEX tenk1_unique2 ON tenk1 USING btree(unique2
int4_ops);
QUERY: CREATE INDEX tenk1_hundred ON tenk1 USING btree(hundred
int4_ops);
QUERY: CREATE INDEX tenk2_unique1 ON tenk2 USING btree(unique1
int4_ops);
QUERY: CREATE INDEX tenk2_unique2 ON tenk2 USING btree(unique2
int4_ops);
ERROR:  DefineIndex: tenk2 relation not found
QUERY: CREATE INDEX tenk2_hundred ON tenk2 USING btree(hundred
int4_ops);
ERROR:  DefineIndex: tenk2 relation not found
<snip other tables' indices successfully created>

However, if I go back in after the regression test is over, the table is
found and the index created:

regression=> CREATE INDEX onek_unique2 ON onek
regression->  USING btree(unique2 int4_ops);
CREATE

btw, my linux box is not quite as sensitive to the problem(s) as David's
AIX box; his simpler test case does not fail on my machine :(

                       - Tom

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] index fix report
Next
From: "Thomas G. Lockhart"
Date:
Subject: Re: [HACKERS] index fix report