Re: [GENERAL] cache lookup of relation 165058647 failed - Mailing list pgsql-bugs
From | Sean Chittenden |
---|---|
Subject | Re: [GENERAL] cache lookup of relation 165058647 failed |
Date | |
Msg-id | 91B18B5E-9D8A-11D8-8912-000A95C705DC@chittenden.org Whole thread Raw |
Responses |
Re: [GENERAL] cache lookup of relation 165058647 failed
Re: [GENERAL] cache lookup of relation 165058647 failed |
List | pgsql-bugs |
> I'v find out that this error occurs in: > dependency.c file > > 2004-04-26 11:09:34 ERROR: dependency.c 1621: cache lookup of relation > 149064743 failed > 2004-04-26 11:09:34 ERROR: Relation "tmp_table1" does not exist > 2004-04-26 11:09:34 ERROR: Relation "tmp_table1" does not exist > > in getRelationDescription(StringInfo buffer, Oid relid) function. > > Any ideas what can cause this errors. <aol>Me too.</aol> But, I am suspecting that it's a race condition with the new background writer code. I've started testing a new database design and was able to reproduce this on my laptop nearly 90% of the time, but could only reproduce it about 10% of the time on my production databases until I figured out what the difference was, fsync. fsync was causing enough of a slow down that SearchSysCache() was finding the tuple, whereas with fsync = false, it wasn't able to find it. But, in search of proving that it wasn't fsync (I use fsync = false on my laptop to save my pour drive), I threw in a sleep in between my tests, and I'm able to get things to work 100% of the time by adding a sleep. The following fails to work with fsync = false, 90% of the time and with fsync = true, only 10% of the time. % psql -f test-begin.sql template1 && psql -f test_enterprise_class.sql && psql -f test-end1.sql template1 && psql -f test-end2.sql template1 But, if I change the command to: % psql -f test-begin.sql template1 && psql -f test_enterprise_class.sql && psql -f test-end1.sql template1 && sleep 1 && psql -f test-end2.sql template1 I have no problems with cache relation misses. As for what happens in those commands, I'm: -- 1) Dropping the test database and re-creating it -- 2) In a different connection, load a rather large schema as the dba -- 3) Connect again and create a temp table -- 4) Connect a second time, and check to see if the temp table exists The sleep comes at step 3.5 in the above sequence of operations. *boom* Here's a snippet of my terminal (the first thing I do after BEGINning a transaction is create a temp table if it doesn't exist): ## BEGIN ## [snip] [...] COMMIT You are now connected to database "test" as user "usr". BEGIN psql:test-end2.sql:3: ERROR: cache lookup failed for relation 398033 CONTEXT: SQL query "SELECT TRUE FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relname = 'tmptbl'::TEXT AND c.relkind = 'r'::TEXT AND pg_catalog.pg_table_is_visible(c.oid)" PL/pgSQL function "create_tmptbl" line 2 at perform PL/pgSQL function "check_or_populate_func" line 8 at assignment PL/pgSQL function "setuid_wrapper_func" line 5 at return ## END ## What's really bothering me is I can push the up arrow on the console, run the exact same thing (including dropping the database), and it'll work sometimes. Very disturbing. As I said, I'm *very* suspicious of the background writer goo that Jan added simply because I can't think of anything else that'd have this problem. I've run each of those commands 100 times now, with and without the sleep 1. With the sleep 1, it's worked 100% of the time. Jan, any bit of code that comes to mind? All of my bgwriter_* settings are set to their default. -sc -- Sean Chittenden
pgsql-bugs by date: