Thread: catalog corruption bug
We have encountered a very nasty but apparently rare bug which appears to result in catalog corruption. I have not been able to pin down an exact sequence of events which cause this problem, it appears to be a race condition of some sort. This is what I have been able to figure out so far. * It appears to be related to temp table creation/deletion. * It requires at least 2 clients be connected simultaneously. * It seems to be related to the autovacuum (not sure, just a theory). I will attempt to explain the circumstances leading up to the problem, and then show the symptoms. We are working on a project which uses postgresql to store data which has been decomposed into a large number of rows in stored procedures (plpgsql/plperl). The implementation we have been working with recently has used temp tables to store intermediate stages of this decomposition so that we can run multiple queries over it in the course of adding it to our final tables without having to regenerate the set each time. We were running a timing test for a load of data which would result in tens of millions of rows. This load creates temp tables with "on commit drop" and also explitly drops them. It appears to do so at a rate of approximately 10 per second (also transactions are being created/committed at that same rate). This works fine. While this load was running we were working on some testing code to determine whether it might be better to create the temp table with "on commit delete rows" instead and use a plpgsql function to create the temp table with an EXCEPTION duplicate_table block to handle when the table has already been created for this connection. We wrote the function at first on a postgres 8.0 box which was not running the load, but when we were attempting to determine what the error code thrown was we noticed that the SQLSTATE variable was not available in 8.0 and copied the function onto the 8.1 box (which was running the load) to try it out. We ran this function a couple times to get the error, and then had it catch the duplicate_table exception. We got the function working, and when we looked at the status of our load we found that it had died with a message saying "ERROR: pg_class entry for relid 7502381 vanished during vacuuming" We found this interesting, figuring it was a bug in postgres. Googling the non-variable pieces of that message turned up nothing relevant, so we set about trying to reproduce it. During the course of doing so, we restarted our load several times and called the function. We later put the calling of the function into a loop in bash calling psql (so we could disconnect/reconnect) to speed up the finding of the problem. These are some of the interesting errors which we got while doing this (all from the server log): ERROR: duplicate key violates unique constraint "pg_class_relname_nsp_index" CONTEXT: SQL statement "CREATE TEMP TABLE foo (a integer, b integer) ON COMMIT DELETE ROWS" PL/pgSQL function "temp_table_test" line 2 at SQL statement ERROR: relation "foo" does not exist ERROR: duplicate key violates unique constraint "pg_class_relname_nsp_index" CONTEXT: SQL statement "CREATE TEMP TABLE foo (a integer, b integer) ON COMMIT DELETE ROWS" PL/pgSQL function "temp_table_test" line 2 at SQL statement ERROR: relation "foo" does not exist FATAL: cache lookup failed for relation 7600066 LOG: server process (PID 20942) exited with exit code 1 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. We also managed to get an error which was more bothersome than the mysterious disappearing/reappearing temp tables. ERROR: relation "windowpos" does not exist ERROR: type "windowpos" already exists ERROR: cache lookup failed for relation 794218 Later: ERROR: relation "windowpos" already exists ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 ERROR: catalog is missing 14 attribute(s) for relid 7577269 Here is the temp table function we were testing: CREATE OR REPLACE FUNCTION temp_table_test() RETURNS boolean AS $$ BEGINCREATE TEMP TABLE foo (a integer, b integer) ON COMMIT DELETE ROWS;RETURN true; EXCEPTION WHEN duplicate_table THENRETURN false; END; $$ LANGUAGE plpgsql; And our bash command line for stressing: for i in `seq 1 10000`; do echo -e 'select temp_table_test();\n insert into foo select a, b from generate_series(1,200) a, generate_series(1,200) b;\n'| psql -o /dev/null cruft; if (($? != 0)); then break; fi; done We may have reproduced the issue using another bash script instead of our timing test, but we are not sure whether this was due to the problem, or due to the catalog already being corrupt in the database. In this test, we did this initally in psql: CREATE TABLE junk (a integer, b integer); CREATE INDEX junk_a_idx ON junk(a); CREATE INDEX junk_b_idx ON junk(b); CREATE INDEX junk_a_b_idx ON junk(a, b); INSERT INTO junk SELECT a, b FROM generate_series(1,200) a, generate_series(1,200) b; TRUNCATE junk; TRUNCATE junk; Then immediately in bash we ran for i in `seq 1 10000`; do echo -e 'insert into junk SELECT a, b FROM generate_series(1,200) a, generate_series(1,200) b;\nCREATE TEMP TABLE goo (ainteger, b integer) ON COMMIT DROP;\n' | psql -o /dev/null cruft; if (($? != 0)); then break; fi; done Here is my postgres version: PostgreSQL 8.1.0 on x86_64-pc-linux-gnu, compiled by GCC x86_64-pc-linux-gnu-gcc (GCC) 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8) And here is the CONFIGURE line from pg_config: CONFIGURE = '--prefix=/usr' '--mandir=/usr/share/man' '--host=x86_64-pc-linux-gnu' '--with-docdir=/usr/share/doc/postgresql-8.1.0' '--libdir=/usr/lib64' '--enable-depend' '--with-tcl' '--with-python' '--with-perl' '--with-openssl' '--enable-nls' '--with-CXX' '--with-pam' 'CFLAGS=-O2 -pipe -march=nocona -mtune=nocona' 'host_alias=x86_64-pc-linux-gnu' This was built from the gentoo ebuild version 8.1.0 My kernel version is Linux samir 2.6.13-gentoo-r5 #1 SMP Thu Nov 17 13:26:09 PST 2005 x86_64 Intel(R) Xeon(TM) CPU 3.00GHz GenuineIntel GNU/Linux We also saw a similar error about a type not being found in the cache on another box which has the same postgres version and platform, but was an Opteron vs the EM64T Xeon, and was running a different project which also used temp tables which were on commit drop. I am sorry I could not provide an exact reproduction case, but this is just one of those things that is too timing dependant to be able to do that. Let me know if you need any additional information and I will attempt to provide it. -- Half the world is composed of people who have something to say and can't, and the other half who have nothing to say and keep on saying it.
Jeremy Drake <pgsql@jdrake.com> writes: > We have encountered a very nasty but apparently rare bug which appears to > result in catalog corruption. How much of this can you reproduce on 8.1.1? We've fixed a few issues already. > This was built from the gentoo ebuild version 8.1.0 I'd be even more interested if you can reproduce it on a non-gentoo machine. Gentoo is not noted for stability. regards, tom lane
On Wed, 21 Dec 2005, Tom Lane wrote: > Jeremy Drake <pgsql@jdrake.com> writes: > > We have encountered a very nasty but apparently rare bug which appears to > > result in catalog corruption. > > How much of this can you reproduce on 8.1.1? We've fixed a few issues > already. We did not see this problem for a while. I upgraded the second gentoo box to show this problem to 8.1.1 basically as soon as the ebuild for it was out. It just started acting up today (but we have not stressed it for a while). It appears to be acting similarly (although corruption which persisted into other backends has not appeared). Here is the error message I currently get on 8.1.1 (names have been changed): DBD::Pg::st execute failed: ERROR: type "push_temp" already exists CONTEXT: SQL statement "CREATE TEMPORARY TABLE push_temp (val text) ON COMMIT DROP" PL/pgSQL function "push_func" line 6 at SQL statement DBD::Pg::st execute failed: ERROR: type "push_temp" already exists CONTEXT: SQL statement "CREATE TEMPORARY TABLE push_temp (val text) ON COMMIT DROP" PL/pgSQL function "push_func" line 6 at SQL statement postgres=# select version(); version ------------------------------------------------------------------------------------------------------------------------------------------PostgreSQL 8.1.1on x86_64-pc-linux-gnu, compiled by GCC x86_64-pc-linux-gnu-gcc (GCC) 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8) (1 row) > > > This was built from the gentoo ebuild version 8.1.0 > > I'd be even more interested if you can reproduce it on a non-gentoo > machine. Gentoo is not noted for stability. This one was also on Gentoo, this time ebuild version 8.1.1. They are applying a couple patches it looks like, one of which looks like it just changes some makefile stuff around, and the other appears to add support for the SH platform in s_lock.h. Unfortunately, I don't have any non-gentoo boxes around which are on a par with these two hardware-wise. Also, I think my test cases I tried to come up with were most likely wrong. This code which is currently croaking is basically amounting to 9 processes calling functions which do SELECT, INSERT, SELECT FOR UPDATE, DELETE, and UPDATE, as well as CREATE TEMP TABLE ... ON COMMIT DROP. ON COMMIT DROP is the only kind of temp table that this code uses. I could probably try to re-arrange the code in such a way that I can send it, if that would be helpful (although I wouldn't want to waste the effort if it wouldn't be helpful). Also, what do you figure are the chances of that plperl locale problem causing this? I would guess pretty slim seeing as I am only using ASCII for my schemas, and all of my databases are SQL_ASCII. I am calling plperl functions in both of the projects which are breaking... Also, if I run the command 'locale' all of the things it prints out are either empty or "POSIX" -- Take your dying with some seriousness, however. Laughing on the way to your execution is not generally understood by less advanced life forms, and they'll call you crazy. -- "Messiah's Handbook: Reminders for the Advanced Soul"
Jeremy Drake <pgsql@jdrake.com> writes: >>> We have encountered a very nasty but apparently rare bug which appears to >>> result in catalog corruption. I've been fooling around with this report today. In several hours of trying, I've been able to get one Assert failure from running Jeremy's example on CVS tip. (I would've given up long ago, except the Assert happened very soon after I started trying...) The assert was from this line in hio.c: Assert(PageIsNew((PageHeader) pageHeader)); which we've seen before in connection with the vacuum-vs-relation-extension race condition found last May. It seems we still have an issue of that sort :-(. While fruitlessly waiting for the test to fail again, I've been combing through the code looking for possible failure paths, and I've found something that might explain it. I think this is definitely a bug even if it isn't what's happening in Jeremy's test: mdread() is defined to not fail, but silently return a page of zeroes, if asked to read a page that's at or beyond the end of the table file. (As noted therein, this seems like pretty sucky design, but there are various reasons that make it hard to change the behavior.) Therefore, if for some reason a process tries to read the page just at EOF, it will leave behind a buffer pool entry that is marked BM_VALID but contains zeroes. There are a number of scenarios that could cause this, but all seem rather low-probability. One way is if a process' rd_targblock field for a relation is pointing at the last page of the file and then VACUUM comes along and truncates off that page because it's empty. The next insertion attempt by the process will try to fetch that page, obtain all-zeroes, decide the page has no free space (PageGetFreeSpace is carefully coded to be sure that happens), and go looking for free space elsewhere. Now suppose someone tries to obtain a new page in the relation by calling ReadBuffer(rel, P_NEW). The location of the new page is determined by asking lseek() how long the file is. ReadBuffer then obtains a buffer for that file offset --- and it is going to get a hit on the all-zeroes buffer previously left behind. Since the buffer is marked BM_VALID, the test "if it was already in the buffer pool, we're done" succeeds and the buffer is returned as-is. This is fine as far as the caller knows: it's expecting to get back an all-zero page, so it goes merrily along. The problem is that if that code path is taken, we *have not extended the file on disk*. That means, until we get around to writing the dirty buffer to disk (eg via checkpoint), the kernel thinks the file doesn't contain that block yet. So if someone else comes along and again does ReadBuffer(rel, P_NEW), the lseek computation will return the same offset as before, and we'll wind up handing back the very same buffer as before. Now we get the above-mentioned Assert, if we are lucky enough to be running an assert-enabled build. Otherwise the code in hio.c will just wipe and reinitialize the page, leading to loss of whatever rows were previously placed in it. Based on this analysis, the logic in ReadBuffer is wrong: if it finds an existing buffer in the P_NEW case, it still has to zero the page and do smgrextend() to be sure that the kernel thinks the page has been added to the file. I'm also thinking that the test for empty page in hio.c ought to be an actual test and elog, not just an Assert. Now that we've seen two different bugs allowing the "can't happen" case to happen, I'm no longer satisfied with not having any check there in a non-assert build. The consequences of not detecting an overwrite are too severe. regards, tom lane
Here is some additional information that I have managed to gather today regarding this. It is not really what causes it, so much as what does not. I removed all plperl from the loading processes. I did a VACUUM FULL ANALYZE, and then I reindexed everything in the database (Including starting the backend in standalone mode and running REINDEX SYSTEM dbname). They still failed. So it is apparently not that plperl issue which was being discussed earlier. Also, what I said about the corruption not having persisted into other backends was not quite correct. It was leaving behind types in pg_type which were in some of the pg_temp* schemas which corresponded to some of the temp tables. But I took those out and still had issues (slightly different). Here is some interesting stuff too. I just stopped my processes to start up a batch again to copy the error message I got now, but before doing so I was doing a VACUUM FULL ANALYZE VERBOSE so I could hopefully start from a relatively clean state. I got a few warnings I don't remember seeing before. INFO: vacuuming "pg_catalog.pg_shdepend" INFO: "pg_shdepend": found 108 removable, 440 nonremovable row versions in 15 p ages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 53 to 53 bytes long. There were 1492 unused item pointers. Total free space (including removable row versions) is 89780 bytes. 7 pages are or will become empty, including 0 at the end of the table. 12 pages containing 89744 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "pg_shdepend_depender_index" now contains 448 row versions in 33 pages DETAIL: 108 index row versions were removed. 23 index pages have been deleted, 23 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.10 sec. WARNING: index "pg_shdepend_depender_index" contains 448 row versions, but table contains 440 row versions HINT: Rebuild the index with REINDEX. INFO: index "pg_shdepend_reference_index" now contains 448 row versions in 12 pages DETAIL: 108 index row versions were removed. 3 index pages have been deleted, 3 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. WARNING: index "pg_shdepend_reference_index" contains 448 row versions, but table contains 440 row versions HINT: Rebuild the index with REINDEX. INFO: "pg_shdepend": moved 4 row versions, truncated 15 to 4 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "pg_shdepend_depender_index" now contains 448 row versions in 33 pages DETAIL: 4 index row versions were removed. 23 index pages have been deleted, 23 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. WARNING: index "pg_shdepend_depender_index" contains 448 row versions, but table contains 440 row versions HINT: Rebuild the index with REINDEX. INFO: index "pg_shdepend_reference_index" now contains 448 row versions in 12 pages DETAIL: 4 index row versions were removed. 4 index pages have been deleted, 4 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. WARNING: index "pg_shdepend_reference_index" contains 448 row versions, but table contains 440 row versions HINT: Rebuild the index with REINDEX. INFO: analyzing "pg_catalog.pg_shdepend" INFO: "pg_shdepend": scanned 4 of 4 pages, containing 440 live rows and 0 dead rows; 440 rows in sample, 440 estimated total rows Similar for pg_type, there being 248 index row versions vs 244 row versions in the table. 1631 vs 1619 in pg_attribute 95 vs 94 in pg_index Looks like it may be time to start a standalone backend and REINDEX again... -- Don't smoke the next cigarette. Repeat.
Jeremy Drake <pgsql@jdrake.com> writes: > Here is some additional information that I have managed to gather today > regarding this. It is not really what causes it, so much as what does > not. > ... > Similar for pg_type, there being 248 index row versions vs 244 row > versions in the table. The ReadBuffer bug I just fixed could result in disappearance of catalog rows, so this observation is consistent with the theory that that's what's biting you. It's not proof though... regards, tom lane
On Thu, 5 Jan 2006, Tom Lane wrote: > The ReadBuffer bug I just fixed could result in disappearance of catalog > rows, so this observation is consistent with the theory that that's > what's biting you. It's not proof though... Well, I applied that patch that you sent me the link to (the bufmgr.c one), and rebuilt (PORTDIR_OVERLAY is cool...) I ran my nine processes which hammer things overnight, and in the morning one of them was dead. DBD::Pg::st execute failed: ERROR: duplicate key violates unique constraint "pg_type_typname_nsp_index" CONTEXT: SQL statement "CREATE TEMPORARY TABLE push_tmp (val text) ON COMMIT DROP" PL/pgSQL function "push_func" line 6 at SQL statement DBD::Pg::st execute failed: ERROR: duplicate key violates unique constraint "pg_type_typname_nsp_index" CONTEXT: SQL statement "CREATE TEMPORARY TABLE push_tmp (val text) ON COMMIT DROP" PL/pgSQL function "push_func" line 6 at SQL statement I also write out the time as my processes progress, so I know roughly when it happened too. It happened at 1136534029 (that's result of perl time() function), which when run through localtime() yields Thu Jan 5 23:53:49 2006. It looks like I started the processes at about 18:30, so they lasted a while at least. Let me know if there is anything else I can try to help debug this (asserts on?).
Jeremy Drake <pgsql@jdrake.com> writes: > Well, I applied that patch that you sent me the link to (the bufmgr.c > one), and rebuilt (PORTDIR_OVERLAY is cool...) > I ran my nine processes which hammer things overnight, and in the > morning one of them was dead. > DBD::Pg::st execute failed: ERROR: duplicate key violates unique > constraint "pg_type_typname_nsp_index" Hm, did you REINDEX things beforehand? This could be leftover corruption... regards, tom lane
On Fri, 6 Jan 2006, Tom Lane wrote: > Jeremy Drake <pgsql@jdrake.com> writes: > > Well, I applied that patch that you sent me the link to (the bufmgr.c > > one), and rebuilt (PORTDIR_OVERLAY is cool...) > > > I ran my nine processes which hammer things overnight, and in the > > morning one of them was dead. > > > DBD::Pg::st execute failed: ERROR: duplicate key violates unique > > constraint "pg_type_typname_nsp_index" > > Hm, did you REINDEX things beforehand? This could be leftover corruption... Yes. I ran that VACUUM FULL ANALYZE VERBOSE which I emailed part of the excerpt from, and then I started a standalone backend (postgres -D data -P) and ran REINDEX SYSTEM dbname on the database in question.
Jeremy Drake <pgsql@jdrake.com> writes: >>> DBD::Pg::st execute failed: ERROR: duplicate key violates unique >>> constraint "pg_type_typname_nsp_index" >> >> Hm, did you REINDEX things beforehand? This could be leftover corruption... > Yes. I ran that VACUUM FULL ANALYZE VERBOSE which I emailed part of the > excerpt from, and then I started a standalone backend (postgres -D data > -P) and ran REINDEX SYSTEM dbname on the database in question. OK, this must be a different issue then. I think we have seen reports like this one before, but not been able to reproduce it. Could you rebuild with Asserts enabled and see if any asserts trigger? Also, at this point it'd be worth trying to boil it down to a test case you can give to other people. regards, tom lane
On Fri, 6 Jan 2006, Tom Lane wrote: > OK, this must be a different issue then. I think we have seen reports > like this one before, but not been able to reproduce it. > > Could you rebuild with Asserts enabled and see if any asserts trigger? I got an assert to fail. I'm not entirely sure if this is helpful, but I managed to get a core dump with --enable-debug and --enable-cassert (with optimizations still on). Let me know if there is anything else that would be useful to get out of this core file... (gdb) bt #0 0x00002aaaab8a0cf9 in kill () from /usr/lib64/libc.so.6 #1 0x00002aaaab8a0a3d in raise () from /usr/lib64/libc.so.6 #2 0x00002aaaab8a1c82 in abort () from /usr/lib64/libc.so.6 #3 0x00000000005f9838 in ExceptionalCondition ( conditionName=0x1abc <Address 0x1abc out of bounds>, errorType=0x6 <Address0x6 out of bounds>, fileName=0x0, lineNumber=-1) at assert.c:51 #4 0x00000000005eec5d in AtEOXact_CatCache (isCommit=-68 '') at catcache.c:575 #5 0x000000000047464f in CommitTransaction () at xact.c:1573 #6 0x0000000000474d07 in CommitTransactionCommand () at xact.c:2184 #7 0x00000000005581da in AutoVacMain (argc=6844, argv=0x6) at autovacuum.c:688 #8 0x0000000000558907 in autovac_start () at autovacuum.c:170 #9 0x000000000055e66b in ServerLoop () at postmaster.c:1269 #10 0x000000000055f9b9 in PostmasterMain (argc=3, argv=0x8832e0) at postmaster.c:943 #11 0x000000000051fb43 in main (argc=3, argv=0x8832e0) at main.c:256 (gdb) frame 4 #4 0x00000000005eec5d in AtEOXact_CatCache (isCommit=-68 '') at catcache.c:575 575 Assert(!ct->dead); (gdb) l 570 { 571 CatCTup *ct = (CatCTup *) DLE_VAL(elt); 572 573 Assert(ct->ct_magic == CT_MAGIC); 574 Assert(ct->refcount == 0); 575 Assert(!ct->dead); 576 } 577 } 578 #endif 579 } (gdb) set print pretty (gdb) p *ct $1 = { ct_magic = 1462113538, my_cache = 0x2aaaaaac3060, lrulist_elem = { dle_next = 0x0, dle_prev = 0x939ab0, dle_val= 0x2aaaaab19e18, dle_list = 0x93b1a8 }, cache_elem = { dle_next = 0x0, dle_prev = 0x934b58, dle_val = 0x2aaaaab19e18, dle_list = 0x2aaaaaac36c8 }, c_list = 0x0, refcount = 0, dead = 1 '\001', negative = 0 '\0', hash_value= 15438, tuple = { t_len = 48, t_self = { ip_blkid = { bi_hi = 0, bi_lo = 0 }, ip_posid= 70 }, t_tableOid = 2602, t_datamcxt = 0x914998, t_data = 0x2aaaaab19f30 } }
Jeremy Drake <pgsql@jdrake.com> writes: > I got an assert to fail. I'm not entirely sure if this is helpful, but I > managed to get a core dump with --enable-debug and --enable-cassert (with > optimizations still on). Let me know if there is anything else that would > be useful to get out of this core file... Fascinating --- that's not anywhere near where I thought your problem was. Which cache is this tuple in? (Print *ct->my_cache) The tableOid implies it's one of the caches on pg_amop, which makes the whole thing stranger yet. pg_amop doesn't change during normal operation so there's no reason for one of its tuples to become dead. You aren't creating/deleting operator classes in this database are you? regards, tom lane
On Sat, 7 Jan 2006, Tom Lane wrote: > Fascinating --- that's not anywhere near where I thought your problem > was. Which cache is this tuple in? (Print *ct->my_cache) $2 = { id = 3, cc_next = 0x2aaaaaac1048, cc_relname = 0x2aaaaab19df8 "pg_amop", cc_reloid = 2602, cc_indexoid = 2654, cc_relisshared= 0 '\0', cc_tupdesc = 0x2aaaaab199e0, cc_reloidattr = 0, cc_ntup = 3, cc_nbuckets = 256, cc_nkeys = 2, cc_key= {5, 1, 0, 0}, cc_hashfunc = {0x44e1a0 <hashoid>, 0x44e1a0 <hashoid>, 0, 0}, cc_skey = {{ sk_flags = 0, sk_attno= 5, sk_strategy = 3, sk_subtype = 0, sk_func = { fn_addr = 0x5bb8c0 <oideq>, fn_oid = 184, fn_nargs = 2, fn_strict = 1 '\001', fn_retset = 0 '\0', fn_extra = 0x0, fn_mcxt = 0x914998, fn_expr = 0x0 }, sk_argument = 0 }, { sk_flags = 0, sk_attno = 1, sk_strategy = 3, sk_subtype = 0, sk_func = { fn_addr = 0x5bb8c0 <oideq>, fn_oid = 184, fn_nargs = 2, fn_strict= 1 '\001', fn_retset = 0 '\0', fn_extra = 0x0, fn_mcxt = 0x914998, fn_expr = 0x0 }, sk_argument = 0 }, { sk_flags = 0, sk_attno = 0, sk_strategy = 0, sk_subtype = 0, sk_func = { fn_addr = 0, fn_oid = 0, fn_nargs = 0, fn_strict = 0 '\0', fn_retset = 0 '\0', fn_extra= 0x0, fn_mcxt = 0x0, fn_expr = 0x0 }, sk_argument = 0 }, { sk_flags = 0, sk_attno= 0, sk_strategy = 0, sk_subtype = 0, sk_func = { fn_addr = 0, fn_oid = 0, fn_nargs= 0, fn_strict = 0 '\0', fn_retset = 0 '\0', fn_extra = 0x0, fn_mcxt = 0x0, fn_expr= 0x0 }, sk_argument = 0 }}, cc_isname = "\000\000\000", cc_lists = { dll_head = 0x935018, dll_tail= 0x934c50 }, cc_bucket = {{ dll_head = 0x0, dll_tail = 0x0 }} } Am I correct in interpreting this as the hash opclass for Oid? That's really bizarre. Definately didn't change that. > The tableOid implies it's one of the caches on pg_amop, which makes > the whole thing stranger yet. pg_amop doesn't change during normal > operation so there's no reason for one of its tuples to become dead. > You aren't creating/deleting operator classes in this database are > you? Nope. As a matter of fact, I never created any custom operator classes in this database. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Given my druthers, I'd druther not.
Jeremy Drake <pgsql@jdrake.com> writes: > Am I correct in interpreting this as the hash opclass for Oid? No, it's the AMOPOPID catalog cache (containing rows from pg_amop indexed by amopopr/amopclaid). After digging around for a bit I noticed that catalog caches get flushed if someone vacuums the associated catalog. I have a theory now: somebody vacuumed pg_amop while this process had an open CatCList of entries from this catcache. ResetCatalogCache would mark both the CatCList and its member entries as "dead", because they can't be released while the CatCList has a positive refcount. When ReleaseCatCacheList is called, it would in turn call CatCacheRemoveCList ... which would remove the list, but it doesn't remove the member entries. That's a bug, because the entries should go away if they're dead and no longer have any reference counts keeping them in the "zombie" state. However, AFAICS the only consequence of this bug is to trigger that Assert failure if you've got Asserts enabled. Dead catcache entries aren't actually harmful except for wasting some space. So I don't think this is related to your pg_type duplicate key problem. One weak spot in this theory is the assumption that somebody was vacuuming pg_amop. It seems unlikely that autovacuum would do so since the table never changes (unless you had reached the point where an anti-XID-wraparound vacuum was needed, which is unlikely in itself). Do you have any background processes that do full-database VACUUMs? I'll go fix CatCacheRemoveCList, but I think this is not the bug we're looking for. regards, tom lane
On Sat, 7 Jan 2006, Tom Lane wrote: > Jeremy Drake <pgsql@jdrake.com> writes: > > Am I correct in interpreting this as the hash opclass for Oid? > > However, AFAICS the only consequence of this bug is to trigger > that Assert failure if you've got Asserts enabled. Dead catcache > entries aren't actually harmful except for wasting some space. > So I don't think this is related to your pg_type duplicate key > problem. > > One weak spot in this theory is the assumption that somebody was > vacuuming pg_amop. It seems unlikely that autovacuum would do so > since the table never changes (unless you had reached the point > where an anti-XID-wraparound vacuum was needed, which is unlikely > in itself). Do you have any background processes that do full-database > VACUUMs? No. Just the autovacuum, which is actually the process which had the assert failure. This appears to give the current xid (gdb) p *s $10 = { transactionId = 13568516, subTransactionId = 1, name = 0x0, savepointLevel = 0, state = TRANS_COMMIT, blockState= TBLOCK_STARTED, nestingLevel = 1, curTransactionContext = 0x9529c0, curTransactionOwner = 0x92eb40, childXids= 0x0, currentUser = 0, prevXactReadOnly = 0 '\0', parent = 0x0 } > > I'll go fix CatCacheRemoveCList, but I think this is not the bug > we're looking for. Incidentally, one of my processes did get that error at the same time. All of the other processes had an error DBD::Pg::st execute failed: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. But this one had the DBD::Pg::st execute failed: ERROR: duplicate key violates unique constraint "pg_type_typname_nsp_index" It looks like my kernel did not have the option to append the pid to core files ,so perhaps they both croaked at the same time but only this one got to write a core file? I will enable this and try again, see if I can't get it to make 2 cores. BTW, nothing of any interest made it into the backend log regarding what assert(s) failed.
Jeremy Drake <pgsql@jdrake.com> writes: > On Sat, 7 Jan 2006, Tom Lane wrote: >> I'll go fix CatCacheRemoveCList, but I think this is not the bug >> we're looking for. > Incidentally, one of my processes did get that error at the same time. > All of the other processes had an error > DBD::Pg::st execute failed: server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > But this one had the DBD::Pg::st execute failed: ERROR: duplicate key > violates unique constraint "pg_type_typname_nsp_index" Oh, that's interesting ... maybe there is some relation after all? Hard to see what ... > It looks like my kernel did not have the option to append the pid to core > files ,so perhaps they both croaked at the same time but only this one got > to write a core file? Yeah, they'd all be dumping into the same directory. It's reasonable to suppose that the corefile you have is from the one that aborted last. That would suggest that this is effect not cause ... hmmm ... A bit of a leap in the dark, but: maybe the triggering event for this situation is not a "VACUUM pg_amop" but a global cache reset due to sinval message buffer overrun. It's fairly clear how that would lead to the CatCacheRemoveCList bug. The duplicate-key failure could be an unrelated bug triggered by the same condition. I have no idea yet what the mechanism could be, but cache reset is a sufficiently seldom-exercised code path that it's entirely plausible that there are bugs lurking in it. If this is correct then we could vastly increase the probability of seeing the bug by setting up something to force cache resets at a high rate. If you're interested I could put together a code patch for that. > BTW, nothing of any interest made it into the backend log regarding what > assert(s) failed. What you'd be looking for is a line starting "TRAP:". regards, tom lane
On Sat, 7 Jan 2006, Tom Lane wrote: > Jeremy Drake <pgsql@jdrake.com> writes: > > On Sat, 7 Jan 2006, Tom Lane wrote: > >> I'll go fix CatCacheRemoveCList, but I think this is not the bug > >> we're looking for. > > A bit of a leap in the dark, but: maybe the triggering event for this > situation is not a "VACUUM pg_amop" but a global cache reset due to > sinval message buffer overrun. It's fairly clear how that would lead > to the CatCacheRemoveCList bug. The duplicate-key failure could be an > unrelated bug triggered by the same condition. I have no idea yet what > the mechanism could be, but cache reset is a sufficiently seldom-exercised > code path that it's entirely plausible that there are bugs lurking in it. > > If this is correct then we could vastly increase the probability of > seeing the bug by setting up something to force cache resets at a high > rate. If you're interested I could put together a code patch for that. I tried that function you sent, while running my other code. It died, but not the same way. None of my processes had the unique constraint error, but two had failed during commit. Both of them died in that same place as the last one, on pg_amop. I think I am going to just run without the function running this time and see if it does the duplicate type error and if it will generate two cores. -- To kick or not to kick...-- Somewhere on IRC, inspired by Shakespeare
Jeremy Drake <pgsql@jdrake.com> writes: > On Sat, 7 Jan 2006, Tom Lane wrote: >> A bit of a leap in the dark, but: maybe the triggering event for this >> situation is not a "VACUUM pg_amop" but a global cache reset due to >> sinval message buffer overrun. > I tried that function you sent, while running my other code. It died, but > not the same way. None of my processes had the unique constraint error, > but two had failed during commit. Both of them died in that same place as > the last one, on pg_amop. Yeah, that's not very surprising. Running the forced-cache-resets function will definitely expose that catcache bug pretty quickly. You'd need to apply the patches I put in yesterday to have a system that has any chance of withstanding that treatment for any length of time. > I think I am going to just run without the function running this time and > see if it does the duplicate type error and if it will generate two cores. Please also look at putting together a test case so others can poke at this. regards, tom lane
On Sun, 8 Jan 2006, Tom Lane wrote: > Yeah, that's not very surprising. Running the forced-cache-resets > function will definitely expose that catcache bug pretty quickly. > You'd need to apply the patches I put in yesterday to have a system > that has any chance of withstanding that treatment for any length of time. > > > I think I am going to just run without the function running this time and > > see if it does the duplicate type error and if it will generate two cores. I ran without that function you made, and it got the error, but not a crash. I stuck an Assert(false) right before the ereport for that particular error, and I did end up with a core there, but I don't see anything out of the ordinary (what little I know of the ordinary ;) #0 0x00002aaaab8a0cf9 in kill () from /usr/lib64/libc.so.6 #1 0x00002aaaab8a0a3d in raise () from /usr/lib64/libc.so.6 #2 0x00002aaaab8a1c82 in abort () from /usr/lib64/libc.so.6 #3 0x00000000005f9878 in ExceptionalCondition ( conditionName=0x2c53 <Address 0x2c53 out of bounds>, errorType=0x6 <Address0x6 out of bounds>, fileName=0x0, lineNumber=-1) at assert.c:51 #4 0x0000000000460967 in _bt_doinsert (rel=0x2aaaaab05568, btitem=0xbec2c0, index_is_unique=1 '\001', heapRel=0x8bf0f0) at nbtinsert.c:247 #5 0x0000000000463773 in btinsert (fcinfo=0x2c53) at nbtree.c:228 #6 0x00000000005fe869 in FunctionCall6 (flinfo=0x8, arg1=6, arg2=0, arg3=18446744073709551615, arg4=0, arg5=0, arg6=0)at fmgr.c:1267 #7 0x000000000045bf4f in index_insert (indexRelation=0x2aaaaab05568, values=0x7fffffdfde20, isnull=0x7fffffdfde00 "",heap_t_ctid=0xbebeac, heapRelation=0x8bf0f0, check_uniqueness=1 '\001') at indexam.c:215 #8 0x000000000048f8fa in CatalogIndexInsert (indstate=0x2c53, heapTuple=0xbebb88) at indexing.c:124 #9 0x000000000048f994 in CatalogUpdateIndexes (heapRel=0x2c53, heapTuple=0xbebea8) at indexing.c:149 #10 0x000000000049bc67 in TypeCreate (typeName=0x7fffffdfe3e0 "push_tmp", typeNamespace=11057063, relationOid=12171371, relationKind=114 'r', internalSize=-16728, typeType=99 'c',typDelim=44 ',', inputProcedure=2290, outputProcedure=2291, receiveProcedure=2402, sendProcedure=2403, analyzeProcedure=0,elementType=0, baseType=0, defaultTypeValue=0x0, defaultTypeBin=0x0, passedByValue=-16 '', alignment=100'd', storage=120 'x', typeMod=-1, typNDims=0, typeNotNull=0 '\0') at pg_type.c:316 #11 0x000000000048c361 in heap_create_with_catalog ( relname=0x7fffffdfe3e0 "push_tmp", relnamespace=11057063, reltablespace=0,relid=12171371, ownerid=16384, tupdesc=0xbeb8e8, relkind=114 'r', shared_relation=0 '\0', oidislocal=0'\0', oidinhcount=0, oncommit=ONCOMMIT_DROP, allow_system_table_mods=0 '\0') at heap.c:634 #12 0x00000000004de220 in DefineRelation (stmt=0x93fc30, relkind=114 'r') at tablecmds.c:423 #13 0x000000000058bfd0 in ProcessUtility (parsetree=0x93fc30, params=0x0, dest=0x814b40, completionTag=0x0) at utility.c:497 #14 0x0000000000515cb5 in _SPI_execute_plan (plan=0x93f9a8, Values=0x9c5798, Nulls=0x9c57b8 "~", '\177' <repeats 199 times>..., snapshot=0x0, crosscheck_snapshot=0x0, read_only=0'\0', tcount=0) at spi.c:1449 #15 0x00000000005165fc in SPI_execute_plan (plan=0x93f9a8, Values=0x9c5798, Nulls=0x9c57b8 "~", '\177' <repeats 199 times>..., read_only=0 '\0', tcount=0) at spi.c:336 #16 0x00002aaaac95d8a4 in exec_stmts (estate=0x7fffffdfe950, stmts=0x6) at pl_exec.c:2280 #17 0x00002aaaac95ebc2 in exec_stmt_block (estate=0x7fffffdfe950, block=0x8f2c70) at pl_exec.c:936 #18 0x00002aaaac95f5ab in plpgsql_exec_function (func=0x913bc8, fcinfo=0x7fffffdfea90) at pl_exec.c:286 #19 0x00002aaaac9573f5 in plpgsql_call_handler (fcinfo=0x7fffffdfea90) at pl_handler.c:123 #20 0x0000000000501a74 in ExecMakeFunctionResult (fcache=0x90a7f0, econtext=0x90a6c0, isNull=0x90ae38 "\177~\177\177\177\177\177\177!\006", isDone=0x90aef0) at execQual.c:1095 #21 0x0000000000505543 in ExecProject (projInfo=0x90ae58, isDone=0x7fffffdfeef4) at execQual.c:3669 #22 0x000000000050ff5a in ExecResult (node=0x90a5a8) at nodeResult.c:157 #23 0x000000000050034d in ExecProcNode (node=0x90a5a8) at execProcnode.c:306 #24 0x00000000004ff5ea in ExecutorRun (queryDesc=0x90a5a8, direction=ForwardScanDirection, count=0) at execMain.c:1110 #25 0x000000000058a5de in PortalRunSelect (portal=0x8e6c68, forward=1 '\001', count=0, dest=0x8dad30) at pquery.c:794 #26 0x000000000058abdf in PortalRun (portal=0x8e6c68, count=9223372036854775807, dest=0x8dad30, altdest=0x8dad30, completionTag=0x7fffffdff320"") at pquery.c:646 #27 0x0000000000588fcb in PostgresMain (argc=9333864, argv=0x8dac18, username=0x8853f0 "jeremyd") at postgres.c:1754 #28 0x000000000055e20a in ServerLoop () at postmaster.c:2853 #29 0x000000000055f9f9 in PostmasterMain (argc=3, argv=0x8832e0) at postmaster.c:943 #30 0x000000000051fb83 in main (argc=3, argv=0x8832e0) at main.c:256 > > Please also look at putting together a test case so others can poke at > this. I sent some emails around which should hopefully set this in motion.
Jeremy Drake <pgsql@jdrake.com> writes: > I ran without that function you made, and it got the error, but not a > crash. I stuck an Assert(false) right before the ereport for that > particular error, and I did end up with a core there, but I don't see > anything out of the ordinary (what little I know of the ordinary ;) Yeah, that's just the CREATE TEMP TABLE doing what it's supposed to do. The problem is presumably that a prior DROP operation failed to remove the pg_type row associated with a previous temp table of the same name ... but why that would happen is still really unclear. Does your application drop these temp tables explicitly, or leave them to be dropped automatically during commit? It might be interesting to see whether changing that makes any difference. I'm also curious whether the transaction that makes the temp table is ever rolled back instead of committed. regards, tom lane
On Mon, 9 Jan 2006, Tom Lane wrote: > Does your application drop these temp tables explicitly, or leave them > to be dropped automatically during commit? It might be interesting to > see whether changing that makes any difference. I drop them explicitly at the end of the function. > I'm also curious > whether the transaction that makes the temp table is ever rolled back > instead of committed. Not intentionally/explicitly. The only time it should roll back is if it gets an error (which tends to be this error). I do sometimes hit ^C on the perl scripts to tweak something, which would roll it back if in this particular code, but I don't think i did that on the last run at least. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Every little picofarad has a nanohenry all its own. -- Don Vonada