Thread: failed to re-find parent key
I have hacked analyze.c to automatically create a unique index on the oid when a table is created and I am getting the failed to re-find parent key in pg_attribute_relid_attnam_index every 8 attempts to do the following select * from foo into temp a; drop table a; Currently analyze does not create the oid index on the select into. I realize this is beyond the realm of supported code, but can anyone tell me what's going on or a better way to fix it. BTW, the real problem is that select * from foo where oid=? doesn't use an index scan. Dave -- Dave Cramer 519 939 0336 ICQ # 1467551
Sorry folks, this is a red-herring, my hack didn't have anything to do with it. It occurs with or without the index. Question is was there a bug between 7.4 and 7.4.1 which may have caused this? Dave On Tue, 2004-01-13 at 16:30, Dave Cramer wrote: > I have hacked analyze.c to automatically create a unique index on the > oid when a table is created and I am getting the failed to re-find > parent key in pg_attribute_relid_attnam_index every 8 attempts to do the > following > > select * from foo into temp a; > drop table a; > > Currently analyze does not create the oid index on the select into. > > I realize this is beyond the realm of supported code, but can anyone > tell me what's going on or a better way to fix it. > > BTW, the real problem is that select * from foo where oid=? doesn't use > an index scan. > > Dave -- Dave Cramer 519 939 0336 ICQ # 1467551
Dave Cramer <pg@fastcrypt.com> writes: > Sorry folks, this is a red-herring, my hack didn't have anything to do > with it. In that case you could provide a self-contained example? I tried this a few times in the regression database: select * into temp a from tenk1; drop table a; and saw no problem in either CVS tip or 7.4.*. regards, tom lane
I can't recreate it either, it is only happening on my customers machines, which are using an older version of redhat (7.2) and gcc 2.96 Is it possible these versions are relevant to the issue? Dave On Tue, 2004-01-13 at 19:05, Tom Lane wrote: > Dave Cramer <pg@fastcrypt.com> writes: > > Sorry folks, this is a red-herring, my hack didn't have anything to do > > with it. > > In that case you could provide a self-contained example? I tried this > a few times in the regression database: > > select * into temp a from tenk1; drop table a; > > and saw no problem in either CVS tip or 7.4.*. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Dave Cramer 519 939 0336 ICQ # 1467551
Dave Cramer <pg@fastcrypt.com> writes: > I can't recreate it either, it is only happening on my customers > machines, which are using an older version of redhat (7.2) and gcc 2.96 > Is it possible these versions are relevant to the issue? Hmm. Compiler bug maybe? I can't recall if gcc 2.96 had a good reputation or not. You might try backing off the optimization level and see if the behavior changes. Also see if there are any errata available for that compiler package. Also, is this just one machine or several? If only one, I'd try reindexing that index and see if that helps. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Hmm. Compiler bug maybe? I can't recall if gcc 2.96 had a good > reputation or not. 2.96's reputation is of being *THE* canonical bad version of gcc. mplayer for example has a configure check that specifically makes it difficult to compile with 2.96 without reading large warnings because they've had too many bug reports get tracked down to it: Note for gcc 2.96 users: Some versions of this compiler are known to miscompile mplayer and lame (which is used formencoder). If you get compile errors, first upgrade to the latest 2.96 release (minimum 2.96-85) and try again. Ifthe problem still exists, try with gcc 3.x (or 2.95.x) *BEFORE* reporting bugs! GCC 2.96 IS NOT AND WILL NOT BE SUPPORTED BY US ! -- greg
On Tue, 2004-01-13 at 21:16, Tom Lane wrote: > Dave Cramer <pg@fastcrypt.com> writes: > > I can't recreate it either, it is only happening on my customers > > machines, which are using an older version of redhat (7.2) and gcc 2.96 > > > Is it possible these versions are relevant to the issue? > > Hmm. Compiler bug maybe? I can't recall if gcc 2.96 had a good > reputation or not. You might try backing off the optimization level > and see if the behavior changes. Also see if there are any errata > available for that compiler package. Thanks for the advice, I'm going to do more tests to try to isolate it > > Also, is this just one machine or several? If only one, I'd try > reindexing that index and see if that helps. Actually the hack checks for oids, and doesn't make the index, if there isn't an oid in the table, so I tried it with a table without oids, and it still occurs. Thanks for the replies; I'll post if I find something relevant. > > regards, tom lane > -- Dave Cramer 519 939 0336 ICQ # 1467551
Dave Cramer <pg@fastcrypt.com> writes: > Actually the hack checks for oids, and doesn't make the index, if there > isn't an oid in the table, so I tried it with a table without oids, and > it still occurs. My thought was that at this point the indexes on pg_attribute are very possibly corrupt, and so just removing whatever initially caused that corruption won't necessarily cause the error messages to stop. You should reindex pg_attribute to get back into a good state. regards, tom lane
Ok, that seems to have fixed it, so my next question is how did it get corrupt? Dave On Tue, 2004-01-13 at 23:07, Tom Lane wrote: > Dave Cramer <pg@fastcrypt.com> writes: > > Actually the hack checks for oids, and doesn't make the index, if there > > isn't an oid in the table, so I tried it with a table without oids, and > > it still occurs. > > My thought was that at this point the indexes on pg_attribute are very > possibly corrupt, and so just removing whatever initially caused that > corruption won't necessarily cause the error messages to stop. You > should reindex pg_attribute to get back into a good state. > > regards, tom lane > -- Dave Cramer 519 939 0336 ICQ # 1467551
Dave Cramer <pg@fastcrypt.com> writes: > Ok, that seems to have fixed it, so my next question is how did it get > corrupt? The nearest culprit seems to be whatever you did in analyze.c ;-). It's not obvious to me how analyze.c would manage to mess up an index, since it's nowhere near the index-handling code, but I'd sure want to examine your patch before looking further afield for an explanation. regards, tom lane