Thread: Strange error related to temporary tables
Postgres version: 8.1.3 I have a code similar to this pseudo-code: try { create temp table temp_report ...; ... do some processing using this table... drop table temp_report; commit; } catch all errors { rollback; } This code occasionally triggered the following error (couldn't reliably reproduce): SQL state: 42710 ERROR: type "temp_report" already exists The code is possible to be executed in parallel by multiple threads, on different connections, or in sequence on the same connection. I would expect it in both cases to work correctly... in manual tests I was able to create in parallel temporary tables named identically in different connections, and on the same connection after rolling back the transaction which created the first table... so I don't know what to try to trigger this. Any ideas what's the problem ? TIA, Csaba.
Csaba Nagy wrote: > The code is possible to be executed in parallel by multiple threads, on > different connections, or in sequence on the same connection. I would > expect it in both cases to work correctly... in manual tests I was able > to create in parallel temporary tables named identically in different > connections, and on the same connection after rolling back the > transaction which created the first table... so I don't know what to try > to trigger this. > > Any ideas what's the problem ? Are you sure that you're not re-using an existing connection from a pool? -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
On Thu, 2006-08-31 at 17:25, Alban Hertroys wrote: > Csaba Nagy wrote: > > The code is possible to be executed in parallel by multiple threads, on > > different connections, or in sequence on the same connection. I would > > expect it in both cases to work correctly... in manual tests I was able > > to create in parallel temporary tables named identically in different > > connections, and on the same connection after rolling back the > > transaction which created the first table... so I don't know what to try > > to trigger this. > > > > Any ideas what's the problem ? > > Are you sure that you're not re-using an existing connection from a pool? Of course I'm re-using a connection from a pool... but if you look at the pseudo-code from the OP, the temporary table is either dropped or the transaction which creates it is rolled back, before the connection is returned to the pool. When another thread is reusing the connection, the table should be dropped either way, and manual tests show that it works both ways, so I guess it is some race condition somewhere else. In any case, the error message is strange in itself, as if I try to create the temporary table when it exists, the error I get in manual trial is: ERROR: relation "test_temp_table" already exists ^^^^^^^^ compared to the error from the OP which I cite here for reference: ERROR: type "temp_report" already exists ^^^^ So it was complaining about an already existing type, not relation. Any ideas ? I'm still clueless. Cheers, Csaba.
Csaba Nagy wrote: > In any case, the error message is strange in itself, as if I try to > create the temporary table when it exists, the error I get in manual > trial is: > > ERROR: relation "test_temp_table" already exists > ^^^^^^^^ > > compared to the error from the OP which I cite here for reference: > > ERROR: type "temp_report" already exists > ^^^^ That's strange indeed. Some googling revealed this thread: http://archives.postgresql.org/pgsql-general/2005-09/msg01100.php Seems like around this time last year someone had the same problem. Unfortunately, the cause and it's possible solutions aren't mentioned (the OP found a workaround though). They may have never been found... I suppose if you'd check pg_type there is a record containing 'temp_report'? -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Csaba Nagy <nagy@ecircle-ag.com> writes: > In any case, the error message is strange in itself, as if I try to > create the temporary table when it exists, the error I get in manual > trial is: > ERROR: relation "test_temp_table" already exists > ^^^^^^^^ > compared to the error from the OP which I cite here for reference: > ERROR: type "temp_report" already exists > ^^^^ This is not too surprising given the way that heap_create_with_catalog works --- it happens to be easier to insert the pg_type row before the pg_class row, so if you have two sessions trying to create the same table at about the same time, that's where the unique index constraint will kick in. The initial check for a duplicate pg_class row doesn't catch the conflict because the guy who's just slightly ahead won't have committed yet when the second guy looks. The question is how is this scenario managing to occur, given that the tables in question are temp tables? It seems like this must indicate two backends trying to use the same pg_temp_NNN schema; but that should surely be impossible --- it's driven off MyBackendId, and if that's not unique to a live session then we've got *major* problems. IIRC we've seen prior reports of similar issues, so I believe there is something there, but without a test case it's gonna be hard to track down. regards, tom lane
Csaba Nagy <nagy@ecircle-ag.com> writes: > create temp table temp_report ...; > ... do some processing using this table... > drop table temp_report; > commit; > This code occasionally triggered the following error (couldn't reliably > reproduce): > ERROR: type "temp_report" already exists BTW, when this happens, does the error persist? If it's a race condition you'd expect not (because the guy who successfully created the temp table would soon drop it again). When we saw this problem before, we were speculating that a temp table's pg_type row had somehow not gotten dropped during table drop, which'd lead to a persistent failure. Note that a "persistent" failure could still only manifest occasionally, if the unwanted pg_type row were in a high-numbered pg_temp_NNN schema that doesn't get used often. So i guess the correct thing to do is "select oid, xmin from pg_type where typname = 'temp_report'" and see if there are any long-lived entries (xmin far away from the others would be a tipoff). regards, tom lane
I executed what you suggest below, see the results below. Short conclusion: the type is there in pg_type, the relation is not there in pg_class. Is there anything I should look for more ? Cheers, Csaba. > BTW, when this happens, does the error persist? If it's a race > condition you'd expect not (because the guy who successfully created the > temp table would soon drop it again). When we saw this problem before, > we were speculating that a temp table's pg_type row had somehow not > gotten dropped during table drop, which'd lead to a persistent failure. > > Note that a "persistent" failure could still only manifest occasionally, > if the unwanted pg_type row were in a high-numbered pg_temp_NNN schema > that doesn't get used often. So i guess the correct thing to do is > "select oid, xmin from pg_type where typname = 'temp_report'" and see > if there are any long-lived entries (xmin far away from the others would > be a tipoff). test03=> select oid, xmin from pg_type where typname = 'temp_report'; oid | xmin ----------+----------- 58293995 | 220215039 (1 row) test03=> select * from pg_type where typname = 'temp_report'; typname | typnamespace | typowner | typlen | typbyval | typtype | typisdefined | typdelim | typrelid | typelem | typinput | typoutput | typreceive | typsend | typanalyze | typalign | typstorage | typnotnull | typbasetype | typtypmod | typndims | typdefaultbin | typdefault -------------+--------------+----------+--------+----------+---------+--------------+----------+----------+---------+-----------+------------+-------------+-------------+------------+----------+------------+------------+-------------+-----------+----------+---------------+------------ temp_report | 58006858 | 16386 | -1 | f | c | t | , | 58293994 | 0 | record_in | record_out | record_recv | record_send | - | d | x | f | 0 | -1 | 0 | | (1 row) test03=> select * from pg_namespace where oid=58006858; nspname | nspowner | nspacl -----------+----------+-------- pg_temp_8 | 10 | (1 row) test03=> select count(*) from pg_class where relname='temp_report'; count ------- 0 (1 row)
On Fri, 2006-09-01 at 11:25, Csaba Nagy wrote: > I executed what you suggest below [snip] Ok... it looks like the postgres version is not what I expected and reported in the first mail, but: test03=> select version(); version -------------------------------------------------------------------------------------------------------------- PostgreSQL 8.1.0 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.3 20051023 (prerelease) (Debian 4.0.2-3) I'm not sure how that got on that box... but I wonder if the version is relevant ? Cheers, Csaba.
Csaba Nagy <nagy@ecircle-ag.com> writes: > I executed what you suggest below, see the results below. Short > conclusion: the type is there in pg_type, the relation is not there in > pg_class. Is there anything I should look for more ? > test03=> select oid, xmin from pg_type where typname = 'temp_report'; > oid | xmin > ----------+----------- > 58293995 | 220215039 > (1 row) Is this xmin quite a bit older than what you get for a freshly-created temp table? Can you find any entries in pg_depend that show the above OID as either objid or refobjid? Also, it's fairly likely (not certain) that the table associated with this pg_type entry had OID one less, ie 58293994. Can you find any trace of that OID in pg_depend, or in pg_class for that matter? regards, tom lane
> Is this xmin quite a bit older than what you get for a freshly-created > temp table? I would say yes, this is a test system which is highly stressed from time to time, but does not get continuous load. test03=> select oid, xmin from pg_type where typname = 'temp_report'; oid | xmin ----------+----------- 58293995 | 220215039 (1 row) test03=> create temp table test_new_temp_table (a text); CREATE TABLE test03=> select xmin from pg_type where typname = 'test_new_temp_table'; xmin ----------- 236080536 (1 row) > Can you find any entries in pg_depend that show the above > OID as either objid or refobjid? No: test03=> select * from pg_depend where objid = 58293995 or refobjid = 58293995; classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype ---------+-------+----------+------------+----------+-------------+--------- (0 rows) > Also, it's fairly likely (not certain) that the table associated with > this pg_type entry had OID one less, ie 58293994. Can you find any > trace of that OID in pg_depend, No trace. I executed: test03=> select count(*) from pg_depend where classid between 58293995 - 100 and 58293995 + 100; count ------- 0 (1 row) ... and then with each of (objid objsubid refclassid refobjid refobjsubid deptype) instead of classid, with the same result. > ... or in pg_class for that matter? test03=> select count(*) from pg_class where oid between 58293995 - 100 and 58293995 + 100; count ------- 0 (1 row) Any other place to check ? Cheers, Csaba.
Csaba Nagy <nagy@ecircle-ag.com> writes: >> Is this xmin quite a bit older than what you get for a freshly-created >> temp table? > I would say yes, this is a test system which is highly stressed from > time to time, but does not get continuous load. OK, so it seems we have an old pg_type entry that for some reason was not dropped when the owning table was dropped. Weird. As you know, we've seen one report of this before. There's probably not much more we can learn at this point --- given that the entry is old, any other evidence is probably long gone. I'd suggest just doing a manual DELETE of that pg_type row to get out of the problem. And yes, you ought to update that 8.1.0 installation ;-). But I have no idea if this problem is related to any solved bugs. regards, tom lane
> There's probably not much more we can learn at this point --- given > that the entry is old, any other evidence is probably long gone. I'm afraid I'll never be able to get soon enough to the evidence even if it happens again, unless I manage to reproduce it myself, which I tried but didn't succeed (100 threads competing for ~10 connections creating/dropping the table 1000 times, with ~10 percent of the cases rolling back before the drop). > I'd suggest just doing a manual DELETE of that pg_type row to get out > of the problem. OK. This was not a huge problem, it's a test system, but enough ground for the QA to scream ;-) > And yes, you ought to update that 8.1.0 installation ;-). But I have > no idea if this problem is related to any solved bugs. The QA installed the right version now. Thanks for the assistance. Cheers, Csaba.