Thread: bug at build_dummy_tuple
People, This is a weird bug. In a freshly initialized database, or just after deleting the pg_internal.init relcache file, SELECT 16854::regclass; crashes the backend. (Apparently any Oid not belonging to a regclass does the trick.) The following assertion is failed: TRAP: FailedAssertion(«!(((ntp)->t_data)->t_infomask & 0x0010)», Archivo: «/home/alvherre/CVS/pgsql/source/00orig/src/backend/utils/cache/catcache.c»,Línea: 1729) The problem is that build_dummy_tuple() is calling HeapTupleSetOid(), which complains apparently because it believes the pg_class relation does not have Oids. This is wrong, but it doesn't know. In fact, next time through, when it has the relcache built, all is well. I can attest that the cache has wrong info, because gdb shows (gdb) print *cache->cc_tupdesc $15 = {natts = 25, attrs = 0x839d5d4, constr = 0x839cb3c, tdtypeid = 2249, tdtypmod = -1, tdhasoid = 0 '\0'} (gdb) print cache->cc_relname $16 = 0x8266f7a "pg_class" I don't know what the fix for this should look like ... This doesn't seem to happen on 7.4. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "El sabio habla porque tiene algo que decir; el tonto, porque tiene que decir algo" (Platon).
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > This is a weird bug. In a freshly initialized database, or just after > deleting the pg_internal.init relcache file, > SELECT 16854::regclass; > crashes the backend. Ah, I see it. Looks like I was a bit too cute here: http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/cache/relcache.c.diff?r1=1.200;r2=1.201;f=h in particular the change within formrdesc at about line 1316. I was thinking that formrdesc needn't get the relcache's tuple descriptor (rel->rd_att) completely right, since it would get fixed up during RelationCacheInitializePhase2. However, that routine uses SearchSysCache(RELOID), which means catcache.c will have to initialize that catalog cache on first call, and when it does so, it copies the not-yet-fully-valid tupdesc for pg_class from the relcache into the catcache entry. Any subsequent code path that looks at the tdtypeid or tdhasoid fields of the RELOID catcache's tupdesc will see wrong data. The reason it didn't crash in 7.4 was that the 7.4 coding forces the hasoids bit true rather than false, which is no more "correct" than CVS tip, but it happens to be right for pg_class which is the only case that presently will be examined before RelationCacheInitializePhase2 fixes everything. I saw that the code was not setting the bit correctly and misassumed that it was therefore a don't care :-( The proper solution is to make sure that formrdesc can fill the tupdesc completely correctly; that's just a matter of adding a couple more parameters to it, since it's only used for a small set of nailed relations. Will fix. regards, tom lane