ohp@pyrenet.fr wrote:
> On Tue, 2 Dec 2008, Heikki Linnakangas wrote:
>
>> Date: Tue, 02 Dec 2008 20:47:19 +0200
>> From: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>
>> To: ohp@pyrenet.fr
>> Cc: Zdenek Kotala <Zdenek.Kotala@Sun.COM>,
>> pgsql-hackers list <pgsql-hackers@postgresql.org>
>> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>>
>> ohp@pyrenet.fr wrote:
>>> Suivi de pile correspondant à p1, Programme postmaster
>>> *[0] fsm_rebuild_page( présumé: 0xbd9731a0, 0, 0xbd9731a0) [0x81e6a97]
>>> [1] fsm_search_avail( présumé: 0x2, 0x6, 0x1) [0x81e68d9]
>>> [2] fsm_set_and_search(0x84b2250, 0, 0, 0x2e, 0x5, 0x6, 0x2e,
>>> 0x8047416, 0xb4) [0x81e6385]
>>> [3] RecordAndGetPageWithFreeSpace(0x84b2250, 0x2e, 0xa0, 0xb4)
>>> [0x81e5a00]
>>> [4] RelationGetBufferForTuple( présumé: 0x84b2250, 0xb4, 0) [0x8099b59]
>>> [5] heap_insert(0x84b2250, 0x853a338, 0, 0, 0) [0x8097042]
>>> [6] simple_heap_insert( présumé: 0x84b2250, 0x853a338, 0x853a310)
>>> [0x8097297]
>>> [7] InsertOneTuple( présumé: 0xb80, 0x84057b0, 0x8452fb8) [0x80cb210]
>>> [8] boot_yyparse( présumé: 0xffffffff, 0x3, 0x8047ab8) [0x80c822b]
>>> [9] BootstrapModeMain( présumé: 0x66, 0x8454600, 0x4) [0x80ca233]
>>> [10] AuxiliaryProcessMain(0x4, 0x8047ab4) [0x80cab3b]
>>> [11] main(0x4, 0x8047ab4, 0x8047ac8) [0x8177dce]
>>> [12] _start() [0x807ff96]
>>>
>>> seems interesting!
>>>
>>> We've had problems already with unixware optimizer, hope this one is
>>> fixable!
>>
>> Looking at fsm_rebuild_page, I wonder if the compiler is treating
>> "int" as an unsigned integer? That would cause an infinite loop.
>>
> No, a simple printf of nodeno shows it starting at 4096 all the way
> down to 0, starting back at 4096...
Hmm, it's probably looping in fsm_search_avail then. In a fresh cluster,
there shouldn't be any broken FSM pages that need rebuilding.
I'd like to see what the FSM page in question looks like. Could you try
to run initdb with "-d -n" options? I bet you'll get an infinite number
of lines like:
DEBUG: fixing corrupt FSM block 1, relation 123/456/789
Could you zip up the FSM file of that relation (a file called e.g
"789_fsm"), and send it over? Or the whole data directory, it shouldn't
be that big.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com