>> As for fixing the problem we do understand: ISTM it's just an
>> awful idea for pgrename and pgunlink to be willing to loop
>> forever. I think they should time out and report the failure
>> after some reasonable period (say between 10 sec and a minute).
is the main problem realy in the rename/delete function? while i'm in no
position of actually knowing whats going on under the hood, my observations
in +10 cases during this afternoon/evening revealed some patterns:
it is defenitely the writer process that blocks the db. but in every case
the writer process seems to fail to rename the file due to another
postgresql still holding a filehandle to the very xlog file that should be
renamed. ProcessExplorer lets you force a close of the file handle - as soon
as you do this [which is a bad thing to do, i assume], the rename succeeds
and processing continues normally.
i actually can reproduce the error at will now - i just need do pump enough
data into the db (~200mb data seems sufficient) to have it lock up.
- thomas