On Wed, Dec 19, 2007 at 11:23:35AM -0300, Alvaro Herrera wrote:
> Magnus Hagander wrote:
> > On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
> > > Gregory Stark <stark@enterprisedb.com> writes:
> > > > "Andrew Dunstan" <andrew@dunslane.net> writes:
> > > >> Interesting. Maybe forever is going a bit too far, but retrying for <n>
> > > >> seconds or so.
> > >
> > > > I think looping forever is the right thing. Having a fixed timeout just means
> > > > Postgres will break sometimes instead of all the time. And it introduces
> > > > non-deterministic behaviour too.
> > >
> > > Looping forever would be considered broken by a very large fraction of
> > > the community.
> > >
> > > IIRC we have a 30-second timeout in rename() for Windows, and that seems
> > > to be working well enough, so I'd be inclined to copy the behavior for
> > > this case.
> >
> > Here's a patch that I think implements this ;) Alvaro - do you have a build
> > env so you can test it? I can't reproduce the problem in my environment...
>
> Thanks -- forwarded to the appropriate parties. :-)
Thanks. Let us know the results :-)
> > Also, it currently just silently loops. Would it be interesting to
> > ereport(WARNING) that it's looping on the open, to let the user know
> > there's a problem? (Naturally, only warning the first time it tries it on
> > each file, so we don't spam the log too hard)
>
> Yeah, I think it would be useful to log one message if after (say) 5
> seconds we still haven't been able to open the file.
Either that, or on the first run.
> Is the sleep time correct? If I'm reading it right, it sleeps 100 ms
> each time, 30 times, that totals 3 seconds ... ?
Uh, I copied that from pgunlink() and pgrename(), but forgot a zero on the
loop. It's supposed to loop 300 times.
> (Are we OK with the idea of sleeping 1 second each time?)
I think not. 0.1 seconds is better. We don't want to delay a full second if
it's just a transient thing.
//Magnus