Re: EINTR in ftruncate() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: EINTR in ftruncate() |
Date | |
Msg-id | 20220711174518.yldckniicknsxgzl@awork3.anarazel.de Whole thread Raw |
In response to | Re: EINTR in ftruncate() (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: EINTR in ftruncate()
|
List | pgsql-hackers |
Hi, On 2022-07-07 17:58:10 +1200, Thomas Munro wrote: > Even if we go with this approach now, I think it's plausible that we > might want to reconsider this yet again one day, perhaps allocating > memory with some future asynchronous infrastructure while still > processing interrupts. It's not very nice to hold up recovery or > ProcSignalBarrier for long operations. I think the next improvement would be to do the fallocate in smaller chunks, and accept interrupts inbetween. > I'm a little unclear about ftruncate() here. I don't expect it to > report EINTR in other places where we use it (ie to make a local file > on a non-"slow device" smaller), because I expect that to be like > read(), write() etc which we don't wrap in EINTR loops. Here you've > observed EINTR when messing around with a debugger*. It seems > inconsistent to put posix_fallocate() in an EINTR retry loop for the > benefit of debuggers, but not ftruncate(). But perhaps that's good > enough, on the theory that posix_fallocate(1GB) is a very large target > and you have a decent chance of hitting it. > *It's funny that ftruncate() apparently doesn't automatically restart > for ptrace SIGCONT on Linux according to your report, while poll() > does according to my experiments, even though the latter is one of the > never-restart functions (it doesn't on other OSes I hack on, and you > feel the difference when debugging missing wakeup type bugs...). > Random implementation details... My test was basically while (true); {if (!ftruncate()) bleat(); if (!fallocate()) bleat();} with a SIGALRM triggering regularly in the background. The ftruncate failed, rarely, when attaching / detaching strace -p. Rarely enough that I had already written you an IM saying that I couldn't make it fail... So it's hard to be confident this can't otherwise be hit. With that caveat: I didn't hit it with a "real" file on a "real" filesystem in a few minutes of trying. But unsurprisingly it triggers when putting the file on a tmpfs. > @@ -362,6 +355,14 @@ dsm_impl_posix_resize(int fd, off_t size) > { > int rc; > > + /* > + * Block all blockable signals, except SIGQUIT. posix_fallocate() can run > + * for quite a long time, and is an all-or-nothing operation. If we > + * allowed SIGUSR1 to interrupt us repeatedly (for example, due to recovery > + * conflicts), the retry loop might never succeed. > + */ > + PG_SETMASK(&BlockSig); > + > /* Truncate (or extend) the file to the requested size. */ > rc = ftruncate(fd, size); > Hm - given that we've observed ftruncate failing with strace, and that stracing to find problems isn't insane, shouldn't we retry the ftruncate too? It's kind of obsoleted by your next patch, but not really, because it's not unconceivable that other OSs behave similarly? And IIUC you're planning to not backpatch 0002? > + pgstat_report_wait_start(WAIT_EVENT_DSM_FILL_ZERO_WRITE); (not new in this patch, just moved around) - FILL_ZERO_WRITE is imo an odd description of what's happening... In my understanding this isn't doing any writing, just allocating. But whatever... Greetings, Andres Freund
pgsql-hackers by date: