Thread: [HACKERS] Reversed sync check in pg_receivewal
This bug seems to have snuck in there with the introduction of walmethods. AFAICT we are testing the result of sync() backwards, so whenever a partial segment exists for pg_receivewal, it will fail. It will then unlink the file, so when it retries 5 seconds later it works.
--
It also doesn't log the failure. Oops.
Attached patch reverses the check, and adds a failure message. I'd appreciate a quick review in case I have the logic backwards in my head...
Attachment
Magnus Hagander <magnus@hagander.net> writes: > Attached patch reverses the check, and adds a failure message. I'd > appreciate a quick review in case I have the logic backwards in my head... I think the patch is correct, but if there's any documentation of the walmethod APIs that would allow one to assert which side of the API got this wrong, I sure don't see it. Would it be unreasonable to insist on some documentation around that? regards, tom lane
On Tue, Apr 11, 2017 at 9:41 PM, Magnus Hagander <magnus@hagander.net> wrote: > This bug seems to have snuck in there with the introduction of walmethods. > AFAICT we are testing the result of sync() backwards, so whenever a partial > segment exists for pg_receivewal, it will fail. It will then unlink the > file, so when it retries 5 seconds later it works. > > It also doesn't log the failure. Oops. > > Attached patch reverses the check, and adds a failure message. I'd > appreciate a quick review in case I have the logic backwards in my head... This has been fat-fingered in 56c7d8d4, and looking around I am not seeing similar mistakes. Thanks for the fix. -- Michael
Hi Magnus, > Attached patch reverses the check, and adds a failure message. I'd > appreciate a quick review in case I have the logic backwards in my head... Well, I can state that `make check-world` passes on my laptop and that code seems to be right. However documentation to WalWriteMethod could probably be improved. Currently there is none. -- Best regards, Aleksander Alekseev
On Tue, Apr 11, 2017 at 3:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Would you say comments in the struct in walmethods.h is enough, or were you thinking actual sgml docs when you commented that?
Magnus Hagander <magnus@hagander.net> writes:
> Attached patch reverses the check, and adds a failure message. I'd
> appreciate a quick review in case I have the logic backwards in my head...
I think the patch is correct, but if there's any documentation of the
walmethod APIs that would allow one to assert which side of the API got
this wrong, I sure don't see it. Would it be unreasonable to insist
on some documentation around that?
Agreed.
Would you say comments in the struct in walmethods.h is enough, or were you thinking actual sgml docs when you commented that?
Magnus Hagander <magnus@hagander.net> writes: > On Tue, Apr 11, 2017 at 3:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think the patch is correct, but if there's any documentation of the >> walmethod APIs that would allow one to assert which side of the API got >> this wrong, I sure don't see it. Would it be unreasonable to insist >> on some documentation around that? > Would you say comments in the struct in walmethods.h is enough, or were you > thinking actual sgml docs when you commented that? This is just internal to pg_basebackup isn't it? I think comments in walmethods.h would be plenty. regards, tom lane
On Tue, Apr 11, 2017 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Apr 11, 2017 at 3:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think the patch is correct, but if there's any documentation of the
>> walmethod APIs that would allow one to assert which side of the API got
>> this wrong, I sure don't see it. Would it be unreasonable to insist
>> on some documentation around that?
> Would you say comments in the struct in walmethods.h is enough, or were you
> thinking actual sgml docs when you commented that?
This is just internal to pg_basebackup isn't it? I think comments in
walmethods.h would be plenty.
Local to pg_basebackup and pg_receivewal, yes.
Something like the attached?
Attachment
Magnus Hagander <magnus@hagander.net> writes: > Something like the attached? Not sure about + * All methods that have a failure path will set errno on failure. Given that you've got a getlasterror method, I don't think that's really the API invariant is it? If it were, you'd just have the callers doing strerror(errno) for themselves. I think maybe a more accurate statement would be * All methods that have a failure return indicator will set state* allowing the getlasterror() method to return a suitablemessage.* Commonly, errno is this state (or part of it); so callers must take* care not to clobber errno betweena failed method call and use of* getlasterror() to retrieve the message. Also: * the arguments of open_for_write() could stand to be explained * what's the return value of finish()? * wouldn't it be better to declare getlasterror() as returning "const char *"? regards, tom lane
On Tue, Apr 11, 2017 at 6:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
-- Magnus Hagander <magnus@hagander.net> writes:
> Something like the attached?
Not sure about
+ * All methods that have a failure path will set errno on failure.
Given that you've got a getlasterror method, I don't think that's really
the API invariant is it? If it were, you'd just have the callers doing
strerror(errno) for themselves. I think maybe a more accurate statement
would be
Hmm. You're right, this is what I get for rushing a patch before plane departure :/
* All methods that have a failure return indicator will set state
* allowing the getlasterror() method to return a suitable message.
* Commonly, errno is this state (or part of it); so callers must take
* care not to clobber errno between a failed method call and use of
* getlasterror() to retrieve the message.
Yeah, that's much better.
Also:
* the arguments of open_for_write() could stand to be explained
* what's the return value of finish()?
Both fixed.
* wouldn't it be better to declare getlasterror() as returning
"const char *"?
Yeah, probably is. Will do and push.
Thanks!