Re: pgstat: remove delayed destroy / pipe: socket error fix - Mailing list pgsql-patches

From Magnus Hagander
Subject Re: pgstat: remove delayed destroy / pipe: socket error fix
Date
Msg-id 6BCB9D8A16AC4241919521715F4D8BCEA0F8DF@algol.sollentuna.se
Whole thread Raw
In response to pgstat: remove delayed destroy / pipe: socket error fix  ("Peter Brant" <Peter.Brant@wicourts.gov>)
Responses Re: pgstat: remove delayed destroy / pipe: socket
Re: pgstat: remove delayed destroy / pipe: socket
List pgsql-patches
> Hi all,
>
> Attached are two patches which in combination make
> pg_stat_activity work reliably for us on Windows.
>
> The mysterious socket error turned out to be WSAEWOULDBLOCK.
> Per
> http://msdn.microsoft.com/library/default.asp?url=/library/en-
> us/winsock/winsock/windows_sockets_error_codes_2.asp
> , it seems the thing to do is loop and try again.  pipe.patch
> does that.

Good catch. I always knew there was something afoot in that code, but
never thought it was that simple ;-) I kinda got snowed in on the idea
that it had to do with inheritance :-(

Also, might we want a CHECK_FOR_INTERRUPTS in the loop? Since it's a
potential stuck-forever place.

Oh, and checking the code go pgwin32_recv, I think I see where this is
coming from: pgwin32_recv calls pgwin32_waitforsinglesocket(). If that
one succeeds *and a signal is delivered while it's waiting*, we'll get
out og pgwin32_waitforsinglesocket() without clearing the WSA code. The
rest of the pg code uses errno which is properly set to EINTR, but the
pipe code uses WSAGetLastError() directly.

The fix for that is probably to add a WSASetLastError(WSAEINTR)  right
after the errno=EINTR in pgwin32_waitforsinglesocket().

Does this seem right? Can you try by adding this, and then backing out
your pgstats patch, and see if this alone is enough?


BTW, what's with the change to all the error msgs?

And finally, the error handling looks a bit off? We specifically *don't*
want it to log an error for the WSAECONNRESET state - it's a normal
state. Or am I reading the patch wrong?


> The one remaining problem is that there seems to be a race
> condition when installing the temporary stats file on
> Windows.  As we were monitoring pg_stat_activity during the
> test run, occasionally we'd get a response with zero rows.
> This may not be much of a problem during normal conditions
> (the server was completely overloaded and we were banging
> away with "Up Arrow", "Enter" watching pg_stat_activity).
>
> What's the best way to do an atomic rename on Windows?
> Alternatively, would it make sense to sleep and try again (up
> to some limit) when trying to open the stats file on Windows?

rename() should be atomic. We do a special version on win32 (see
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/port/dirmod.c?rev=1
.42), but I don't see how that would change the atomicity of it.

Do you get anything at all in the logs when this happens? Are you sure
the reason is that it picks up an empty file, or could it be something
else?

//Magnus

pgsql-patches by date:

Previous
From: "Peter Brant"
Date:
Subject: pgstat: remove delayed destroy / pipe: socket error fix
Next
From: "Magnus Hagander"
Date:
Subject: Re: pgstat: remove delayed destroy / pipe: socket error fix