Thread: Streaming Replication on win32
I'm trying to figure out why streaming replication doesn't work on win32. Here is what I have so far: It starts up fine, and outputs: LOG: starting archive recovery LOG: standby_mode = 'on' LOG: primary_conninfo = 'host=localhost port=5432' LOG: starting streaming recovery at 0/2000000 After this, *nothing* happens, and it never reaches a consistent state or anything. Looking at stacktraces, I notice two things: walreceiver process is in: ntdll!ZwWaitForSingleObject+0xa mswsock+0x4f65 WS2_32!select+0x105 LIBPQ!pqSocketPoll(int sock = 4936, int forRead = 1, int forWrite = 0, int64 end_time = -1)+0x2bb LIBPQ!pqSocketCheck(struct pg_conn * conn = 0x00000000`00830160, int forRead = 1, int forWrite = 0, int64 end_time = -1)+0xa1 LIBPQ!pqWaitTimed(int forRead = 1, int forWrite = 0, struct pg_conn * conn = 0x00000000`00830160, int64 finish_time = -1)+0x2e LIBPQ!pqWait(int forRead = 1, int forWrite = 0, struct pg_conn * conn = 0x00000000`00830160)+0x2a LIBPQ!PQgetResult(struct pg_conn * conn = 0x00000000`00830160)+0x82 LIBPQ!PQexecFinish(struct pg_conn * conn = 0x00000000`00830160)+0x1c LIBPQ!PQexec(struct pg_conn * conn = 0x00000000`00830160, char * query = 0x00000000`0042f600 "START_REPLICATION 0/2000000")+0x44 walreceiver!WalRcvConnect(void)+0x457 walreceiver!WalReceiverMain(struct FunctionCallInfoData * fcinfo = 0x00000000`00000000)+0x20e postgres!AuxiliaryProcessMain(int argc = 2, char ** argv = 0x00000000`0081f080)+0x600 postgres!SubPostmasterMain(int argc = 4, char ** argv = 0x00000000`0081f070)+0x2d7 postgres!main(int argc = 4, char ** argv = 0x00000000`0081f070)+0x1e4 postgres!__tmainCRTStartup(void)+0x192 postgres!mainCRTStartup(void)+0xe kernel32!BaseProcessStart+0x2c Which shows one potentially big problem - since we're calling select() from inside libpq, it's not calling our "signal emulation layer compatible select()". This means that at this point, walreceiver is not interruptible. Which also shows itself if I shut down the system - the walreceiver stays around, and won't terminate properly. Do we need to invent a way for libpq to call back into backend code to do this select? We certainly can't have libipq use our version directly - since that would break all non-postmaster/postgres processes. The second thing I note is that the walsender is in: ntdll!ZwWaitForMultipleObjects+0xa kernel32!ReleaseSemaphore+0x6b postgres!pgwin32_waitforsinglesocket(unsigned int64 s = 0x13fc, int what = 41, int timeout = -1)+0x275 postgres!pgwin32_recv(unsigned int64 s = 0x13fc, char * buf = 0x00000000`0042f990 "???", int len = 1, int f = 0)+0xf5 postgres!secure_read(struct Port * port = 0x00000000`0042fcf0, void * ptr = 0x00000000`0042f990, unsigned int64 len = 1)+0x32 postgres!pq_getbyte_if_available(unsigned char * c = 0x00000000`0042f990 "???")+0x106 postgres!CheckClosedConnection(void)+0x10 postgres!WalSndLoop(void)+0xdf postgres!WalSenderMain(void)+0xb9 postgres!PostgresMain(int argc = 2, char ** argv = 0x00000000`0084d520, char * username = 0x00000000`0082e218 "Administrator")+0x3b5 postgres!BackendRun(struct Port * port = 0x00000000`0042fcf0)+0x235 postgres!SubPostmasterMain(int argc = 3, char ** argv = 0x00000000`0081f080)+0x278 postgres!main(int argc = 3, char ** argv = 0x00000000`0081f080)+0x1e4 postgres!__tmainCRTStartup(void)+0x192 postgres!mainCRTStartup(void)+0xe kernel32!BaseProcessStart+0x2c From what I can tell, this indicates that pq_getbyte_if_available() is not working - because it's supposed to never block, right? This could be because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets. Specifically, it actually *always* sets the socket to non-blocking mode, and then uses that to properly emulate how sockets work under unix. Oh, and the walsender process says: \Sessions\1\BaseNamedObjects\pgident(2196): postgres: wal sender process Administrator 127.0.0.1(1398) startup the walreceiver says: \Sessions\1\BaseNamedObjects\pgident(2264): postgres: wal receiver process and the startup process says: \Sessions\1\BaseNamedObjects\pgident(2764): postgres: startup processwaiting for 000000010000000000000002 -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > Which shows one potentially big problem - since we're calling select() > from inside libpq, it's not calling our "signal emulation layer > compatible select()". This means that at this point, walreceiver is > not interruptible. Which also shows itself if I shut down the system - > the walreceiver stays around, and won't terminate properly. Do we need > to invent a way for libpq to call back into backend code to do this > select? We certainly can't have libipq use our version directly - > since that would break all non-postmaster/postgres processes. Hmm, contrib/dblink probably has the same issue then. We could replace the blocking PQexec() calls with PQsendQuery(), and usethe emulated version of select() to wait. > From what I can tell, this indicates that pq_getbyte_if_available() is > not working - because it's supposed to never block, right? Right, it's not supposed to block. > This could be because the win32 socket emulation layer simply wasn't > designed to deal with non-blocking sockets. Specifically, it actually > *always* sets the socket to non-blocking mode, and then uses that to > properly emulate how sockets work under unix. I presume the win32 emulation layer can be taught about non-blocking sockets? Or maybe pq_getbyte_if_available() can be implemented using some other simpler method on Windows. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Magnus Hagander <magnus@hagander.net> writes: > Which shows one potentially big problem - since we're calling select() > from inside libpq, it's not calling our "signal emulation layer > compatible select()". This means that at this point, walreceiver is > not interruptible. Ugh. > Which also shows itself if I shut down the system - > the walreceiver stays around, and won't terminate properly. Do we need > to invent a way for libpq to call back into backend code to do this > select? We certainly can't have libipq use our version directly - > since that would break all non-postmaster/postgres processes. I think that on some platforms, it's possible for the call to select() from a shlib such as libpq to be captured by a select() provided by the executable it's loaded into. Dunno about the linking rules on Windows, but is there any chance of a workaround that way? regards, tom lane
On Mon, Jan 18, 2010 at 5:22 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> This could be because the win32 socket emulation layer simply wasn't >> designed to deal with non-blocking sockets. Specifically, it actually >> *always* sets the socket to non-blocking mode, and then uses that to >> properly emulate how sockets work under unix. > > I presume the win32 emulation layer can be taught about non-blocking > sockets? Or maybe pq_getbyte_if_available() can be implemented using > some other simpler method on Windows. How about checking the socket by using select/poll before calling pq_getbyte_if_available()? This would prevent pgwin32_recv() from being blocked because a message is guaranteed to have already arrived. When the renegotiation happens, SSL_read (instead of pqwin32_recv()) is called with non-blocking socket, so it's not blocked. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Jan 18, 2010 at 10:30, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jan 18, 2010 at 5:22 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >>> This could be because the win32 socket emulation layer simply wasn't >>> designed to deal with non-blocking sockets. Specifically, it actually >>> *always* sets the socket to non-blocking mode, and then uses that to >>> properly emulate how sockets work under unix. >> >> I presume the win32 emulation layer can be taught about non-blocking >> sockets? Or maybe pq_getbyte_if_available() can be implemented using >> some other simpler method on Windows. > > How about checking the socket by using select/poll before calling > pq_getbyte_if_available()? This would prevent pgwin32_recv() from > being blocked because a message is guaranteed to have already arrived. > When the renegotiation happens, SSL_read (instead of pqwin32_recv()) > is called with non-blocking socket, so it's not blocked. SSL_read calls into pqwin32_recv(), so you have the same problem. (see my_sock_read() and my_sock_write() in be-secure.c) -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Mon, Jan 18, 2010 at 6:40 PM, Magnus Hagander <magnus@hagander.net> wrote: > SSL_read calls into pqwin32_recv(), so you have the same problem. (see > my_sock_read() and my_sock_write() in be-secure.c) Oh, I confirmed that. Thanks! Can we prevent SSL_read from being blocked in the renegotiation case if we use poll/select in my_sock_read() even if the socket is blocking mode? If Yes, ISTM that we can work around the problem by using the special BIO function which checks the socket, as BIO. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
2010/1/18 Tom Lane <tgl@sss.pgh.pa.us>: > Magnus Hagander <magnus@hagander.net> writes: >> Which shows one potentially big problem - since we're calling select() >> from inside libpq, it's not calling our "signal emulation layer >> compatible select()". This means that at this point, walreceiver is >> not interruptible. > > Ugh. Indeed. >> Which also shows itself if I shut down the system - >> the walreceiver stays around, and won't terminate properly. Do we need >> to invent a way for libpq to call back into backend code to do this >> select? We certainly can't have libipq use our version directly - >> since that would break all non-postmaster/postgres processes. > > I think that on some platforms, it's possible for the call to select() > from a shlib such as libpq to be captured by a select() provided by the > executable it's loaded into. Dunno about the linking rules on Windows, > but is there any chance of a workaround that way? AFAIK, no. We can somehow control the link order when we link with the import library, but that would require us to do it at link time, meaning libpq would be linked to postgres.exe --> FAIL. Another option is to load the select() function on runtime, by having libpq examine the list of loaded modules for postgres.exe.. But that seems a lot uglier than providing some kind of callback for it. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
2010/1/17 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: > Magnus Hagander wrote: >> Which shows one potentially big problem - since we're calling select() >> from inside libpq, it's not calling our "signal emulation layer >> compatible select()". This means that at this point, walreceiver is >> not interruptible. Which also shows itself if I shut down the system - >> the walreceiver stays around, and won't terminate properly. Do we need >> to invent a way for libpq to call back into backend code to do this >> select? We certainly can't have libipq use our version directly - >> since that would break all non-postmaster/postgres processes. > > Hmm, contrib/dblink probably has the same issue then. Yes. > We could replace the blocking PQexec() calls with PQsendQuery(), and use > the emulated version of select() to wait. Hmm. That would at least theoretically work, but aren't there still places we may end up blocking further down? Or are those ok? >> From what I can tell, this indicates that pq_getbyte_if_available() is >> not working - because it's supposed to never block, right? > > Right, it's not supposed to block. > >> This could be because the win32 socket emulation layer simply wasn't >> designed to deal with non-blocking sockets. Specifically, it actually >> *always* sets the socket to non-blocking mode, and then uses that to >> properly emulate how sockets work under unix. > > I presume the win32 emulation layer can be taught about non-blocking > sockets? Or maybe pq_getbyte_if_available() can be implemented using > some other simpler method on Windows. It could be taught that, but it would probably be a lot easier to put platform specific code in pq_getbyte_if_available(). That's a bit breaking an abstraction layer though, but that's maybe Ok in this case... -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > 2010/1/17 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: >> We could replace the blocking PQexec() calls with PQsendQuery(), and use >> the emulated version of select() to wait. > > Hmm. That would at least theoretically work, but aren't there still > places we may end up blocking further down? Or are those ok? There's also PQconnect that needs similar treatment (using PQconnectStart/Poll()), but that's it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Magnus Hagander wrote: >> 2010/1/17 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: >>> We could replace the blocking PQexec() calls with PQsendQuery(), and use >>> the emulated version of select() to wait. >> Hmm. That would at least theoretically work, but aren't there still >> places we may end up blocking further down? Or are those ok? > > There's also PQconnect that needs similar treatment (using > PQconnectStart/Poll()), but that's it. So here's a patch implementing that for contrib/dblink. Walreceiver needs the same treatment. The implementation should be shared between the two, but I'm not sure how. We can't just put the wrapper functions to a module in src/backend/port/, because the wrapper functions depend on libpq. Maybe put them in a new header file as static functions, and include that in contrib/dblink/dblink.c and src/backend/replication/libpqwalreceiver.c. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c index 2c1d7a2..fa11709 100644 --- a/contrib/dblink/dblink.c +++ b/contrib/dblink/dblink.c @@ -34,6 +34,14 @@ #include <limits.h> +#ifdef WIN23 +/* These are needed by the interruptible libpq function replacements */ +#include <time.h> +#include <unistd.h> +#include <sys/time.h> +#include <sys/types.h> +#endif + #include "libpq-fe.h" #include "fmgr.h" #include "funcapi.h" @@ -101,6 +109,193 @@ static void dblink_res_error(const char *conname, PGresult *res, const char *dbl static char *get_connect_string(const char *servername); static char *escape_param_str(const char *from); +#ifdef WIN23 +/* + * Replacement functions for blocking libpq functions, for Windows. + * + * On Windows, the vanilla select() function doesn't react to our emulated + * signals. PQexec() and PQconnectdb() use select(), so they're + * uninterruptible. These replacement functions use the corresponding + * asynchronous libpq functions and backend version of select() to implement + * the same functionality, but in a way that's interrupted by signals. + * + * These work on other platforms as well, but presumably it's more efficient + * to let libpq block. + */ + +static PGresult * +dblink_PQexec(PGconn *conn, const char *command) +{ + int sock; + PGresult *result, + *lastResult; + + /* Send query. This can block too, but we ignore that for now. */ + if (PQsendQuery(conn, command) == 0) + return NULL; + + /* Wait for response */ + sock = PQsocket(conn); + + while(PQisBusy(conn)) + { + fd_set input_mask; + + FD_ZERO(&input_mask); + + FD_SET (sock, &input_mask); + + /* + * Note that we don't check the return code. We assume that + * PQconsumeInput() will get the same error, and set the result + * as failed. + */ + select(sock + 1, &input_mask, NULL, NULL, NULL); + PQconsumeInput(conn); + } + + /* + * Emulate PQexec()'s behavior of returning the *last* result, if + * there's many. dblink doesn't normally issue statements that return + * multiple results, but the user-supplied SQL statement passed to + * dblink() might. You'll only get the last result back, so it's not a + * very sensible thing to do, but we must still handle it gracefully. + * + * We don't try to concatenate error messages like PQexec() does. + * Doesn't seem worth the effort. + */ + lastResult = NULL; + while((result = PQgetResult(conn)) != NULL) + { + if (lastResult != NULL) + { + if (PQresultStatus(lastResult) != PGRES_COMMAND_OK && + PQresultStatus(lastResult) != PGRES_TUPLES_OK) + { + PQclear(result); + result = lastResult; + } + else + PQclear(lastResult); + } + lastResult = result; + } + + return lastResult; +} + +static PGconn * +dblink_PQconnectdb(const char *conninfo) +{ + PGconn *conn; + PostgresPollingStatusType status; + PQconninfoOption *options; + int timeout_secs = 0; + time_t end_time; + int sock; + + conn = PQconnectStart(conninfo); + if (conn == NULL) + return NULL; + + if (PQstatus(conn) == CONNECTION_BAD) + return conn; + + /* Extract timeout from the connection string */ + options = PQconninfoParse(conninfo, NULL); + if (options) + { + PQconninfoOption *option; + for (option = options; option->keyword != NULL; option++) + { + if (strcmp(option->keyword, "connect_timeout") == 0) + { + if (option->val != NULL && option->val[0] != '\0') + { + timeout_secs = atoi(option->val); + break; + } + } + } + PQconninfoFree(options); + } + if (timeout_secs > 0) + end_time = time(NULL) + timeout_secs; + + sock = PQsocket(conn); + + /* Wait for connection to be established */ + for (;;) + { + fd_set input_mask; + fd_set output_mask; + time_t now; + struct timeval timeout; + struct timeval *timeout_ptr; + + FD_ZERO(&input_mask); + FD_ZERO(&output_mask); + + status = PQconnectPoll(conn); + switch(status) + { + case PGRES_POLLING_OK: + case PGRES_POLLING_FAILED: + return conn; + + case PGRES_POLLING_READING: + FD_SET(sock, &input_mask); + break; + + case PGRES_POLLING_WRITING: + FD_SET(sock, &output_mask); + break; + + default: + elog(ERROR, "unknown PQconnectPoll() return value: %d", status); + } + + if (timeout_secs > 0) + { + now = time(NULL); + timeout.tv_sec = (now > end_time) ? 0 : (end_time - now); + timeout.tv_usec = 0; + timeout_ptr = &timeout; + } + else + timeout_ptr = NULL; + + /* + * Note that we don't check an error code. We assume that + * PQconnectPoll() will get the same error, and return failure. + */ + if (select(sock + 1, &input_mask, &output_mask, NULL, timeout_ptr) == 0) + { + /* Timeout */ + PQfinish(conn); + + /* + * This message is subtly different from the one from the message + * you get on other platforms, where PQconnectdb() handles the + * timeout. The "timeout expired" message here gets translated + * using the backend .po file, while the message emitted by + * PQconnectdb() is translated using libpq .po file. I hope that + * makes no difference in practice. + */ + ereport(ERROR, + (errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION), + errmsg("could not establish connection"), + errdetail("timeout expired"))); + } + } + return NULL; /* not reached, keep compiler quiet */ +} + +#define PQexec(conn, command) dblink_PQexec(conn, command) +#define PQconnectdb(conninfo) dblink_PQconnectdb(conninfo) + +#endif + /* Global */ static remoteConn *pconn = NULL; static HTAB *remoteConnHash = NULL;
On 01/21/2010 04:46 AM, Heikki Linnakangas wrote: > Heikki Linnakangas wrote: >> Magnus Hagander wrote: >>> 2010/1/17 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: >>>> We could replace the blocking PQexec() calls with PQsendQuery(), and use >>>> the emulated version of select() to wait. >>> Hmm. That would at least theoretically work, but aren't there still >>> places we may end up blocking further down? Or are those ok? >> >> There's also PQconnect that needs similar treatment (using >> PQconnectStart/Poll()), but that's it. > > So here's a patch implementing that for contrib/dblink. Walreceiver > needs the same treatment. > > The implementation should be shared between the two, but I'm not sure > how. We can't just put the wrapper functions to a module in > src/backend/port/, because the wrapper functions depend on libpq. Maybe > put them in a new header file as static functions, and include that in > contrib/dblink/dblink.c and src/backend/replication/libpqwalreceiver.c. +#ifdef WIN23 ^^^^^ I assume you meant WIN32 here ;-) +#define PQexec(conn, command) dblink_PQexec(conn, command) +#define PQconnectdb(conninfo) dblink_PQconnectdb(conninfo) I have not been really following this thread, but why can't we put the "#ifdef WIN32" and special definition of these functions into libpq. I don't understand why we need special treatment for dblink. Joe
Joe Conway wrote: > +#ifdef WIN23 > ^^^^^ > I assume you meant WIN32 here ;-) Yeah. I admit I haven't tested this on Windows, I just commented out those #ifdef's and tested on Linux. Will need to verify that this actually solves the problem on Windows before committing. > +#define PQexec(conn, command) dblink_PQexec(conn, command) > +#define PQconnectdb(conninfo) dblink_PQconnectdb(conninfo) > > I have not been really following this thread, but why can't we put the > "#ifdef WIN32" and special definition of these functions into libpq. I > don't understand why we need special treatment for dblink. The problem is that select() function on Windows isn't interrupted by signals. That's because Unix-style signals don't exist on Windows, but we've emulated them in the server with pipes. The select() function doesn't know about that hack, so in the backend, we've replaced it with pgwin32_select() that does, using a #define. libpq doesn't use that #define and pgwin32_select(), because that's a backend function. It won't work in regular client applications. If we just moved those dblink_PQexec/PQconnectdb() functions to libpq, they wouldn't use pgwin32_select() and would thus be useless. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 01/21/2010 10:33 PM, Heikki Linnakangas wrote: > Joe Conway wrote: >> I have not been really following this thread, but why can't we put the >> "#ifdef WIN32" and special definition of these functions into libpq. I >> don't understand why we need special treatment for dblink. > > The problem is that select() function on Windows isn't interrupted by > signals. That's because Unix-style signals don't exist on Windows, but > we've emulated them in the server with pipes. The select() function > doesn't know about that hack, so in the backend, we've replaced it with > pgwin32_select() that does, using a #define. Ah, thanks for the synopsis. > libpq doesn't use that #define and pgwin32_select(), because that's a > backend function. It won't work in regular client applications. > > If we just moved those dblink_PQexec/PQconnectdb() functions to libpq, > they wouldn't use pgwin32_select() and would thus be useless. OK, so now I see why we want this fixed for dblink and walreceiver, but doesn't this approach leave every other WIN32 libpq client out in the cold? Is there nothing that can be done for the general case, or is it a SMOP? Joe
Joe Conway wrote: > OK, so now I see why we want this fixed for dblink and walreceiver, but > doesn't this approach leave every other WIN32 libpq client out in the > cold? Is there nothing that can be done for the general case, or is it a > SMOP? The problem only applies to libpq calls from the backend. Client apps are not affected, only backend modules. If there's any other modules out there that use libpq, then yes, those have a problem too. A generic fix would be nice. Putting the wrapper functions in a new header file that's included in all backend modules that want to use libpq would work. Maybe the new header file could even be #included from libpq-fe.h, when compiling backend code (#ifndef FRONTEND), so you wouldn't need to modify dblink, walreceiver, or any 3rd party modules that use libpq at all. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Joe Conway wrote: >> OK, so now I see why we want this fixed for dblink and walreceiver, but >> doesn't this approach leave every other WIN32 libpq client out in the >> cold? Is there nothing that can be done for the general case, or is it a >> SMOP? > > The problem only applies to libpq calls from the backend. Client apps > are not affected, only backend modules. If there's any other modules out > there that use libpq, then yes, those have a problem too. plproxy comes to mind. -- dim
On 1/22/10, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > > Joe Conway wrote: > >> OK, so now I see why we want this fixed for dblink and walreceiver, but > >> doesn't this approach leave every other WIN32 libpq client out in the > >> cold? Is there nothing that can be done for the general case, or is it a > >> SMOP? > > > > The problem only applies to libpq calls from the backend. Client apps > > are not affected, only backend modules. If there's any other modules out > > there that use libpq, then yes, those have a problem too. > > > plproxy comes to mind. Thats interesting. PL/Proxy deos not use PQexec, it uses async execution and waits on sockets with plain select() called from code compiled with backend headers. So it seems to be already using pgwin32_select(). Or not? -- marko
Marko Kreen wrote: > On 1/22/10, Dimitri Fontaine <dfontaine@hi-media.com> wrote: >> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> > The problem only applies to libpq calls from the backend. Client apps >> > are not affected, only backend modules. If there's any other modules out >> > there that use libpq, then yes, those have a problem too. >> >> >> plproxy comes to mind. > > Thats interesting. PL/Proxy deos not use PQexec, it uses async > execution and waits on sockets with plain select() called > from code compiled with backend headers. > > So it seems to be already using pgwin32_select(). Or not? Yes. I just grepped plproxy source code and there's indeed no blocking libpq calls there. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 01/21/2010 11:19 PM, Heikki Linnakangas wrote: > Joe Conway wrote: >> OK, so now I see why we want this fixed for dblink and walreceiver, but >> doesn't this approach leave every other WIN32 libpq client out in the >> cold? Is there nothing that can be done for the general case, or is it a >> SMOP? > > The problem only applies to libpq calls from the backend. Client apps > are not affected, only backend modules. If there's any other modules out > there that use libpq, then yes, those have a problem too. Sorry for being thick -- I'm still missing something. I don't understand why any user program using libpq/PQexec running on Windows does not have the same issue. Or to put it another way, why does this only apply to libpq calls from backend modules? > A generic fix would be nice. Putting the wrapper functions in a new > header file that's included in all backend modules that want to use > libpq would work. Maybe the new header file could even be #included from > libpq-fe.h, when compiling backend code (#ifndef FRONTEND), so you > wouldn't need to modify dblink, walreceiver, or any 3rd party modules > that use libpq at all. I'll go ahead and do this if there is consensus for it. Joe
2010/1/24 Joe Conway <mail@joeconway.com>: > On 01/21/2010 11:19 PM, Heikki Linnakangas wrote: >> Joe Conway wrote: >>> OK, so now I see why we want this fixed for dblink and walreceiver, but >>> doesn't this approach leave every other WIN32 libpq client out in the >>> cold? Is there nothing that can be done for the general case, or is it a >>> SMOP? >> >> The problem only applies to libpq calls from the backend. Client apps >> are not affected, only backend modules. If there's any other modules out >> there that use libpq, then yes, those have a problem too. > > Sorry for being thick -- I'm still missing something. I don't understand > why any user program using libpq/PQexec running on Windows does not have > the same issue. Or to put it another way, why does this only apply to > libpq calls from backend modules? Because Windows programs in general don't rely on working Unix signal semantics - since Win32 doesn't *have* them. We faked them for PostgreSQL so we didn't have to rewrite large parts of how the backend deals with those things. I don't know any other software that does - and especially not client side software. So yeah, you could say they are affected insofar that their calls will be blocked as well, but if they are Windows apps they are probably designed to deal with it. It's the common way for such calls to behave on the platform. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > 2010/1/24 Joe Conway <mail@joeconway.com>: >> Sorry for being thick -- I'm still missing something. I don't understand >> why any user program using libpq/PQexec running on Windows does not have >> the same issue. Or to put it another way, why does this only apply to >> libpq calls from backend modules? > Because Windows programs in general don't rely on working Unix signal > semantics - since Win32 doesn't *have* them. The real question is why is it so critical for our emulated signals to be able to interrupt these operations. If you're trying to tell me that Hot Standby is too fragile to work unless it can interrupt them, then HS has got a serious problem, and it is not libpq's fault. There is an enormous amount of code in the backend that can run for long periods of time without noticing signals, and not all of that is fixable. Consider for example a plperl, plpython, or pltcl function that goes off and does a long computation. So I'm thinking that proposing to kluge up libpq in this area isn't solving the real problem anyway. regards, tom lane
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> 2010/1/24 Joe Conway <mail@joeconway.com>: >>> Sorry for being thick -- I'm still missing something. I don't understand >>> why any user program using libpq/PQexec running on Windows does not have >>> the same issue. Or to put it another way, why does this only apply to >>> libpq calls from backend modules? > >> Because Windows programs in general don't rely on working Unix signal >> semantics - since Win32 doesn't *have* them. > > The real question is why is it so critical for our emulated signals to > be able to interrupt these operations. The process won't react to a shutdown request otherwise, while it's waiting for the response to PQexec(). It's not such a big deal for walreceiver, actually, because it already uses select() (with emulation) in the main loop, but it's theoretically possible for the connection to be silently lost during the initial handshake, after sending the Query message, before receiving a response. dblink/contrib has the same issue, it might wait for a response forever. Hmm, maybe we should just TCP_KEEPALIVE (if available) on the connection. We should really be doing that in walreceiver anyway, walreceiver won't otherwise notice if the connection is silently lost, and won't know to reconnect. > If you're trying to tell me that Hot Standby is too fragile to work > unless it can interrupt them, then HS has got a serious problem, and > it is not libpq's fault. There is an enormous amount of code in the > backend that can run for long periods of time without noticing signals, > and not all of that is fixable. Consider for example a plperl, > plpython, or pltcl function that goes off and does a long computation. No, it's not related to Hot Standby. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, Jan 18, 2010 at 11:46 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> From what I can tell, this indicates that pq_getbyte_if_available() is >>> not working - because it's supposed to never block, right? >> >> Right, it's not supposed to block. >> >>> This could be because the win32 socket emulation layer simply wasn't >>> designed to deal with non-blocking sockets. Specifically, it actually >>> *always* sets the socket to non-blocking mode, and then uses that to >>> properly emulate how sockets work under unix. >> >> I presume the win32 emulation layer can be taught about non-blocking >> sockets? Or maybe pq_getbyte_if_available() can be implemented using >> some other simpler method on Windows. > > It could be taught that, but it would probably be a lot easier to put > platform specific code in pq_getbyte_if_available(). Umm.. in this case, for SSL on win32 case, we also need to create new function like my_sock_read_if_available() that receives data from non-blocking socket, and reassign it to the SSL BIO function. Right? If so, it seems easier for me to tell the win32 layer about non-blocking. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
2010/2/8 Fujii Masao <masao.fujii@gmail.com>: > On Mon, Jan 18, 2010 at 11:46 PM, Magnus Hagander <magnus@hagander.net> wrote: >>>> From what I can tell, this indicates that pq_getbyte_if_available() is >>>> not working - because it's supposed to never block, right? >>> >>> Right, it's not supposed to block. >>> >>>> This could be because the win32 socket emulation layer simply wasn't >>>> designed to deal with non-blocking sockets. Specifically, it actually >>>> *always* sets the socket to non-blocking mode, and then uses that to >>>> properly emulate how sockets work under unix. >>> >>> I presume the win32 emulation layer can be taught about non-blocking >>> sockets? Or maybe pq_getbyte_if_available() can be implemented using >>> some other simpler method on Windows. >> >> It could be taught that, but it would probably be a lot easier to put >> platform specific code in pq_getbyte_if_available(). > > Umm.. in this case, for SSL on win32 case, we also need to create > new function like my_sock_read_if_available() that receives data > from non-blocking socket, and reassign it to the SSL BIO function. > Right? If so, it seems easier for me to tell the win32 layer about > non-blocking. Sorry about the delay in responding to this. Remember that the win32 code *always* puts the socket in non-blocking mode. So we can't just "teach the layer about it". We need some way to pass the information down that this is actually something we want to be non-blocking, and it can't be the normal flag on the socket. I don't really have an idea of where else we'd put it though :( It's in the port structure, but not beyond it. What we could do, is have an ugly global flag specifically for the use-case we have here. Assuming we do create a plataform specific pq_getbyte_if_available(), the code-path that would have trouble now would be when we call pq_getbyte_if_available(), and it in turns asks the socket if there is data, there is, but we end up calling back into the SSL code to fetch the data, and it gets an incomplete packet. Correct? So the path is basically: pq_getbyte_if_available() -> secure_read() -> SSL_read() -> my_sock_read() -> pgwin32_recv() Given that we know we are working on a single socket here, we could use a global flag to tell pgwin32_recv() to become nonblocking. We could set this flag directly in the win32-specific version of pq_getbyte_if_available(), and make sure it's cleared as soon as we exit. It will obviously fail if we do anything on a *different* socket during this time, so it has to be set for a very short time. But that seems doable. And we don't call any socket stuff from signal handlers so that shouldn't cause issues. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Sun, Feb 14, 2010 at 11:52 PM, Magnus Hagander <magnus@hagander.net> wrote: > Sorry about the delay in responding to this. Thanks for the response. > Remember that the win32 code *always* puts the socket in non-blocking > mode. So we can't just "teach the layer about it". We need some way to > pass the information down that this is actually something we want to > be non-blocking, and it can't be the normal flag on the socket. I > don't really have an idea of where else we'd put it though :( It's in > the port structure, but not beyond it. Right. BTW, pq_getbyte_if_available() always changes the socket to non-blocking and blocking mode before and after calling secure_read(), respectively. This code seems wrong on win32. Because, as you said, the socket is always in non-blocking mode on win32. We should change pq_getbyte_if_available() so as not to change the socket mode only in win32? > What we could do, is have an ugly global flag specifically for the > use-case we have here. Assuming we do create a plataform specific > pq_getbyte_if_available(), the code-path that would have trouble now > would be when we call pq_getbyte_if_available(), and it in turns asks > the socket if there is data, there is, but we end up calling back into > the SSL code to fetch the data, and it gets an incomplete packet. > Correct? So the path is basically: > > pq_getbyte_if_available() -> secure_read() -> SSL_read() -> > my_sock_read() -> pgwin32_recv() > > Given that we know we are working on a single socket here, we could > use a global flag to tell pgwin32_recv() to become nonblocking. We > could set this flag directly in the win32-specific version of > pq_getbyte_if_available(), and make sure it's cleared as soon as we > exit. > > It will obviously fail if we do anything on a *different* socket > during this time, so it has to be set for a very short time. But that > seems doable. And we don't call any socket stuff from signal handlers > so that shouldn't cause issues. Agreed. Here is the patch which does that (including the above-mentioned change). I haven't tested it yet because I failed in creating the build environment for the MSVC :( I'll try to create that again, and test it. Though I'm not sure how long it takes. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
2010/2/15 Fujii Masao <masao.fujii@gmail.com>: > On Sun, Feb 14, 2010 at 11:52 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Remember that the win32 code *always* puts the socket in non-blocking >> mode. So we can't just "teach the layer about it". We need some way to >> pass the information down that this is actually something we want to >> be non-blocking, and it can't be the normal flag on the socket. I >> don't really have an idea of where else we'd put it though :( It's in >> the port structure, but not beyond it. > > Right. > > BTW, pq_getbyte_if_available() always changes the socket to non-blocking > and blocking mode before and after calling secure_read(), respectively. > This code seems wrong on win32. Because, as you said, the socket is always > in non-blocking mode on win32. We should change pq_getbyte_if_available() > so as not to change the socket mode only in win32? Yes. >> What we could do, is have an ugly global flag specifically for the >> use-case we have here. Assuming we do create a plataform specific >> pq_getbyte_if_available(), the code-path that would have trouble now >> would be when we call pq_getbyte_if_available(), and it in turns asks >> the socket if there is data, there is, but we end up calling back into >> the SSL code to fetch the data, and it gets an incomplete packet. >> Correct? So the path is basically: >> >> pq_getbyte_if_available() -> secure_read() -> SSL_read() -> >> my_sock_read() -> pgwin32_recv() >> >> Given that we know we are working on a single socket here, we could >> use a global flag to tell pgwin32_recv() to become nonblocking. We >> could set this flag directly in the win32-specific version of >> pq_getbyte_if_available(), and make sure it's cleared as soon as we >> exit. >> >> It will obviously fail if we do anything on a *different* socket >> during this time, so it has to be set for a very short time. But that >> seems doable. And we don't call any socket stuff from signal handlers >> so that shouldn't cause issues. > > Agreed. Here is the patch which does that (including the above-mentioned > change). I haven't tested it yet because I failed in creating the build > environment for the MSVC :( I'll try to create that again, and test it. > Though I'm not sure how long it takes. I changed your patch to this, because I find it a lot simpler. The change is in the checking in pgwin32_recv - there is no need to ever call waitforsinglesocket, we can just exit out early. Do you see any issue with that? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
Magnus Hagander <magnus@hagander.net> writes: > I changed your patch to this, because I find it a lot simpler. The > change is in the checking in pgwin32_recv - there is no need to ever > call waitforsinglesocket, we can just exit out early. > Do you see any issue with that? This definitely looks cleaner, but is there a reason not to use bool instead of int here? regards, tom lane
2010/2/15 Tom Lane <tgl@sss.pgh.pa.us>: > Magnus Hagander <magnus@hagander.net> writes: >> I changed your patch to this, because I find it a lot simpler. The >> change is in the checking in pgwin32_recv - there is no need to ever >> call waitforsinglesocket, we can just exit out early. > >> Do you see any issue with that? > > This definitely looks cleaner, but is there a reason not to use bool > instead of int here? No. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Tue, Feb 16, 2010 at 1:33 AM, Magnus Hagander <magnus@hagander.net> wrote: > 2010/2/15 Tom Lane <tgl@sss.pgh.pa.us>: >> Magnus Hagander <magnus@hagander.net> writes: >>> I changed your patch to this, because I find it a lot simpler. The >>> change is in the checking in pgwin32_recv - there is no need to ever >>> call waitforsinglesocket, we can just exit out early. Thanks a lot, Magnus! >>> Do you see any issue with that? >> >> This definitely looks cleaner, but is there a reason not to use bool >> instead of int here? > > No. Can include/port/win32.h refer to bool type? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
2010/2/16 Fujii Masao <masao.fujii@gmail.com>: > On Tue, Feb 16, 2010 at 1:33 AM, Magnus Hagander <magnus@hagander.net> wrote: >> 2010/2/15 Tom Lane <tgl@sss.pgh.pa.us>: >>> Magnus Hagander <magnus@hagander.net> writes: >>>> I changed your patch to this, because I find it a lot simpler. The >>>> change is in the checking in pgwin32_recv - there is no need to ever >>>> call waitforsinglesocket, we can just exit out early. > > Thanks a lot, Magnus! > >>>> Do you see any issue with that? >>> >>> This definitely looks cleaner, but is there a reason not to use bool >>> instead of int here? >> >> No. > > Can include/port/win32.h refer to bool type? Nope, you're correct, it can't. Committed without that. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/