Thread: SO_SNDBUF size is small on win32?
Hi, I see a performance issue on win32. This problem is causes by the following URL. http://support.microsoft.com/kb/823764/EN-US/ On win32, default SO_SNDBUF value is 8192 bytes. And libpq's buffer is 8192 too. pqcomm.c:117 #define PQ_BUFFER_SIZE 8192 send() may take as long as 200ms. So, I think we should increase SO_SNDBUF to more than 8192. I attache the patch. Regards, -- Yoshiyuki Asaba y-asaba@sraoss.co.jp Index: pqcomm.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/libpq/pqcomm.c,v retrieving revision 1.184 diff -c -r1.184 pqcomm.c *** pqcomm.c 5 Mar 2006 15:58:27 -0000 1.184 --- pqcomm.c 27 Jun 2006 15:17:18 -0000 *************** *** 593,598 **** --- 593,608 ---- return STATUS_ERROR; } + #ifdef WIN32 + on = PQ_BUFFER_SIZE * 2; + if (setsockopt(port->sock, SOL_SOCKET, SO_SNDBUF, + (char *) &on, sizeof(on)) < 0) + { + elog(LOG, "setsockopt(SO_SNDBUF) failed: %m"); + return STATUS_ERROR; + } + #endif + /* * Also apply the current keepalive parameters. If we fail to set a * parameter, don't errorout, because these aren't universally
Yoshiyuki Asaba <y-asaba@sraoss.co.jp> writes: > send() may take as long as 200ms. So, I think we should increase > SO_SNDBUF to more than 8192. I attache the patch. Why would that help? We won't be sending more than 8K at a time. regards, tom lane
On Wed, Jun 28, 2006 at 12:23:13AM +0900, Yoshiyuki Asaba wrote: > Hi, > > I see a performance issue on win32. This problem is causes by the > following URL. > > http://support.microsoft.com/kb/823764/EN-US/ > > On win32, default SO_SNDBUF value is 8192 bytes. And libpq's buffer is > 8192 too. Ok, so there's a difficiency in Windows TCP code. Do you have any benchmarks to show this actually makes a difference. According to the URL you give, the problem occurs if the libpq buffer is *bigger* than the socket buffer, which it isn't... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
From: Tom Lane <tgl@sss.pgh.pa.us> Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? Date: Tue, 27 Jun 2006 11:30:56 -0400 > Yoshiyuki Asaba <y-asaba@sraoss.co.jp> writes: > > send() may take as long as 200ms. So, I think we should increase > > SO_SNDBUF to more than 8192. I attache the patch. > > Why would that help? We won't be sending more than 8K at a time. MSDN is, Method2: Make the Socket Send Buffer Size Larger Than the Program Send Buffer Size .... Modify the send callor the WSASend call to specify a buffer size at least 1 byte smaller than the SO_SNDBUF value. -- Yoshiyuki Asaba y-asaba@sraoss.co.jp
Martijn van Oosterhout wrote: >On Wed, Jun 28, 2006 at 12:23:13AM +0900, Yoshiyuki Asaba wrote: > > >>Hi, >> >>I see a performance issue on win32. This problem is causes by the >>following URL. >> >>http://support.microsoft.com/kb/823764/EN-US/ >> >>On win32, default SO_SNDBUF value is 8192 bytes. And libpq's buffer is >>8192 too. >> >> > >Ok, so there's a difficiency in Windows TCP code. Do you have any >benchmarks to show this actually makes a difference. According to the >URL you give, the problem occurs if the libpq buffer is *bigger* than >the socket buffer, which it isn't... > > No, it says it occurs if this condition is met: "A single *send* call or *WSASend* call fills the whole underlying socket send buffer." This will surely be true if the buffer sizes are the same. They recommend making the socket buffer at least 1 byte bigger. cheers andrew
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > Martijn van Oosterhout > On Wed, Jun 28, 2006 at 12:23:13AM +0900, Yoshiyuki Asaba wrote: > > Hi, > > > > I see a performance issue on win32. This problem is causes by the > > following URL. > > > > http://support.microsoft.com/kb/823764/EN-US/ > > > > On win32, default SO_SNDBUF value is 8192 bytes. And > libpq's buffer is > > 8192 too. > > Ok, so there's a difficiency in Windows TCP code. Do you have any > benchmarks to show this actually makes a difference. According to the > URL you give, the problem occurs if the libpq buffer is *bigger* than > the socket buffer, which it isn't... > The article also says there is a problem if they are the same size. -rocco
On Tue, Jun 27, 2006 at 11:45:53AM -0400, Andrew Dunstan wrote: > No, it says it occurs if this condition is met: "A single *send* call or > *WSASend* call fills the whole underlying socket send buffer." > > This will surely be true if the buffer sizes are the same. They > recommend making the socket buffer at least 1 byte bigger. Ok, but even then, are there any benchmarks to show it makes a difference. The articles suggests there should be but it would be nice to see how much difference it makes... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Andrew Dunstan <andrew@dunslane.net> writes: > Martijn van Oosterhout wrote: >> On Wed, Jun 28, 2006 at 12:23:13AM +0900, Yoshiyuki Asaba wrote: >>> http://support.microsoft.com/kb/823764/EN-US/ > No, it says it occurs if this condition is met: "A single *send* call or > *WSASend* call fills the whole underlying socket send buffer." It also says that the condition only occurs if the program uses non-blocking sockets ... which the backend does not. So this page offers no support for the proposed patch. regards, tom lane
From: Martijn van Oosterhout <kleptog@svana.org> Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? Date: Tue, 27 Jun 2006 18:13:18 +0200 > On Tue, Jun 27, 2006 at 11:45:53AM -0400, Andrew Dunstan wrote: > > No, it says it occurs if this condition is met: "A single *send* call or > > *WSASend* call fills the whole underlying socket send buffer." > > > > This will surely be true if the buffer sizes are the same. They > > recommend making the socket buffer at least 1 byte bigger. > > Ok, but even then, are there any benchmarks to show it makes a > difference. The articles suggests there should be but it would be nice > to see how much difference it makes... I see the problem in this environment. * client - Windows XP - using ODBC driver * server - Windows XP - 8.1.4 * query time - original -> about 12sec. - patch version -> about 3sec. However, this problem did not occur when I changed a client machine... Regards, -- Yoshiyuki Asaba y-asaba@sraoss.co.jp
From: Tom Lane <tgl@sss.pgh.pa.us> Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? Date: Tue, 27 Jun 2006 12:28:35 -0400 > Andrew Dunstan <andrew@dunslane.net> writes: > > Martijn van Oosterhout wrote: > >> On Wed, Jun 28, 2006 at 12:23:13AM +0900, Yoshiyuki Asaba wrote: > >>> http://support.microsoft.com/kb/823764/EN-US/ > > > No, it says it occurs if this condition is met: "A single *send* call or > > *WSASend* call fills the whole underlying socket send buffer." > > It also says that the condition only occurs if the program uses > non-blocking sockets ... which the backend does not. So this page > offers no support for the proposed patch. WSAEventSelect() sets a socket to nonblocking mode. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wcecomm5/html/wce50lrfWSAEventSelect.asp pgwin32_send() calls pgwin32_waitforsinglesocket() before WSASend(). And pgwin32_waitforsinglesocket() calls WSAEventSelect(). Regards, -- Yoshiyuki Asaba y-asaba@sraoss.co.jp
Yoshiyuki Asaba <y-asaba@sraoss.co.jp> writes: > From: Tom Lane <tgl@sss.pgh.pa.us> >> It also says that the condition only occurs if the program uses >> non-blocking sockets ... which the backend does not. So this page >> offers no support for the proposed patch. > WSAEventSelect() sets a socket to nonblocking mode. Yeah, but that socket is only used for inter-backend signaling with small (1 byte, I think) messages. The socket used for communication with the frontend is not in nonblocking mode, unless I'm totally confused. Have you actually measured any performance benefit from this patch, and if so what was the test case? I'm not opposed to the patch if it does something useful, but the info currently available does not suggest that it will help. What I would think might help is a patch on the libpq side (because it *does* use a nonblocking socket) to avoid sending more than 8K per WSASend call. The effect would just be to break a long send into a series of shorter sends, which wouldn't really do anything useful on a well-designed TCP stack, but then this is Windows we're talking about... regards, tom lane
> > From: Tom Lane <tgl@sss.pgh.pa.us> > >> It also says that the condition only occurs if the program uses > >> non-blocking sockets ... which the backend does not. So this page > >> offers no support for the proposed patch. > > > WSAEventSelect() sets a socket to nonblocking mode. > > Yeah, but that socket is only used for inter-backend > signaling with small (1 byte, I think) messages. The socket > used for communication with the frontend is not in > nonblocking mode, unless I'm totally confused. For once, I beleive you are :-) We use non-blocking sockets in backend/port/win32/socket.c so we are able to deliver our "faked signals" while waiting for I/O on the socket. We specifically set it in pgwin32_socket(). Given that, it might be a good idea to actually put the code there instead, to localise it. With a comment and a reference to that Q article. > Have you actually measured any performance benefit from this > patch, and if so what was the test case? I'm not opposed to > the patch if it does something useful, but the info currently > available does not suggest that it will help. We have definitly seen weird timing issues sometimes when both client and server were on Windows, but have been unable to pin it exactly on what. From Yoshiykis other mail it looks like this could possibly be it, since he did experience a speedup in the range we've been looking for in those cases. > What I would think might help is a patch on the libpq side (because it > *does* use a nonblocking socket) to avoid sending more than > 8K per WSASend call. The effect would just be to break a > long send into a series of shorter sends, which wouldn't > really do anything useful on a well-designed TCP stack, but > then this is Windows we're talking about... It could definitly be a good idea to have a patch there *as well*, but I think they'd both be affected. //Magnus
I would set the SO_SNDBUF to 32768. > Hi, > > I see a performance issue on win32. This problem is causes by the > following URL. > > http://support.microsoft.com/kb/823764/EN-US/ > > On win32, default SO_SNDBUF value is 8192 bytes. And libpq's buffer is > 8192 too. > > pqcomm.c:117 > #define PQ_BUFFER_SIZE 8192 > > send() may take as long as 200ms. So, I think we should increase > SO_SNDBUF to more than 8192. I attache the patch. > > Regards, > -- > Yoshiyuki Asaba > y-asaba@sraoss.co.jp > Index: pqcomm.c > =================================================================== > RCS file: /projects/cvsroot/pgsql/src/backend/libpq/pqcomm.c,v > retrieving revision 1.184 > diff -c -r1.184 pqcomm.c > *** pqcomm.c 5 Mar 2006 15:58:27 -0000 1.184 > --- pqcomm.c 27 Jun 2006 15:17:18 -0000 > *************** > *** 593,598 **** > --- 593,608 ---- > return STATUS_ERROR; > } > > + #ifdef WIN32 > + on = PQ_BUFFER_SIZE * 2; > + if (setsockopt(port->sock, SOL_SOCKET, SO_SNDBUF, > + (char *) &on, sizeof(on)) < 0) > + { > + elog(LOG, "setsockopt(SO_SNDBUF) failed: %m"); > + return STATUS_ERROR; > + } > + #endif > + > /* > * Also apply the current keepalive parameters. If we fail to set a > * parameter, don't error out, because these aren't universally > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
"Magnus Hagander" <mha@sollentuna.net> writes: > We use non-blocking sockets in backend/port/win32/socket.c so we are > able to deliver our "faked signals" while waiting for I/O on the socket. > We specifically set it in pgwin32_socket(). Hm, that seems a bit grotty, but anyway I stand corrected. > Given that, it might be a good idea to actually put the code there > instead, to localise it. With a comment and a reference to that Q > article. No, I think the patch has it in the right place, because pgwin32_socket would have no defensible way of knowing what the max send size would be. (Indeed, with a slightly different implementation in pqcomm.c, there would not *be* any hard upper limit; the current code wastes cycles copying data around, when with a large message it probably should just send() directly from the message buffer...) I agree it needs a comment though. >> What I would think might help is a patch on the libpq side (because it >> *does* use a nonblocking socket) to avoid sending more than >> 8K per WSASend call. > It could definitly be a good idea to have a patch there *as well*, but I > think they'd both be affected. On the libpq side, sending large messages is probably rare except for COPY IN mode. Has anyone noticed performance issues specifically with COPY IN? regards, tom lane
> We have definitly seen weird timing issues sometimes when both client > and server were on Windows, but have been unable to pin it exactly on > what. From Yoshiykis other mail it looks like this could possibly be it, > since he did experience a speedup in the range we've been looking for in > those cases. > > >> What I would think might help is a patch on the libpq side (because it >> *does* use a nonblocking socket) to avoid sending more than >> 8K per WSASend call. The effect would just be to break a >> long send into a series of shorter sends, which wouldn't >> really do anything useful on a well-designed TCP stack, but >> then this is Windows we're talking about... > > It could definitly be a good idea to have a patch there *as well*, but I > think they'd both be affected. As I said earlier, I would boost the socket buffer to something larger than merely 2x the packet size. I'd try for 32K (32768), that way we have some space for additional buffers before we hit the problem. It is presumed that we should have enough data in the socket buffer to at least try to match the expected amount of data that would be sent while waiting for the defered ACK.
Hi, From: Tom Lane <tgl@sss.pgh.pa.us> Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? Date: Tue, 27 Jun 2006 14:43:57 -0400 > >> What I would think might help is a patch on the libpq side (because it > >> *does* use a nonblocking socket) to avoid sending more than > >> 8K per WSASend call. > > > It could definitly be a good idea to have a patch there *as well*, but I > > think they'd both be affected. > > On the libpq side, sending large messages is probably rare except for > COPY IN mode. Has anyone noticed performance issues specifically with > COPY IN? I think libpq interface does not use non-blocking socket. Because 'FRONTEND' symbol is enabled. src/include/port/win32.h #ifndef FRONTEND #define socket(af, type, protocol) pgwin32_socket(af, type, protocol) #define accept(s,addr, addrlen) pgwin32_accept(s, addr, addrlen) #define connect(s, name, namelen) pgwin32_connect(s, name, namelen)#define select(n, r, w, e, timeout) pgwin32_select(n, r, w, e, timeout) #define recv(s, buf, len, flags) pgwin32_recv(s,buf, len, flags) #define send(s, buf, len, flags) pgwin32_send(s, buf, len, flags) I think this is only server-side problem. Is this right? Regards, -- Yoshiyuki Asaba y-asaba@sraoss.co.jp
Yoshiyuki Asaba <y-asaba@sraoss.co.jp> writes: > I think libpq interface does not use non-blocking socket. Not unless the Windows port has also disabled pg_set_noblock ... regards, tom lane
From: Tom Lane <tgl@sss.pgh.pa.us> Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? Date: Wed, 28 Jun 2006 09:54:21 -0400 > Yoshiyuki Asaba <y-asaba@sraoss.co.jp> writes: > > I think libpq interface does not use non-blocking socket. > > Not unless the Windows port has also disabled pg_set_noblock ... Sorry, I misunderstood. I tried to occur this issue on msys. % cat test.shexport PGHOST=xxxexport PGPORT=5432export PGDATABASE=testdropdb $PGDATABASEcreatedbpsql -c 'CREATE TABLE t1(a int, b text)'i=0while [ $i -lt 50 ]; do psql -c "insert into t1 values ($i, repeat('x', 10000))" i=`expr $i + 1`donepg_dump-a > dumptime psql -f dump % sh test.sh But, I did not occur this issue... Does anyone occur this issue? -- Yoshiyuki Asaba y-asaba@sraoss.co.jp