Thread: Got no response last time on setsockopt post, so I thought I would reiterate.

<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">These two calls make our remote queries via libpq about twice as fast on average.  It seems to me
likeit might be a nice addition to the core product’s libpq (I poked it into the spot where the Nagle algorithm is
turnedoff, but another place would be fine too).  Can anyone give me a reason why it is a bad idea to add this in?  If
itwere made a parameter with a default of 64K, that would be even better.  Then it could be tuned to particular systems
formaximum throughput.</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">  on = 65535;</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">  if (setsockopt(conn->sock, SOL_SOCKET, SO_RCVBUF,(char *) &on, sizeof(on)) <
0)</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">            {</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">                        char                  sebuf[256];</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">                        printfPQExpBuffer(&conn->errorMessage,</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">                                    libpq_gettext("could not set socket SO_RCVBUF window size:
%s\n"),</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">                                                                          SOCK_STRERROR(SOCK_ERRNO,
sebuf,sizeof(sebuf)));</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">                        return 0;</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">            }</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">            </span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">  on = 65535;</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">  if (setsockopt(conn->sock, SOL_SOCKET, SO_SNDBUF,(char *) &on, sizeof(on)) <
0)</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">            {</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">                        char                  sebuf[256];</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">                        printfPQExpBuffer(&conn->errorMessage,</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">                                    libpq_gettext("could not set socket SO_SNDBUF window size:
%s\n"),</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">                                                                          SOCK_STRERROR(SOCK_ERRNO,
sebuf,sizeof(sebuf)));</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">                        return 0;</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">            }</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font></div>
"Dann Corbit" <DCorbit@connx.com> writes:
> These two calls make our remote queries via libpq about twice as fast on
> average.

And, perhaps, cause even greater factors of degradation in other
scenarios (not to mention the possibility of complete failure on some
platforms).  You haven't provided nearly enough evidence that this is
a safe change to make.
        regards, tom lane


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, June 11, 2007 3:41 PM
> To: Dann Corbit
> Cc: pgsql-hackers@postgresql.org; Larry McGhaw
> Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
> thought I would reiterate.
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > These two calls make our remote queries via libpq about twice as
fast on
> > average.
>
> And, perhaps, cause even greater factors of degradation in other
> scenarios (not to mention the possibility of complete failure on some
> platforms).  You haven't provided nearly enough evidence that this is
> a safe change to make.

May I suggest:
http://www-didc.lbl.gov/TCP-tuning/setsockopt.html
http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html

We test against dozens of operating systems and we have never had a
problem (generally, we use our own tcp/ip network objects for
communication and we only recently figured out why PostgreSQL was
lagging so far behind and patched libPQ ourselves.)  Now, it will be
about 2 weeks before our full regressions have run against PostgreSQL on
all of our platforms, but we do adjust the TCP/IP window on all of our
clients and servers and have yet to find one that is unable to either
negotiate a decent size or ignore our request at worst.

However, I won't twist your arm.  I just wanted to be sure that those at
the PostgreSQL organization were aware of this simple trick.  Our
products run on:
Aix
BeOS
Hpux
Linux (everywhere, including mainframe zLinux)
MVS
SunOS
Solaris
OpenVMS Alpha
OpenVMS VAX
OpenVMS Itanium
Windows

And several others



On Mon, 11 Jun 2007, Dann Corbit wrote:

> These two calls make our remote queries via libpq about twice as fast on
> average.

Can you comment a bit on what your remote queries are typically doing? 
You'll need to provide at least an idea what type of test case you're 
seeing the improvement on for others to try and replicate it.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


> -----Original Message-----
> From: Greg Smith [mailto:gsmith@gregsmith.com]
> Sent: Monday, June 11, 2007 4:09 PM
> To: Dann Corbit
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
> thought I would reiterate.
>
> On Mon, 11 Jun 2007, Dann Corbit wrote:
>
> > These two calls make our remote queries via libpq about twice as
fast on
> > average.
>
> Can you comment a bit on what your remote queries are typically doing?
> You'll need to provide at least an idea what type of test case you're
> seeing the improvement on for others to try and replicate it.

We have literally thousands (maybe hundreds of thousands -- I'm not
totally sure exactly how many there are because I am in development and
not in testing) of queries, that take dozens of machines over a week to
run.

Our queries include inserts, updates, deletes, joins, views, um... You
name it.

Our product is database middleware and so we have to test against
anything that is a legal SQL query against every sort of database and
operating system combination (PostgreSQL is one of many environments
that we support).

If you have seen the NIST SQL verification suite, that is part of our
test suite.  We also found the PostgreSQL suite useful (though the
PostgreSQL specific things we only run against PostgreSQL).  We also
have our own collection of regression tests that we have gathered over
the past 20 years or so.

I can't be specific because we run every sort of query.  Most of our
hardware is fairly high end (generally 1GB Ethernet, but we do have some
machines that only have 100 MB net cards in them).

I guess that our usage is atypical for general business use but fairly
typical for those companies that produce middleware tool sets.  However,
many of our regressions came from customer feedback and so we do test
lots and lots of valid customer requirements.

I have a simple suggestion:
Put the setsockopt calls in (with the necessary fluff to make it robust)
and then perform the OSDB test.  I guess that unless the OSDB houses the
clients and the server on the same physical hardware you will see a very
large bonus for a very simple change.

> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore,
MD




Dann Corbit wrote:
> However, I won't twist your arm.  I just wanted to be sure that those at
> the PostgreSQL organization were aware of this simple trick.  Our
> products run on:
> Aix
> BeOS
> Hpux
> Linux (everywhere, including mainframe zLinux)
> MVS
> SunOS
> Solaris
> OpenVMS Alpha
> OpenVMS VAX
> OpenVMS Itanium
> Windows
>
> And several others
>
>
>
>   

We already set the SNDBUF on Windows for reasons documented in the code.

I think if you were to quantify the alleged improvement by platform it 
might allay suspicion.

cheers

andrew


> -----Original Message-----
> From: Andrew Dunstan [mailto:andrew@dunslane.net]
> Sent: Monday, June 11, 2007 4:35 PM
> To: Dann Corbit
> Cc: Tom Lane; pgsql-hackers@postgresql.org; Larry McGhaw
> Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
> thought I would reiterate.
>
> Dann Corbit wrote:
> > However, I won't twist your arm.  I just wanted to be sure that
those at
> > the PostgreSQL organization were aware of this simple trick.  Our
> > products run on:
> > Aix
> > BeOS
> > Hpux
> > Linux (everywhere, including mainframe zLinux)
> > MVS
> > SunOS
> > Solaris
> > OpenVMS Alpha
> > OpenVMS VAX
> > OpenVMS Itanium
> > Windows
> >
> > And several others
> >
> >
> >
> >
>
> We already set the SNDBUF on Windows for reasons documented in the
code.

The only place I see it is for Windows *only* in PQCOMM.C (to 32K).  Did
I miss it somewhere else?

> I think if you were to quantify the alleged improvement by platform it
> might allay suspicion.

I do not know if you will see the same results as we do.  We support
ancient and modern operating systems, on ancient and modern hardware (we
have OpenVMS 6.1 running Rdb as old as 4.2, for instance -- 1980's
technology).

The only way for you to see if your environments have the same sort of
benefits that we see is to test it yourselves.

The TCP/IP window size is such a well known optimization setting (in
fact the dominant one) that I am kind of surprised to be catching anyone
unawares.



"Dann Corbit" <DCorbit@connx.com> writes:
> May I suggest:
> http://www-didc.lbl.gov/TCP-tuning/setsockopt.html
> http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html

I poked around on those pages and almost immediately came across
http://www.psc.edu/networking/projects/tcptune/
which appears more up-to-date than the other pages, and it specifically
recommends *against* setting SO_SNDBUF or SO_RCVBUF on modern Linuxen.
So that's one fairly large category where we probably do not want this.

You have not even made it clear whether you were increasing the sizes in
the server-to-client or client-to-server direction, and your handwaving
about the test conditions makes it even harder to know what you are
measuring.  I would think for instance that local vs remote connections
make a big difference and might need different tuning.

BTW, if we look at this issue we ought to also look at whether the
send/recv quantum in libpq and the backend should be changed.  It's been
8K for about ten years now ...
        regards, tom lane


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, June 11, 2007 5:12 PM
> To: Dann Corbit
> Cc: pgsql-hackers@postgresql.org; Larry McGhaw
> Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
> thought I would reiterate.
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > May I suggest:
> > http://www-didc.lbl.gov/TCP-tuning/setsockopt.html
> > http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html
>
> I poked around on those pages and almost immediately came across
> http://www.psc.edu/networking/projects/tcptune/
> which appears more up-to-date than the other pages, and it
specifically
> recommends *against* setting SO_SNDBUF or SO_RCVBUF on modern Linuxen.
> So that's one fairly large category where we probably do not want
this.

It can still be a good idea to set it:
http://datatag.web.cern.ch/datatag/howto/tcp.html
64K was just an example.  Like I said before, it should be configurable.
> You have not even made it clear whether you were increasing the sizes
in
> the server-to-client or client-to-server direction, and your
handwaving
> about the test conditions makes it even harder to know what you are
> measuring.  I would think for instance that local vs remote
connections
> make a big difference and might need different tuning.

The configuration is a negotiation between client and server.  You may
or may not get what you ask for.  I suggest that it is simple to
implement and worthwhile to test.  But it was only a suggestion.

> BTW, if we look at this issue we ought to also look at whether the
> send/recv quantum in libpq and the backend should be changed.  It's
been
> 8K for about ten years now ...

I suspect that TCP/IP packetizing will moderate the affects of changes
on this parameter.