Thread: Re: Low throughput of binary inserts from windows to linux

Re: Low throughput of binary inserts from windows to linux

From

"Axel Waggershauser"

Date:

08 December 2006, 21:24:28

Hi,

I have written my own 'large object'-like feature using the following table:

----
CREATE TABLE blob
(
 id bigint NOT NULL,
 pageno integer NOT NULL,
 data bytea,
 CONSTRAINT blob_pkey PRIMARY KEY (id, pageno)
)
WITHOUT OIDS;
ALTER TABLE blob ALTER COLUMN data SET STORAGE EXTERNAL;

CREATE SEQUENCE seq_key_blob;
----

One blob consist of many rows, each containing one 'page'. I insert pages with
PQexecPrepared with the format set to binary. This works quite well for the
following setups:

client  -> server
-----------------
linux   -> linux
linux   -> windows
windows -> windows

but pretty bad (meaning about 10 times slower) for this setup

windows -> linux


The used postgresql versions are 8.1.5 for both operating system. A (sort of)
minimal code sample exposing this problem may be found appended to this e-mail.

Any ideas?

Thanks,
 Axel

Attachment

my_lo.c

Re: Low throughput of binary inserts from windows to linux

From

Tom Lane

Date:

08 December 2006, 23:52:44

"Axel Waggershauser" <awagger@web.de> writes:
> ... This works quite well for the
> following setups:

> client  -> server
> -----------------
> linux   -> linux
> linux   -> windows
> windows -> windows

> but pretty bad (meaning about 10 times slower) for this setup

> windows -> linux

This has to be a network-level problem.  IIRC, there are some threads in
our archives discussing possibly-related performance issues seen with
Windows' TCP stack.  Don't recall details, but I think in some cases
the problem was traced to third-party add-ons to the Windows stack.
You might want to check just what you're running there.

            regards, tom lane

Re: Low throughput of binary inserts from windows to linux

From

"Axel Waggershauser"

Date:

11 December 2006, 10:58:34

On 12/9/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Axel Waggershauser" <awagger@web.de> writes:
> > ... This works quite well for the
> > following setups:
>
> > client  -> server
> > -----------------
> > linux   -> linux
> > linux   -> windows
> > windows -> windows
>
> > but pretty bad (meaning about 10 times slower) for this setup
>
> > windows -> linux
>
> This has to be a network-level problem.  IIRC, there are some threads in
> our archives discussing possibly-related performance issues seen with
> Windows' TCP stack.  Don't recall details, but I think in some cases
> the problem was traced to third-party add-ons to the Windows stack.
> You might want to check just what you're running there.

I searched the archives but found nothing really enlightening
regarding my problem. One large thread regarding win32 was related to
a psql problem related to multiple open handles, other mails referred
to a "QoS" patch but I could not find more specific information.

I thought about firewall or virus scanning software myself, but I
can't really see why such software should distinguish between a
windows remote host and a linux remote host. Furthermore,
"downloading" is fast on all setups, it's just uploading from windows
to linux, which is slow.

I repeated my test with a vanilla windows 2000 machine (inc. tons of
microsoft hot-fixes) and it exposes the same problem.

I'm out of ideas here, maybe someone could try to reproduce this
behavior or could point me to the thread containing relevant
information (sorry, maybe I'm just too dumb :-/)

Thank,
  Axel

Re: Low throughput of binary inserts from windows to linux

From

"Thomas H."

Date:

11 December 2006, 11:03:30

> I'm out of ideas here, maybe someone could try to reproduce this
> behavior or could point me to the thread containing relevant
> information (sorry, maybe I'm just too dumb :-/)

please specify how you're transfering the data from windows -> linux. are
you using odbc? if yes, what driver? are you using FDQN server names or a
plain ip adress? etc etc.

- thomas

Re: Low throughput of binary inserts from windows to linux

From

"Axel Waggershauser"

Date:

11 December 2006, 11:16:17

On 12/11/06, Thomas H. <me@alternize.com> wrote:
> > I'm out of ideas here, maybe someone could try to reproduce this
> > behavior or could point me to the thread containing relevant
> > information (sorry, maybe I'm just too dumb :-/)
>
> please specify how you're transfering the data from windows -> linux. are
> you using odbc? if yes, what driver? are you using FDQN server names or a
> plain ip adress? etc etc.

You may take a look at my first mail (starting this thread), there you
find a my_lo.c attached, containing a the complete code. I use libpq.
The connection is established like this:

     conn = PQsetdbLogin(argv[1], "5432", NULL, NULL, argv[2],
argv[3], argv[4]);

I called the test program with the plain ip-address of the server machine.

Axel

Re: Low throughput of binary inserts from windows to linux

From

Tom Lane

Date:

11 December 2006, 11:59:11

"Axel Waggershauser" <awagger@web.de> writes:
> On 12/9/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This has to be a network-level problem.  IIRC, there are some threads in
>> our archives discussing possibly-related performance issues seen with
>> Windows' TCP stack.

> I searched the archives but found nothing really enlightening
> regarding my problem. One large thread regarding win32 was related to
> a psql problem related to multiple open handles, other mails referred
> to a "QoS" patch but I could not find more specific information.

Yeah, that's what I couldn't think of the other day.  The principal
report was here:
http://archives.postgresql.org/pgsql-general/2005-01/msg01231.php

    By default, Windows XP installs the QoS Packet Scheduler service.
    It is not installed by default on Windows 2000.  After I installed
    QoS Packet Scheduler on the Windows 2000 machine, the latency
    problem vanished.

Now he was talking about a local connection not remote, but it's still
something worth trying.

            regards, tom lane

Re: Low throughput of binary inserts from windows to

From

"Kevin Grittner"

Date:

11 December 2006, 12:25:52

>>> On Mon, Dec 11, 2006 at  8:58 AM, in message
<5e66c6e90612110658r3c0918f6v4fd3682363db5c15@mail.gmail.com>, "Axel
Waggershauser" <awagger@web.de> wrote:
>
> I'm out of ideas here, maybe someone could try to reproduce this
> behavior or could point me to the thread containing relevant
> information

No guarantees that this is the problem, but I have seen similar issues
in other software because of delays introduced in the TCP stack by the
Nagle algorithm.  Turning on TCP_NODELAY has solved such problems.  I
don't know if PostgreSQL is vulnerable to this, or how it would be fixed
in a PostgreSQL environment, but it might give you another avenue to
search.

-Kevin

Re: Low throughput of binary inserts from windows to linux

From

"Axel Waggershauser"

Date:

11 December 2006, 18:44:50

On 12/11/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, that's what I couldn't think of the other day.  The principal
> report was here:
> http://archives.postgresql.org/pgsql-general/2005-01/msg01231.php
>
>     By default, Windows XP installs the QoS Packet Scheduler service.
>     It is not installed by default on Windows 2000.  After I installed
>     QoS Packet Scheduler on the Windows 2000 machine, the latency
>     problem vanished.

I found a QoS-RVPS service (not sure about the last four characters
and I'm sitting at my mac at home now...) on one of the WinXP test
boxes, started it and immediately lost network connection :-(. Since I
have pretty much the same skepticism regarding the usefulness of a QoS
packet scheduler to help with a raw-throughput-problem like Lincoln
Yeoh in a follow up mail to the above
(http://archives.postgresql.org/pgsql-general/2005-01/msg01243.php), I
didn't investigate this further.

And regarding the TCP_NODELAY hint from Kevin Grittner: if I am not
wrong with interpreting fe_connect.c, the libpq already deals with it
(fe_connect.c:connectNoDelay). But this made me think about the
'page'-size I use in my blob table...

I tested different sizes on linux some time ago and found that 64KB
was optimal. But playing with different sizes again revealed that my
windows->linux problem seems to be solved if I use _any_ other
(reasonable - meaning something between 4K and 512K) power of two ?!?

Does this make sense to anyone?

Thanks,
  axel

Re: Low throughput of binary inserts from windows to linux

From

Tom Lane

Date:

11 December 2006, 21:25:39

"Axel Waggershauser" <awagger@web.de> writes:
> I tested different sizes on linux some time ago and found that 64KB
> was optimal. But playing with different sizes again revealed that my
> windows->linux problem seems to be solved if I use _any_ other
> (reasonable - meaning something between 4K and 512K) power of two ?!?

I think this almost certainly indicates a Nagle/delayed-ACK
interaction.  I googled and found a nice description of the issue:
http://www.stuartcheshire.org/papers/NagleDelayedAck/

Note that there are no TCP connections in which message payloads are
exact powers of two (and no, I don't know why they didn't try to make
it so).  You are probably looking at a situation where this particular
transfer size results in an odd number of messages where the other sizes
do not, with the different overheads between Windows and everybody else
accounting for the fact that it's only seen with a Windows sender.

If you don't like that theory, another line of reasoning has to do with
the fact that the maximum advertiseable window size in TCP is 65535 ---
there could be some edge-case behaviors in the Windows and Linux stacks
that don't play nicely together for 64K transfer sizes.

            regards, tom lane

Re: Low throughput of binary inserts from windows to linux

From

Florian Weimer

Date:

12 December 2006, 03:39:33

* Tom Lane:

> If you don't like that theory, another line of reasoning has to do with
> the fact that the maximum advertiseable window size in TCP is 65535 ---
> there could be some edge-case behaviors in the Windows and Linux stacks
> that don't play nicely together for 64K transfer sizes.

Linux enables window scaling, so the actual window size can be more
than 64K.  Windows should cope with it, but some PIX firewalls and
other historic boxes won't.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: Low throughput of binary inserts from windows to linux

From

"Axel Waggershauser"

Date:

12 December 2006, 06:19:15

On 12/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Axel Waggershauser" <awagger@web.de> writes:
> > I tested different sizes on linux some time ago and found that 64KB
> > was optimal. But playing with different sizes again revealed that my
> > windows->linux problem seems to be solved if I use _any_ other
> > (reasonable - meaning something between 4K and 512K) power of two ?!?
>
> I think this almost certainly indicates a Nagle/delayed-ACK
> interaction.  I googled and found a nice description of the issue:
> http://www.stuartcheshire.org/papers/NagleDelayedAck/

But that means I must have misinterpreted fe-connect.c, right? Meaning
on the standard windows build the

    setsockopt(conn->sock, IPPROTO_TCP, TCP_NODELAY, (char *) &on, sizeof(on))

line gets never called (eather because TCP_NODELAY is not defined or
IS_AF_UNIX(addr_cur->ai_family) in PQconnectPoll evaluates to true).

In case I was mistaken, this explanation makes perfectly sens to me.
But then again it would indicate a 'bug' in libpq, in the sense that
it (apparently) sets TCP_NODELAY on linux but not on windows.

Axel

Re: Low throughput of binary inserts from windows to linux

From

Tom Lane

Date:

12 December 2006, 12:33:43

"Axel Waggershauser" <awagger@web.de> writes:
> On 12/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think this almost certainly indicates a Nagle/delayed-ACK
>> interaction.  I googled and found a nice description of the issue:
>> http://www.stuartcheshire.org/papers/NagleDelayedAck/

> In case I was mistaken, this explanation makes perfectly sens to me.
> But then again it would indicate a 'bug' in libpq, in the sense that
> it (apparently) sets TCP_NODELAY on linux but not on windows.

No, it would mean a bug in Windows in that it fails to honor TCP_NODELAY.
Again, given that you only see the behavior at one specific message
length, I suspect this is a corner case rather than a generic "it
doesn't work" issue.

We're pretty much guessing though.  Have you tried tracing the traffic
with a packet sniffer to see what's really happening at different
message sizes?

            regards, tom lane

Re: Low throughput of binary inserts from windows to linux

From

David Boreham

Date:

12 December 2006, 12:56:11

Tom Lane wrote:

In case I was mistaken, this explanation makes perfectly sens to me.
But then again it would indicate a 'bug' in libpq, in the sense that
it (apparently) sets TCP_NODELAY on linux but not on windows.

No, it would mean a bug in Windows in that it fails to honor TCP_NODELAY.

Last time I did battle with nagle/delayed ack interaction in windows (the other end
has to be another stack implementation -- windows to itself I don't think has the problem),
it _did_ honor TCP_NODELAY. That was a while ago (1997) but I'd be surprised
if things have changed much since then.

Basically nagle has to be turned off for protocols like this (request/response interaction
over TCP) otherwise you'll sometimes end up with stalls waiting for the delayed ack
before sending, which in turn results in very low throughput, per connection. As I remember
Windows client talking to Solaris server had the problem, but various other permutations
of client and server stack implementation did not.