Thread: Re: Low throughput of binary inserts from windows to linux
Hi, I have written my own 'large object'-like feature using the following table: ---- CREATE TABLE blob ( id bigint NOT NULL, pageno integer NOT NULL, data bytea, CONSTRAINT blob_pkey PRIMARY KEY (id, pageno) ) WITHOUT OIDS; ALTER TABLE blob ALTER COLUMN data SET STORAGE EXTERNAL; CREATE SEQUENCE seq_key_blob; ---- One blob consist of many rows, each containing one 'page'. I insert pages with PQexecPrepared with the format set to binary. This works quite well for the following setups: client -> server ----------------- linux -> linux linux -> windows windows -> windows but pretty bad (meaning about 10 times slower) for this setup windows -> linux The used postgresql versions are 8.1.5 for both operating system. A (sort of) minimal code sample exposing this problem may be found appended to this e-mail. Any ideas? Thanks, Axel
Attachment
"Axel Waggershauser" <awagger@web.de> writes: > ... This works quite well for the > following setups: > client -> server > ----------------- > linux -> linux > linux -> windows > windows -> windows > but pretty bad (meaning about 10 times slower) for this setup > windows -> linux This has to be a network-level problem. IIRC, there are some threads in our archives discussing possibly-related performance issues seen with Windows' TCP stack. Don't recall details, but I think in some cases the problem was traced to third-party add-ons to the Windows stack. You might want to check just what you're running there. regards, tom lane
On 12/9/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Axel Waggershauser" <awagger@web.de> writes: > > ... This works quite well for the > > following setups: > > > client -> server > > ----------------- > > linux -> linux > > linux -> windows > > windows -> windows > > > but pretty bad (meaning about 10 times slower) for this setup > > > windows -> linux > > This has to be a network-level problem. IIRC, there are some threads in > our archives discussing possibly-related performance issues seen with > Windows' TCP stack. Don't recall details, but I think in some cases > the problem was traced to third-party add-ons to the Windows stack. > You might want to check just what you're running there. I searched the archives but found nothing really enlightening regarding my problem. One large thread regarding win32 was related to a psql problem related to multiple open handles, other mails referred to a "QoS" patch but I could not find more specific information. I thought about firewall or virus scanning software myself, but I can't really see why such software should distinguish between a windows remote host and a linux remote host. Furthermore, "downloading" is fast on all setups, it's just uploading from windows to linux, which is slow. I repeated my test with a vanilla windows 2000 machine (inc. tons of microsoft hot-fixes) and it exposes the same problem. I'm out of ideas here, maybe someone could try to reproduce this behavior or could point me to the thread containing relevant information (sorry, maybe I'm just too dumb :-/) Thank, Axel
> I'm out of ideas here, maybe someone could try to reproduce this > behavior or could point me to the thread containing relevant > information (sorry, maybe I'm just too dumb :-/) please specify how you're transfering the data from windows -> linux. are you using odbc? if yes, what driver? are you using FDQN server names or a plain ip adress? etc etc. - thomas
On 12/11/06, Thomas H. <me@alternize.com> wrote: > > I'm out of ideas here, maybe someone could try to reproduce this > > behavior or could point me to the thread containing relevant > > information (sorry, maybe I'm just too dumb :-/) > > please specify how you're transfering the data from windows -> linux. are > you using odbc? if yes, what driver? are you using FDQN server names or a > plain ip adress? etc etc. You may take a look at my first mail (starting this thread), there you find a my_lo.c attached, containing a the complete code. I use libpq. The connection is established like this: conn = PQsetdbLogin(argv[1], "5432", NULL, NULL, argv[2], argv[3], argv[4]); I called the test program with the plain ip-address of the server machine. Axel
"Axel Waggershauser" <awagger@web.de> writes: > On 12/9/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This has to be a network-level problem. IIRC, there are some threads in >> our archives discussing possibly-related performance issues seen with >> Windows' TCP stack. > I searched the archives but found nothing really enlightening > regarding my problem. One large thread regarding win32 was related to > a psql problem related to multiple open handles, other mails referred > to a "QoS" patch but I could not find more specific information. Yeah, that's what I couldn't think of the other day. The principal report was here: http://archives.postgresql.org/pgsql-general/2005-01/msg01231.php By default, Windows XP installs the QoS Packet Scheduler service. It is not installed by default on Windows 2000. After I installed QoS Packet Scheduler on the Windows 2000 machine, the latency problem vanished. Now he was talking about a local connection not remote, but it's still something worth trying. regards, tom lane
>>> On Mon, Dec 11, 2006 at 8:58 AM, in message <5e66c6e90612110658r3c0918f6v4fd3682363db5c15@mail.gmail.com>, "Axel Waggershauser" <awagger@web.de> wrote: > > I'm out of ideas here, maybe someone could try to reproduce this > behavior or could point me to the thread containing relevant > information No guarantees that this is the problem, but I have seen similar issues in other software because of delays introduced in the TCP stack by the Nagle algorithm. Turning on TCP_NODELAY has solved such problems. I don't know if PostgreSQL is vulnerable to this, or how it would be fixed in a PostgreSQL environment, but it might give you another avenue to search. -Kevin
On 12/11/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, that's what I couldn't think of the other day. The principal > report was here: > http://archives.postgresql.org/pgsql-general/2005-01/msg01231.php > > By default, Windows XP installs the QoS Packet Scheduler service. > It is not installed by default on Windows 2000. After I installed > QoS Packet Scheduler on the Windows 2000 machine, the latency > problem vanished. I found a QoS-RVPS service (not sure about the last four characters and I'm sitting at my mac at home now...) on one of the WinXP test boxes, started it and immediately lost network connection :-(. Since I have pretty much the same skepticism regarding the usefulness of a QoS packet scheduler to help with a raw-throughput-problem like Lincoln Yeoh in a follow up mail to the above (http://archives.postgresql.org/pgsql-general/2005-01/msg01243.php), I didn't investigate this further. And regarding the TCP_NODELAY hint from Kevin Grittner: if I am not wrong with interpreting fe_connect.c, the libpq already deals with it (fe_connect.c:connectNoDelay). But this made me think about the 'page'-size I use in my blob table... I tested different sizes on linux some time ago and found that 64KB was optimal. But playing with different sizes again revealed that my windows->linux problem seems to be solved if I use _any_ other (reasonable - meaning something between 4K and 512K) power of two ?!? Does this make sense to anyone? Thanks, axel
"Axel Waggershauser" <awagger@web.de> writes: > I tested different sizes on linux some time ago and found that 64KB > was optimal. But playing with different sizes again revealed that my > windows->linux problem seems to be solved if I use _any_ other > (reasonable - meaning something between 4K and 512K) power of two ?!? I think this almost certainly indicates a Nagle/delayed-ACK interaction. I googled and found a nice description of the issue: http://www.stuartcheshire.org/papers/NagleDelayedAck/ Note that there are no TCP connections in which message payloads are exact powers of two (and no, I don't know why they didn't try to make it so). You are probably looking at a situation where this particular transfer size results in an odd number of messages where the other sizes do not, with the different overheads between Windows and everybody else accounting for the fact that it's only seen with a Windows sender. If you don't like that theory, another line of reasoning has to do with the fact that the maximum advertiseable window size in TCP is 65535 --- there could be some edge-case behaviors in the Windows and Linux stacks that don't play nicely together for 64K transfer sizes. regards, tom lane
* Tom Lane: > If you don't like that theory, another line of reasoning has to do with > the fact that the maximum advertiseable window size in TCP is 65535 --- > there could be some edge-case behaviors in the Windows and Linux stacks > that don't play nicely together for 64K transfer sizes. Linux enables window scaling, so the actual window size can be more than 64K. Windows should cope with it, but some PIX firewalls and other historic boxes won't. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On 12/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Axel Waggershauser" <awagger@web.de> writes: > > I tested different sizes on linux some time ago and found that 64KB > > was optimal. But playing with different sizes again revealed that my > > windows->linux problem seems to be solved if I use _any_ other > > (reasonable - meaning something between 4K and 512K) power of two ?!? > > I think this almost certainly indicates a Nagle/delayed-ACK > interaction. I googled and found a nice description of the issue: > http://www.stuartcheshire.org/papers/NagleDelayedAck/ But that means I must have misinterpreted fe-connect.c, right? Meaning on the standard windows build the setsockopt(conn->sock, IPPROTO_TCP, TCP_NODELAY, (char *) &on, sizeof(on)) line gets never called (eather because TCP_NODELAY is not defined or IS_AF_UNIX(addr_cur->ai_family) in PQconnectPoll evaluates to true). In case I was mistaken, this explanation makes perfectly sens to me. But then again it would indicate a 'bug' in libpq, in the sense that it (apparently) sets TCP_NODELAY on linux but not on windows. Axel
"Axel Waggershauser" <awagger@web.de> writes: > On 12/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think this almost certainly indicates a Nagle/delayed-ACK >> interaction. I googled and found a nice description of the issue: >> http://www.stuartcheshire.org/papers/NagleDelayedAck/ > In case I was mistaken, this explanation makes perfectly sens to me. > But then again it would indicate a 'bug' in libpq, in the sense that > it (apparently) sets TCP_NODELAY on linux but not on windows. No, it would mean a bug in Windows in that it fails to honor TCP_NODELAY. Again, given that you only see the behavior at one specific message length, I suspect this is a corner case rather than a generic "it doesn't work" issue. We're pretty much guessing though. Have you tried tracing the traffic with a packet sniffer to see what's really happening at different message sizes? regards, tom lane
Tom Lane wrote:
has to be another stack implementation -- windows to itself I don't think has the problem),
it _did_ honor TCP_NODELAY. That was a while ago (1997) but I'd be surprised
if things have changed much since then.
Basically nagle has to be turned off for protocols like this (request/response interaction
over TCP) otherwise you'll sometimes end up with stalls waiting for the delayed ack
before sending, which in turn results in very low throughput, per connection. As I remember
Windows client talking to Solaris server had the problem, but various other permutations
of client and server stack implementation did not.
Last time I did battle with nagle/delayed ack interaction in windows (the other endIn case I was mistaken, this explanation makes perfectly sens to me. But then again it would indicate a 'bug' in libpq, in the sense that it (apparently) sets TCP_NODELAY on linux but not on windows.No, it would mean a bug in Windows in that it fails to honor TCP_NODELAY.
has to be another stack implementation -- windows to itself I don't think has the problem),
it _did_ honor TCP_NODELAY. That was a while ago (1997) but I'd be surprised
if things have changed much since then.
Basically nagle has to be turned off for protocols like this (request/response interaction
over TCP) otherwise you'll sometimes end up with stalls waiting for the delayed ack
before sending, which in turn results in very low throughput, per connection. As I remember
Windows client talking to Solaris server had the problem, but various other permutations
of client and server stack implementation did not.