Re: TCP network cost - Mailing list pgsql-performance

From Ross J. Reedstrom
Subject Re: TCP network cost
Date
Msg-id 20090223194212.GA14802@cooker
Whole thread Raw
In response to Re: TCP network cost  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: TCP network cost
List pgsql-performance
[note: sending a message that's been sitting in 'drafts' since last week]

Summary: C client and large-object API python both send bits in
reasonable time, but I suspect there's still room for improvement in
libpq over TCP: I'm suspicious of the 6x difference. Detailed analysis
will probably find it's all down to memory allocation and extra copying
of bits around (client side)

Ross

On Wed, Feb 18, 2009 at 01:44:23PM +0000, Gregory Stark wrote:
>
> There's not much Postgres can do to mess up TCP/IP. The only things that come
> to mind are a) making lots of short-lived connections and b) getting caught by
> Nagle when doing lots of short operations and blocking waiting on results.

The hint re: Nagle sent to off hunting. It looks like libpq _should_ be
setting NODELAY on both sides of the socket. However, tcptrace output
does show (what I understand to be) the stereotypical
every-other-packet-acked stairstep of a delayed-ack/Nagle interaction.
(as described here: http://www.stuartcheshire.org/papers/NagleDelayedAck/ )

Walking through the libpq code, though, it sets NODELAY, so Nagle should
be out of the picture.  This may be a red herring, though. See below.

> What libpq (or other interface) operations are you doing exactly?

I'm using psycopg from python. My cut down test case is:

con=psycopg.connect('dbname=mydb user=myuser port=5433 host=myhost')
cur=con.cursor()
start=DateTime.now()
cur.execute("""select file from files where fileid=1""")
data = cur.fetchone()[0]
end=DateTime.now()
f=open('/dev/null','w')
f.write(data)
f.close()
cur.close()
print "tcp socket: %s" % str(end - start)

I've since written a minimal C app, and it's doing much better, down to
about 7 sec for a local TCP connection (either localhost or hostname)

So, I get to blame the psycopg wrapper for ~ 30 sec of delay. I'm
suspicous of memory allocation, myself.

The tcp traces (tcpdump + tcptrace + xplot are cool set of tools, btw)
indicate that the backend's taking ~ 0.35 sec to process the query and
start sending bits, and using a domain socket w/ that code gets the file
in 1.3 - 1.4 sec, so I'm still seeing a 6-fold slowdown for going via
TCP (6 sec. vs. 1 sec.)  Sending the raw file via apache (localhost)
takes ~200 ms.

Moving to a large-object based implementation would seem to confirm
that: psycopg2 (snapshot of svn head) manages to pull a lo version of
the file in times equivalent to the C client (7 sec local)

I'll probably move the system to use that, since there's really almost
no use-case for access to the insides of these files from SQL.

> [also, your Mail-Followup-To has a bogus email address in it. Please don't do
> that]

Hmm, not on purpose. I'll take a look.

pgsql-performance by date:

Previous
From: Kouber Saparev
Date:
Subject: Re: LIMIT confuses the planner
Next
From: "Ross J. Reedstrom"
Date:
Subject: Re: TCP network cost