Re: strange performance problem (SOLVED) - Mailing list pgsql-general

From Linos
Subject Re: strange performance problem (SOLVED)
Date
Msg-id 49AADBB2.4060500@linos.es
Whole thread Raw
In response to Re: strange performance problem (SOLVED)  (Linos <info@linos.es>)
List pgsql-general
Linos escribió:
> Linos escribió:
>> Linos escribió:
>>> Richard Huxton escribió:
>>>> Linos wrote:
>>>>> Richard Huxton escribió:
>>>>>> Linos wrote:
>>>>>>> 2009-02-27 13:51:15 CET 127.0.0.1LOG:  duración: 4231.045 ms
>>>>>>> sentencia:
>>>>>>> SELECT "nombre", "subfamilia_id", "id_familia", "hasta", "foto",
>>>>>>> "id_seccion", "id_categoria" FROM "modelo_subfamilia"
>>>>>>> PSQL with \timing:
>>>>>>> -development: Time: 72,441 ms
>>>>>>> -server: Time: 78,762 ms
>>>>>>> but if i load it from QT or from pgadmin i get more than 4
>>>>>>> seconds in
>>>>>>> server and ~100ms in develoment machime, if i try the query
>>>>>>> without the
>>>>>>> "foto" column i get 2ms in development and 30ms in server
>>>>>> OK, so:
>>>>>> 1. No "foto"     - both quick
>>>>>> 2. psql + "foto" - both slow
>>>>>> 3. QT + "foto"   - slow only on server
>>>>> 1.No "foto"                     -both quick but still a noticeable
>>>>> difference between them 2ms develoment - 30ms server
>>>>> 2. psql + "foto"                -both quick really, they are about
>>>>> 70ms,
>>>>> not bad giving that foto are bytea with small png images.
>>>>
>>>> Ah, sorry - I read the psql timings as 72 thousand ms (72 seconds) - of
>>>> course you're using European decimal marks.
>>>>
>>>>> 3. QT or WXWindows + "foto"       -slow only one server yes.
>>>>>
>>>>>> The bit that puzzles me is why both are slow in #2 and not in #3.
>>>>
>>>> OK - well, the fact that both psql are fast means there's nothing too
>>>> wrong with your setup. It must be something to do with the application
>>>> libraries.
>>>>
>>>>> After the vacuum full verbose and reindex still the same problem (i
>>>>> had
>>>>> tried the vacuum before).
>>>>
>>>> OK. Worth ruling it out.
>>>>
>>>>> 1- The same in the two machines, tcp/ip with localhost.
>>>>
>>>> Hmm...
>>>>
>>>>> 2- I am exactly the same code in the two machines and the same
>>>>> pgadmin3
>>>>> version too.
>>>>
>>>> Good. We can rule that out.
>>>>
>>>>> 3- Ever the entire result set.
>>>>
>>>> Good.
>>>>
>>>>> 4- I am using es_ES.UTF8 in the two machines
>>>>
>>>> Good.
>>>>
>>>>> What can be using wxwindows and QT to access postgresql that psql
>>>>> it is
>>>>> not using, libpq?
>>>>
>>>> Well, I'm pretty sure that pgadmin will be using libpq at some level,
>>>> even if there is other code above it.
>>>>
>>>> Either:
>>>>
>>>> 1. One machine (the fast one) is actually using unix sockets and not
>>>> tcp/ip+localhost like you think.
>>>> 2. The networking setup is different on each.
>>>> 3. Something your code is doing with the bytea data is slower on one
>>>> machine than another. I seem to remember that pgadmin used to be quite
>>>> slow at displaying large amounts of data. They did some work on that,
>>>> but it might be that your use-case still suffers from it.
>>>>
>>>> For #1 try the psql test again, but with "-h localhost" and "-h /tmp"
>>>> (or whatever directory your unix socket is in - might be
>>>> /var/run/postgresql or similar too).
>>>
>>> Ok, thanks for the trick now i know where to search, after trying
>>> with -h localhost psql it is slow too in the server from 80,361 with
>>> \timing to 4259,453 using -h localhost. Any ideas what can be the
>>> problem here? i am going to make what you suggest and capture analyze
>>> the traffic, after find the hole i have tried in other debian server
>>> with the same kernel 2.6.26 and i have the same problem (my
>>> development machine it is Arch Linux with 2.6.28).
>>>
>>> Regards,
>>> Miguel Angel.
>>>
>>>> For #2, you can always try timing "psql -h localhost ... >
>>>> /dev/null" on
>>>> both machines. If you capture port 5432 with something like "tcpdump -w
>>>> ip.dump host localhost and port 5432" you can then use wireshark to see
>>>> exactly why it's slow.
>>>>
>>>> For #3, I guess you'd need to reduce your code to just fetching the
>>>> data
>>>> and time that. You may have already done this of course.
>>>>
>>>> HTH
>>>>
>>
>> I have been testing with tcpdump but i dont see the problem in the
>> traffic (aside from the fact that it gives big jumps in ms between
>> packets of data, but i dont know why), i have tested on other debian
>> machines with the same result, i have upgraded kernel to 2.6.28 and
>> postgresql to 8.3.6 (equal versions of my Arch Linux Development
>> machine), but still have the same problem:
>>
>> -query with \timing with "psql -d database" ~110ms
>> -query with \timing with "psql -d database -h localhost" ~4400ms
>>
>> Using tcp the cpu of postgresql spike to the max it can borrow within
>> the query. I have attached the tcpdump logs of a debian machine and
>> the Arch too (maybe anyone can see anything in them that i can not).
>> How i can test pure speed in the loopback interface? i have tried
>> iperf but seems to be cpu bound so maybe the results are misleading.
>>
>
> Okay, i have found the problem, in postgresql.conf the parameter "ssl =
> true" seems to slow the clear tcp connections (not ssl enabled) very
> very much, but this does not affect my arch Linux machine, only debian
> ones so i will contact debian package maintainer so they can investigate
> it, thanks for the help.
>
> Regards,
> Miguel Angel.
>

The difference here between Arch and Debian it is that debian ships with a ssl
cert (in the pgsql-performance mailing list Magnus have pointed me this info) so
the local psql connections are ever encrypted by default if you not change
PGSSLMODE environment variable to "disable",  which gives the difference in the
query, after disable ssl all it is normal other time :).

Regards,
Miguel Angel.


pgsql-general by date:

Previous
From: Linos
Date:
Subject: Re: strange performance problem (SOLVED)
Next
From: Siddharth Shah
Date:
Subject: Shared Buffers