Thread: Slow query through ODBC

Slow query through ODBC

From
Arnaud Lesauvage
Date:
Hi all !

I have a query that runs very fast when executed through pgAdmin
(~2 seconds), but very slowly through the ODBC driver (~30
seconds). Both queries are executed from the same workstation on
the same server.

My workstation is a WinXP Pro, and I use the latest psqlODBC
driver available as an MSI package (version 8.01.0102).
The server is a PostGreSQL 8.1.2, running on a Win2000 server.

The query is very simple, it returns about 1500 lines, made of
only one text field, but this field can be very large (it is the
WKT representation of a spatial object from PostGIS).

I am connecting to the server in a VBA project with a simple
connectionstring with only default options (I have not found a
document explaining how to fine tune the odbc driver).
My connectionstring is :
"DRIVER={PostgreSQL
Unicode};SERVER=myserver;PORT=5432;DATABASE=mydatabase;UID=myuser;PWD=mypass"

I assume that the problem is with the very large field size that
is returned, but what should I change in my connectionstring to
make this query run faster ?

Thanks for your help !

--
Arnaud



Re: Slow query through ODBC

From
Ludek Finstrle
Date:
> My workstation is a WinXP Pro, and I use the latest psqlODBC
> driver available as an MSI package (version 8.01.0102).

Don't you have problem with geting whole data correctly? There
is bug in psqlodbc driver when reading data from text field larger
then 2048 (+-) bytes (in Unicode driver). The problem was fixed
in 08.01.0106 development snapshot.

> The server is a PostGreSQL 8.1.2, running on a Win2000 server.
>
> The query is very simple, it returns about 1500 lines, made of

Are you sure the PgAdmin returns all rows (I'm not). Maybe Dave
give us the answer.

> only one text field, but this field can be very large (it is the
> WKT representation of a spatial object from PostGIS).
>
> I am connecting to the server in a VBA project with a simple
> connectionstring with only default options (I have not found a
> document explaining how to fine tune the odbc driver).
> My connectionstring is :
> "DRIVER={PostgreSQL
> Unicode};SERVER=myserver;PORT=5432;DATABASE=mydatabase;UID=myuser;PWD=mypass"
>
> I assume that the problem is with the very large field size that
> is returned, but what should I change in my connectionstring to
> make this query run faster ?

I don't know it exactly. Could you try experiment with parameters
such Use declare/fetch or Server side prepare?
It could help me when you send me the mylog output (enable the mylog
output under Global and DataSource button).

I hope the data isn't confidental (they are listed in mylog output).

Regards,

Luf

Re: Slow query through ODBC

From
Arnaud Lesauvage
Date:
Ludek Finstrle a écrit :
>> My workstation is a WinXP Pro, and I use the latest psqlODBC
>> driver available as an MSI package (version 8.01.0102).
>
> Don't you have problem with geting whole data correctly? There
> is bug in psqlodbc driver when reading data from text field larger
> then 2048 (+-) bytes (in Unicode driver). The problem was fixed
> in 08.01.0106 development snapshot.

No, the data looks good. I am drawing the spatial objects with the
query results, and my code would throw an error if some data was
incorrect.

>> The server is a PostGreSQL 8.1.2, running on a Win2000 server.
>> The query is very simple, it returns about 1500 lines, made of
>
> Are you sure the PgAdmin returns all rows (I'm not). Maybe Dave
> give us the answer.

Yes. Actually, pgAdmin asks me if I want all rows (default is to
return 100 rows). I answer yes and the result is almost immediate.
If I ask for an output in a text file, the query runs in ~2
seconds also.

>> only one text field, but this field can be very large (it is the
>> WKT representation of a spatial object from PostGIS).
>>
>> I am connecting to the server in a VBA project with a simple
>> connectionstring with only default options (I have not found a
>> document explaining how to fine tune the odbc driver).
>> My connectionstring is :
>> "DRIVER={PostgreSQL
>> Unicode};SERVER=myserver;PORT=5432;DATABASE=mydatabase;UID=myuser;PWD=mypass"
>>
>> I assume that the problem is with the very large field size that
>> is returned, but what should I change in my connectionstring to
>> make this query run faster ?
>
> I don't know it exactly. Could you try experiment with parameters
> such Use declare/fetch or Server side prepare?
> It could help me when you send me the mylog output (enable the mylog
> output under Global and DataSource button).

I will try with these parameters, but how do I add them to the
connectionstring ?
Should I just append something like ";usedeclarefecth=1" at the
end of it ? I can't find any documentation on this (maybe I am not
searching at the right place...).

> I hope the data isn't confidental (they are listed in mylog output).

Unfortunately the data is confidential.
If we can't find a simple solution, I might just remove the data
from the log file ?


Thanks for helping me on this !


--
Arnaud
--
Arnaud


Re: Slow query through ODBC

From
Ludek Finstrle
Date:
> >Don't you have problem with geting whole data correctly? There
> >is bug in psqlodbc driver when reading data from text field larger
> >then 2048 (+-) bytes (in Unicode driver). The problem was fixed
> >in 08.01.0106 development snapshot.
>
> No, the data looks good. I am drawing the spatial objects with the
> query results, and my code would throw an error if some data was
> incorrect.

Ok. I describe the problem better. The problem was that last part
of each row returned from psqlodbc are random data from memory.
So if the data for one column (and row) is larger then N bytes (some
users has problem with N = 2048 bytes) the problem raised.
It means that only last +- 1 .. N bytes for each row are filled
randomely from memory.
You didn't must run in the problem if the data are fetched in one call
of SQLGetData. I only want to notice it. It could help you in the
future ;-)

> >I don't know it exactly. Could you try experiment with parameters
> >such Use declare/fetch or Server side prepare?
> >It could help me when you send me the mylog output (enable the mylog
> >output under Global and DataSource button).
>
> I will try with these parameters, but how do I add them to the
> connectionstring ?
> Should I just append something like ";usedeclarefecth=1" at the
> end of it ? I can't find any documentation on this (maybe I am not
> searching at the right place...).

I'm sorry I don't know it exactly (I use ODBC manager).
Maybe you could configure it through ODBC manager ...

I try to search it in source code and it could be (fullname and
shortcut):
;UseDeclareFetch=1
;B6=1

;UseServerSidePrepare=1
;C8=1

Maybe it's case insensitive. I don't know it exactly.
Don't use it both at the time. You only waste your time if you try it.

> >I hope the data isn't confidental (they are listed in mylog output).
>
> Unfortunately the data is confidential.
> If we can't find a simple solution, I might just remove the data
> from the log file ?

You could remove the data from log (only keep there a note that data
is fetched and the data length could be useful too).
You want to seatch CC_mapping and PGAPI_GetData in the log.
There is at least one line per row. Maybe it could be easier when
you replace your data with 'x' or something like that.

If I can't find the problem in mylog output could you create example
data?

Regards,

Luf

Re: Slow query through ODBC

From
Arnaud Lesauvage
Date:
Ludek Finstrle a écrit :
>> >Don't you have problem with geting whole data correctly? There
>> >is bug in psqlodbc driver when reading data from text field larger
>> >then 2048 (+-) bytes (in Unicode driver). The problem was fixed
>> >in 08.01.0106 development snapshot.
>>
>> No, the data looks good. I am drawing the spatial objects with the
>> query results, and my code would throw an error if some data was
>> incorrect.
>
> Ok. I describe the problem better. The problem was that last part
> of each row returned from psqlodbc are random data from memory.
> So if the data for one column (and row) is larger then N bytes (some
> users has problem with N = 2048 bytes) the problem raised.
> It means that only last +- 1 .. N bytes for each row are filled
> randomely from memory.
> You didn't must run in the problem if the data are fetched in one call
> of SQLGetData. I only want to notice it. It could help you in the
> future ;-)

Good point Ludek. I double checked my data, and it definitively
good in the output.
FYI, the largest field returned is 23581 characters long. The
dataset is ~1.6MB long (so we have an average of 1kB per row).

>> I will try with these parameters, but how do I add them to the
>> connectionstring ?
>> Should I just append something like ";usedeclarefecth=1" at the
>> end of it ? I can't find any documentation on this (maybe I am not
>> searching at the right place...).
>
> I'm sorry I don't know it exactly (I use ODBC manager).
> Maybe you could configure it through ODBC manager ...
>
> I try to search it in source code and it could be (fullname and
> shortcut):
> ;UseDeclareFetch=1
> ;B6=1
>
> ;UseServerSidePrepare=1
> ;C8=1
>
> Maybe it's case insensitive. I don't know it exactly.
> Don't use it both at the time. You only waste your time if you try it.

I tried both parameters (they look OK because ADO did not throw an
error), but they did not help.
Maybe I should add that the query is stored in an ADO recordset,
which is read-only and forward-only ?
It is the .open method of this recordset that actually takes 30
seconds to run.

> You could remove the data from log (only keep there a note that data
> is fetched and the data length could be useful too).
> You want to seatch CC_mapping and PGAPI_GetData in the log.
> There is at least one line per row. Maybe it could be easier when
> you replace your data with 'x' or something like that.
>
> If I can't find the problem in mylog output could you create example
> data?

I sure could !
I'll try some other parameters before (I will use the File-DSN
panel to generate connectionstrings, I should have thought about
it before !), and I'll tell you whether I found something or not.


--
Arnaud


Re: Slow query through ODBC

From
Ludek Finstrle
Date:
> I tried both parameters (they look OK because ADO did not throw an
> error), but they did not help.

I think it. But we could test it at first :-)

> Maybe I should add that the query is stored in an ADO recordset,
> which is read-only and forward-only ?

This is the best situation for psqlODBC driver.

> It is the .open method of this recordset that actually takes 30
> seconds to run.

.open ADO contains a lot of SQL* commands for ODBC.
I need the mylog or (the better for me) example data to reproduce it
here.

> >If I can't find the problem in mylog output could you create example
> >data?
>
> I sure could !

Great.

> I'll try some other parameters before (I will use the File-DSN
> panel to generate connectionstrings, I should have thought about
> it before !), and I'll tell you whether I found something or not.

Ok. I'm looking for note from you.
If you'll not find the way please send me the parsed log or
example data with SELECT.

Regards,

Luf

Re: Slow query through ODBC

From
Ludek Finstrle
Date:
Tue, Jan 24, 2006 at 04:47:04PM +0100, Arnaud Lesauvage wrote:
> I went on with my investigations.
>
> I now believe that the problem lies within the network dialog
> (between my client and my server).
> If I transfer this table on my workstation (I have a local
> postgresql installation too), the query runs blindingly fast (less
> than 1 sec).
> I don't understand what the problem could be though.
>
> This thread in the mailing list archive looks like the problem
> mentionned is the same as mine, but they don't really mention a fix :
> http://archives.postgresql.org/pgsql-performance/2005-06/msg00593.php
>
> My server is on the LAN, so bandwidth cannot be a problem.
> Furthermore, I just migrated from a MySQL 5.0 server. The exact
> same query on this MySQL server, through MyODBC, took less than 1
> second too, so my network is not the problem.
>
> Are there specific client/server protocol tuning parameters ?

I know about nothing like that.

> Have you ever heard of such issues ?

No, I haven't heard it with 08.01 yet.

Regards,

Luf

P.S. I reply mainly for Cc: pgsql-odbc - more heads more knowledge ;-)

Re: Slow query through ODBC

From
"Dave Page"
Date:

> -----Original Message-----
> From: Ludek Finstrle [mailto:luf@pzkagis.cz]
> Sent: 24 January 2006 22:35
> To: Arnaud Lesauvage
> Cc: Dave Page
> Subject: Re: [ODBC] Slow query through ODBC
>
> It seems to me I have to kick Dave to take a look at this :-)
> Dave is pgAdmin developer so he could give a light on the problem.
>
> Dave: Could you take a look at this problem?

Arnaud,

When you are testing, please make sure that the psqlODBC's mylog and
commlog are switched off, and that SQL Tracing is turned off in the ODBC
control panel applet.

Also, please note that pgAdmin has very lightweight data access classes
written over libpq - much more lightweight than the ODBC driver. Most of
the overhead in pgAdmin comes from rendering the data in the grid
control.

Regards, Dave.

Re: Slow query through ODBC

From
Arnaud Lesauvage
Date:
Dave Page a écrit :
>> From: Ludek Finstrle [mailto:luf@pzkagis.cz]
>> It seems to me I have to kick Dave to take a look at this :-)
>> Dave is pgAdmin developer so he could give a light on the problem.
>>
>> Dave: Could you take a look at this problem?
>
> Arnaud,
>
> When you are testing, please make sure that the psqlODBC's mylog and
> commlog are switched off, and that SQL Tracing is turned off in the ODBC
> control panel applet.
>
> Also, please note that pgAdmin has very lightweight data access classes
> written over libpq - much more lightweight than the ODBC driver. Most of
> the overhead in pgAdmin comes from rendering the data in the grid
> control.
>
> Regards, Dave.

Dave,

All logging is off when I run tests. I only turned them on on
Ludek's request for debugging purposes. Tracing is always off in
the ODBC pannel.

What looks strange to me is that while the query is running
(during the "open" method of the recordset), processor usage is 0%
on the server and the workstation. Network usage is very low :
~50kB on a 100Mbps connection, during the entire query.

In pgAdmin, the query is immediate and there is just a pike in the
network traffic when the data is received. That's what I was
expecting to see through psqlODBC also...

Regards
--
Arnaud


Re: Slow query through ODBC

From
Arnaud Lesauvage
Date:
Hi List !

I have some news about this problem !
Just to remind you the facts :
A query runs very slowly when executed through the psqlodbc
driver, but very fast when executed through psql or pgAdmin.
The query returns very long rows containing only one field, but
this field can be as long as 40.000 characters (it is a text field).

I posted a TCP/IP dump of the dialog between my workstation and
the postgresql server to a TCP/IP newsgroup.
I fixed the problem by changing a parameter in my TCP/IP
configuration (on my workstation, not on the server).
I have set the TcpAckFrequency key to 1, as suggested by someone
on the comp.protocols.tcp-ip NG. The procedure is described in :
http://support.microsoft.com/?scid=kb%3Ben-us%3B328890&x=15&y=10

*But* the interesting thing is that some people suggested that the
  problem was lying in the way that the server and the client
talked to each other.

Let me first post the explanation from Bill Meier, on the
Ethereal-users mailing-list, then the one from Chris Marget on the
comp.protocols.tcp-ip newsgroup.

Bill Meier wrote :
> The short answer:
>
> "TCP_NODELAY" should be enabled on the TCP connection to
> the database on both the client and server side of the connection.
> (This is also known as "disabling the Nagle Algorithm").
> (Based upon your capture I can almost guarantee that TCP_NODELAY is
> *not* enabled on your Database TCP connection on the server side).
>
> The use of TCP_NODELAY is a database option for a different database server with
> which I'm familiar. I suggest you consult your DBA with respect to PostgreSQL.
> (Also Google "database tcp_nodelay" & etc for information).
>
> The longer answer:
>
> Delays may be observerd in client/server query/response types
> of applications over a TCP connection with the Nagle Algorithm enabled.
>
> The presence or absence of a delay is dependent upon the length of
> the query and/or the query response; this causes much confusion when trying
> to analyze delay problems because the delays will seem to "come and go"
> depending upon the exact length of the query and/or response.
>
> There is also much confusion as to the exact nature of the problem.
>
> I believe the current edition of Stevens' "TCPIP Illustrated" gives a pretty clear
> explanation of the issue.
>
>
> The details in your case:
>
> I believe that due to the Nagle Algorithm, for the specific query reponse
> shown in the capture, each time your server sends a "short" packet,
> the server waits for an ACK from the client before sending the next packet.
> In your case, the server application is sending the response in chunks of
> 8192 data bytes which results in a "short" packet every 6 packets).
>
> (I would suspect that a capture of a query/response from one of the
> workstations on which the query runs "very fast" may show that the query response
> is of a different length).


Chris Marget wrote :
>> > The developper of the psqlodbc driver told me that he had used the
>> > same library that is used in pgAdmin (libpq.dll) to talk to the
>> > server !
>
> Are these two applications hitting different socket code on the server
> end?
>
> I have an area for you to explore:  There's a method of socket
> programming in windows called "io completion ports" (or somesuch).
> Rather than streaming data into the socket UNIX style, the application
> code allocates a buffer somewhere, fills it and then tells the stack:
> "The buffer is here, please send it, then let me know when I can have
> the buffer back."
>
> Some see an advantage to this type of coding because it eliminates a
> copy of data from the application memory area into stack memory area.
>
> The problem comes from the fact that the buffer cannot be reused by the
> application until the receiving TCP has ACKed all the data.
>
> Here's what I think is happening:
>
> Your server is using one of these zero-copy stack methods.  It needs to
> send 44078 bytes to the client.  The buffer I mentioned is only 8KB, so
> it must be reused 6 times in order to send all 44078 bytes.
>
> So, 8KB is copied in and the stack is asked to send.
> Bad luck causes it to take ~200ms to receive ACKs for all that data
> because of your stack tuning and timing issues.
>
> The buffer becomes available again and another 8KB is copied in.
> ~200ms again.
>
> This cycle repeats 6 times until all 44078 bytes are send and ACKed.
> Nearly 1 second has elapsed.
>
> A smarter application on the server would have used a bigger buffer.
>
> An even smarter application would have used many buffers.  8KB buffers
> are okay if you fill buffer A, notify the stack, fill buffer B, notify
> the stack, etc...
>
> Something else entirely may be going on, but this is my guess.


Both these explanations are far beyond my knowledge !
I am quite confident though that the developers of the libpq.dll
will understand what this is all about.

Was this information of some help to you ?

Best regards,
--
Arnaud