Re: ODBC driver for Windows & future... - Mailing list pgsql-odbc
From | markw@mohawksoft.com |
---|---|
Subject | Re: ODBC driver for Windows & future... |
Date | |
Msg-id | 16619.24.91.171.78.1103639240.squirrel@mail.mohawksoft.com Whole thread Raw |
In response to | Re: ODBC driver for Windows & future... (Marko Ristola <marko.ristola@kolumbus.fi>) |
List | pgsql-odbc |
> > Hi, > > I have investigated the ODBC driver behaviour, when the query result has > very many rows. > > It seems, that the whole query result is stored as such into a memory > buffer before any > further processing. > That buffer is reallocated, when needed. If the buffer is for example > 50Mbytes, and > after reallocation it's size will be 100Mbytes. The malloc() or > realloc() takes a very long time. > I investigated this bottleneck on Linux ODBC driver. There is actually a setting that makes the ODBC driver use a cursor to only grab a chunk at a time, its in the docs somewhere as I have used it in the past. > > The procedure without an ODBC cursor is as follows: > 1. Read all query result data from the backend to the huge buffer. > (maybe sometimes restructure the buffer, if some column size on the > buffer is exceeded.) > This seems to be the bottleneck with the large malloc() operation. > 2. Read (and convert) the asked results from the buffer for the given row. > > > Allocating huge buffers is inefficient. > Linux operating system handles allocating big files much better > than allocating big memory areas. > More efficient would be to use a temporary file: > sequential file scans are rather fast. > > One way for solving the problem: > > Maybe the key for solving the bottleneck is to tune the operating system > to free enough memory > beforehand: If the operating system has 100Mb unused memory, it is a lot > faster, > than if it has only 2Mb unused memory ready for fast memory allocations. > > Good way for solving the problem: > > The bottleneck can be avoided on the program side by using ODBC cursor. > With ODBC cursor one can fetch for example 1000 rows in one batch > from the database server. You get next 1000 rows with a new fetch. > This way there is no limit on the number of rows fetched on any database. > > On large result sets, there is always a limit with the memory on 32 bit > systems. > On 64 bit systems this limit goes away, but the limit with some slowdown > on > nonlocal CPU memory won't go away even on high end machines. > (NUMA machines have about 2Gbytes memory near each CPU. Other memory is > behind > a slower bus ). So memory allocations over 2Gb are not good for speed. > > So, the ODBC cursor scales well for any huge query result, on any > client operating system. It works even on Java, where memory is > extremely limited. > > Other ways to solve the problem? > ODBC Code: How about allocating memory in 4Mb chunks? The operating > system handles > small memory allocations more easily and frees more memory to be available > in the background while the ODBC driver fills the allocated chunk. > > Marko Ristola > > *Shachar Shemesh wrote: > > *lso of interest is that this very same client is also interested in the > ODBC driver for a different project. We have already did some porting of > their application, and have spotted a serious performance issue with > ODBC when long query results are retrieved. It is possible (thought it > would be best not to count on it) that we will do some work in that > direction on ODBC in the foreseeable future. The reason we did not step > forward and offered ourselves as full maintainers of the code is that we > don't feel we have the resources for that. It is good to know, however, > that the facilities for sending patches and having them committed exists. > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
pgsql-odbc by date: