Re: Best way to reduce server rounds getting big BLOBs - Mailing list pgsql-general

From Jorge Arévalo
Subject Re: Best way to reduce server rounds getting big BLOBs
Date
Msg-id B94F2D0562D94FFBA94470CD13FFD3EB@libregis.org
Whole thread Raw
In response to Re: Best way to reduce server rounds getting big BLOBs  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-general
Hello,

First of all, sorry for the duplicate. I accidentally sent the same message twice, and I got responses in both of them.
Iinclude here the person who responded to the other messsage   


El miércoles 15 de mayo de 2013 a las 20:43, Merlin Moncure escribió:

> On Wed, May 15, 2013 at 11:31 AM, Jorge Arévalo
> <jorgearevalo@libregis.org (mailto:jorgearevalo@libregis.org)> wrote:
> > Hello,
> >
> > I'd like to know what's the best way to reduce the number of server rounds in a libpq C app that fetches BLOBs from
aremote PostgreSQL server. 
> >
> > About 75% of the time my app uses is spent querying database. I basically get binary objects (images). I have to
fetchall the images from a table. This table can be really big (in number of rows) and each image can be big too. 
>
> #1 thing to make sure of when getting big blobs is that you are
> fetching data in binary. If you are not, do so before changing
> anything else (I wrote a library to help do that, libpqtypes).

Yes, I get data in binary. But thanks for the reference to the library.

>
> > I guess I should go for cursors. If I understood the concept of "cursor", basically the query is executed, a
ResultSetis generated inside the database server, and the client receives a "pointer" to this ResultSet. You can get
allthe rows by moving this pointer over the ResultSet, calling the right functions. But you still have to go to the
databasefor each chunk of data. Am I right? 
>
> cursors are a way to page through a query result without fetching all
> the data at once. this would be most useful if you are processing one
> row at a time on the client side. but if the client needs all the
> data held in memory, cursors will only help in terms of reducing the
> temporary memory demands while doing the transfer. So it's hard to
> say if it's worth using them until you describe the client side
> requirements a little better.
>
Ok, more context here. The images stored in the database, are really PostGIS Raster images. So, they are stored in this
format[1], and are read in this another one [2], as binary objects (you can get a string representation of the format
too,but is slower). 

My point is to make the GDAL PostGIS Raster driver [3] faster, and my approach is: "instead of one server round per
eachuser's data request, try to get the whole raster data with the fewer number of queries". 

GDAL library raises a read operation for each chunk of data, in case of tiled raster formats. PostGIS Raster is a
(special)tiled raster format, but one read operation per data chunk means one query (= one trip to server), and that's
expensive.Even the simplest app (transform one raster table into a GeoTIFF file) spends 75% of its time in database
queries,because this architecture.   

Plus, each data read's request requires an intersection between the area desired and the data stored in database (using
ST_Intersects[4]). An intersection in the database is more expensive than in memory. 

So, my goal is to save server rounds. My thought was: using a cursor, I still have to go to the database, but my data
isstored in a ResultSet, and I just need to seek over it, instead of raising expensive queries. Am I wrong? Is there
anybetter approach? 

Eduardo, about your response, many thanks. I think blob data compression doesn't apply here. And out-db storage is an
optionof the PostGIS Raster format, but it's not mandatory. 


[1] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC1-SerializedFormat
[2] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC2-WellKnownBinaryFormat
[3] http://trac.osgeo.org/gdal/wiki/frmts_wtkraster.html
[4] http://postgis.net/docs/manual-2.0/ST_Intersects.html


Best regards,
Jorge


> merlin




pgsql-general by date:

Previous
From: David Demelier
Date:
Subject: Undefined reference with libpq on Visual Studio 2012
Next
From: sumita
Date:
Subject: Re: FATAL: database "a/system_data" does not exist