Re: Best way to reduce server rounds getting big BLOBs - Mailing list pgsql-general
From | Jorge Arévalo |
---|---|
Subject | Re: Best way to reduce server rounds getting big BLOBs |
Date | |
Msg-id | B94F2D0562D94FFBA94470CD13FFD3EB@libregis.org Whole thread Raw |
In response to | Re: Best way to reduce server rounds getting big BLOBs (Merlin Moncure <mmoncure@gmail.com>) |
List | pgsql-general |
Hello, First of all, sorry for the duplicate. I accidentally sent the same message twice, and I got responses in both of them. Iinclude here the person who responded to the other messsage El miércoles 15 de mayo de 2013 a las 20:43, Merlin Moncure escribió: > On Wed, May 15, 2013 at 11:31 AM, Jorge Arévalo > <jorgearevalo@libregis.org (mailto:jorgearevalo@libregis.org)> wrote: > > Hello, > > > > I'd like to know what's the best way to reduce the number of server rounds in a libpq C app that fetches BLOBs from aremote PostgreSQL server. > > > > About 75% of the time my app uses is spent querying database. I basically get binary objects (images). I have to fetchall the images from a table. This table can be really big (in number of rows) and each image can be big too. > > #1 thing to make sure of when getting big blobs is that you are > fetching data in binary. If you are not, do so before changing > anything else (I wrote a library to help do that, libpqtypes). Yes, I get data in binary. But thanks for the reference to the library. > > > I guess I should go for cursors. If I understood the concept of "cursor", basically the query is executed, a ResultSetis generated inside the database server, and the client receives a "pointer" to this ResultSet. You can get allthe rows by moving this pointer over the ResultSet, calling the right functions. But you still have to go to the databasefor each chunk of data. Am I right? > > cursors are a way to page through a query result without fetching all > the data at once. this would be most useful if you are processing one > row at a time on the client side. but if the client needs all the > data held in memory, cursors will only help in terms of reducing the > temporary memory demands while doing the transfer. So it's hard to > say if it's worth using them until you describe the client side > requirements a little better. > Ok, more context here. The images stored in the database, are really PostGIS Raster images. So, they are stored in this format[1], and are read in this another one [2], as binary objects (you can get a string representation of the format too,but is slower). My point is to make the GDAL PostGIS Raster driver [3] faster, and my approach is: "instead of one server round per eachuser's data request, try to get the whole raster data with the fewer number of queries". GDAL library raises a read operation for each chunk of data, in case of tiled raster formats. PostGIS Raster is a (special)tiled raster format, but one read operation per data chunk means one query (= one trip to server), and that's expensive.Even the simplest app (transform one raster table into a GeoTIFF file) spends 75% of its time in database queries,because this architecture. Plus, each data read's request requires an intersection between the area desired and the data stored in database (using ST_Intersects[4]). An intersection in the database is more expensive than in memory. So, my goal is to save server rounds. My thought was: using a cursor, I still have to go to the database, but my data isstored in a ResultSet, and I just need to seek over it, instead of raising expensive queries. Am I wrong? Is there anybetter approach? Eduardo, about your response, many thanks. I think blob data compression doesn't apply here. And out-db storage is an optionof the PostGIS Raster format, but it's not mandatory. [1] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC1-SerializedFormat [2] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC2-WellKnownBinaryFormat [3] http://trac.osgeo.org/gdal/wiki/frmts_wtkraster.html [4] http://postgis.net/docs/manual-2.0/ST_Intersects.html Best regards, Jorge > merlin
pgsql-general by date: