Thread: Memory exhaustion on large query
Hello,
I am using postgresql 13, on debian server. I use libpq to interface my DB. On some queries, my program is using so much memory that the application crashes.
I use libpq in asynchronous mode to avoid this problem, but the memory is consumed in the PQsendQueryPrepared call, before I invoke the first time PQgetResult
I copy-pasted the query that causes the trouble here under. My table has an index that should be usable for the query 'including the "ORDER BY" clause:
CREATE INDEX "ConfigurableWindowTableDataBuffer_last_modified_index" ON "ConfigurableWindowTableDataBufferInternal" USING btree ("DbSyncLastModifiedBackupVersion", "DbSyncLastModifiedTimeStamp")
And an EXPLAIN on the query shows that the planner uses this index:
Index Scan using "ConfigurableWindowTableDataBuffer_last_modified_index" on "ConfigurableWindowTableDataBufferInternal" (cost=0.54..2870.78 rows=14059 width=510) |
Index Cond: (("DbSyncLastModifiedBackupVersion" = "DbSyncGetBackupVersion"()) AND ("DbSyncLastModifiedTimeStamp" > 0)) |
The content of the table is so huge that, if PQsendQueryPrepared retrieves all data, or if postgresql engine is creating temp file with all data, the query cannot succeed (I have no enough RAM or disk space to copy the whole data). But I was expecting that using PQsendQueryPrepared and PQgetResult would avoid this by returning one result at a time.
Any idea of what is going wrong ? Or on how I could correct my query to avoid this ?
Note that basically, what I want to do is retrieve each record at a time, and send its content on an open TCP connection. I am hoping to be allowed to do that without requiring to have RAM or disk space to temporarily store all data that needs to be sent.
Many thanks,
Brice
SELECT "DbSyncID","DbSyncInsertedBackupVersion","DbSyncInsertedClientUniqueId","DbSyncInsertedClientEntryId","DbSyncDeleted","DbSyncRemovedFromServer","DbSyncLastModifiedBackupVersion","DbSyncLastModifiedTimeStamp","DbSyncFKBackupVersion","DbSyncFKClientUniqueId","DbSyncFKClientEntryId","TableDataID_BV","TableDataID_CID","TableDataID_CEID","TableId","SubTableId","BufferEntries_0" FROM "ConfigurableWindowTableDataBufferSync" WHERE "DbSyncLastModifiedBackupVersion"="DbSyncGetBackupVersion"() AND "DbSyncLastModifiedTimeStamp" > 0 ORDER BY "DbSyncLastModifiedTimeStamp" ASC
At first glance, the issue is the order by, which will use a temporary table to sort the result. If you remove this, does the memory issue go away?
On Sun, Dec 12, 2021 at 1:26 PM Brice André <brice@famille-andre.be> wrote:
Hello,I am using postgresql 13, on debian server. I use libpq to interface my DB. On some queries, my program is using so much memory that the application crashes.I use libpq in asynchronous mode to avoid this problem, but the memory is consumed in the PQsendQueryPrepared call, before I invoke the first time PQgetResultI copy-pasted the query that causes the trouble here under. My table has an index that should be usable for the query 'including the "ORDER BY" clause:CREATE INDEX "ConfigurableWindowTableDataBuffer_last_modified_index" ON "ConfigurableWindowTableDataBufferInternal" USING btree ("DbSyncLastModifiedBackupVersion", "DbSyncLastModifiedTimeStamp")And an EXPLAIN on the query shows that the planner uses this index:
Index Scan using "ConfigurableWindowTableDataBuffer_last_modified_index" on "ConfigurableWindowTableDataBufferInternal" (cost=0.54..2870.78 rows=14059 width=510) Index Cond: (("DbSyncLastModifiedBackupVersion" = "DbSyncGetBackupVersion"()) AND ("DbSyncLastModifiedTimeStamp" > 0))The content of the table is so huge that, if PQsendQueryPrepared retrieves all data, or if postgresql engine is creating temp file with all data, the query cannot succeed (I have no enough RAM or disk space to copy the whole data). But I was expecting that using PQsendQueryPrepared and PQgetResult would avoid this by returning one result at a time.Any idea of what is going wrong ? Or on how I could correct my query to avoid this ?Note that basically, what I want to do is retrieve each record at a time, and send its content on an open TCP connection. I am hoping to be allowed to do that without requiring to have RAM or disk space to temporarily store all data that needs to be sent.Many thanks,BriceSELECT "DbSyncID","DbSyncInsertedBackupVersion","DbSyncInsertedClientUniqueId","DbSyncInsertedClientEntryId","DbSyncDeleted","DbSyncRemovedFromServer","DbSyncLastModifiedBackupVersion","DbSyncLastModifiedTimeStamp","DbSyncFKBackupVersion","DbSyncFKClientUniqueId","DbSyncFKClientEntryId","TableDataID_BV","TableDataID_CID","TableDataID_CEID","TableId","SubTableId","BufferEntries_0" FROM "ConfigurableWindowTableDataBufferSync" WHERE "DbSyncLastModifiedBackupVersion"="DbSyncGetBackupVersion"() AND "DbSyncLastModifiedTimeStamp" > 0 ORDER BY "DbSyncLastModifiedTimeStamp" ASC
=?UTF-8?B?QnJpY2UgQW5kcsOp?= <brice@famille-andre.be> writes: > The content of the table is so huge that, if PQsendQueryPrepared retrieves > all data, or if postgresql engine is creating temp file with all data, the > query cannot succeed (I have no enough RAM or disk space to copy the whole > data). But I was expecting that using PQsendQueryPrepared and PQgetResult > would avoid this by returning one result at a time. You're confusing asynchronous mode with single-row mode. Async mode, per se, doesn't change memory consumption; it just lets you do something else while waiting for the query result. You are (I suppose) missing a call to PQsetSingleRowMode --- see https://www.postgresql.org/docs/current/libpq-single-row-mode.html regards, tom lane
Many thanks for your help.
A call to PQsetSingleRowMode just after PQsendQueryPrepared solved my issue.
Regards,
Brice
Le dim. 12 déc. 2021 à 19:43, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Brice André <brice@famille-andre.be> writes:
> The content of the table is so huge that, if PQsendQueryPrepared retrieves
> all data, or if postgresql engine is creating temp file with all data, the
> query cannot succeed (I have no enough RAM or disk space to copy the whole
> data). But I was expecting that using PQsendQueryPrepared and PQgetResult
> would avoid this by returning one result at a time.
You're confusing asynchronous mode with single-row mode. Async mode,
per se, doesn't change memory consumption; it just lets you do something
else while waiting for the query result. You are (I suppose) missing
a call to PQsetSingleRowMode --- see
https://www.postgresql.org/docs/current/libpq-single-row-mode.html
regards, tom lane