Thread: Memory exhaustion on large query

Memory exhaustion on large query

From
Brice André
Date:
Hello,

I am using postgresql 13, on debian server. I use libpq to interface my DB. On some queries, my program is using so much memory that the application crashes.

I use libpq in asynchronous mode to avoid this problem, but the memory is consumed in the PQsendQueryPrepared call, before I invoke the first time PQgetResult

I copy-pasted the query that causes the trouble here under. My table has an index that should be usable for the query 'including the "ORDER BY" clause:
CREATE INDEX "ConfigurableWindowTableDataBuffer_last_modified_index" ON "ConfigurableWindowTableDataBufferInternal" USING btree ("DbSyncLastModifiedBackupVersion", "DbSyncLastModifiedTimeStamp")

And an EXPLAIN on the query shows that the planner uses this index:
Index Scan using "ConfigurableWindowTableDataBuffer_last_modified_index" on "ConfigurableWindowTableDataBufferInternal"  (cost=0.54..2870.78 rows=14059 width=510)
  Index Cond: (("DbSyncLastModifiedBackupVersion" = "DbSyncGetBackupVersion"()) AND ("DbSyncLastModifiedTimeStamp" > 0))

The content of the table is so huge that, if PQsendQueryPrepared retrieves all data, or if postgresql engine is creating temp file with all data, the query cannot succeed (I have no enough RAM or disk space to copy the whole data). But I was expecting that using PQsendQueryPrepared and PQgetResult would avoid this by returning one result at a time.

Any idea of what is going wrong ? Or on how I could correct my query to avoid this ?

Note that basically, what I want to do is retrieve each record at a time, and send its content on an open TCP connection. I am hoping to be allowed to do that without requiring to have RAM or disk space to temporarily store all data that needs to be sent.

Many thanks,
Brice

SELECT "DbSyncID","DbSyncInsertedBackupVersion","DbSyncInsertedClientUniqueId","DbSyncInsertedClientEntryId","DbSyncDeleted","DbSyncRemovedFromServer","DbSyncLastModifiedBackupVersion","DbSyncLastModifiedTimeStamp","DbSyncFKBackupVersion","DbSyncFKClientUniqueId","DbSyncFKClientEntryId","TableDataID_BV","TableDataID_CID","TableDataID_CEID","TableId","SubTableId","BufferEntries_0" FROM "ConfigurableWindowTableDataBufferSync" WHERE "DbSyncLastModifiedBackupVersion"="DbSyncGetBackupVersion"() AND "DbSyncLastModifiedTimeStamp" > 0 ORDER BY "DbSyncLastModifiedTimeStamp" ASC

Re: Memory exhaustion on large query

From
Erik Brandsberg
Date:
At first glance, the issue is the order by, which will use a temporary table to sort the result.  If you remove this, does the memory issue go away?

On Sun, Dec 12, 2021 at 1:26 PM Brice André <brice@famille-andre.be> wrote:
Hello,

I am using postgresql 13, on debian server. I use libpq to interface my DB. On some queries, my program is using so much memory that the application crashes.

I use libpq in asynchronous mode to avoid this problem, but the memory is consumed in the PQsendQueryPrepared call, before I invoke the first time PQgetResult

I copy-pasted the query that causes the trouble here under. My table has an index that should be usable for the query 'including the "ORDER BY" clause:
CREATE INDEX "ConfigurableWindowTableDataBuffer_last_modified_index" ON "ConfigurableWindowTableDataBufferInternal" USING btree ("DbSyncLastModifiedBackupVersion", "DbSyncLastModifiedTimeStamp")

And an EXPLAIN on the query shows that the planner uses this index:
Index Scan using "ConfigurableWindowTableDataBuffer_last_modified_index" on "ConfigurableWindowTableDataBufferInternal"  (cost=0.54..2870.78 rows=14059 width=510)
  Index Cond: (("DbSyncLastModifiedBackupVersion" = "DbSyncGetBackupVersion"()) AND ("DbSyncLastModifiedTimeStamp" > 0))

The content of the table is so huge that, if PQsendQueryPrepared retrieves all data, or if postgresql engine is creating temp file with all data, the query cannot succeed (I have no enough RAM or disk space to copy the whole data). But I was expecting that using PQsendQueryPrepared and PQgetResult would avoid this by returning one result at a time.

Any idea of what is going wrong ? Or on how I could correct my query to avoid this ?

Note that basically, what I want to do is retrieve each record at a time, and send its content on an open TCP connection. I am hoping to be allowed to do that without requiring to have RAM or disk space to temporarily store all data that needs to be sent.

Many thanks,
Brice

SELECT "DbSyncID","DbSyncInsertedBackupVersion","DbSyncInsertedClientUniqueId","DbSyncInsertedClientEntryId","DbSyncDeleted","DbSyncRemovedFromServer","DbSyncLastModifiedBackupVersion","DbSyncLastModifiedTimeStamp","DbSyncFKBackupVersion","DbSyncFKClientUniqueId","DbSyncFKClientEntryId","TableDataID_BV","TableDataID_CID","TableDataID_CEID","TableId","SubTableId","BufferEntries_0" FROM "ConfigurableWindowTableDataBufferSync" WHERE "DbSyncLastModifiedBackupVersion"="DbSyncGetBackupVersion"() AND "DbSyncLastModifiedTimeStamp" > 0 ORDER BY "DbSyncLastModifiedTimeStamp" ASC

Re: Memory exhaustion on large query

From
Tom Lane
Date:
=?UTF-8?B?QnJpY2UgQW5kcsOp?= <brice@famille-andre.be> writes:
> The content of the table is so huge that, if PQsendQueryPrepared retrieves
> all data, or if postgresql engine is creating temp file with all data, the
> query cannot succeed (I have no enough RAM or disk space to copy the whole
> data). But I was expecting that using PQsendQueryPrepared and PQgetResult
> would avoid this by returning one result at a time.

You're confusing asynchronous mode with single-row mode.  Async mode,
per se, doesn't change memory consumption; it just lets you do something
else while waiting for the query result.  You are (I suppose) missing
a call to PQsetSingleRowMode --- see

https://www.postgresql.org/docs/current/libpq-single-row-mode.html

            regards, tom lane



Re: Memory exhaustion on large query

From
Brice André
Date:
Many thanks for your help.

A call to PQsetSingleRowMode just after PQsendQueryPrepared solved my issue.

Regards,
Brice

Le dim. 12 déc. 2021 à 19:43, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Brice André <brice@famille-andre.be> writes:
> The content of the table is so huge that, if PQsendQueryPrepared retrieves
> all data, or if postgresql engine is creating temp file with all data, the
> query cannot succeed (I have no enough RAM or disk space to copy the whole
> data). But I was expecting that using PQsendQueryPrepared and PQgetResult
> would avoid this by returning one result at a time.

You're confusing asynchronous mode with single-row mode.  Async mode,
per se, doesn't change memory consumption; it just lets you do something
else while waiting for the query result.  You are (I suppose) missing
a call to PQsetSingleRowMode --- see

https://www.postgresql.org/docs/current/libpq-single-row-mode.html

                        regards, tom lane