Thread: SPI_cursor_fetch Memory overrun
Hi all,
I’m creating a C extension to write data from the Postgres to a text file . Since the table can be very large, I only want to process one chunk of data per time to keep the memory stable. To achieve this, I use SPI cursor(https://www.postgresql.org/docs/current/static/spi-spi-cursor-fetch.html) to fetch row chunks from the table. Following is my program:
char* command;
uint64 proc = 1;
// Prepare a statement without executing it yet
SPI_connect();
command = psprintf("SELECT * FROM %s", tableName);
SPIPlanPtr SPIplan = SPI_prepare_cursor(command, 0, NULL, 0);
pfree(command);
// Set up a cursor using a statement created with SPI_prepare_cursor
Portal cursor= SPI_cursor_open(NULL, SPIplan, NULL, NULL, true);
if (SPIplan == NULL){
elog(ERROR,"couldn't create cursor via SPI");
}
// Writing to local Desktop
clock_t begin = clock();
int count = 0;
FILE *fp = fopen("/home/ubuntu/test.txt", "w");
if(fp == NULL){
return psprintf("Error when open file");
}
while(proc > 0){
// Fetch 200000 rows from a cursor
SPI_cursor_fetch(cursor, true, 200000);
proc = SPI_processed; // number of row returned
if (proc != 0){
TupleDesc tupdesc = SPI_tuptable->tupdesc;
uint64 j;
for (j = 0; j < proc; j++){
HeapTuple tuple = tuptable->vals[j];
int i;
count++;
for (i = 1; i <= tupdesc->natts; i++){ //natts -> number of columns
fprintf(fp, (i == tupdesc->natts) ? "%s\n" : "%s," , SPI_getvalue(tuple, tupdesc, i));
}
}
}else{
break;
}
SPI_freetuptable(SPI_tuptable);
elog(INFO, “freed tuptable”);
sleep(5);
}
From my understanding, cursor points at the entire result rows on heap. After fetching certain number of rows, SPI_tuptable points to the allocated row set. After finishing writing the row set to text file, I freed the memory of it by calling SPI_freetuptable( ) before next fetch. I expected the memory to stay around a constant value through out the program, However, the actual memory kept increasing when I run the program. Also, I only observed memory drop after SPI_freetuptable() of first chunk. After that memory kept going up even SPI_freetuptable() is executed. Any clue of why is it happening?
Thanks in advance!
Best,
Ivy
Wu Ivy <ivywuyzl@gmail.com> writes: > fprintf(fp, (i == tupdesc->natts) ? "%s\n" : "%s," , SPI_getvalue(tuple, tupdesc, i)); That (specifically the SPI_getvalue call) is what's leaking memory. You could improve matters by pfree'ing the result string after each such call. Just to add insult to injury, it's also looking up the column datatype's output function afresh on every call. There could perhaps be some leakage involved in those lookups too, though I'd bet the main problem is the result strings. regards, tom lane