Thread: SPI_cursor_fetch Memory overrun

SPI_cursor_fetch Memory overrun

From
Wu Ivy
Date:
Hi all,

I’m creating a C extension to write data from the Postgres  to a text file . Since the table can be very large, I only want to process one chunk of data per time to keep the memory stable. To achieve this, I use SPI  cursor(https://www.postgresql.org/docs/current/static/spi-spi-cursor-fetch.html) to fetch row chunks from the table. Following is my program:

    char* command;
    uint64 proc = 1;

    // Prepare a statement without executing it yet
    SPI_connect();
    command = psprintf("SELECT * FROM %s", tableName);
    SPIPlanPtr SPIplan = SPI_prepare_cursor(command, 0, NULL, 0);
    pfree(command);

    // Set up a cursor using a statement created with SPI_prepare_cursor
    Portal cursor= SPI_cursor_open(NULL, SPIplan, NULL, NULL, true);
    if (SPIplan == NULL){
        elog(ERROR,"couldn't create cursor via SPI");
    }

    // Writing to local Desktop
    clock_t begin = clock();
    int count = 0;
    FILE *fp = fopen("/home/ubuntu/test.txt", "w");
    if(fp == NULL){
        return psprintf("Error when open file");
    }

    while(proc > 0){
        // Fetch 200000 rows from a cursor
        SPI_cursor_fetch(cursor, true, 200000);
        proc = SPI_processed; // number of row returned

        if (proc != 0){
            TupleDesc tupdesc = SPI_tuptable->tupdesc;
            uint64 j;

            for (j = 0; j < proc; j++){
                HeapTuple tuple = tuptable->vals[j];
                int i;
                count++;
                for (i = 1; i <= tupdesc->natts; i++){ //natts -> number of columns
                     fprintf(fp, (i == tupdesc->natts) ? "%s\n" : "%s," , SPI_getvalue(tuple, tupdesc, i)); 
                }
            }
        }else{
break;
}

        SPI_freetuptable(SPI_tuptable);
elog(INFO, “freed tuptable”);
sleep(5);
}


From my understanding, cursor points at the entire result rows on heap. After fetching certain number of rows, SPI_tuptable points to the allocated row set. After finishing writing the row set to text file,  I freed the memory of it by calling SPI_freetuptable( ) before next fetch.  I expected the memory to stay around a constant value through out the program, However, the actual memory kept increasing when I run the program.  Also, I only observed memory drop after SPI_freetuptable() of first chunk. After that memory kept going up even  SPI_freetuptable()  is executed.  Any clue of why is it happening?  

Thanks in advance!

Best,
Ivy

Re: SPI_cursor_fetch Memory overrun

From
Tom Lane
Date:
Wu Ivy <ivywuyzl@gmail.com> writes:
>                      fprintf(fp, (i == tupdesc->natts) ? "%s\n" : "%s," , SPI_getvalue(tuple, tupdesc, i));

That (specifically the SPI_getvalue call) is what's leaking memory.
You could improve matters by pfree'ing the result string after each
such call.

Just to add insult to injury, it's also looking up the column datatype's
output function afresh on every call.  There could perhaps be some leakage
involved in those lookups too, though I'd bet the main problem is the
result strings.

            regards, tom lane