Thread: Seg fault when processing large SPI cursor (PG9.13)

Seg fault when processing large SPI cursor (PG9.13)

From
"Fields, Zachary J. (MU-Student)"
Date:
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">I'm working on PostgreSQL 9.13 (waiting
foradmin to push upgrades next week), in the meanwhile, I was curious if there are any known bugs regarding large
cursorfetches, or if I am to blame.<br /><br /> My cursor has 400 million records, and I'm fetching in blocks of 2^17
(approx.130K). When I fetch the next block after processing the 48,889,856th record, then the DB seg faults. It should
benoted, I have processed tables with 23 million+ records several times and everything appears to work great.<br /><br
/>I have watched top, and the system memory usage gets up to 97.6% (from approx 30 million records onward - then sways
upand down), but ultimately crashes when I try to get past the 48,889,856th record. I have tried odd and various block
sizes,anything greater than 2^17 crashes at the fetch that would have it surpassed 48,889,856 records, 2^16 hits the
samesweet spot, and anything less than 2^16 actually crashes slightly earlier (noted in comments in code below).<br
/><br/> To me, it appears to be an obvious memory leak, the question is who caused it. I would typically assume I am to
blame(and I may be), but the code is so simple (shown below) that I can't see how it could be me - unless I am misusing
pg-sql(which is totally possible).<br /><br /> Here is the code segment that is crashing...<br /><br /> <code><br
/>    // Cursor variables<br />     const char *cursor_name = NULL;  // Postgres will self-assign a name<br />    
constint arg_count = 0;  // No arguments will be passed<br />     Oid *arg_types = NULL;  // n/a<br />     Datum
*arg_values= NULL;  // n/a<br />     const char *null_args = NULL;  // n/a<br />     bool read_only = true;  //
read_onlyallows for optimization<br />     const int cursor_opts = CURSOR_OPT_NO_SCROLL;  // default cursor options<br
/>    bool forward = true;<br />     //const long fetch_count = FETCH_ALL;<br />     //const long fetch_count =
1048576; // 2^20 - last processed = 48,234,496<br />     //const long fetch_count = 524288;  // 2^19 - last processed =
48,758,784<br/>     //const long fetch_count = 262144;  // 2^18 - last processed = 48,758,784<br />     const long
fetch_count= 131072;  // 2^17 - last processed = 48,889,856<br />     //const long fetch_count = 65536;  // 2^16 - last
processed= 48,889,856<br />     //const long fetch_count = 32768;  // 2^15 - last processed = 48,857,088<br />    
//constlong fetch_count = 16384;  // 2^14 - last processed = 48,791,552<br />     //const long fetch_count = 8192;  //
2^13- last processed = 48,660,480<br />     //const long fetch_count = 4096;  // 2^12 - last processed = 48,398,336<br
/>    //const long fetch_count = 2048;  // 2^11<br />     //const long fetch_count = 1024;  // 2^10<br />     //const
longfetch_count = 512;  // 2^9<br />     //const long fetch_count = 256;  // 2^8<br />     //const long fetch_count =
128; // 2^7<br />     //const long fetch_count = 64;  // 2^6<br />     //const long fetch_count = 32;  // 2^5<br />    
//constlong fetch_count = 16;  // 2^4<br />     //const long fetch_count = 8;  // 2^3<br />     //const long
fetch_count= 4;  // 2^2<br />     //const long fetch_count = 2;  // 2^1<br />     //const long fetch_count = 1;  //
2^0<br/><br />     unsigned int i, j, end, stored;<br />     unsigned int result_counter = 0;<br />     float8
l1_norm;<br/>     bool is_null = true;<br />     bool nulls[4];<br />     Datum result_tuple_datum[4];<br />    
HeapTuplenew_tuple;<br />     MemoryContext function_context;<br /><br />     ResultCandidate *candidate, **candidates,
*top,*free_candidate = NULL;<br />     KSieve<ResultCandidate *> sieve(result_cnt_);<br /><br />      
/*********************/<br/>      /** Init SPI_cursor **/<br />     /*********************/<br /><br />     // Connect
toSPI<br />     if ( SPI_connect() != SPI_OK_CONNECT ) { return; }<br /><br />     // Prepare and open SPI cursor<br />
   Portal signature_cursor = SPI_cursor_open_with_args(cursor_name, sql_stmt_, arg_count, arg_types, arg_values,
null_args,read_only, cursor_opts);<br /><br />     do {<br />         // Fetch rows for processing (Populates
SPI_processedand SPI_tuptable)<br />         SPI_cursor_fetch(signature_cursor, forward, fetch_count);<br /><br />    
     /************************/<br />          /** Process SPI_cursor **/<br />         /************************/<br
/><br/>         // Iterate cursor and perform calculations<br />         for (i = 0 ; i < SPI_processed ; ++i) {<br
/>            // Transfer columns to work array<br />             for ( j = 1 ; j < 4 ; ++j ) {<br />            
   result_tuple_datum[j-1] = SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, j, &is_null);<br />
               nulls[j-1] = is_null;<br />             }<br /><br />             // Special Handling for final
column<br/>             Datum raw_double_array = SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 4,
&is_null);<br/>             nulls[3] = is_null;<br />             if ( is_null ) {<br />                 l1_norm =
FLT_MAX;<br/>                 result_tuple_datum[3] = PointerGetDatum(NULL);<br />             } else {<br />        
       // Transform binary into double array<br />                 ArrayType *pg_double_array =
DatumGetArrayTypeP(raw_double_array);<br/>                 l1_norm = meanAbsoluteError(signature_, (double
*)ARR_DATA_PTR(pg_double_array),(ARR_DIMS(pg_double_array))[0], 0);<br />                 result_tuple_datum[3] =
Float8GetDatum(l1_norm);<br/>             }<br /><br />             // Create and test candidate<br />             if (
free_candidate) {<br />                 candidate = free_candidate;<br />                 free_candidate = NULL;<br />
           } else {<br />                 candidate = (ResultCandidate *)palloc(sizeof(ResultCandidate));<br />        
   }<br />             (*candidate).lat = DatumGetFloat8(result_tuple_datum[0]);<br />            
(*candidate).null_lat= nulls[0];<br />             (*candidate).lon = DatumGetFloat8(result_tuple_datum[1]);<br />    
       (*candidate).null_lon = nulls[1];<br />             (*candidate).orientation =
DatumGetFloat8(result_tuple_datum[2]);<br/>             (*candidate).null_orientation = nulls[2];<br />            
(*candidate).rank= l1_norm;<br />             (*candidate).null_rank = nulls[3];<br /><br />             // Run
candidatethrough sieve<br />             top = sieve.top();<br />             if ( !sieve.siftItem(candidate) ) {<br />
               // Free non-filtered candidates<br />                 free_candidate = candidate;<br />             }
elseif ( sieve.size() == result_cnt_ ) {<br />                 // Free non-filtered candidates<br />                
free_candidate= top;<br />             }<br />         }<br />         result_counter += i;<br />     } while (
SPI_processed);<br /><br />     SPI_finish();<br /> </code><br /><br /> Is there an obvious error I'm
overlooking,or is there a known bug (PG9.13) for large fetch sizes?<br /><br /> Thanks,<br /> Zak<br /><br /> P.S.
KSieveis POD encapsulating an array that has been allocated with palloc().<br /></div> 

Re: Seg fault when processing large SPI cursor (PG9.13)

From
Tom Lane
Date:
"Fields, Zachary J. (MU-Student)" <zjfe58@mail.missouri.edu> writes:
> I'm working on PostgreSQL 9.13 (waiting for admin to push upgrades next week), in the meanwhile, I was curious if
thereare any known bugs regarding large cursor fetches, or if I am to blame.
 
> My cursor has 400 million records, and I'm fetching in blocks of 2^17 (approx. 130K). When I fetch the next block
afterprocessing the 48,889,856th record, then the DB seg faults. It should be noted, I have processed tables with 23
million+records several times and everything appears to work great.
 

> I have watched top, and the system memory usage gets up to 97.6% (from approx 30 million records onward - then sways
upand down), but ultimately crashes when I try to get past the 48,889,856th record. I have tried odd and various block
sizes,anything greater than 2^17 crashes at the fetch that would have it surpassed 48,889,856 records, 2^16 hits the
samesweet spot, and anything less than 2^16 actually crashes slightly earlier (noted in comments in code below).
 

> To me, it appears to be an obvious memory leak,

Well, you're leaking the SPITupleTables (you should be doing
SPI_freetuptable when done with each one), so running out of memory is
not exactly surprising.  I suspect what is happening is that an
out-of-memory error is getting thrown and recovery from that is messed
up somehow.  Have you tried getting a stack trace from the crash?

I note that you're apparently using C++.  C++ in the backend is rather
dangerous, and one of the main reasons is that C++ error handling
doesn't play nice with elog/ereport error handling.  It's possible to
make it work safely but it takes a lot of attention and extra code,
which you don't seem to have here.
        regards, tom lane



Re: Seg fault when processing large SPI cursor (PG9.13)

From
Merlin Moncure
Date:
On Mon, Mar 4, 2013 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Fields, Zachary J. (MU-Student)" <zjfe58@mail.missouri.edu> writes:
>> I'm working on PostgreSQL 9.13 (waiting for admin to push upgrades next week), in the meanwhile, I was curious if
thereare any known bugs regarding large cursor fetches, or if I am to blame. 
>> My cursor has 400 million records, and I'm fetching in blocks of 2^17 (approx. 130K). When I fetch the next block
afterprocessing the 48,889,856th record, then the DB seg faults. It should be noted, I have processed tables with 23
million+records several times and everything appears to work great. 
>
>> I have watched top, and the system memory usage gets up to 97.6% (from approx 30 million records onward - then sways
upand down), but ultimately crashes when I try to get past the 48,889,856th record. I have tried odd and various block
sizes,anything greater than 2^17 crashes at the fetch that would have it surpassed 48,889,856 records, 2^16 hits the
samesweet spot, and anything less than 2^16 actually crashes slightly earlier (noted in comments in code below). 
>
>> To me, it appears to be an obvious memory leak,
>
> Well, you're leaking the SPITupleTables (you should be doing
> SPI_freetuptable when done with each one), so running out of memory is
> not exactly surprising.  I suspect what is happening is that an
> out-of-memory error is getting thrown and recovery from that is messed
> up somehow.  Have you tried getting a stack trace from the crash?
>
> I note that you're apparently using C++.  C++ in the backend is rather
> dangerous, and one of the main reasons is that C++ error handling
> doesn't play nice with elog/ereport error handling.  It's possible to
> make it work safely but it takes a lot of attention and extra code,
> which you don't seem to have here.

could be c++ is throwing exception. if you haven't already, try
disabling exception handling completely in the compiler.

merlin



Re: Seg fault when processing large SPI cursor (PG9.13)

From
Tom Lane
Date:
"Fields, Zachary J. (MU-Student)" <zjfe58@mail.missouri.edu> writes:
> Thanks for getting back to me! I had already discovered freeing the SPI_tuptable each time, and you are correct, it
madea big difference. However, I still was only able to achieve 140+ million before it crashed.
 

> My current working implementation is to reset the "current" memory context after X number of iterations, which keeps
memoryin check. This seems like a big hammer for the job, and I'm sure it is not optimal. Speed is very important to my
application,so I would prefer to use a scalpel instead of a hatchet. If I am already freeing the SPI_tuptable created
bythe cursor, where else should I be looking for memory leaks?
 

There are a lot of places that could be leaking memory, for instance if
the array you're working with are large enough then they could be
toasted, and DatumGetArrayTypeP would involve making a working copy.
I'm not too sure that you're not ever leaking "candidate" structs,
either.

The usual theory in Postgres is that a memory context reset is cheaper,
as well as much less leak-prone, than trying to make sure you've pfree'd
each individual allocation.  So we tend to work with short-lived
contexts that can be reset at the end of each tuple cycle --- or in this
example, probably once per cursor fetch would be good.  The main problem
I'd have with what you're doing is that it's not very safe for a
function to reset the whole SPI Proc context: you might be clobbering
some storage that's still in use, eg related to the cursor you're using.
Instead create a context that's a child of that context, switch into
that to do your processing, and reset it every so often.
        regards, tom lane