Seg fault when processing large SPI cursor (PG9.13)

Started by Fields, Zachary J. (MU-Student)almost 13 years ago4 messages
#1Fields, Zachary J. (MU-Student)
zjfe58@mail.missouri.edu

I'm working on PostgreSQL 9.13 (waiting for admin to push upgrades next week), in the meanwhile, I was curious if there are any known bugs regarding large cursor fetches, or if I am to blame.

My cursor has 400 million records, and I'm fetching in blocks of 2^17 (approx. 130K). When I fetch the next block after processing the 48,889,856th record, then the DB seg faults. It should be noted, I have processed tables with 23 million+ records several times and everything appears to work great.

I have watched top, and the system memory usage gets up to 97.6% (from approx 30 million records onward - then sways up and down), but ultimately crashes when I try to get past the 48,889,856th record. I have tried odd and various block sizes, anything greater than 2^17 crashes at the fetch that would have it surpassed 48,889,856 records, 2^16 hits the same sweet spot, and anything less than 2^16 actually crashes slightly earlier (noted in comments in code below).

To me, it appears to be an obvious memory leak, the question is who caused it. I would typically assume I am to blame (and I may be), but the code is so simple (shown below) that I can't see how it could be me - unless I am misusing pg-sql (which is totally possible).

Here is the code segment that is crashing...

<code>
// Cursor variables
const char *cursor_name = NULL; // Postgres will self-assign a name
const int arg_count = 0; // No arguments will be passed
Oid *arg_types = NULL; // n/a
Datum *arg_values = NULL; // n/a
const char *null_args = NULL; // n/a
bool read_only = true; // read_only allows for optimization
const int cursor_opts = CURSOR_OPT_NO_SCROLL; // default cursor options
bool forward = true;
//const long fetch_count = FETCH_ALL;
//const long fetch_count = 1048576; // 2^20 - last processed = 48,234,496
//const long fetch_count = 524288; // 2^19 - last processed = 48,758,784
//const long fetch_count = 262144; // 2^18 - last processed = 48,758,784
const long fetch_count = 131072; // 2^17 - last processed = 48,889,856
//const long fetch_count = 65536; // 2^16 - last processed = 48,889,856
//const long fetch_count = 32768; // 2^15 - last processed = 48,857,088
//const long fetch_count = 16384; // 2^14 - last processed = 48,791,552
//const long fetch_count = 8192; // 2^13 - last processed = 48,660,480
//const long fetch_count = 4096; // 2^12 - last processed = 48,398,336
//const long fetch_count = 2048; // 2^11
//const long fetch_count = 1024; // 2^10
//const long fetch_count = 512; // 2^9
//const long fetch_count = 256; // 2^8
//const long fetch_count = 128; // 2^7
//const long fetch_count = 64; // 2^6
//const long fetch_count = 32; // 2^5
//const long fetch_count = 16; // 2^4
//const long fetch_count = 8; // 2^3
//const long fetch_count = 4; // 2^2
//const long fetch_count = 2; // 2^1
//const long fetch_count = 1; // 2^0

unsigned int i, j, end, stored;
unsigned int result_counter = 0;
float8 l1_norm;
bool is_null = true;
bool nulls[4];
Datum result_tuple_datum[4];
HeapTuple new_tuple;
MemoryContext function_context;

ResultCandidate *candidate, **candidates, *top, *free_candidate = NULL;
KSieve<ResultCandidate *> sieve(result_cnt_);

/*********************/
/** Init SPI_cursor **/
/*********************/

// Connect to SPI
if ( SPI_connect() != SPI_OK_CONNECT ) { return; }

// Prepare and open SPI cursor
Portal signature_cursor = SPI_cursor_open_with_args(cursor_name, sql_stmt_, arg_count, arg_types, arg_values, null_args, read_only, cursor_opts);

do {
// Fetch rows for processing (Populates SPI_processed and SPI_tuptable)
SPI_cursor_fetch(signature_cursor, forward, fetch_count);

/************************/
/** Process SPI_cursor **/
/************************/

// Iterate cursor and perform calculations
for (i = 0 ; i < SPI_processed ; ++i) {
// Transfer columns to work array
for ( j = 1 ; j < 4 ; ++j ) {
result_tuple_datum[j-1] = SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, j, &is_null);
nulls[j-1] = is_null;
}

// Special Handling for final column
Datum raw_double_array = SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 4, &is_null);
nulls[3] = is_null;
if ( is_null ) {
l1_norm = FLT_MAX;
result_tuple_datum[3] = PointerGetDatum(NULL);
} else {
// Transform binary into double array
ArrayType *pg_double_array = DatumGetArrayTypeP(raw_double_array);
l1_norm = meanAbsoluteError(signature_, (double *)ARR_DATA_PTR(pg_double_array), (ARR_DIMS(pg_double_array))[0], 0);
result_tuple_datum[3] = Float8GetDatum(l1_norm);
}

// Create and test candidate
if ( free_candidate ) {
candidate = free_candidate;
free_candidate = NULL;
} else {
candidate = (ResultCandidate *)palloc(sizeof(ResultCandidate));
}
(*candidate).lat = DatumGetFloat8(result_tuple_datum[0]);
(*candidate).null_lat = nulls[0];
(*candidate).lon = DatumGetFloat8(result_tuple_datum[1]);
(*candidate).null_lon = nulls[1];
(*candidate).orientation = DatumGetFloat8(result_tuple_datum[2]);
(*candidate).null_orientation = nulls[2];
(*candidate).rank = l1_norm;
(*candidate).null_rank = nulls[3];

// Run candidate through sieve
top = sieve.top();
if ( !sieve.siftItem(candidate) ) {
// Free non-filtered candidates
free_candidate = candidate;
} else if ( sieve.size() == result_cnt_ ) {
// Free non-filtered candidates
free_candidate = top;
}
}
result_counter += i;
} while ( SPI_processed );

SPI_finish();
</code>

Is there an obvious error I'm overlooking, or is there a known bug (PG9.13) for large fetch sizes?

Thanks,
Zak

P.S. KSieve is POD encapsulating an array that has been allocated with palloc().

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fields, Zachary J. (MU-Student) (#1)
Re: Seg fault when processing large SPI cursor (PG9.13)

"Fields, Zachary J. (MU-Student)" <zjfe58@mail.missouri.edu> writes:

I'm working on PostgreSQL 9.13 (waiting for admin to push upgrades next week), in the meanwhile, I was curious if there are any known bugs regarding large cursor fetches, or if I am to blame.
My cursor has 400 million records, and I'm fetching in blocks of 2^17 (approx. 130K). When I fetch the next block after processing the 48,889,856th record, then the DB seg faults. It should be noted, I have processed tables with 23 million+ records several times and everything appears to work great.

I have watched top, and the system memory usage gets up to 97.6% (from approx 30 million records onward - then sways up and down), but ultimately crashes when I try to get past the 48,889,856th record. I have tried odd and various block sizes, anything greater than 2^17 crashes at the fetch that would have it surpassed 48,889,856 records, 2^16 hits the same sweet spot, and anything less than 2^16 actually crashes slightly earlier (noted in comments in code below).

To me, it appears to be an obvious memory leak,

Well, you're leaking the SPITupleTables (you should be doing
SPI_freetuptable when done with each one), so running out of memory is
not exactly surprising. I suspect what is happening is that an
out-of-memory error is getting thrown and recovery from that is messed
up somehow. Have you tried getting a stack trace from the crash?

I note that you're apparently using C++. C++ in the backend is rather
dangerous, and one of the main reasons is that C++ error handling
doesn't play nice with elog/ereport error handling. It's possible to
make it work safely but it takes a lot of attention and extra code,
which you don't seem to have here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#2)
Re: Seg fault when processing large SPI cursor (PG9.13)

On Mon, Mar 4, 2013 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Fields, Zachary J. (MU-Student)" <zjfe58@mail.missouri.edu> writes:

I'm working on PostgreSQL 9.13 (waiting for admin to push upgrades next week), in the meanwhile, I was curious if there are any known bugs regarding large cursor fetches, or if I am to blame.
My cursor has 400 million records, and I'm fetching in blocks of 2^17 (approx. 130K). When I fetch the next block after processing the 48,889,856th record, then the DB seg faults. It should be noted, I have processed tables with 23 million+ records several times and everything appears to work great.

I have watched top, and the system memory usage gets up to 97.6% (from approx 30 million records onward - then sways up and down), but ultimately crashes when I try to get past the 48,889,856th record. I have tried odd and various block sizes, anything greater than 2^17 crashes at the fetch that would have it surpassed 48,889,856 records, 2^16 hits the same sweet spot, and anything less than 2^16 actually crashes slightly earlier (noted in comments in code below).

To me, it appears to be an obvious memory leak,

Well, you're leaking the SPITupleTables (you should be doing
SPI_freetuptable when done with each one), so running out of memory is
not exactly surprising. I suspect what is happening is that an
out-of-memory error is getting thrown and recovery from that is messed
up somehow. Have you tried getting a stack trace from the crash?

I note that you're apparently using C++. C++ in the backend is rather
dangerous, and one of the main reasons is that C++ error handling
doesn't play nice with elog/ereport error handling. It's possible to
make it work safely but it takes a lot of attention and extra code,
which you don't seem to have here.

could be c++ is throwing exception. if you haven't already, try
disabling exception handling completely in the compiler.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fields, Zachary J. (MU-Student) (#1)
Re: Seg fault when processing large SPI cursor (PG9.13)

"Fields, Zachary J. (MU-Student)" <zjfe58@mail.missouri.edu> writes:

Thanks for getting back to me! I had already discovered freeing the SPI_tuptable each time, and you are correct, it made a big difference. However, I still was only able to achieve 140+ million before it crashed.

My current working implementation is to reset the "current" memory context after X number of iterations, which keeps memory in check. This seems like a big hammer for the job, and I'm sure it is not optimal. Speed is very important to my application, so I would prefer to use a scalpel instead of a hatchet. If I am already freeing the SPI_tuptable created by the cursor, where else should I be looking for memory leaks?

There are a lot of places that could be leaking memory, for instance if
the array you're working with are large enough then they could be
toasted, and DatumGetArrayTypeP would involve making a working copy.
I'm not too sure that you're not ever leaking "candidate" structs,
either.

The usual theory in Postgres is that a memory context reset is cheaper,
as well as much less leak-prone, than trying to make sure you've pfree'd
each individual allocation. So we tend to work with short-lived
contexts that can be reset at the end of each tuple cycle --- or in this
example, probably once per cursor fetch would be good. The main problem
I'd have with what you're doing is that it's not very safe for a
function to reset the whole SPI Proc context: you might be clobbering
some storage that's still in use, eg related to the cursor you're using.
Instead create a context that's a child of that context, switch into
that to do your processing, and reset it every so often.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers