BUG #8573: int4range memory consumption

Started by Nonameabout 12 years ago12 messages
#1Noname
g.vanluffelen@qipc.com

The following bug has been logged on the website:

Bug reference: 8573
Logged by: Godfried Vanluffelen
Email address: g.vanluffelen@qipc.com
PostgreSQL version: 9.3.1
Operating system: Windows 7 64 bit
Description:

int4range ( and any other range function) consumes much memory when used in
a select statement on a big table.

steps to reproduce:
-generate or use a table with 10M+ rows with 2 columns that are usable to
create a range. For example 1 column with negative numbers, and one with
positive numbers.

-invoke a simple select statement where you create a range of 2 columns. For
example select int4range(col1,col2) from table.

-memory builds up until the query is finished, or when out of memory error
occurs.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: [BUGS] BUG #8573: int4range memory consumption

g.vanluffelen@qipc.com writes:

int4range ( and any other range function) consumes much memory when used in
a select statement on a big table.

The problem is that range_out leaks memory, as a consequence of creating a
number of intermediate strings that it doesn't bother to free. I don't
believe it's the only output function that leaks memory, but it does
so with particular vim: now that we've increased the initial size of
StringInfo buffers, it's probably wasting upwards of 2K per call.

While we could doubtless hack range_out to release those strings again,
it seems to me that that's just sticking a finger in the dike. I'm
inclined to think that we really ought to solve this class of problems
once and for all by fixing printtup.c to run the output functions in a
temporary memory context, which we could reset once per row to recover all
memory used. That would also let us get rid of the inadequate kluges that
printtup currently uses to try to minimize leakage, namely forced
detoasting of varlena fields and forced pfree'ing of the strings returned
by output functions. (There is no other context in which we imagine that
it's safe to pfree a function's result, and there are a number of
datatypes for which it'd make sense to return constant strings, were it
not for this kluge.)

It's possible that this would result in some net slowdown in tuple output;
but it's also possible that eliminating the retail pfree's in favor of a
single context reset per tuple would make for a net savings. In any case,
we're already using a reset-per-row approach to memory management of
output function calls in COPY OUT, and I know for a fact that we've
squeezed that code path as hard as we could.

Thoughts?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Jim Nasby
decibel@decibel.org
In reply to: Tom Lane (#2)
Re: [BUGS] BUG #8573: int4range memory consumption

On Nov 1, 2013, at 2:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

g.vanluffelen@qipc.com writes:

int4range ( and any other range function) consumes much memory when used in
a select statement on a big table.

The problem is that range_out leaks memory, as a consequence of creating a
number of intermediate strings that it doesn't bother to free. I don't
believe it's the only output function that leaks memory, but it does
so with particular vim: now that we've increased the initial size of
StringInfo buffers, it's probably wasting upwards of 2K per call.

While we could doubtless hack range_out to release those strings again,
it seems to me that that's just sticking a finger in the dike. I'm
inclined to think that we really ought to solve this class of problems
once and for all by fixing printtup.c to run the output functions in a
temporary memory context,

...

we're already using a reset-per-row approach to memory management of
output function calls in COPY OUT, and I know for a fact that we've
squeezed that code path as hard as we could.

+1. COPY is actually the case I was worried about… if you're dealing with large amounts of data in other clients ISTM that other things will bottleneck before the extra memory context.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: [BUGS] BUG #8573: int4range memory consumption

I wrote:

It's possible that this would result in some net slowdown in tuple output;
but it's also possible that eliminating the retail pfree's in favor of a
single context reset per tuple would make for a net savings. In any case,
we're already using a reset-per-row approach to memory management of
output function calls in COPY OUT, and I know for a fact that we've
squeezed that code path as hard as we could.

It appears that indeed, the reset-per-row approach is marginally faster
than the existing code. It's just barely faster with a couple of columns
of output, for instance I get about 660 vs 665 msec for
select x,x from generate_series(1,1000000) x;
but the advantage grows for more columns, which is expected since we're
getting rid of more pfree's. With ten integer columns I see 1650 vs
1710 msec, for example.

So I see no downside to fixing it like this, and will work on a complete
patch.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#3)
1 attachment(s)
Re: [BUGS] BUG #8573: int4range memory consumption

Jim Nasby <decibel@decibel.org> writes:

On Nov 1, 2013, at 2:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

While we could doubtless hack range_out to release those strings again,
it seems to me that that's just sticking a finger in the dike. I'm
inclined to think that we really ought to solve this class of problems
once and for all by fixing printtup.c to run the output functions in a
temporary memory context,
...
we're already using a reset-per-row approach to memory management of
output function calls in COPY OUT, and I know for a fact that we've
squeezed that code path as hard as we could.

+1. COPY is actually the case I was worried about� if you're dealing with large amounts of data in other clients ISTM that other things will bottleneck before the extra memory context.

Attached is a proposed patch for this. It fixes most of the functions
in printtup.c to use a per-row memory context. (I did not bother to
fix debugtup(), which is used only in standalone mode. If you're doing
queries large enough for mem leaks to be problematic in standalone mode,
you're nuts.) I also simplified some other places that had copied the
notion of forced detoasting before an output function call, as that seems
dubious at best, and wasn't being done uniformly anyway.

My original thought had been to get rid of all pfree's of output function
results, so as to make the world safe for returning constant strings when
such functions find it appropriate. However, I ran into a showstopper
problem: SPI_getvalue(), which is little more than a wrapper around the
appropriate type output function, is documented as returning a pfree'able
string. Some though not all of the callers in core and contrib take the
hint and pfree the result, and I'm sure there are more in third-party
extensions. So if we want to allow output functions to return
non-palloc'd strings, it seems we have to either change SPI_getvalue()'s
API contract or insert a pstrdup() call in it. Neither of these is
attractive, mainly because the vast majority of output function results
would still be palloc'd even if we made aggressive use of the option not
to. That means that if we did the former, light testing wouldn't
necessarily show a problem if someone forgot to remove a pfree() after a
SPI_getvalue() call, and it also means that if we did the latter, the
pstrdup() would usually be a waste of cycles and space.

So I've abandoned that idea for now, and just recommend applying the
attached patch as far back as 9.2, where range_out was added.

Comments?

regards, tom lane

Attachments:

printtup-temp-context.patchtext/x-diff; charset=us-ascii; name=printtup-temp-context.patchDownload
diff --git a/src/backend/access/common/printtup.c b/src/backend/access/common/printtup.c
index 8daac9e..0e8c947 100644
*** a/src/backend/access/common/printtup.c
--- b/src/backend/access/common/printtup.c
***************
*** 21,26 ****
--- 21,27 ----
  #include "tcop/pquery.h"
  #include "utils/lsyscache.h"
  #include "utils/memdebug.h"
+ #include "utils/memutils.h"
  
  
  static void printtup_startup(DestReceiver *self, int operation,
*************** typedef struct
*** 61,66 ****
--- 62,68 ----
  	TupleDesc	attrinfo;		/* The attr info we are set up for */
  	int			nattrs;
  	PrinttupAttrInfo *myinfo;	/* Cached info about each attr */
+ 	MemoryContext tmpcontext;	/* Memory context for per-row workspace */
  } DR_printtup;
  
  /* ----------------
*************** printtup_create_DR(CommandDest dest)
*** 87,92 ****
--- 89,95 ----
  	self->attrinfo = NULL;
  	self->nattrs = 0;
  	self->myinfo = NULL;
+ 	self->tmpcontext = NULL;
  
  	return (DestReceiver *) self;
  }
*************** printtup_startup(DestReceiver *self, int
*** 124,129 ****
--- 127,144 ----
  	DR_printtup *myState = (DR_printtup *) self;
  	Portal		portal = myState->portal;
  
+ 	/*
+ 	 * Create a temporary memory context that we can reset once per row to
+ 	 * recover palloc'd memory.  This avoids any problems with leaks inside
+ 	 * datatype output routines, and should be faster than retail pfree's
+ 	 * anyway.
+ 	 */
+ 	myState->tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ 												"printtup",
+ 												ALLOCSET_DEFAULT_MINSIZE,
+ 												ALLOCSET_DEFAULT_INITSIZE,
+ 												ALLOCSET_DEFAULT_MAXSIZE);
+ 
  	if (PG_PROTOCOL_MAJOR(FrontendProtocol) < 3)
  	{
  		/*
*************** printtup(TupleTableSlot *slot, DestRecei
*** 289,294 ****
--- 304,310 ----
  {
  	TupleDesc	typeinfo = slot->tts_tupleDescriptor;
  	DR_printtup *myState = (DR_printtup *) self;
+ 	MemoryContext oldcontext;
  	StringInfoData buf;
  	int			natts = typeinfo->natts;
  	int			i;
*************** printtup(TupleTableSlot *slot, DestRecei
*** 300,307 ****
  	/* Make sure the tuple is fully deconstructed */
  	slot_getallattrs(slot);
  
  	/*
! 	 * Prepare a DataRow message
  	 */
  	pq_beginmessage(&buf, 'D');
  
--- 316,326 ----
  	/* Make sure the tuple is fully deconstructed */
  	slot_getallattrs(slot);
  
+ 	/* Switch into per-row context so we can recover memory below */
+ 	oldcontext = MemoryContextSwitchTo(myState->tmpcontext);
+ 
  	/*
! 	 * Prepare a DataRow message (note buffer is in per-row context)
  	 */
  	pq_beginmessage(&buf, 'D');
  
*************** printtup(TupleTableSlot *slot, DestRecei
*** 313,320 ****
  	for (i = 0; i < natts; ++i)
  	{
  		PrinttupAttrInfo *thisState = myState->myinfo + i;
! 		Datum		origattr = slot->tts_values[i],
! 					attr;
  
  		if (slot->tts_isnull[i])
  		{
--- 332,338 ----
  	for (i = 0; i < natts; ++i)
  	{
  		PrinttupAttrInfo *thisState = myState->myinfo + i;
! 		Datum		attr = slot->tts_values[i];
  
  		if (slot->tts_isnull[i])
  		{
*************** printtup(TupleTableSlot *slot, DestRecei
*** 323,352 ****
  		}
  
  		/*
! 		 * If we have a toasted datum, forcibly detoast it here to avoid
! 		 * memory leakage inside the type's output routine.
! 		 *
! 		 * Here we catch undefined bytes in tuples that are returned to the
  		 * client without hitting disk; see comments at the related check in
! 		 * PageAddItem().  Whether to test before or after detoast is somewhat
! 		 * arbitrary, as is whether to test external/compressed data at all.
! 		 * Undefined bytes in the pre-toast datum will have triggered Valgrind
! 		 * errors in the compressor or toaster; any error detected here for
! 		 * such datums would indicate an (unlikely) bug in a type-independent
! 		 * facility.  Therefore, this test is most useful for uncompressed,
! 		 * non-external datums.
! 		 *
! 		 * We don't presently bother checking non-varlena datums for undefined
! 		 * data.  PageAddItem() does check them.
  		 */
  		if (thisState->typisvarlena)
! 		{
! 			VALGRIND_CHECK_MEM_IS_DEFINED(origattr, VARSIZE_ANY(origattr));
! 
! 			attr = PointerGetDatum(PG_DETOAST_DATUM(origattr));
! 		}
! 		else
! 			attr = origattr;
  
  		if (thisState->format == 0)
  		{
--- 341,355 ----
  		}
  
  		/*
! 		 * Here we catch undefined bytes in datums that are returned to the
  		 * client without hitting disk; see comments at the related check in
! 		 * PageAddItem().  This test is most useful for uncompressed,
! 		 * non-external datums, but we're quite likely to see such here when
! 		 * testing new C functions.
  		 */
  		if (thisState->typisvarlena)
! 			VALGRIND_CHECK_MEM_IS_DEFINED(DatumGetPointer(attr),
! 										  VARSIZE_ANY(attr));
  
  		if (thisState->format == 0)
  		{
*************** printtup(TupleTableSlot *slot, DestRecei
*** 355,361 ****
  
  			outputstr = OutputFunctionCall(&thisState->finfo, attr);
  			pq_sendcountedtext(&buf, outputstr, strlen(outputstr), false);
- 			pfree(outputstr);
  		}
  		else
  		{
--- 358,363 ----
*************** printtup(TupleTableSlot *slot, DestRecei
*** 366,380 ****
  			pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
  			pq_sendbytes(&buf, VARDATA(outputbytes),
  						 VARSIZE(outputbytes) - VARHDRSZ);
- 			pfree(outputbytes);
  		}
- 
- 		/* Clean up detoasted copy, if any */
- 		if (DatumGetPointer(attr) != DatumGetPointer(origattr))
- 			pfree(DatumGetPointer(attr));
  	}
  
  	pq_endmessage(&buf);
  }
  
  /* ----------------
--- 368,381 ----
  			pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
  			pq_sendbytes(&buf, VARDATA(outputbytes),
  						 VARSIZE(outputbytes) - VARHDRSZ);
  		}
  	}
  
  	pq_endmessage(&buf);
+ 
+ 	/* Return to caller's context, and flush row's temporary memory */
+ 	MemoryContextSwitchTo(oldcontext);
+ 	MemoryContextReset(myState->tmpcontext);
  }
  
  /* ----------------
*************** printtup_20(TupleTableSlot *slot, DestRe
*** 386,391 ****
--- 387,393 ----
  {
  	TupleDesc	typeinfo = slot->tts_tupleDescriptor;
  	DR_printtup *myState = (DR_printtup *) self;
+ 	MemoryContext oldcontext;
  	StringInfoData buf;
  	int			natts = typeinfo->natts;
  	int			i,
*************** printtup_20(TupleTableSlot *slot, DestRe
*** 399,404 ****
--- 401,409 ----
  	/* Make sure the tuple is fully deconstructed */
  	slot_getallattrs(slot);
  
+ 	/* Switch into per-row context so we can recover memory below */
+ 	oldcontext = MemoryContextSwitchTo(myState->tmpcontext);
+ 
  	/*
  	 * tell the frontend to expect new tuple data (in ASCII style)
  	 */
*************** printtup_20(TupleTableSlot *slot, DestRe
*** 430,437 ****
  	for (i = 0; i < natts; ++i)
  	{
  		PrinttupAttrInfo *thisState = myState->myinfo + i;
! 		Datum		origattr = slot->tts_values[i],
! 					attr;
  		char	   *outputstr;
  
  		if (slot->tts_isnull[i])
--- 435,441 ----
  	for (i = 0; i < natts; ++i)
  	{
  		PrinttupAttrInfo *thisState = myState->myinfo + i;
! 		Datum		attr = slot->tts_values[i];
  		char	   *outputstr;
  
  		if (slot->tts_isnull[i])
*************** printtup_20(TupleTableSlot *slot, DestRe
*** 439,463 ****
  
  		Assert(thisState->format == 0);
  
- 		/*
- 		 * If we have a toasted datum, forcibly detoast it here to avoid
- 		 * memory leakage inside the type's output routine.
- 		 */
- 		if (thisState->typisvarlena)
- 			attr = PointerGetDatum(PG_DETOAST_DATUM(origattr));
- 		else
- 			attr = origattr;
- 
  		outputstr = OutputFunctionCall(&thisState->finfo, attr);
  		pq_sendcountedtext(&buf, outputstr, strlen(outputstr), true);
- 		pfree(outputstr);
- 
- 		/* Clean up detoasted copy, if any */
- 		if (DatumGetPointer(attr) != DatumGetPointer(origattr))
- 			pfree(DatumGetPointer(attr));
  	}
  
  	pq_endmessage(&buf);
  }
  
  /* ----------------
--- 443,457 ----
  
  		Assert(thisState->format == 0);
  
  		outputstr = OutputFunctionCall(&thisState->finfo, attr);
  		pq_sendcountedtext(&buf, outputstr, strlen(outputstr), true);
  	}
  
  	pq_endmessage(&buf);
+ 
+ 	/* Return to caller's context, and flush row's temporary memory */
+ 	MemoryContextSwitchTo(oldcontext);
+ 	MemoryContextReset(myState->tmpcontext);
  }
  
  /* ----------------
*************** printtup_shutdown(DestReceiver *self)
*** 474,479 ****
--- 468,477 ----
  	myState->myinfo = NULL;
  
  	myState->attrinfo = NULL;
+ 
+ 	if (myState->tmpcontext)
+ 		MemoryContextDelete(myState->tmpcontext);
+ 	myState->tmpcontext = NULL;
  }
  
  /* ----------------
*************** debugtup(TupleTableSlot *slot, DestRecei
*** 536,543 ****
  	TupleDesc	typeinfo = slot->tts_tupleDescriptor;
  	int			natts = typeinfo->natts;
  	int			i;
! 	Datum		origattr,
! 				attr;
  	char	   *value;
  	bool		isnull;
  	Oid			typoutput;
--- 534,540 ----
  	TupleDesc	typeinfo = slot->tts_tupleDescriptor;
  	int			natts = typeinfo->natts;
  	int			i;
! 	Datum		attr;
  	char	   *value;
  	bool		isnull;
  	Oid			typoutput;
*************** debugtup(TupleTableSlot *slot, DestRecei
*** 545,574 ****
  
  	for (i = 0; i < natts; ++i)
  	{
! 		origattr = slot_getattr(slot, i + 1, &isnull);
  		if (isnull)
  			continue;
  		getTypeOutputInfo(typeinfo->attrs[i]->atttypid,
  						  &typoutput, &typisvarlena);
  
- 		/*
- 		 * If we have a toasted datum, forcibly detoast it here to avoid
- 		 * memory leakage inside the type's output routine.
- 		 */
- 		if (typisvarlena)
- 			attr = PointerGetDatum(PG_DETOAST_DATUM(origattr));
- 		else
- 			attr = origattr;
- 
  		value = OidOutputFunctionCall(typoutput, attr);
  
  		printatt((unsigned) i + 1, typeinfo->attrs[i], value);
- 
- 		pfree(value);
- 
- 		/* Clean up detoasted copy, if any */
- 		if (DatumGetPointer(attr) != DatumGetPointer(origattr))
- 			pfree(DatumGetPointer(attr));
  	}
  	printf("\t----\n");
  }
--- 542,556 ----
  
  	for (i = 0; i < natts; ++i)
  	{
! 		attr = slot_getattr(slot, i + 1, &isnull);
  		if (isnull)
  			continue;
  		getTypeOutputInfo(typeinfo->attrs[i]->atttypid,
  						  &typoutput, &typisvarlena);
  
  		value = OidOutputFunctionCall(typoutput, attr);
  
  		printatt((unsigned) i + 1, typeinfo->attrs[i], value);
  	}
  	printf("\t----\n");
  }
*************** printtup_internal_20(TupleTableSlot *slo
*** 587,592 ****
--- 569,575 ----
  {
  	TupleDesc	typeinfo = slot->tts_tupleDescriptor;
  	DR_printtup *myState = (DR_printtup *) self;
+ 	MemoryContext oldcontext;
  	StringInfoData buf;
  	int			natts = typeinfo->natts;
  	int			i,
*************** printtup_internal_20(TupleTableSlot *slo
*** 600,605 ****
--- 583,591 ----
  	/* Make sure the tuple is fully deconstructed */
  	slot_getallattrs(slot);
  
+ 	/* Switch into per-row context so we can recover memory below */
+ 	oldcontext = MemoryContextSwitchTo(myState->tmpcontext);
+ 
  	/*
  	 * tell the frontend to expect new tuple data (in binary style)
  	 */
*************** printtup_internal_20(TupleTableSlot *slo
*** 631,638 ****
  	for (i = 0; i < natts; ++i)
  	{
  		PrinttupAttrInfo *thisState = myState->myinfo + i;
! 		Datum		origattr = slot->tts_values[i],
! 					attr;
  		bytea	   *outputbytes;
  
  		if (slot->tts_isnull[i])
--- 617,623 ----
  	for (i = 0; i < natts; ++i)
  	{
  		PrinttupAttrInfo *thisState = myState->myinfo + i;
! 		Datum		attr = slot->tts_values[i];
  		bytea	   *outputbytes;
  
  		if (slot->tts_isnull[i])
*************** printtup_internal_20(TupleTableSlot *slo
*** 640,665 ****
  
  		Assert(thisState->format == 1);
  
- 		/*
- 		 * If we have a toasted datum, forcibly detoast it here to avoid
- 		 * memory leakage inside the type's output routine.
- 		 */
- 		if (thisState->typisvarlena)
- 			attr = PointerGetDatum(PG_DETOAST_DATUM(origattr));
- 		else
- 			attr = origattr;
- 
  		outputbytes = SendFunctionCall(&thisState->finfo, attr);
- 		/* We assume the result will not have been toasted */
  		pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
  		pq_sendbytes(&buf, VARDATA(outputbytes),
  					 VARSIZE(outputbytes) - VARHDRSZ);
- 		pfree(outputbytes);
- 
- 		/* Clean up detoasted copy, if any */
- 		if (DatumGetPointer(attr) != DatumGetPointer(origattr))
- 			pfree(DatumGetPointer(attr));
  	}
  
  	pq_endmessage(&buf);
  }
--- 625,639 ----
  
  		Assert(thisState->format == 1);
  
  		outputbytes = SendFunctionCall(&thisState->finfo, attr);
  		pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
  		pq_sendbytes(&buf, VARDATA(outputbytes),
  					 VARSIZE(outputbytes) - VARHDRSZ);
  	}
  
  	pq_endmessage(&buf);
+ 
+ 	/* Return to caller's context, and flush row's temporary memory */
+ 	MemoryContextSwitchTo(oldcontext);
+ 	MemoryContextReset(myState->tmpcontext);
  }
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 2a1af63..62f4c18 100644
*** a/src/backend/bootstrap/bootstrap.c
--- b/src/backend/bootstrap/bootstrap.c
*************** InsertOneValue(char *value, int i)
*** 835,841 ****
  	Oid			typioparam;
  	Oid			typinput;
  	Oid			typoutput;
- 	char	   *prt;
  
  	AssertArg(i >= 0 && i < MAXATTR);
  
--- 835,840 ----
*************** InsertOneValue(char *value, int i)
*** 849,857 ****
  						  &typinput, &typoutput);
  
  	values[i] = OidInputFunctionCall(typinput, value, typioparam, -1);
! 	prt = OidOutputFunctionCall(typoutput, values[i]);
! 	elog(DEBUG4, "inserted -> %s", prt);
! 	pfree(prt);
  }
  
  /* ----------------
--- 848,861 ----
  						  &typinput, &typoutput);
  
  	values[i] = OidInputFunctionCall(typinput, value, typioparam, -1);
! 
! 	/*
! 	 * We use ereport not elog here so that parameters aren't evaluated unless
! 	 * the message is going to be printed, which generally it isn't
! 	 */
! 	ereport(DEBUG4,
! 			(errmsg_internal("inserted -> %s",
! 							 OidOutputFunctionCall(typoutput, values[i]))));
  }
  
  /* ----------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index d91a663..dd35212 100644
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
*************** SPI_fname(TupleDesc tupdesc, int fnumber
*** 869,877 ****
  char *
  SPI_getvalue(HeapTuple tuple, TupleDesc tupdesc, int fnumber)
  {
! 	char	   *result;
! 	Datum		origval,
! 				val;
  	bool		isnull;
  	Oid			typoid,
  				foutoid;
--- 869,875 ----
  char *
  SPI_getvalue(HeapTuple tuple, TupleDesc tupdesc, int fnumber)
  {
! 	Datum		val;
  	bool		isnull;
  	Oid			typoid,
  				foutoid;
*************** SPI_getvalue(HeapTuple tuple, TupleDesc 
*** 886,892 ****
  		return NULL;
  	}
  
! 	origval = heap_getattr(tuple, fnumber, tupdesc, &isnull);
  	if (isnull)
  		return NULL;
  
--- 884,890 ----
  		return NULL;
  	}
  
! 	val = heap_getattr(tuple, fnumber, tupdesc, &isnull);
  	if (isnull)
  		return NULL;
  
*************** SPI_getvalue(HeapTuple tuple, TupleDesc 
*** 897,918 ****
  
  	getTypeOutputInfo(typoid, &foutoid, &typisvarlena);
  
! 	/*
! 	 * If we have a toasted datum, forcibly detoast it here to avoid memory
! 	 * leakage inside the type's output routine.
! 	 */
! 	if (typisvarlena)
! 		val = PointerGetDatum(PG_DETOAST_DATUM(origval));
! 	else
! 		val = origval;
! 
! 	result = OidOutputFunctionCall(foutoid, val);
! 
! 	/* Clean up detoasted copy, if any */
! 	if (val != origval)
! 		pfree(DatumGetPointer(val));
! 
! 	return result;
  }
  
  Datum
--- 895,901 ----
  
  	getTypeOutputInfo(typoid, &foutoid, &typisvarlena);
  
! 	return OidOutputFunctionCall(foutoid, val);
  }
  
  Datum
diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index cb04a72..33647f7 100644
*** a/src/backend/utils/adt/rowtypes.c
--- b/src/backend/utils/adt/rowtypes.c
*************** record_out(PG_FUNCTION_ARGS)
*** 398,412 ****
  			column_info->column_type = column_type;
  		}
  
! 		/*
! 		 * If we have a toasted datum, forcibly detoast it here to avoid
! 		 * memory leakage inside the type's output routine.
! 		 */
! 		if (column_info->typisvarlena)
! 			attr = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
! 		else
! 			attr = values[i];
! 
  		value = OutputFunctionCall(&column_info->proc, attr);
  
  		/* Detect whether we need double quotes for this value */
--- 398,404 ----
  			column_info->column_type = column_type;
  		}
  
! 		attr = values[i];
  		value = OutputFunctionCall(&column_info->proc, attr);
  
  		/* Detect whether we need double quotes for this value */
*************** record_out(PG_FUNCTION_ARGS)
*** 437,448 ****
  		}
  		if (nq)
  			appendStringInfoCharMacro(&buf, '"');
- 
- 		pfree(value);
- 
- 		/* Clean up detoasted copy, if any */
- 		if (DatumGetPointer(attr) != DatumGetPointer(values[i]))
- 			pfree(DatumGetPointer(attr));
  	}
  
  	appendStringInfoChar(&buf, ')');
--- 429,434 ----
*************** record_send(PG_FUNCTION_ARGS)
*** 759,785 ****
  			column_info->column_type = column_type;
  		}
  
! 		/*
! 		 * If we have a toasted datum, forcibly detoast it here to avoid
! 		 * memory leakage inside the type's output routine.
! 		 */
! 		if (column_info->typisvarlena)
! 			attr = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
! 		else
! 			attr = values[i];
! 
  		outputbytes = SendFunctionCall(&column_info->proc, attr);
- 
- 		/* We assume the result will not have been toasted */
  		pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
  		pq_sendbytes(&buf, VARDATA(outputbytes),
  					 VARSIZE(outputbytes) - VARHDRSZ);
- 
- 		pfree(outputbytes);
- 
- 		/* Clean up detoasted copy, if any */
- 		if (DatumGetPointer(attr) != DatumGetPointer(values[i]))
- 			pfree(DatumGetPointer(attr));
  	}
  
  	pfree(values);
--- 745,755 ----
  			column_info->column_type = column_type;
  		}
  
! 		attr = values[i];
  		outputbytes = SendFunctionCall(&column_info->proc, attr);
  		pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
  		pq_sendbytes(&buf, VARDATA(outputbytes),
  					 VARSIZE(outputbytes) - VARHDRSZ);
  	}
  
  	pfree(values);
#6Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: [BUGS] BUG #8573: int4range memory consumption

On 2013-11-02 15:29:36 -0400, Tom Lane wrote:

Attached is a proposed patch for this. It fixes most of the functions
in printtup.c to use a per-row memory context. (I did not bother to
fix debugtup(), which is used only in standalone mode. If you're doing
queries large enough for mem leaks to be problematic in standalone mode,
you're nuts.)

FWIW, by that definition I have been nuts several time in the past -
without much choice since I was recovering data from a corrupted cluster
and the database couldn't be started normally. This now has gotten worse
(even in the backbranches) since debugtup won't clean up detoasted data
anymore.
But I guess the solution for that is to use COPY in those situations
which shouldn't have that problem and should work. Also, easier to parse ;)

My original thought had been to get rid of all pfree's of output function
results, so as to make the world safe for returning constant strings when
such functions find it appropriate. However, I ran into a showstopper
problem: SPI_getvalue(), which is little more than a wrapper around the
appropriate type output function, is documented as returning a pfree'able
string. Some though not all of the callers in core and contrib take the
hint and pfree the result, and I'm sure there are more in third-party
extensions. So if we want to allow output functions to return
non-palloc'd strings, it seems we have to either change SPI_getvalue()'s
API contract or insert a pstrdup() call in it. Neither of these is
attractive, mainly because the vast majority of output function results
would still be palloc'd even if we made aggressive use of the option not
to. That means that if we did the former, light testing wouldn't
necessarily show a problem if someone forgot to remove a pfree() after a
SPI_getvalue() call, and it also means that if we did the latter, the
pstrdup() would usually be a waste of cycles and space.

I guess one option for the future is to to pstrdup() in SPI_getvalue()
and adding a new function that tells whether the result can be freed or
not. Then callers that care can move over. Although I'd guess that many
of those caring will already use SPI_getbinval() manually.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#6)
Re: [BUGS] BUG #8573: int4range memory consumption

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-11-02 15:29:36 -0400, Tom Lane wrote:

Attached is a proposed patch for this. It fixes most of the functions
in printtup.c to use a per-row memory context. (I did not bother to
fix debugtup(), which is used only in standalone mode. If you're doing
queries large enough for mem leaks to be problematic in standalone mode,
you're nuts.)

FWIW, by that definition I have been nuts several time in the past -
without much choice since I was recovering data from a corrupted cluster
and the database couldn't be started normally. This now has gotten worse
(even in the backbranches) since debugtup won't clean up detoasted data
anymore.
But I guess the solution for that is to use COPY in those situations
which shouldn't have that problem and should work. Also, easier to parse ;)

Considering the output from debugtup goes to stdout where it will be
intermixed with prompts etc, I'd have to think that COPY is a way better
solution for dealing with bulk data.

Really I'd like to see standalone mode, in its current form, go away
completely. I had a prototype patch for allowing psql and other clients
to interface to a standalone backend. I think getting that finished would
be a way more productive use of time than improving debugtup.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#7)
Re: [BUGS] BUG #8573: int4range memory consumption

On 2013-11-04 09:45:22 -0500, Tom Lane wrote:

Really I'd like to see standalone mode, in its current form, go away
completely. I had a prototype patch for allowing psql and other clients
to interface to a standalone backend. I think getting that finished would
be a way more productive use of time than improving debugtup.

+many

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#7)
Re: [BUGS] BUG #8573: int4range memory consumption

On Mon, Nov 4, 2013 at 8:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-11-02 15:29:36 -0400, Tom Lane wrote:

Attached is a proposed patch for this. It fixes most of the functions
in printtup.c to use a per-row memory context. (I did not bother to
fix debugtup(), which is used only in standalone mode. If you're doing
queries large enough for mem leaks to be problematic in standalone mode,
you're nuts.)

FWIW, by that definition I have been nuts several time in the past -
without much choice since I was recovering data from a corrupted cluster
and the database couldn't be started normally. This now has gotten worse
(even in the backbranches) since debugtup won't clean up detoasted data
anymore.
But I guess the solution for that is to use COPY in those situations
which shouldn't have that problem and should work. Also, easier to parse ;)

Considering the output from debugtup goes to stdout where it will be
intermixed with prompts etc, I'd have to think that COPY is a way better
solution for dealing with bulk data.

Really I'd like to see standalone mode, in its current form, go away
completely. I had a prototype patch for allowing psql and other clients
to interface to a standalone backend. I think getting that finished would
be a way more productive use of time than improving debugtup.

The last patch including Windows implementation was posted at below
link sometime back.
/messages/by-id/6C0B27F7206C9E4CA54AE035729E9C382853263C@szxeml509-mbs

If this is the right thing, I can rebase the patch and make it ready.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#9)
Re: [BUGS] BUG #8573: int4range memory consumption

Amit Kapila <amit.kapila16@gmail.com> writes:

On Mon, Nov 4, 2013 at 8:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Really I'd like to see standalone mode, in its current form, go away
completely. I had a prototype patch for allowing psql and other clients
to interface to a standalone backend. I think getting that finished would
be a way more productive use of time than improving debugtup.

The last patch including Windows implementation was posted at below
link sometime back.
/messages/by-id/6C0B27F7206C9E4CA54AE035729E9C382853263C@szxeml509-mbs

If this is the right thing, I can rebase the patch and make it ready.

IIRC, the discussion stalled out because people had security concerns
and/or there wasn't consensus about how it should look at the user level.
There's probably not much point in polishing a patch until we have more
agreement about the high-level design.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#10)
Re: [BUGS] BUG #8573: int4range memory consumption

On Tue, Nov 5, 2013 at 12:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Kapila <amit.kapila16@gmail.com> writes:

On Mon, Nov 4, 2013 at 8:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Really I'd like to see standalone mode, in its current form, go away
completely. I had a prototype patch for allowing psql and other clients
to interface to a standalone backend. I think getting that finished would
be a way more productive use of time than improving debugtup.

The last patch including Windows implementation was posted at below
link sometime back.
/messages/by-id/6C0B27F7206C9E4CA54AE035729E9C382853263C@szxeml509-mbs

If this is the right thing, I can rebase the patch and make it ready.

IIRC, the discussion stalled out because people had security concerns
and/or there wasn't consensus about how it should look at the user level.
There's probably not much point in polishing a patch until we have more
agreement about the high-level design.

Okay. However +1 for working on that idea as lot of people have shown
interest in it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#11)
Re: [BUGS] BUG #8573: int4range memory consumption

On Mon, Nov 4, 2013 at 10:51 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Nov 5, 2013 at 12:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Kapila <amit.kapila16@gmail.com> writes:

On Mon, Nov 4, 2013 at 8:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Really I'd like to see standalone mode, in its current form, go away
completely. I had a prototype patch for allowing psql and other clients
to interface to a standalone backend. I think getting that finished would
be a way more productive use of time than improving debugtup.

The last patch including Windows implementation was posted at below
link sometime back.
/messages/by-id/6C0B27F7206C9E4CA54AE035729E9C382853263C@szxeml509-mbs

If this is the right thing, I can rebase the patch and make it ready.

IIRC, the discussion stalled out because people had security concerns
and/or there wasn't consensus about how it should look at the user level.
There's probably not much point in polishing a patch until we have more
agreement about the high-level design.

Okay. However +1 for working on that idea as lot of people have shown
interest in it.

Agreed. I remember the sticking point as being mostly, like, whether
we should just make it work, or whether we had to do a huge amount of
additional work to turn it into something quite different from what
you had in mind. That question answers itself.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers