[PATCH] dtrace probes for memory manager

Started by Zdenek Kotalaabout 16 years ago38 messages
#1Zdenek Kotala
Zdenek.Kotala@Sun.COM
1 attachment(s)

Attached patch contains new dtrace probes for memory management. Main
purpose is to analyze memory footprint - for example how many memory
needs transaction, peak memory per context, when memory block is reused
or when it is allocate by malloc and so on.

There are three groups of probes:

1) general memory context operation:

mcxt-alloc
mcxt-create
mcxt-delete
mcxt-free
mcxt-realloc
mcxt-reset

2) AllocSet operations (called from mcxt)

aset-alloc
aset-delete
aset-free
aset-realloc
aset-reset

3) AllocSet Block operations.

aset-block-free
aset-block-new
aset-block-realloc
aset-block-reset

Zdenek

Attachments:

dtrace_mem.patchtext/x-patch; CHARSET=US-ASCII; name=dtrace_mem.patchDownload
diff -r 68b8827f4738 src/backend/utils/mmgr/aset.c
--- a/src/backend/utils/mmgr/aset.c	Fri Nov 13 11:17:04 2009 +0000
+++ b/src/backend/utils/mmgr/aset.c	Fri Nov 13 21:24:56 2009 +0100
@@ -64,6 +64,7 @@
 
 #include "postgres.h"
 
+#include "pg_trace.h"
 #include "utils/memutils.h"
 
 /* Define this to detail debug alloc information */
@@ -463,6 +464,8 @@
 
 	AssertArg(AllocSetIsValid(set));
 
+	TRACE_POSTGRESQL_ASET_RESET(context); 
+
 	/* Nothing to do if no pallocs since startup or last reset */
 	if (set->isReset)
 		return;
@@ -495,6 +498,8 @@
 #endif
 			block->freeptr = datastart;
 			block->next = NULL;
+			TRACE_POSTGRESQL_ASET_BLOCK_RESET(context, block,  
+				block->endptr - ((char *) block));
 		}
 		else
 		{
@@ -503,6 +508,8 @@
 			/* Wipe freed memory for debugging purposes */
 			memset(block, 0x7F, block->freeptr - ((char *) block));
 #endif
+			TRACE_POSTGRESQL_ASET_BLOCK_FREE(context, block,  
+				block->endptr - ((char *) block));
 			free(block);
 		}
 		block = next;
@@ -529,6 +536,7 @@
 	AllocBlock	block = set->blocks;
 
 	AssertArg(AllocSetIsValid(set));
+	TRACE_POSTGRESQL_ASET_DELETE(context); 
 
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Check for corruption and leaks before freeing */
@@ -548,6 +556,8 @@
 		/* Wipe freed memory for debugging purposes */
 		memset(block, 0x7F, block->freeptr - ((char *) block));
 #endif
+		TRACE_POSTGRESQL_ASET_BLOCK_FREE(context, block,  
+			block->endptr - ((char *) block));
 		free(block);
 		block = next;
 	}
@@ -570,6 +580,7 @@
 
 	AssertArg(AllocSetIsValid(set));
 
+	TRACE_POSTGRESQL_ASET_ALLOC(context, size);
 	/*
 	 * If requested size exceeds maximum for chunks, allocate an entire block
 	 * for this request.
@@ -623,6 +634,7 @@
 		set->isReset = false;
 
 		AllocAllocInfo(set, chunk);
+		TRACE_POSTGRESQL_ASET_BLOCK_NEW(context, block, blksize);
 		return AllocChunkGetPointer(chunk);
 	}
 
@@ -771,6 +783,7 @@
 					 errdetail("Failed on request of size %lu.",
 							   (unsigned long) size)));
 		}
+		TRACE_POSTGRESQL_ASET_BLOCK_NEW(context, block, blksize);
 
 		block->aset = set;
 		block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
@@ -830,6 +843,7 @@
 	AllocChunk	chunk = AllocPointerGetChunk(pointer);
 
 	AllocFreeInfo(set, chunk);
+	TRACE_POSTGRESQL_ASET_FREE(context, pointer);
 
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Test for someone scribbling on unused space in chunk */
@@ -866,6 +880,9 @@
 			set->blocks = block->next;
 		else
 			prevblock->next = block->next;
+
+		TRACE_POSTGRESQL_ASET_BLOCK_FREE(context, block, 
+			block->endptr - ((char *) block));
 #ifdef CLOBBER_FREED_MEMORY
 		/* Wipe freed memory for debugging purposes */
 		memset(block, 0x7F, block->freeptr - ((char *) block));
@@ -905,6 +922,8 @@
 	AllocChunk	chunk = AllocPointerGetChunk(pointer);
 	Size		oldsize = chunk->size;
 
+	TRACE_POSTGRESQL_ASET_REALLOC(context, pointer, size, oldsize);
+
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Test for someone scribbling on unused space in chunk */
 	if (chunk->requested_size < oldsize)
@@ -948,6 +967,7 @@
 		 */
 		AllocBlock	block = set->blocks;
 		AllocBlock	prevblock = NULL;
+		AllocBlock  newblock;
 		Size		chksize;
 		Size		blksize;
 
@@ -967,7 +987,11 @@
 		/* Do the realloc */
 		chksize = MAXALIGN(size);
 		blksize = chksize + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
-		block = (AllocBlock) realloc(block, blksize);
+		newblock = (AllocBlock) realloc(block, blksize);
+
+		TRACE_POSTGRESQL_ASET_BLOCK_REALLOC(context, block, newblock, blksize);
+		block = newblock;
+
 		if (block == NULL)
 		{
 			MemoryContextStats(TopMemoryContext);
diff -r 68b8827f4738 src/backend/utils/mmgr/mcxt.c
--- a/src/backend/utils/mmgr/mcxt.c	Fri Nov 13 11:17:04 2009 +0000
+++ b/src/backend/utils/mmgr/mcxt.c	Fri Nov 13 21:24:56 2009 +0100
@@ -22,7 +22,7 @@
 #include "postgres.h"
 
 #include "utils/memutils.h"
-
+#include "pg_trace.h"
 
 /*****************************************************************************
  *	  GLOBAL MEMORY															 *
@@ -123,6 +123,8 @@
 {
 	AssertArg(MemoryContextIsValid(context));
 
+	TRACE_POSTGRESQL_MCXT_RESET(context->name, context);	
+
 	/* save a function call in common case where there are no children */
 	if (context->firstchild != NULL)
 		MemoryContextResetChildren(context);
@@ -168,6 +170,8 @@
 
 	MemoryContextDeleteChildren(context);
 
+	TRACE_POSTGRESQL_MCXT_DELETE(context->name, context, context->parent);	
+
 	/*
 	 * We delink the context from its parent before deleting it, so that if
 	 * there's an error we won't have deleted/busted contexts still attached
@@ -490,6 +494,8 @@
 		parent->firstchild = node;
 	}
 
+	TRACE_POSTGRESQL_MCXT_CREATE(node->name, node, parent);
+
 	/* Return to type-specific creation routine to finish up */
 	return node;
 }
@@ -504,13 +510,24 @@
 void *
 MemoryContextAlloc(MemoryContext context, Size size)
 {
+	void	   *ret;
+	StandardChunkHeader *header;
+
 	AssertArg(MemoryContextIsValid(context));
 
 	if (!AllocSizeIsValid(size))
 		elog(ERROR, "invalid memory alloc request size %lu",
 			 (unsigned long) size);
+	
+	ret = (*context->methods->alloc) (context, size);
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
 
-	return (*context->methods->alloc) (context, size);
+	Assert(header->context == context);
+	Assert(header->size >= size);
+	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, false);
+
+	return ret;
 }
 
 /*
@@ -524,6 +541,7 @@
 MemoryContextAllocZero(MemoryContext context, Size size)
 {
 	void	   *ret;
+	StandardChunkHeader *header;
 
 	AssertArg(MemoryContextIsValid(context));
 
@@ -531,7 +549,12 @@
 		elog(ERROR, "invalid memory alloc request size %lu",
 			 (unsigned long) size);
 
+
 	ret = (*context->methods->alloc) (context, size);
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
+
+	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, true);
 
 	MemSetAligned(ret, 0, size);
 
@@ -549,6 +572,7 @@
 MemoryContextAllocZeroAligned(MemoryContext context, Size size)
 {
 	void	   *ret;
+	StandardChunkHeader *header;
 
 	AssertArg(MemoryContextIsValid(context));
 
@@ -558,6 +582,11 @@
 
 	ret = (*context->methods->alloc) (context, size);
 
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
+
+	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, true);
+
 	MemSetLoop(ret, 0, size);
 
 	return ret;
@@ -588,6 +617,8 @@
 
 	AssertArg(MemoryContextIsValid(header->context));
 
+	TRACE_POSTGRESQL_MCXT_FREE(header->context->name, header->context, header->size);
+
 	(*header->context->methods->free_p) (header->context, pointer);
 }
 
@@ -620,6 +651,8 @@
 		elog(ERROR, "invalid memory alloc request size %lu",
 			 (unsigned long) size);
 
+	TRACE_POSTGRESQL_MCXT_REALLOC(header->context->name, header->context, header->size, size);
+
 	return (*header->context->methods->realloc) (header->context,
 												 pointer, size);
 }
diff -r 68b8827f4738 src/backend/utils/probes.d
--- a/src/backend/utils/probes.d	Fri Nov 13 11:17:04 2009 +0000
+++ b/src/backend/utils/probes.d	Fri Nov 13 21:24:56 2009 +0100
@@ -90,4 +90,21 @@
 	probe xlog__switch();
 	probe wal__buffer__write__dirty__start();
 	probe wal__buffer__write__dirty__done();
+
+	probe mcxt__create(char *, void *, void *);
+	probe mcxt__delete(char *, void *, void *);
+	probe mcxt__alloc(char *, void *, size_t, size_t, bool);
+	probe mcxt__reset(char *, void *);
+	probe mcxt__realloc(char *, void *, size_t, size_t);
+	probe mcxt__free(char *, void *, size_t);
+
+	probe aset__delete(void *);
+	probe aset__reset(void *);
+	probe aset__alloc(void *, size_t);
+	probe aset__realloc(void *, void *, size_t, size_t);
+	probe aset__free(void *, void *);
+	probe aset__block__reset(void *, void *, size_t);
+	probe aset__block__free(void *, void *, size_t);
+	probe aset__block__new(void *, void *, size_t);
+	probe aset__block__realloc(void *, void *, void *, size_t);
 };
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#1)
Re: [PATCH] dtrace probes for memory manager

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Attached patch contains new dtrace probes for memory management.

This is a bad idea and I want to reject it outright. No ordinary user
is really going to care about those details, and palloc is a
sufficiently hot hot-spot that even the allegedly negligible overhead
of an inactive dtrace probe is going to cost us.

If this goes in, I will disable dtrace support in Red Hat's builds,
and I rather imagine that other packagers will react similarly.

regards, tom lane

#3Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#2)
Re: [PATCH] dtrace probes for memory manager

On Fri, 2009-11-13 at 16:06 -0500, Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Attached patch contains new dtrace probes for memory management.

This is a bad idea and I want to reject it outright. No ordinary user
is really going to care about those details, and palloc is a
sufficiently hot hot-spot that even the allegedly negligible overhead
of an inactive dtrace probe is going to cost us.

No ordinary user is going to use dtrace at all.

If this goes in, I will disable dtrace support in Red Hat's builds,
and I rather imagine that other packagers will react similarly.

Is it possible to have a set of probes that would only be enabled with
say, --enable-debug compile time option? I could certainly see the
benefit to these probes for profiling but that is such as specific use
that it seems to need a specific flag anyway.

Sincerely,

Joshua D. Drake

regards, tom lane

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - Salamander

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#3)
Re: [PATCH] dtrace probes for memory manager

"Joshua D. Drake" <jd@commandprompt.com> writes:

On Fri, 2009-11-13 at 16:06 -0500, Tom Lane wrote:

This is a bad idea and I want to reject it outright. No ordinary user
is really going to care about those details, and palloc is a
sufficiently hot hot-spot that even the allegedly negligible overhead
of an inactive dtrace probe is going to cost us.

No ordinary user is going to use dtrace at all.

Right, but *those probes are going to cost him performance anyway*
if he's using a dtrace-enabled build. Probes associated with I/O
calls might be negligible, probes in palloc are not going to be.

regards, tom lane

#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Joshua D. Drake (#3)
Re: [PATCH] dtrace probes for memory manager

"Joshua D. Drake" <jd@commandprompt.com> wrote:

Is it possible to have a set of probes that would only be enabled
with say, --enable-debug compile time option?

But we routinely build with that for normal production use so that if
we get a core dump or need to backtrace a problem process, we will get
meaningful results. Please don't create a performance penalty for
compiling with debug info. (We'd probably wind up disabling it....)

-Kevin

#6Joshua D. Drake
jd@commandprompt.com
In reply to: Kevin Grittner (#5)
Re: [PATCH] dtrace probes for memory manager

On Fri, 2009-11-13 at 15:22 -0600, Kevin Grittner wrote:

"Joshua D. Drake" <jd@commandprompt.com> wrote:

Is it possible to have a set of probes that would only be enabled
with say, --enable-debug compile time option?

But we routinely build with that for normal production use so that if
we get a core dump or need to backtrace a problem process, we will get
meaningful results. Please don't create a performance penalty for
compiling with debug info. (We'd probably wind up disabling it....)

Good point but I still say that the probes are useful. I am not sure we
have a good compile time flag for it right now but it seems that would
be the way to do it.

Joshua D. Drake

-Kevin

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - Salamander

#7Alvaro Herrera
alvherre@commandprompt.com
In reply to: Zdenek Kotala (#1)
Re: [PATCH] dtrace probes for memory manager

Zdenek Kotala wrote:

Attached patch contains new dtrace probes for memory management. Main
purpose is to analyze memory footprint - for example how many memory
needs transaction, peak memory per context, when memory block is reused
or when it is allocate by malloc and so on.

Having had to instrument these to figure out some problems, I'd give
this patch a +1. However, the performance argument is compelling. As a
compromise, maybe we could have a #define that needs to be turned on at
compile time to enable these probes; so a regular dtrace-enabled build
would not have them, but if you really needed to analyze memory
allocations, you could recompile to turn them on.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#8Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#2)
Re: [PATCH] dtrace probes for memory manager

Tom Lane píše v pá 13. 11. 2009 v 16:06 -0500:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Attached patch contains new dtrace probes for memory management.

This is a bad idea and I want to reject it outright. No ordinary user
is really going to care about those details, and palloc is a
sufficiently hot hot-spot that even the allegedly negligible overhead
of an inactive dtrace probe is going to cost us.

I don't think that impact is too big here. User space DTrace probes are
implemented like function call. When probe is inactive call is replaced
by nop. Only what stay in code is arguments preparations.

there is asm code 32bit intel from palloc:

MemoryContextAlloc+0xae: pushl $0x0
MemoryContextAlloc+0xb0: pushl -0x8(%ebx)
MemoryContextAlloc+0xb3: pushl 0xc(%ebp)
MemoryContextAlloc+0xb6: movl 0x8(%ebp),%eax
MemoryContextAlloc+0xb9: pushl %eax
MemoryContextAlloc+0xba: pushl 0x14(%eax)
MemoryContextAlloc+0xbd: nop
MemoryContextAlloc+0xbe: nop
MemoryContextAlloc+0xbf: nop
MemoryContextAlloc+0xc0: nop
MemoryContextAlloc+0xc1: nop
MemoryContextAlloc+0xc2: addl $0x20,%esp

You can see is that overhead depends on number of argument. I used five
arguments now but two can be enough. Only dtrace script will be
complicated in some cases after that.

By my opinion if you compare whole palloc code path, Dtrace probes
overhead is minimal.

Zdenek

#9Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Alvaro Herrera (#7)
Re: [PATCH] dtrace probes for memory manager

Alvaro Herrera píše v pá 13. 11. 2009 v 18:34 -0300:

Zdenek Kotala wrote:

Attached patch contains new dtrace probes for memory management. Main
purpose is to analyze memory footprint - for example how many memory
needs transaction, peak memory per context, when memory block is reused
or when it is allocate by malloc and so on.

Having had to instrument these to figure out some problems, I'd give
this patch a +1. However, the performance argument is compelling. As a
compromise, maybe we could have a #define that needs to be turned on at
compile time to enable these probes; so a regular dtrace-enabled build
would not have them, but if you really needed to analyze memory
allocations, you could recompile to turn them on.

But point of dtrace probes is that they are here without
recompilation :(. Do we have any test which we could use for performance
penalty testing? I don't think that overhead is significant.

Zdenek

#10Robert Haas
robertmhaas@gmail.com
In reply to: Zdenek Kotala (#9)
Re: [PATCH] dtrace probes for memory manager

On Fri, Nov 13, 2009 at 4:41 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

Alvaro Herrera píše v pá 13. 11. 2009 v 18:34 -0300:

Zdenek Kotala wrote:

Attached patch contains new dtrace probes for memory management. Main
purpose is to analyze memory footprint - for example how many memory
needs transaction, peak memory per context, when memory block is reused
or when it is allocate by malloc and so on.

Having had to instrument these to figure out some problems, I'd give
this patch a +1.  However, the performance argument is compelling.  As a
compromise, maybe we could have a #define that needs to be turned on at
compile time to enable these probes; so a regular dtrace-enabled build
would not have them, but if you really needed to analyze memory
allocations, you could recompile to turn them on.

But point of dtrace probes is that they are here without
recompilation :(. Do we have any test which we could use for performance
penalty testing? I don't think that overhead is significant.

Don't think. Benchmark. :-)

(If you can measure it at all, it's too much, at least IMHO.)

...Robert

#11Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Alvaro Herrera (#7)
Re: [PATCH] dtrace probes for memory manager

Alvaro Herrera píše v pá 13. 11. 2009 v 18:34 -0300:

Zdenek Kotala wrote:

Attached patch contains new dtrace probes for memory management. Main
purpose is to analyze memory footprint - for example how many memory
needs transaction, peak memory per context, when memory block is reused
or when it is allocate by malloc and so on.

Having had to instrument these to figure out some problems, I'd give
this patch a +1. However, the performance argument is compelling. As a
compromise, maybe we could have a #define that needs to be turned on at
compile time to enable these probes; so a regular dtrace-enabled build
would not have them, but if you really needed to analyze memory
allocations, you could recompile to turn them on.

The another idea is to have two AllocSet functions set. One without and
one with dtrace probes. And switch it by GUC flag for example. From mcxt
I create and delete context is important, rest can be taken from alloc
set probes.

Zdenek

#12Robert Haas
robertmhaas@gmail.com
In reply to: Zdenek Kotala (#11)
Re: [PATCH] dtrace probes for memory manager

On Fri, Nov 13, 2009 at 5:20 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

Alvaro Herrera píše v pá 13. 11. 2009 v 18:34 -0300:

Zdenek Kotala wrote:

Attached patch contains new dtrace probes for memory management. Main
purpose is to analyze memory footprint - for example how many memory
needs transaction, peak memory per context, when memory block is reused
or when it is allocate by malloc and so on.

Having had to instrument these to figure out some problems, I'd give
this patch a +1.  However, the performance argument is compelling.  As a
compromise, maybe we could have a #define that needs to be turned on at
compile time to enable these probes; so a regular dtrace-enabled build
would not have them, but if you really needed to analyze memory
allocations, you could recompile to turn them on.

The another idea is to have two AllocSet functions set. One without and
one with dtrace probes. And switch it by GUC flag for example. From mcxt
I create and delete context is important, rest can be taken from alloc
set probes.

I've always thought it was odd that we had a function-dispatch
interface in such a performance-critical path. But if we really want
to keep it around this might not be a bad thing to use it for.

...Robert

#13Bernd Helmle
mailings@oopsware.de
In reply to: Robert Haas (#10)
Re: [PATCH] dtrace probes for memory manager

--On 13. November 2009 17:16:15 -0500 Robert Haas <robertmhaas@gmail.com>
wrote:

Don't think. Benchmark. :-)

(If you can measure it at all, it's too much, at least IMHO.)

I've tried to benchmark this now on my (fairly slow compared to server
hardware) MacBook and see some negative trend for those memory probes in
pgbench. Running dozens of rounds with pgbench (scale 150, 10 clients /
1000 transactions) gives the following numbers (untuned PostgreSQL config):

AVG(tps) with dtrace memory probes: 31.62 tps
AVG(tps) without dtrace memory probes: 38.24 tps

So there seems to be a minor slowdown at least on *my* machine. However, it
would be fine if someone can prove these numbers..

What do you guys think, what other tests/parameters can be invoked to test
for an impact ?

--
Thanks

Bernd

#14Greg Smith
greg@2ndquadrant.com
In reply to: Bernd Helmle (#13)
Re: [PATCH] dtrace probes for memory manager

Bernd Helmle wrote:

I've tried to benchmark this now on my (fairly slow compared to server
hardware) MacBook and see some negative trend for those memory probes
in pgbench. Running dozens of rounds with pgbench (scale 150, 10
clients / 1000 transactions)

That makes for a 5.5 minute test, which is unfortunately close to the
default checkpoint period. You're going to want a pgbench configuration
that's doing thousands of operations per second to measure this overhead
I think, and let it run a bit longer. The difference you're seeing
could easily be just that that the "with probes" result had more
checkpoints happen during testing than the other one--if it got even a
single checkpoint more, that could be enough to throw results off using
the default test and such low TPS results.

Try this instead, which will give you a test where checkpoints have a
minimal impact, but lots of memory will be thrown around:

pgbench -i -s 10 <db>
pgbench -S -c 10 -T 600 <db>

That will do just SELECT statements against a much smaller database
(about 160MB) and will run for 10 minutes each time.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#15Bernd Helmle
mailings@oopsware.de
In reply to: Greg Smith (#14)
Re: [PATCH] dtrace probes for memory manager

--On 8. Dezember 2009 15:51:52 -0500 Greg Smith <greg@2ndquadrant.com>
wrote:

Try this instead, which will give you a test where checkpoints have a
minimal impact, but lots of memory will be thrown around:

pgbench -i -s 10 <db>
pgbench -S -c 10 -T 600 <db>

Thanks for the input, will try....

--
Thanks

Bernd

#16Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Bernd Helmle (#15)
2 attachment(s)
Re: [PATCH] dtrace probes for memory manager

Bernd Helmle píše v út 08. 12. 2009 v 22:06 +0100:

--On 8. Dezember 2009 15:51:52 -0500 Greg Smith <greg@2ndquadrant.com>
wrote:

Try this instead, which will give you a test where checkpoints have a
minimal impact, but lots of memory will be thrown around:

pgbench -i -s 10 <db>
pgbench -S -c 10 -T 600 <db>

Thanks for the input, will try....

I modified probes to reduce overhead. Prototype patch is attached. Main
point is to remove mcxt_alloc probe and keep only aset_alloc. I did also
some testing with interesting results. At first I prepare special C
store function (attached) which do only allocation and deallocation and
I measured how long it takes:

On 32bit the memory allocation is slow down 8.4% and on 64bit it is
only 4.6%. Good to mention that I call palloc and pfree but in standard
behavior pfree is not much used and memory is freed when context is
destroyed. It means that we should think about 4.2% and 2.3% instead.

But in normal situation database does also other thing and palloc is
only one part of code path. It is why I run second test and use sun
studio profiling tools (collect/analyzer) to determine how much CPU
ticks cost the probes during pg_bench run. And results are much better.
AllocSet alloc function takes about 4-5% and probes assembler code takes
0.1-0.2% on 64bit. I did not test 32bit but my expectation is that it
should be about 0.3-0.4%.

Zdenek

Attachments:

dtrace_test.patchtext/x-patch; CHARSET=US-ASCII; name=dtrace_test.patchDownload
diff -r 68b8827f4738 src/backend/utils/mmgr/aset.c
--- a/src/backend/utils/mmgr/aset.c	Fri Nov 13 11:17:04 2009 +0000
+++ b/src/backend/utils/mmgr/aset.c	Wed Dec 09 14:41:34 2009 +0100
@@ -64,6 +64,7 @@
 
 #include "postgres.h"
 
+#include "pg_trace.h"
 #include "utils/memutils.h"
 
 /* Define this to detail debug alloc information */
@@ -463,6 +464,8 @@
 
 	AssertArg(AllocSetIsValid(set));
 
+	TRACE_POSTGRESQL_ASET_RESET(context); 
+
 	/* Nothing to do if no pallocs since startup or last reset */
 	if (set->isReset)
 		return;
@@ -495,6 +498,8 @@
 #endif
 			block->freeptr = datastart;
 			block->next = NULL;
+			TRACE_POSTGRESQL_ASET_BLOCK_RESET(context, block,  
+				block->endptr - ((char *) block));
 		}
 		else
 		{
@@ -503,6 +508,8 @@
 			/* Wipe freed memory for debugging purposes */
 			memset(block, 0x7F, block->freeptr - ((char *) block));
 #endif
+			TRACE_POSTGRESQL_ASET_BLOCK_FREE(context, block,  
+				block->endptr - ((char *) block));
 			free(block);
 		}
 		block = next;
@@ -529,6 +536,7 @@
 	AllocBlock	block = set->blocks;
 
 	AssertArg(AllocSetIsValid(set));
+	TRACE_POSTGRESQL_ASET_DELETE(context); 
 
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Check for corruption and leaks before freeing */
@@ -548,6 +556,8 @@
 		/* Wipe freed memory for debugging purposes */
 		memset(block, 0x7F, block->freeptr - ((char *) block));
 #endif
+		TRACE_POSTGRESQL_ASET_BLOCK_FREE(context, block,  
+			block->endptr - ((char *) block));
 		free(block);
 		block = next;
 	}
@@ -570,6 +580,7 @@
 
 	AssertArg(AllocSetIsValid(set));
 
+	TRACE_POSTGRESQL_ASET_ALLOC(context, size);
 	/*
 	 * If requested size exceeds maximum for chunks, allocate an entire block
 	 * for this request.
@@ -623,6 +634,7 @@
 		set->isReset = false;
 
 		AllocAllocInfo(set, chunk);
+		TRACE_POSTGRESQL_ASET_BLOCK_NEW(context, block, blksize);
 		return AllocChunkGetPointer(chunk);
 	}
 
@@ -771,6 +783,7 @@
 					 errdetail("Failed on request of size %lu.",
 							   (unsigned long) size)));
 		}
+		TRACE_POSTGRESQL_ASET_BLOCK_NEW(context, block, blksize);
 
 		block->aset = set;
 		block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
@@ -830,6 +843,7 @@
 	AllocChunk	chunk = AllocPointerGetChunk(pointer);
 
 	AllocFreeInfo(set, chunk);
+	TRACE_POSTGRESQL_ASET_FREE(context, pointer);
 
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Test for someone scribbling on unused space in chunk */
@@ -866,6 +880,9 @@
 			set->blocks = block->next;
 		else
 			prevblock->next = block->next;
+
+		TRACE_POSTGRESQL_ASET_BLOCK_FREE(context, block, 
+			block->endptr - ((char *) block));
 #ifdef CLOBBER_FREED_MEMORY
 		/* Wipe freed memory for debugging purposes */
 		memset(block, 0x7F, block->freeptr - ((char *) block));
@@ -905,6 +922,8 @@
 	AllocChunk	chunk = AllocPointerGetChunk(pointer);
 	Size		oldsize = chunk->size;
 
+	TRACE_POSTGRESQL_ASET_REALLOC(context, pointer, size, oldsize);
+
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Test for someone scribbling on unused space in chunk */
 	if (chunk->requested_size < oldsize)
@@ -948,6 +967,7 @@
 		 */
 		AllocBlock	block = set->blocks;
 		AllocBlock	prevblock = NULL;
+		AllocBlock  newblock;
 		Size		chksize;
 		Size		blksize;
 
@@ -967,7 +987,11 @@
 		/* Do the realloc */
 		chksize = MAXALIGN(size);
 		blksize = chksize + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
-		block = (AllocBlock) realloc(block, blksize);
+		newblock = (AllocBlock) realloc(block, blksize);
+
+		TRACE_POSTGRESQL_ASET_BLOCK_REALLOC(context, block, newblock, blksize);
+		block = newblock;
+
 		if (block == NULL)
 		{
 			MemoryContextStats(TopMemoryContext);
diff -r 68b8827f4738 src/backend/utils/mmgr/mcxt.c
--- a/src/backend/utils/mmgr/mcxt.c	Fri Nov 13 11:17:04 2009 +0000
+++ b/src/backend/utils/mmgr/mcxt.c	Wed Dec 09 14:41:34 2009 +0100
@@ -22,7 +22,7 @@
 #include "postgres.h"
 
 #include "utils/memutils.h"
-
+#include "pg_trace.h"
 
 /*****************************************************************************
  *	  GLOBAL MEMORY															 *
@@ -123,6 +123,8 @@
 {
 	AssertArg(MemoryContextIsValid(context));
 
+	TRACE_POSTGRESQL_MCXT_RESET(context->name, context);	
+
 	/* save a function call in common case where there are no children */
 	if (context->firstchild != NULL)
 		MemoryContextResetChildren(context);
@@ -168,6 +170,8 @@
 
 	MemoryContextDeleteChildren(context);
 
+	TRACE_POSTGRESQL_MCXT_DELETE(context->name, context, context->parent);	
+
 	/*
 	 * We delink the context from its parent before deleting it, so that if
 	 * there's an error we won't have deleted/busted contexts still attached
@@ -490,6 +494,8 @@
 		parent->firstchild = node;
 	}
 
+	TRACE_POSTGRESQL_MCXT_CREATE(node->name, node, parent);
+
 	/* Return to type-specific creation routine to finish up */
 	return node;
 }
@@ -510,7 +516,8 @@
 		elog(ERROR, "invalid memory alloc request size %lu",
 			 (unsigned long) size);
 
-	return (*context->methods->alloc) (context, size);
+//	TRACE_POSTGRESQL_MCXT_ALLOC(context, size);
+	return ((*context->methods->alloc) (context, size));
 }
 
 /*
@@ -524,6 +531,7 @@
 MemoryContextAllocZero(MemoryContext context, Size size)
 {
 	void	   *ret;
+	StandardChunkHeader *header;
 
 	AssertArg(MemoryContextIsValid(context));
 
@@ -531,7 +539,12 @@
 		elog(ERROR, "invalid memory alloc request size %lu",
 			 (unsigned long) size);
 
+
 	ret = (*context->methods->alloc) (context, size);
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
+
+//	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, true);
 
 	MemSetAligned(ret, 0, size);
 
@@ -549,6 +562,7 @@
 MemoryContextAllocZeroAligned(MemoryContext context, Size size)
 {
 	void	   *ret;
+	StandardChunkHeader *header;
 
 	AssertArg(MemoryContextIsValid(context));
 
@@ -558,6 +572,11 @@
 
 	ret = (*context->methods->alloc) (context, size);
 
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
+
+//	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, true);
+
 	MemSetLoop(ret, 0, size);
 
 	return ret;
@@ -588,6 +607,8 @@
 
 	AssertArg(MemoryContextIsValid(header->context));
 
+//	TRACE_POSTGRESQL_MCXT_FREE(header->context, pointer);
+
 	(*header->context->methods->free_p) (header->context, pointer);
 }
 
@@ -620,6 +641,8 @@
 		elog(ERROR, "invalid memory alloc request size %lu",
 			 (unsigned long) size);
 
+	TRACE_POSTGRESQL_MCXT_REALLOC(header->context->name, header->context, header->size, size);
+
 	return (*header->context->methods->realloc) (header->context,
 												 pointer, size);
 }
diff -r 68b8827f4738 src/backend/utils/probes.d
--- a/src/backend/utils/probes.d	Fri Nov 13 11:17:04 2009 +0000
+++ b/src/backend/utils/probes.d	Wed Dec 09 14:41:34 2009 +0100
@@ -90,4 +90,21 @@
 	probe xlog__switch();
 	probe wal__buffer__write__dirty__start();
 	probe wal__buffer__write__dirty__done();
+
+	probe mcxt__create(char *, void *, void *);
+	probe mcxt__delete(char *, void *, void *);
+/*	probe mcxt__alloc(void *, size_t);
+	probe mcxt__free(void *, void *); */
+	probe mcxt__reset(char *, void *);
+	probe mcxt__realloc(char *, void *, size_t, size_t);
+
+	probe aset__delete(void *);
+	probe aset__reset(void *);
+	probe aset__alloc(void *, size_t);
+	probe aset__realloc(void *, void *, size_t, size_t);
+	probe aset__free(void *, void *);
+	probe aset__block__reset(void *, void *, size_t);
+	probe aset__block__free(void *, void *, size_t);
+	probe aset__block__new(void *, void *, size_t);
+	probe aset__block__realloc(void *, void *, void *, size_t);
 };
memtest.ctext/x-csrc; CHARSET=US-ASCII; name=memtest.cDownload
#17Noname
fche@redhat.com
In reply to: Zdenek Kotala (#16)
Re: [PATCH] dtrace probes for memory manager

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

[...]
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
+
+//	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, true);
+
[...]

If the dormant overhead of these probes is measured or suspected to be
excessive, consider using the dtrace-generated per-probe foo_ENABLED()
conditional, or a postgres configuration global thusly:

if (__builtin_expect(TRACE_POSTGRESQL_MCXT_ALLOC_ENABLED(), 0))
TRACE_POSTGRESQL_MCXT_ALLOC(...);

so that the whole instrumentation parameter setup/call can be placed
out of the hot line with gcc -freorder-blocks.

- FChE

#18Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Noname (#17)
Re: [PATCH] dtrace probes for memory manager

Frank Ch. Eigler píše v čt 10. 12. 2009 v 14:11 -0500:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

[...]
+	header = (StandardChunkHeader *)
+		((char *) ret - STANDARDCHUNKHEADERSIZE);
+
+//	TRACE_POSTGRESQL_MCXT_ALLOC(context->name, context, size, header->size, true);
+
[...]

If the dormant overhead of these probes is measured or suspected to be
excessive, consider using the dtrace-generated per-probe foo_ENABLED()
conditional, or a postgres configuration global thusly:

TRACE_POSTGRESQL_MCXT_ALLOC and TRACE_POSTGRESQL_ASET_ALLOC are
duplicated probes. Have them both make sense but from performance point
of view to have one of them is acceptable.

foo_enable() is good to use when number of argument and their evaluation
cost too much. In this case it does no seem to be much useful. See ASM
code:

AllocSetAlloc+0x17: xorq %rax,%rax
AllocSetAlloc+0x1a: nop
AllocSetAlloc+0x1b: nop
AllocSetAlloc+0x1c: testl %eax,%eax
AllocSetAlloc+0x1e: je +0xb <AllocSetAlloc+0x2b>
AllocSetAlloc+0x20: movq %r13,%rdi
AllocSetAlloc+0x23: movq %r14,%rsi
AllocSetAlloc+0x26: nop
AllocSetAlloc+0x27: nop
AllocSetAlloc+0x28: nop
AllocSetAlloc+0x29: nop
AllocSetAlloc+0x2a: nop

if (__builtin_expect(TRACE_POSTGRESQL_MCXT_ALLOC_ENABLED(), 0))
TRACE_POSTGRESQL_MCXT_ALLOC(...);

so that the whole instrumentation parameter setup/call can be placed
out of the hot line with gcc -freorder-blocks.

compiler specific construct is not good way. Do not forget that also
other compiler exists.

Zdenek

#19Frank Ch. Eigler
fche@redhat.com
In reply to: Zdenek Kotala (#18)
Re: [PATCH] dtrace probes for memory manager

Hi -

On Thu, Dec 10, 2009 at 09:33:28PM +0100, Zdenek Kotala wrote:

[...]

If the dormant overhead of these probes is measured or suspected to be
excessive, consider using the dtrace-generated per-probe foo_ENABLED()
conditional, or a postgres configuration global thusly:

[...] foo_enable() is good to use when number of argument and their
evaluation cost too much. In this case it does no seem to be much
useful. [...]

Right, I just wanted to make the others aware of the option.

if (__builtin_expect(TRACE_POSTGRESQL_MCXT_ALLOC_ENABLED(), 0))
TRACE_POSTGRESQL_MCXT_ALLOC(...);

so that the whole instrumentation parameter setup/call can be placed
out of the hot line with gcc -freorder-blocks.

compiler specific construct is not good way. Do not forget that also
other compiler exists.

Certainly. Many projects -- but apparently not postgresql -- wrap
such branch prediction hints in macros such as likely() and
unlikely(), which are easily no-op'd for compilers that don't support
this sort of thing.

- FChE

#20Robert Haas
robertmhaas@gmail.com
In reply to: Zdenek Kotala (#16)
Re: [PATCH] dtrace probes for memory manager

On Wed, Dec 9, 2009 at 9:04 AM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

Bernd Helmle píše v út 08. 12. 2009 v 22:06 +0100:

--On 8. Dezember 2009 15:51:52 -0500 Greg Smith <greg@2ndquadrant.com>
wrote:

Try this instead, which will give you a test where checkpoints have a
minimal impact, but lots of memory will be thrown around:

pgbench -i -s 10 <db>
pgbench -S -c 10 -T 600 <db>

Thanks for the input, will try....

I modified probes to reduce overhead. Prototype patch is attached. Main
point is to remove mcxt_alloc probe and keep only aset_alloc. I did also
some testing with interesting results. At first I prepare special C
store function (attached) which do only allocation and deallocation and
I measured how long it takes:

On 32bit the memory allocation is slow down 8.4%  and on 64bit it is
only 4.6%. Good to mention that I call palloc and pfree but in standard
behavior pfree is not much used and memory is freed when context is
destroyed. It means that we should think about 4.2% and 2.3% instead.

But in normal situation database does also other thing and palloc is
only one part of code path. It is why I run second test and use sun
studio profiling tools (collect/analyzer) to determine how much CPU
ticks cost the probes during pg_bench run. And results are much better.
AllocSet alloc function takes about 4-5% and probes assembler code takes
0.1-0.2% on 64bit. I did not test 32bit but my expectation is that it
should be about 0.3-0.4%.

There's not really enough detail here to determine what you tested and
what the results were, and I don't think this patch has any chance at
all of getting committed without that. Please clarify.

If there's some real-world test where this probe costs 0.3%-0.4%, I
think that is sufficient grounds for rejecting this patch. I
understand the desire of people to be able to use dtrace, but our
performance is too hard-won for me to want to give any measurable of
it up for tracing and instrumentation hooks that will only be used by
a small number of users in a small number of situations.

...Robert

#21Bernd Helmle
mailings@oopsware.de
In reply to: Robert Haas (#20)
Re: [PATCH] dtrace probes for memory manager

--On 10. Dezember 2009 23:55:49 -0500 Robert Haas <robertmhaas@gmail.com>
wrote:

If there's some real-world test where this probe costs 0.3%-0.4%, I
think that is sufficient grounds for rejecting this patch. I
understand the desire of people to be able to use dtrace, but our
performance is too hard-won for me to want to give any measurable of
it up for tracing and instrumentation hooks that will only be used by
a small number of users in a small number of situations.

I repeated the pgbench runs per Greg's advice (see upthread) and it seems
there is actually a small slowdown which supports this argument,
unfortunately. After repeating the pgbench runs with and without the new
probes (note: i've used the new version of the patch, too), the numbers are
going to stabilize as follows:

without compiled probes: AVG(2531.68)
with compiled probes: AVG(2511.97)

I can repeat that tests over and over, but this doesn't seem to change the
whole picture (so there seems some real argument for a 0.4 - 0.6% cost, at
least on *my* machine here with pgbench).

--
Thanks

Bernd

#22Alvaro Herrera
alvherre@commandprompt.com
In reply to: Bernd Helmle (#21)
Re: [PATCH] dtrace probes for memory manager

Bernd Helmle escribi�:

I repeated the pgbench runs per Greg's advice (see upthread) and it
seems there is actually a small slowdown which supports this
argument, unfortunately. After repeating the pgbench runs with and
without the new probes (note: i've used the new version of the
patch, too), the numbers are going to stabilize as follows:

without compiled probes: AVG(2531.68)
with compiled probes: AVG(2511.97)

Were the probes enabled?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#23Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#22)
Re: [PATCH] dtrace probes for memory manager

--On 11. Dezember 2009 11:28:54 -0300 Alvaro Herrera
<alvherre@commandprompt.com> wrote:

without compiled probes: AVG(2531.68)
with compiled probes: AVG(2511.97)

Were the probes enabled?

Hmm, they were just compiled in, i didn't anything to instrument them with
dtrace.

I've just started a pgbench/dtrace run with instrumented probes aset_alloc,
aset_free and aset_realloc which just counts the calls to them during
pgbench, the first run gives me

tps = 1035.045523 (excluding connections establishing)

Ideas?

--
Thanks

Bernd

#24Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Bernd Helmle (#23)
Re: [PATCH] dtrace probes for memory manager

Bernd Helmle píše v pá 11. 12. 2009 v 17:13 +0100:

--On 11. Dezember 2009 11:28:54 -0300 Alvaro Herrera
<alvherre@commandprompt.com> wrote:

without compiled probes: AVG(2531.68)
with compiled probes: AVG(2511.97)

Were the probes enabled?

Hmm, they were just compiled in, i didn't anything to instrument them with
dtrace.

I've just started a pgbench/dtrace run with instrumented probes aset_alloc,
aset_free and aset_realloc which just counts the calls to them during
pgbench, the first run gives me

tps = 1035.045523 (excluding connections establishing)

Ideas?

When probes are activated they have performance impact. It is normal.
Important is that you can use it when you need it on production system.
No recompilation, no extra binary, no downtime and it is safe.
Performance impact depends on Dscript

Zdenek

#25Robert Haas
robertmhaas@gmail.com
In reply to: Zdenek Kotala (#24)
Re: [PATCH] dtrace probes for memory manager

On Fri, Dec 11, 2009 at 12:55 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

Bernd Helmle píše v pá 11. 12. 2009 v 17:13 +0100:

--On 11. Dezember 2009 11:28:54 -0300 Alvaro Herrera
<alvherre@commandprompt.com> wrote:

without compiled probes: AVG(2531.68)
with compiled probes: AVG(2511.97)

Were the probes enabled?

Hmm, they were just compiled in, i didn't anything to instrument them with
dtrace.

I've just started a pgbench/dtrace run with instrumented probes aset_alloc,
aset_free and aset_realloc which just counts the calls to them during
pgbench, the first run gives me

tps = 1035.045523 (excluding connections establishing)

Ideas?

When probes are activated they have performance impact. It is normal.
Important is that you can use it when you need it on production system.
No recompilation, no extra binary, no downtime and it is safe.
Performance impact depends on Dscript

Yeah. The problem here is the impact when the probes aren't enabled.

I thought we had an idea of using the AllocSet dispatch mechanism to
make this zero-overhead in the case where the probes are not enabled.
What happened to that notion?

...Robert

#26Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Robert Haas (#20)
Re: [PATCH] dtrace probes for memory manager

Robert Haas píše v čt 10. 12. 2009 v 23:55 -0500:

On Wed, Dec 9, 2009 at 9:04 AM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

But in normal situation database does also other thing and palloc is
only one part of code path. It is why I run second test and use sun
studio profiling tools (collect/analyzer) to determine how much CPU
ticks cost the probes during pg_bench run. And results are much better.
AllocSet alloc function takes about 4-5% and probes assembler code takes
0.1-0.2% on 64bit. I did not test 32bit but my expectation is that it
should be about 0.3-0.4%.

There's not really enough detail here to determine what you tested and
what the results were, and I don't think this patch has any chance at
all of getting committed without that. Please clarify.

If there's some real-world test where this probe costs 0.3%-0.4%, I
think that is sufficient grounds for rejecting this patch. I
understand the desire of people to be able to use dtrace, but our
performance is too hard-won for me to want to give any measurable of
it up for tracing and instrumentation hooks that will only be used by
a small number of users in a small number of situations.

As I mentioned I run pg_bench -c10 -t1000 and collect data from
backends. collect and analyzer is similar tool to gprof.

Zdenek

#27Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Robert Haas (#25)
Re: [PATCH] dtrace probes for memory manager

Robert Haas píše v pá 11. 12. 2009 v 12:55 -0500:

On Fri, Dec 11, 2009 at 12:55 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

Bernd Helmle píše v pá 11. 12. 2009 v 17:13 +0100:

--On 11. Dezember 2009 11:28:54 -0300 Alvaro Herrera
<alvherre@commandprompt.com> wrote:

without compiled probes: AVG(2531.68)
with compiled probes: AVG(2511.97)

Were the probes enabled?

Hmm, they were just compiled in, i didn't anything to instrument them with
dtrace.

I've just started a pgbench/dtrace run with instrumented probes aset_alloc,
aset_free and aset_realloc which just counts the calls to them during
pgbench, the first run gives me

tps = 1035.045523 (excluding connections establishing)

Ideas?

When probes are activated they have performance impact. It is normal.
Important is that you can use it when you need it on production system.
No recompilation, no extra binary, no downtime and it is safe.
Performance impact depends on Dscript

Yeah. The problem here is the impact when the probes aren't enabled.

I thought we had an idea of using the AllocSet dispatch mechanism to
make this zero-overhead in the case where the probes are not enabled.
What happened to that notion?

We know that performance impact is less then 1% probably less then 0.6%.
The question is if it is acceptable or not. I personally think that it
is acceptable. However if not, I will start work on backup solution with
dtraced AllocSet and some switching mechanism. But it needs little
discussion about design. And first we need decision about current
performance impact.

Zdenek

#28Robert Haas
robertmhaas@gmail.com
In reply to: Zdenek Kotala (#27)
Re: [PATCH] dtrace probes for memory manager

On Fri, Dec 11, 2009 at 1:04 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

We know that performance impact is less then 1% probably less then 0.6%.
The question is if it is acceptable or not. I personally think that it
is acceptable. However if not, I will start work on backup solution with
dtraced AllocSet and some switching mechanism. But it needs little
discussion about design. And first we need decision about current
performance impact.

As far as I am concerned that is way too much, particularly
considering that your test case isn't designed to be particularly
memory-allocation intensive, and if it is up to me I will reject this.
Even a quarter-percent slowdown for a feature that will be used only
by a small fraction of users only a small fraction of time time seems
totally unacceptable to me.

...Robert

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#28)
Re: [PATCH] dtrace probes for memory manager

Robert Haas <robertmhaas@gmail.com> writes:

As far as I am concerned that is way too much, particularly
considering that your test case isn't designed to be particularly
memory-allocation intensive, and if it is up to me I will reject this.
Even a quarter-percent slowdown for a feature that will be used only
by a small fraction of users only a small fraction of time time seems
totally unacceptable to me.

It seems to me that anyone who really needs this can instrument the
alloc functions anyway --- isn't one of the features of DTrace supposed
to be that you can monitor calls to a particular function without any
prearranged code support? Or is that one of the things like "zero
overhead" that turns out to be more marketing-speak than reality?

Anyway I concur with Robert's opinion that the use-case is far too small
to justify incurring a measurable overhead for everybody. There might
be some small argument for putting these in under an extra #ifdef, but
they wouldn't get into any regular production build.

regards, tom lane

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#25)
Re: [PATCH] dtrace probes for memory manager

Robert Haas <robertmhaas@gmail.com> writes:

I thought we had an idea of using the AllocSet dispatch mechanism to
make this zero-overhead in the case where the probes are not enabled.
What happened to that notion?

I must have missed that discussion, but +1 --- should be possible to get
to zero-overhead-when-off that way. The trick is to figure out
what/where enables the alternate implementation. The current design
assumes that the callers of FooContextCreate choose the implementation,
but we don't want that here.

regards, tom lane

#31Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#29)
Re: [PATCH] dtrace probes for memory manager

Tom Lane píše v pá 11. 12. 2009 v 13:56 -0500:

Robert Haas <robertmhaas@gmail.com> writes:

As far as I am concerned that is way too much, particularly
considering that your test case isn't designed to be particularly
memory-allocation intensive, and if it is up to me I will reject this.
Even a quarter-percent slowdown for a feature that will be used only
by a small fraction of users only a small fraction of time time seems
totally unacceptable to me.

It seems to me that anyone who really needs this can instrument the
alloc functions anyway --- isn't one of the features of DTrace supposed
to be that you can monitor calls to a particular function without any
prearranged code support? Or is that one of the things like "zero
overhead" that turns out to be more marketing-speak than reality?

There are several types of probes. For example for PID provider probes
you can monitor all entry and return point from global function. And
also you can put probe on each asm instruction in the function. These
probes have zero overhead, because dtrace understand ABI and know where
args are. Unfortunately user defined probes has small overhead which is
price for universal solution which works with all compilers and linkers.

Anyway I concur with Robert's opinion that the use-case is far too small
to justify incurring a measurable overhead for everybody.

OK.

There might
be some small argument for putting these in under an extra #ifdef, but
they wouldn't get into any regular production build.

unfortunately #ifdef solution kills main dtrace goal - without
recompilation :(.

Zdenek

#32Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#30)
Re: [PATCH] dtrace probes for memory manager

Tom Lane píše v pá 11. 12. 2009 v 14:38 -0500:

Robert Haas <robertmhaas@gmail.com> writes:

I thought we had an idea of using the AllocSet dispatch mechanism to
make this zero-overhead in the case where the probes are not enabled.
What happened to that notion?

I must have missed that discussion, but +1 --- should be possible to get
to zero-overhead-when-off that way. The trick is to figure out
what/where enables the alternate implementation. The current design
assumes that the callers of FooContextCreate choose the implementation,
but we don't want that here.

I thought about it. I think we can use GUC variable (e.g. dtraced_alloc)
and hook switch pointers to dtraced AsetFunctions. The problem is how to
distribute to all backend.

Zdenek

#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#32)
Re: [PATCH] dtrace probes for memory manager

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

I thought about it. I think we can use GUC variable (e.g. dtraced_alloc)
and hook switch pointers to dtraced AsetFunctions. The problem is how to
distribute to all backend.

You set the GUC in postgresql.conf. No big deal.

If we go this route it would be nice to think about making a facility
that has some usefulness for non-DTrace platforms too.

regards, tom lane

#34Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#33)
Re: [PATCH] dtrace probes for memory manager

Tom Lane píše v pá 11. 12. 2009 v 15:11 -0500:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

I thought about it. I think we can use GUC variable (e.g. dtraced_alloc)
and hook switch pointers to dtraced AsetFunctions. The problem is how to
distribute to all backend.

You set the GUC in postgresql.conf. No big deal.

If we go this route it would be nice to think about making a facility
that has some usefulness for non-DTrace platforms too.

Do you mean general facility for switching memory allocator?

Zdenek

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#34)
Re: [PATCH] dtrace probes for memory manager

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Tom Lane píše v pá 11. 12. 2009 v 15:11 -0500:

If we go this route it would be nice to think about making a facility
that has some usefulness for non-DTrace platforms too.

Do you mean general facility for switching memory allocator?

No, I was thinking of some sort of memory allocation stats collection
that doesn't depend on DTrace. It's amazing to me that we've never
gone back and improved on the original quick-and-dirty
MemoryContextStats mechanism. I certainly find myself using that a
lot for issues like tracking down memory leaks. While palloc has a lot
of advantages, the fact that you can't easily plug in a debug-friendly
substitute malloc package is not one of them :-(

regards, tom lane

#36Greg Smith
greg@2ndquadrant.com
In reply to: Tom Lane (#35)
Re: [PATCH] dtrace probes for memory manager

Tom Lane wrote:

It's amazing to me that we've never
gone back and improved on the original quick-and-dirty
MemoryContextStats mechanism. I certainly find myself using that a
lot for issues like tracking down memory leaks.

That code hasn't really gone anywhere since Neil tweaked the indentation
two years ago. What sorts of improvements were you looking for?

I started on trying to improve this area at one point but didn't get
far. My first step was going to be just wrapping the call into a UDF to
make it easier to reach without having to set loose gdb. I thought that
might expand the possible uses for MemoryContextStats to "help find a
memory leak safely on the production box", and therefore attact more
attention to improving it. People really don't like running gdb on
production, but a UDF dumping the same data wouldn't seem so risky. As
Daniel Farina pointed out to me one day in a "duh" moment, that idea is
quite obviously doomed by the fact that people want dumps from processes
that you won't be talking to in a UDF context. You won't be able to
find a leak in the background writer by dumping the context the UDF can
see. There would need to be some sort of IPC/signaling mechanism in
each process if you wanted it to work everywhere, and once that
realization hit I realized this was a deeper problem than it appeared.

If you want to make it easier to export into user space, it seems to me
the memory context stats data either needs to get pushed somewhere it's
easier to get to (the way the regular stats are), or there needs to be
some way of polling any random PG process to find it--presumably by
passing the request the pid you want to instrument. Neither of which
may really be a reasonable idea at all, particularly since that in most
cases, you're using a debugger to track your leak down only after
reproducing a test case for it.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#37Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#35)
Re: [PATCH] dtrace probes for memory manager

On Fri, Dec 11, 2009 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Tom Lane píše v pá 11. 12. 2009 v 15:11 -0500:

If we go this route it would be nice to think about making a facility
that has some usefulness for non-DTrace platforms too.

Do you mean general facility for switching memory allocator?

No, I was thinking of some sort of memory allocation stats collection
that doesn't depend on DTrace.  It's amazing to me that we've never
gone back and improved on the original quick-and-dirty
MemoryContextStats mechanism.  I certainly find myself using that a
lot for issues like tracking down memory leaks.  While palloc has a lot
of advantages, the fact that you can't easily plug in a debug-friendly
substitute malloc package is not one of them :-(

This might be nice to have, but I don't think it's necessarily a
prerequisite for a committable patch. For that, I think we just need
something that uses the dispatch mechanism to avoid the overhead in
the normal case.

However, as this is a substantial redesign and there are 4 days left
in the CommitFest, I am going to mark this patch Returned with
Feedback for now.

...Robert

#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#36)
Re: [PATCH] dtrace probes for memory manager

Greg Smith <greg@2ndquadrant.com> writes:

Tom Lane wrote:

It's amazing to me that we've never
gone back and improved on the original quick-and-dirty
MemoryContextStats mechanism.

That code hasn't really gone anywhere since Neil tweaked the indentation
two years ago. What sorts of improvements were you looking for?

Not anything like what you were thinking about, apparently ;-). What
I wish for the most often is a way to track palloc usage down to the
particular file/line of the alloc call. This is obviously not something
that we'd pay the overhead for in a standard build, but I would love to
be able to build a debug version that could do it when I needed. There
are malloc substitutes that can do this, but the palloc layer means that
they're not especially useful for debugging the PG backend.

Of course something like that doesn't have to have anything to do with
what Zdenek was wishing for, but it just struck me that creating a
stats-oriented shim layer in the memory context stuff is playing on
pretty much the same turf.

I started on trying to improve this area at one point but didn't get
far. My first step was going to be just wrapping the call into a UDF to
make it easier to reach without having to set loose gdb. I thought that
might expand the possible uses for MemoryContextStats to "help find a
memory leak safely on the production box", and therefore attact more
attention to improving it.

Just for context: the reason MemoryContextStats is designed as it is
is so that it has a reasonable chance of telling you something useful
about an out-of-memory failure. When you're down to your last ten
bytes, it's not going to help if your reporting mechanism wants to
allocate a data structure to put its results in. So it writes to stderr
and nothing else. It may well be that that constraint means we can
never do anything really creative with MemoryContextStats per se.
But I see the use-case for other mechanisms as being to get some info
before we reach the point of having our backs to the wall.

regards, tom lane