memory allocation and powers of two

Started by David Schultzover 22 years ago5 messages
#1David Schultz
dschultz@uclink.Berkeley.EDU

While looking into a block size mismatch problem between
Postgresql and FreeBSD's FFS, I noticed that postgresql is making
some rather odd-sized requests to malloc(3): 0x2034, 0x2020,
0x4018, 0x8018, etc. Most malloc(3) implementations round large
allocations up to a multiple of a large power of 2---often the
hardware page size, so this is a pathological case for those
allocators. For example, on a machine with 4K pages, the database
may use ~50% more memory for the 0x2034 byte allocations.

Browsing through the source, it looks like the allocation set
implementation and the buffered file implementation both use
inline tags. The latter, at least, could be easily fixed by
allocating the buffer separately, but that would only be
worthwhile if the former were also modified.

Thoughts on this particular design decision?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Schultz (#1)
Re: memory allocation and powers of two

David Schultz <dschultz@uclink.Berkeley.EDU> writes:

While looking into a block size mismatch problem between
Postgresql and FreeBSD's FFS, I noticed that postgresql is making
some rather odd-sized requests to malloc(3): 0x2034, 0x2020,
0x4018, 0x8018, etc.

AFAICT the operative word there is "some" --- the heavily used paths
should make power-of-two requests to malloc, because aset block sizes
will normally be powers of two. Can you put your finger on paths that
generate a significant number of non-power-of-two requests?

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Schultz (#1)
Re: memory allocation and powers of two

David Schultz <dschultz@uclink.Berkeley.EDU> writes:

While looking into a block size mismatch problem between
Postgresql and FreeBSD's FFS, I noticed that postgresql is making
some rather odd-sized requests to malloc(3): 0x2034, 0x2020,
0x4018, 0x8018, etc. Most malloc(3) implementations round large
allocations up to a multiple of a large power of 2---often the
hardware page size, so this is a pathological case for those
allocators.

After looking more closely I saw that there were some corner cases where
aset.c would unnecessarily switch from the intended power-of-two block
sizes to non-power-of-two sizes. I've applied the attached fix.

regards, tom lane

*** src/backend/utils/mmgr/aset.c.orig	Sun Aug  3 23:01:13 2003
--- src/backend/utils/mmgr/aset.c	Sat Sep 13 18:20:48 2003
***************
*** 650,681 ****
  		}
  		else
  		{
- 			/* Get size of prior block */
- 			blksize = set->blocks->endptr - ((char *) set->blocks);
- 
  			/*
! 			 * Special case: if very first allocation was for a large
! 			 * chunk (or we have a small "keeper" block), could have an
! 			 * undersized top block.  Do something reasonable.
  			 */
! 			if (blksize < set->initBlockSize)
! 				blksize = set->initBlockSize;
! 			else
! 			{
! 				/* Crank it up, but not past max */
  				blksize <<= 1;
! 				if (blksize > set->maxBlockSize)
! 					blksize = set->maxBlockSize;
! 			}
  		}

/*
* If initBlockSize is less than ALLOC_CHUNK_LIMIT, we could need
! * more space...
*/
required_size = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
! if (blksize < required_size)
! blksize = required_size;

  		/* Try to allocate it */
  		block = (AllocBlock) malloc(blksize);
--- 650,678 ----
  		}
  		else
  		{
  			/*
! 			 * Use first power of 2 that is larger than previous block,
! 			 * but not more than the allowed limit.  (We don't simply double
! 			 * the prior block size, because in some cases this could be a
! 			 * funny size, eg if very first allocation was for an odd-sized
! 			 * large chunk.)
  			 */
! 			Size	pblksize = set->blocks->endptr - ((char *) set->blocks);
! 
! 			blksize = set->initBlockSize;
! 			while (blksize <= pblksize)
  				blksize <<= 1;
! 			if (blksize > set->maxBlockSize)
! 				blksize = set->maxBlockSize;
  		}

/*
* If initBlockSize is less than ALLOC_CHUNK_LIMIT, we could need
! * more space... but try to keep it a power of 2.
*/
required_size = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
! while (blksize < required_size)
! blksize <<= 1;

/* Try to allocate it */
block = (AllocBlock) malloc(blksize);

#4David Schultz
dschultz@uclink.Berkeley.EDU
In reply to: Tom Lane (#3)
Re: memory allocation and powers of two

On Sat, Sep 13, 2003, Tom Lane wrote:

David Schultz <dschultz@uclink.Berkeley.EDU> writes:

While looking into a block size mismatch problem between
Postgresql and FreeBSD's FFS, I noticed that postgresql is making
some rather odd-sized requests to malloc(3): 0x2034, 0x2020,
0x4018, 0x8018, etc. Most malloc(3) implementations round large
allocations up to a multiple of a large power of 2---often the
hardware page size, so this is a pathological case for those
allocators.

After looking more closely I saw that there were some corner cases where
aset.c would unnecessarily switch from the intended power-of-two block
sizes to non-power-of-two sizes. I've applied the attached fix.

Thanks for looking into this. I had meant to follow up sooner
myself, but I've been under time pressure lately. Your patch
looks good to me. I assume that pgsql will be able to use the
slack allocated in each chunk. I'll try my original test again
next weekend and let you know if this change clears up most of the
power-of-two-plus-epsilon issues I was seeing.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Schultz (#4)
Re: memory allocation and powers of two

David Schultz <dschultz@uclink.Berkeley.EDU> writes:

... I assume that pgsql will be able to use the
slack allocated in each chunk.

That's the theory, anyway. Undoubtedly the extra allocation will go to
waste in some scenarios, but we're no worse off than we were before.

regards, tom lane