Thread: memory allocation and powers of two
While looking into a block size mismatch problem between Postgresql and FreeBSD's FFS, I noticed that postgresql is making some rather odd-sized requests to malloc(3): 0x2034, 0x2020, 0x4018, 0x8018, etc. Most malloc(3) implementations round large allocations up to a multiple of a large power of 2---often the hardware page size, so this is a pathological case for those allocators. For example, on a machine with 4K pages, the database may use ~50% more memory for the 0x2034 byte allocations. Browsing through the source, it looks like the allocation set implementation and the buffered file implementation both use inline tags. The latter, at least, could be easily fixed by allocating the buffer separately, but that would only be worthwhile if the former were also modified. Thoughts on this particular design decision?
David Schultz <dschultz@uclink.Berkeley.EDU> writes: > While looking into a block size mismatch problem between > Postgresql and FreeBSD's FFS, I noticed that postgresql is making > some rather odd-sized requests to malloc(3): 0x2034, 0x2020, > 0x4018, 0x8018, etc. AFAICT the operative word there is "some" --- the heavily used paths should make power-of-two requests to malloc, because aset block sizes will normally be powers of two. Can you put your finger on paths that generate a significant number of non-power-of-two requests? regards, tom lane
David Schultz <dschultz@uclink.Berkeley.EDU> writes: > While looking into a block size mismatch problem between > Postgresql and FreeBSD's FFS, I noticed that postgresql is making > some rather odd-sized requests to malloc(3): 0x2034, 0x2020, > 0x4018, 0x8018, etc. Most malloc(3) implementations round large > allocations up to a multiple of a large power of 2---often the > hardware page size, so this is a pathological case for those > allocators. After looking more closely I saw that there were some corner cases where aset.c would unnecessarily switch from the intended power-of-two block sizes to non-power-of-two sizes. I've applied the attached fix. regards, tom lane *** src/backend/utils/mmgr/aset.c.orig Sun Aug 3 23:01:13 2003 --- src/backend/utils/mmgr/aset.c Sat Sep 13 18:20:48 2003 *************** *** 650,681 **** } else { - /* Get size of prior block */ - blksize = set->blocks->endptr - ((char *) set->blocks); - /* ! * Special case: if very first allocation was for a large ! * chunk (or we have a small "keeper" block), could have an ! * undersized top block. Do something reasonable. */ ! if (blksize < set->initBlockSize) ! blksize = set->initBlockSize; ! else ! { ! /* Crank it up, but not past max */ blksize <<= 1; ! if (blksize > set->maxBlockSize) ! blksize = set->maxBlockSize; ! } } /* * If initBlockSize is less than ALLOC_CHUNK_LIMIT, we could need ! * more space... */ required_size = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ; ! if (blksize < required_size) ! blksize = required_size; /* Try to allocate it */ block = (AllocBlock) malloc(blksize); --- 650,678 ---- } else { /* ! * Use first power of 2 that is larger than previous block, ! * but not more than the allowed limit. (We don't simply double ! * the prior block size, because in some cases this could be a ! * funny size, eg if very first allocation was for an odd-sized ! * large chunk.) */ ! Size pblksize = set->blocks->endptr - ((char *) set->blocks); ! ! blksize = set->initBlockSize; ! while (blksize <= pblksize) blksize <<= 1; ! if (blksize > set->maxBlockSize) ! blksize = set->maxBlockSize; } /* * If initBlockSize is less than ALLOC_CHUNK_LIMIT,we could need ! * more space... but try to keep it a power of 2. */ required_size = chunk_size + ALLOC_BLOCKHDRSZ+ ALLOC_CHUNKHDRSZ; ! while (blksize < required_size) ! blksize <<= 1; /* Try to allocate it */ block = (AllocBlock) malloc(blksize);
David Schultz <dschultz@uclink.Berkeley.EDU> writes: > ... I assume that pgsql will be able to use the > slack allocated in each chunk. That's the theory, anyway. Undoubtedly the extra allocation will go to waste in some scenarios, but we're no worse off than we were before. regards, tom lane
On Sat, Sep 13, 2003, Tom Lane wrote: > David Schultz <dschultz@uclink.Berkeley.EDU> writes: > > While looking into a block size mismatch problem between > > Postgresql and FreeBSD's FFS, I noticed that postgresql is making > > some rather odd-sized requests to malloc(3): 0x2034, 0x2020, > > 0x4018, 0x8018, etc. Most malloc(3) implementations round large > > allocations up to a multiple of a large power of 2---often the > > hardware page size, so this is a pathological case for those > > allocators. > > After looking more closely I saw that there were some corner cases where > aset.c would unnecessarily switch from the intended power-of-two block > sizes to non-power-of-two sizes. I've applied the attached fix. Thanks for looking into this. I had meant to follow up sooner myself, but I've been under time pressure lately. Your patch looks good to me. I assume that pgsql will be able to use the slack allocated in each chunk. I'll try my original test again next weekend and let you know if this change clears up most of the power-of-two-plus-epsilon issues I was seeing.