Attached is a patch that aligns large shared memory allocations beyond
MAXIMUM_ALIGNOF. The reason for this is that Intel's cpus have a fast
path for bulk memory copies that only works with aligned addresses. It's
possible that other cpus have similar restrictions.
With 7.3.4, it achives a 5% performance gain with pgbench. It has no
effect with 7.3.3, because the buffers are already aligned by chance. I
haven't properly tested 7.4cvs yet.
One problem is the "32" - it's arbitrary, it probably belongs into an
arch dependant header file. But where?
--
Manfred
diff -u pgsql.orig/src/backend/storage/ipc/shmem.c pgsql/src/backend/storage/ipc/shmem.c
--- pgsql.orig/src/backend/storage/ipc/shmem.c 2003-09-20 20:17:08.000000000 +0200
+++ pgsql/src/backend/storage/ipc/shmem.c 2003-09-20 20:34:21.000000000 +0200
@@ -131,6 +131,7 @@
void *
ShmemAlloc(Size size)
{
+ uint32 newStart;
uint32 newFree;
void *newSpace;
@@ -146,10 +147,21 @@
SpinLockAcquire(ShmemLock);
- newFree = shmemseghdr->freeoffset + size;
+ newStart = shmemseghdr->freeoffset;
+ if (size >= BLCKSZ)
+ {
+ /* Align BLCKSZ sized buffers even further:
+ * - the costs are small
+ * - some cpus (most notably Intel Pentium III)
+ * prefer well-aligned addresses for memory copies
+ */
+ newStart = TYPEALIGN(32, newStart);
+ }
+
+ newFree = newStart + size;
if (newFree <= shmemseghdr->totalsize)
{
- newSpace = (void *) MAKE_PTR(shmemseghdr->freeoffset);
+ newSpace = (void *) MAKE_PTR(newStart);
shmemseghdr->freeoffset = newFree;
}
else