Re: [HACKERS] Postgres Speed or lack thereof - Mailing list pgsql-hackers
From | Vadim Mikheev |
---|---|
Subject | Re: [HACKERS] Postgres Speed or lack thereof |
Date | |
Msg-id | 36AAAFCF.CBB9DF4A@krs.ru Whole thread Raw |
In response to | Re: [HACKERS] Postgres Speed or lack thereof (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Postgres Speed or lack thereof
Re: [HACKERS] Postgres Speed or lack thereof |
List | pgsql-hackers |
Tom Lane wrote: > > Having an idle hour this evening, I thought it'd be interesting to build > a backend with profiling enabled, so as to confirm or deny the above > guess. It seems that indeed a lot of time is being wasted, but where > it's being wasted might surprise you! ... > In other words, we're spending a third of our time mallocing and freeing > memory. A tad high, what? > > Actually, it's worse than that, because AllocSetAlloc, > PortalHeapMemoryAlloc, AllocSetFree, and all of the OrderedElemXXX > routines represent our own bookkeeping layer atop malloc/free. > That's another 18.66 seconds spent in these top routines, which means > that we are real close to expending half the backend's runtime on > memory bookkeeping. This needs work. > Yes, it's suprizing! I added some debug code to palloc/pfree and it shows that for INSERT: 1. 80% of allocations are made for <= 32 bytes. 2. pfree is used for 25% of them only (others are freed after statement/transaction is done). Note that our mmgr adds 16 bytes to each allocation (+ some bytes in malloc) - a great overhead, yes? I added code to allocate a few big (16K-64K) blocks of memory for these small allocations to speed up palloc by skiping AllocSetAlloc/malloc. New code don't free allocated memory (to make bookkeeping fast) but keeping in mind 2. above and memory overhead it seems as appropriate thing to do. These code also speed up freeing when statement/transaction is done, because of only a few blocks have to be freed now. I did 5000 INSERTS (into tables with 3 ints and 33 ints) with BEGIN/END, -F and -B 512 (I run postgres directly, without postmaster). User times: old new ----------------------------------------- table with 3 ints 9.7 sec 7.6 sec table with 33 ints 59.5 sec 39.9 sec So, new code 20%-30% faster. Process sizes are the same. Tom, could you run new code under profiling? There are still some things to do: 1. SELECT/UPDATE/DELETE often palloc/pfree tuples (sizes are > 32 bytes), but pfree now requires additional lookup tosee is memory allocated by AllocSetAlloc or new code. We can avoid this. 2. Index scans palloc/pfree IndexResult for each tuple returned by index. This one was annoying me for long time. IndexResultshould be part of IndexScan structure... 3. psort uses leftist structure (16 bytes) when disk is used for sorting. Obviously, big block allocation should be usedby lselect code itself. 4. Actually, new mode shouldn't be used by Executor in some cases. I'll address this in a few days... BTW, look at memutils.h: new code is like "tunable" mode described there. > > In other words, essentially *all* of the CPU time spent in > CommitTransaction is spent freeing memory. That's probably why > ganging the transactions doesn't help --- it's the same number of > memory blocks getting allocated and freed. It shows that we should get rid of system malloc/free and do all things in mmgr itself - this would allow us much faster free memory contexts at statement/transaction end. Vadim
pgsql-hackers by date: