Re: [HACKERS] Postgres Speed or lack thereof - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] Postgres Speed or lack thereof |
Date | |
Msg-id | 199903151428.JAA12005@candle.pha.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Postgres Speed or lack thereof (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Postgres Speed or lack thereof
|
List | pgsql-hackers |
Tom, I assume this has all been corrected, right? > I wrote: > > As for where the time is going: "top" showed that the time for the > > INSERT ops was almost all going into backend CPU time. My guess is > > that most of the cost is in parsing and planning the INSERT > > statements. > > Having an idle hour this evening, I thought it'd be interesting to build > a backend with profiling enabled, so as to confirm or deny the above > guess. It seems that indeed a lot of time is being wasted, but where > it's being wasted might surprise you! > > I profiled a backend built from this morning's current CVS sources, > being driven by a single psql run reading a series of exactly 5000 > INSERT commands. The first few are: > > INSERT INTO "fehistory_1" values (64,14,252,'D','-','-','s',1,4027,NULL,'S',1,4088,NULL,32,'Thu Jan 07 15:15:00 1999 EST','FriJan 08 08:30:38 1999 EST',4027,32,NULL,NULL,46,NULL); > INSERT INTO "fehistory_1" values (65,14,252,'P','-','-','S',1,4086,NULL,NULL,NULL,NULL,NULL,32,'Fri Jan 08 16:00:00 1999EST',NULL,NULL,NULL,NULL,NULL,45,NULL); > INSERT INTO "fehistory_1" values (66,14,135,'P','-','-','S',1,13619,NULL,'s',1,12967,NULL,100,'Fri Jan 08 02:00:00 1999EST',NULL,NULL,NULL,NULL,NULL,44,NULL); > > and it goes on for a long while in that vein :-). Total runtime was > 278.09 CPU seconds in the backend, 307 seconds elapsed realtime. The > table being inserted into has no indexes and was initially empty. > > The profiler is GNU gprof, which provides both flat and structured > profile data. The top few entries in the flat profile are: > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 34.96 97.21 97.21 _mcount > 7.18 117.19 19.98 8047816 0.00 0.00 tree_cut > 6.56 135.43 18.24 962187 0.02 0.02 recv > 3.55 145.31 9.88 8234278 0.00 0.00 tree_insert > 3.49 155.02 9.71 4292947 0.00 0.01 malloc > 3.31 164.22 9.20 1965056 0.00 0.01 SearchSysCache > 3.24 173.23 9.01 4291775 0.00 0.01 free > 2.44 180.01 6.78 6171667 0.00 0.00 tree_concatenate > 2.17 186.04 6.03 8234069 0.00 0.00 tree_delete > 1.07 189.02 2.98 10250 0.29 0.29 write > 1.05 191.94 2.92 _psort_cmp > 0.95 194.57 2.63 2983301 0.00 0.00 newNode > 0.92 197.14 2.57 4251762 0.00 0.00 OrderedElemPop > 0.88 199.60 2.46 2746172 0.00 0.00 comphash > 0.82 201.88 2.28 4252495 0.00 0.01 AllocSetAlloc > 0.77 204.02 2.14 1965056 0.00 0.00 SearchSysCacheTuple > 0.73 206.05 2.03 265000 0.01 0.01 yylex > 0.69 207.96 1.91 $$remU > 0.68 209.86 1.90 4252500 0.00 0.00 OrderedElemPushHead > 0.62 211.58 1.72 4247473 0.00 0.00 palloc > 0.60 213.26 1.68 25004 0.07 0.07 send > 0.56 214.81 1.55 1965056 0.00 0.00 CatalogCacheComputeHashIndex > 0.55 216.33 1.52 5000 0.30 1.07 yyparse > 0.54 217.82 1.49 4251762 0.00 0.01 AllocSetFree > 0.53 219.29 1.47 4252494 0.00 0.00 MemoryContextAlloc > 0.51 220.71 1.42 4252500 0.00 0.00 OrderedElemPushInto > 0.49 222.07 1.36 3430844 0.00 0.00 strncmp > 0.48 223.41 1.34 4205327 0.00 0.00 OrderedSetGetHead > 0.47 224.73 1.32 elog > 0.45 225.99 1.26 1964976 0.00 0.00 DLRemove > 0.43 227.19 1.20 2510857 0.00 0.00 strcmp > 0.43 228.38 1.19 4200327 0.00 0.00 OrderedElemGetBase > 0.42 229.56 1.18 4245740 0.00 0.01 PortalHeapMemoryAlloc > 0.38 230.62 1.06 4252500 0.00 0.00 OrderedElemPush > 0.37 231.66 1.04 4205327 0.00 0.00 AllocSetGetFirst > 0.36 232.66 1.00 1965145 0.00 0.00 DLAddHead > > _mcount is part of the profiling apparatus and can be ignored (although > the fact that it's so high up implies that we're executing an awful lot > of short routine calls, because mcount runs once per entry and exit of > profilable routines). tree_cut, tree_insert, tree_concatenate, and > tree_delete are subroutines of malloc/free. Altogether, malloc/free > and friends accounted for 61.39 seconds out of the 180.88 non-overhead > CPU seconds in this run. > > In other words, we're spending a third of our time mallocing and freeing > memory. A tad high, what? > > Actually, it's worse than that, because AllocSetAlloc, > PortalHeapMemoryAlloc, AllocSetFree, and all of the OrderedElemXXX > routines represent our own bookkeeping layer atop malloc/free. > That's another 18.66 seconds spent in these top routines, which means > that we are real close to expending half the backend's runtime on > memory bookkeeping. This needs work. > > The other thing that jumps out here is the unreasonably high position of > recv(), which is called 962187 times. The script being read by psql was > only 957186 characters. Evidently we're invoking a kernel recv() call > once per character read from the frontend. I suspect this is an > inefficiency introduced by Magnus Hagander's recent rewrite of backend > libpq (see, I told you there was a reason for using stdio ;-)). We're > gonna have to do something about that, though it's not as critical as > the memory-allocation issue. It also appears that send() is now being > invoked multiple times per backend reply, which is going to create > inefficiencies outside the backend (ie, multiple packets per reply). > On a test case with a lot of SELECTs that would show up more than it > does here. > > > Moving on to the dynamic profile, the most interesting items are: > > index % time self children called name > ----------------------------------------------- > 0.05 118.06 1/1 DoBackend [7] > [8] 65.3 0.05 118.06 1 PostgresMain [8] > 0.04 57.64 5000/5000 pg_exec_query [9] > 0.01 39.83 5000/5000 CommitTransactionCommand [11] > 0.02 18.86 5001/5001 ReadCommand [27] > 0.01 1.18 5000/5000 StartTransactionCommand [118] > 0.01 0.36 5001/5001 ReadyForQuery [190] > 0.04 0.00 5001/5001 EnableNotifyInterrupt [385] > 0.04 0.00 5000/5000 strspn [388] > 0.00 0.01 1/1 InitPostgres [495] > 0.00 0.00 5000/548573 strlen [172] > 0.00 0.00 5001/93349 memset [366] > 0.00 0.00 1/1 read_pg_options [601] > 0.00 0.00 2/2 pq_putint [637] > 0.00 0.00 1/15002 pq_putnchar [123] > 0.00 0.00 2/10013 getenv [241] > 0.00 0.00 1/1 SetPgUserName [683] > 0.00 0.00 1/1 pq_init [687] > 0.00 0.00 1/1 proc_exit [704] > 0.00 0.00 1/1 pq_close [721] > 0.00 0.00 1/5002 getpgusername [521] > 0.00 0.00 1/2 FindExec [722] > 0.00 0.00 5/8 getopt [728] > 0.00 0.00 5001/5001 DisableNotifyInterrupt [761] > 0.00 0.00 9/34 pqsignal [806] > 0.00 0.00 2/5 atoi [837] > 0.00 0.00 1/2 SetProcessingMode [868] > 0.00 0.00 1/10002 TransactionFlushEnabled [749] > 0.00 0.00 1/1 sigsetjmp [934] > > ("self" is the amount of time (CPU sec) spent directly in the named function, > while "children" is the amount of time spent in its callees, pro-rated > by the number of calls. For example, strlen was actually measured to > use 0.47 sec in the whole run, but since PostgresMain called it less > than 1% of all the times it was called, PostgresMain gets blamed for > less than 0.01 sec of that total.) > > The interesting thing here is the relative times indicated for > pg_exec_query, CommitTransactionCommand, and ReadCommand. We already > found out why ReadCommand is so slow (recv() per character) ... but why > is CommitTransactionCommand so high, when I already discovered that > doing only one commit doesn't help much? Patience, we'll get there. > > ----------------------------------------------- > 0.03 57.61 5000/5000 pg_exec_query [9] > [10] 31.9 0.03 57.61 5000 pg_exec_query_dest [10] > 0.09 39.12 5000/5000 pg_parse_and_plan [13] > 0.00 17.37 5000/5000 ProcessQuery [36] > 0.02 0.60 5000/5000 SetQuerySnapshot [159] > 0.01 0.34 5000/5000 CommandCounterIncrement [196] > 0.02 0.05 10000/4291775 free [22] > ----------------------------------------------- > 0.09 39.12 5000/5000 pg_exec_query_dest [10] > [13] 21.7 0.09 39.12 5000 pg_parse_and_plan [13] > 0.02 19.28 5000/5000 parser [26] > 0.03 17.63 5000/5000 planner [34] > 0.00 1.98 5000/5000 QueryRewrite [90] > 0.02 0.05 10000/4292947 malloc [21] > 0.02 0.05 10000/4291775 free [22] > 0.00 0.02 5000/497779 lappend [91] > 0.01 0.00 5000/5000 IsAbortedTransactionBlockState [529] > 0.00 0.00 5000/72779 length [342] > > In other words, parsing, planning, and executing an INSERT each take > about the same amount of time. > > ----------------------------------------------- > 0.01 39.83 5000/5000 PostgresMain [8] > [11] 22.0 0.01 39.83 5000 CommitTransactionCommand [11] > 0.02 39.81 5000/5000 CommitTransaction [12] > ----------------------------------------------- > 0.02 39.81 5000/5000 CommitTransactionCommand [11] > [12] 22.0 0.02 39.81 5000 CommitTransaction [12] > 0.00 38.12 5000/5000 AtCommit_Memory [16] > 0.01 1.10 5000/5000 RecordTransactionCommit [122] > 0.00 0.30 5000/5000 AtCommit_Locks [211] > 0.02 0.10 5000/5000 AtEOXact_portals [286] > 0.03 0.07 5000/5000 DestroyTempRels [305] > 0.03 0.00 5000/5000 _lo_commit [425] > 0.01 0.00 5000/5000 AtCommit_Notify [522] > 0.01 0.00 5000/5000 CloseSequences [523] > 0.01 0.00 5000/5000 RelationPurgeLocalRelation [531] > 0.00 0.00 5000/10000 AtCommit_Cache [750] > ----------------------------------------------- > 0.00 38.12 5000/5000 CommitTransaction [12] > [16] 21.1 0.00 38.12 5000 AtCommit_Memory [16] > 0.01 38.11 5000/5000 EndPortalAllocMode [17] > 0.00 0.00 5000/10552 MemoryContextSwitchTo [747] > ----------------------------------------------- > 0.01 38.11 5000/5000 AtCommit_Memory [16] > [17] 21.1 0.01 38.11 5000 EndPortalAllocMode [17] > 0.54 37.55 5000/5000 AllocSetReset [18] > 0.01 0.00 5000/10000 PortalHeapMemoryGetVariableMemory [449] > 0.01 0.00 5000/5000 FixedStackPop [528] > 0.00 0.00 5000/51434 MemoryContextFree [367] > ----------------------------------------------- > 0.54 37.55 5000/5000 EndPortalAllocMode [17] > [18] 21.1 0.54 37.55 5000 AllocSetReset [18] > 1.47 32.51 4200327/4251762 AllocSetFree [20] > 1.04 2.53 4205327/4205327 AllocSetGetFirst [64] > > In other words, essentially *all* of the CPU time spent in > CommitTransaction is spent freeing memory. That's probably why > ganging the transactions doesn't help --- it's the same number of > memory blocks getting allocated and freed. > > 0.02 19.28 5000/5000 pg_parse_and_plan [13] > [26] 10.7 0.02 19.28 5000 parser [26] > 0.02 13.88 5000/5000 parse_analyze [41] > 1.52 3.81 5000/5000 yyparse [54] > 0.01 0.01 5000/5000 init_io [474] > 0.00 0.01 5000/420496 pstrdup [124] > 0.01 0.00 5000/5000 parser_init [533] > > Thomas might know why parse_analyze is taking so much time compared to > the rest of the parsing machinery... > > > I won't bother the list with the rest of the profile, although I'll be > happy to send it to anyone who wants to see all of it. Our immediate > performance problems seem pretty clear, however: time to rethink memory > management. > > regards, tom lane > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: