Re: Berkeley DB... - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Berkeley DB...
Date
Msg-id 2113.959355596@sss.pgh.pa.us
Whole thread Raw
In response to Re: Berkeley DB...  (Hannu Krosing <hannu@tm.ee>)
Responses Re: Berkeley DB...  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
Hannu Krosing <hannu@tm.ee> writes:
>> As Vadim points out in his comparison
>> of COPY vs. INSERT, something is *wrong* with the time it takes
>> for PostgreSQL to parse, plan, rewrite, and optimize.

We might have part of the story in the recently noticed fact that
each insert/update query begins by doing a seqscan of pg_index.

I have done profiles of INSERT in the past and not found any really
spectacular bottlenecks (but I was looking at a test table with no
indexes, so I failed to see the pg_index problem :-().  Last time
I did it, I had these top profile entries for inserting 100,000 rows
of 30 columns apiece:
 %   cumulative   self              self     total           time   seconds   seconds    calls  ms/call  ms/call  name
 30.08    290.79   290.79                             _mcount 6.48    353.46    62.67 30702766     0.00     0.00
AllocSetAlloc5.27    404.36    50.90   205660     0.25     0.25  write 3.06    433.97    29.61 30702765     0.00
0.00 MemoryContextAlloc 2.74    460.45    26.48   100001     0.26     0.74  yyparse 2.63    485.86    25.41 24300077
0.00     0.00  newNode 2.22    507.33    21.47  3900054     0.01     0.01  yylex 1.63    523.04    15.71 30500751
0.00    0.00  PortalHeapMemoryAlloc 1.31    535.68    12.64  5419526     0.00     0.00  hash_search 1.18    547.11
11.43 9900000     0.00     0.00  expression_tree_walker 1.01    556.90     9.79  3526752     0.00     0.00
SpinRelease

While the time spent in memory allocation is annoying, that's only about
ten mallocs per parsed data expression, so it's unlikely that we will be
able to improve on it very much.  (We could maybe avoid having *three*
levels of subroutine call to do an alloc, though ;-).)  Unless you are
smarter than the flex and bison guys you are not going to be able to
improve on the lex/parse times either.  The planner isn't even showing
up for a simple INSERT.  Not much left, unless you can figure out how
to write and commit a tuple with less than two disk writes.

But, as I said, this was a case with no indexes to update.

I intend to do something about caching pg_index info ASAP in the 7.1
cycle, and then we can see how much of a difference that makes...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: \h SELECT
Next
From: Bruce Momjian
Date:
Subject: Fixed psql \h SELECT