Re: can we optimize STACK_DEPTH_SLOP - Mailing list pgsql-hackers

From Tom Lane
Subject Re: can we optimize STACK_DEPTH_SLOP
Date
Msg-id 20233.1467910511@sss.pgh.pa.us
Whole thread Raw
In response to Re: can we optimize STACK_DEPTH_SLOP  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: can we optimize STACK_DEPTH_SLOP  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I found out that pmap can give much more fine-grained results than I was
getting before, if you give it the -x flag and then pay attention to the
"dirty" column rather than the "nominal size" column.  That gives a
reliable indication of how much stack space the process ever actually
touched, with resolution apparently 4KB on my machine.

I redid my measurements with commit 62c8421e8 applied, and now get results
like this for one run of the standard regression tests:

$ grep '\[ stack \]' postmaster.log  | sort -k 4n | uniq -c   137 00007fff0f615000      84      36      36 rw---    [
stack]    21 00007fff0f615000      84      40      40 rw---    [ stack ]     4 00007fff0f615000      84      44      44
rw---   [ stack ]    20 00007fff0f615000      84      48      48 rw---    [ stack ]     8 00007fff0f615000      84
52     52 rw---    [ stack ]     2 00007fff0f615000      84      56      56 rw---    [ stack ]    10 00007fff0f615000
  84      60      60 rw---    [ stack ]     3 00007fff0f615000      84      64      64 rw---    [ stack ]     3
00007fff0f615000     84      68      68 rw---    [ stack ]     2 00007fff0f615000      84      72      72 rw---    [
stack]     1 00007fff0f612000      96      76      76 rw---    [ stack ]     2 00007fff0f60e000     112     112     112
rw---   [ stack ]     1 00007fff0f5e0000     296     296     296 rw---    [ stack ]     1 00007fff0f427000    2060
2060   2060 rw---    [ stack ]
 

The rightmost numeric column is the "dirty KB in region" column, and 36KB
is the floor established by the postmaster.  (It looks like selecting
timezone is still the largest stack-space hog in that, but it's no longer
enough to make me want to do something about it.)  So now we're seeing
some cases that exceed that floor, which is good.  regex and errors are
still the outliers, as expected.

Also, I found that on OS X "vmmap -dirty" could produce results comparable
to pmap, so here's the numbers for the same test case on current OS X:
154 Stack                             8192K      36K        2   5 Stack                             8192K      40K
 2  11 Stack                             8192K      44K        2   6 Stack                             8192K      48K
    2  11 Stack                             8192K      52K        2   7 Stack                             8192K
56K       2   8 Stack                             8192K      60K        2   2 Stack                             8192K
  64K        2   2 Stack                             8192K      68K        2   4 Stack
8192K     72K        2   1 Stack                             8192K      76K        2   2 Stack
  8192K     108K        2   1 Stack                             8192K     384K        2   1 Stack
     8192K    2056K        2 
 

(The "virtual" stack size seems to always be the same as ulimit -s,
ie 8MB by default, on this platform.)  This is good confirmation
that the actual stack consumption is pretty stable across different
compilers, though it looks like OS X's version of clang is a bit
more stack-wasteful for the regex recursion.

Based on these numbers, I'd have no fear of reducing STACK_DEPTH_SLOP
to 256KB on x86_64.  It would sure be good to check things on some
other architectures, though ...
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Header and comments describing routines in incorrect shape in visibilitymap.c
Next
From: Fujii Masao
Date:
Subject: Re: EXPLAIN ANALYZE for parallel query doesn't report the SortMethod information.