Thread: Proof-of-concept for initdb-time shared_buffers selection

Proof-of-concept for initdb-time shared_buffers selection

From

Tom Lane

Date:

04 July 2003, 16:29:53

The attached patch shows how initdb can dynamically determine reasonable
shared_buffers and max_connections settings that will work on the
current machine.  It consists of two trivial adjustments: one rips out
the "PrivateMemory" code, so that a standalone backend will allocate a
shared memory segment the same way as a postmaster would do, and the
second adds a simple test loop in initdb that sees how large a setting
will still allow the backend to start.

The patch isn't quite complete since I didn't bother adding the few
lines of sed hacking needed to actually insert the selected values into
the installed postgresql.conf file, but that's just another few minutes'
work.  Adjusting the documentation to match would take a bit longer.

We might also want to tweak initdb to print a warning message if it's
forced to select very small values, but I didn't do that yet.

Questions for the list:

1. Does this approach seem like a reasonable solution to our problem
of some machines having unrealistically small kernel limits on shared
memory?

2. If so, can I get away with applying this post-feature-freeze?  I can
argue that it's a bug fix, but perhaps some will disagree.

3. What should be the set of tested values?  I have it as
   buffers: first to work of 1000 900 800 700 600 500 400 300 200 100 50
   connections: first to work of 100 50 40 30 20 10
but we could certainly argue for different rules.

            regards, tom lane


*** src/backend/port/sysv_shmem.c.orig    Thu May  8 15:17:07 2003
--- src/backend/port/sysv_shmem.c    Fri Jul  4 14:47:51 2003
***************
*** 45,52 ****
  static void *InternalIpcMemoryCreate(IpcMemoryKey memKey, uint32 size);
  static void IpcMemoryDetach(int status, Datum shmaddr);
  static void IpcMemoryDelete(int status, Datum shmId);
- static void *PrivateMemoryCreate(uint32 size);
- static void PrivateMemoryDelete(int status, Datum memaddr);
  static PGShmemHeader *PGSharedMemoryAttach(IpcMemoryKey key,
                                          IpcMemoryId *shmid, void *addr);

--- 45,50 ----
***************
*** 243,283 ****
  }


- /* ----------------------------------------------------------------
-  *                        private memory support
-  *
-  * Rather than allocating shmem segments with IPC_PRIVATE key, we
-  * just malloc() the requested amount of space.  This code emulates
-  * the needed shmem functions.
-  * ----------------------------------------------------------------
-  */
-
- static void *
- PrivateMemoryCreate(uint32 size)
- {
-     void       *memAddress;
-
-     memAddress = malloc(size);
-     if (!memAddress)
-     {
-         fprintf(stderr, "PrivateMemoryCreate: malloc(%u) failed\n", size);
-         proc_exit(1);
-     }
-     MemSet(memAddress, 0, size);    /* keep Purify quiet */
-
-     /* Register on-exit routine to release storage */
-     on_shmem_exit(PrivateMemoryDelete, PointerGetDatum(memAddress));
-
-     return memAddress;
- }
-
- static void
- PrivateMemoryDelete(int status, Datum memaddr)
- {
-     free(DatumGetPointer(memaddr));
- }
-
-
  /*
   * PGSharedMemoryCreate
   *
--- 241,246 ----
***************
*** 289,294 ****
--- 252,260 ----
   * collision with non-Postgres shmem segments.    The idea here is to detect and
   * re-use keys that may have been assigned by a crashed postmaster or backend.
   *
+  * makePrivate means to always create a new segment, rather than attach to
+  * or recycle any existing segment.
+  *
   * The port number is passed for possible use as a key (for SysV, we use
   * it to generate the starting shmem key).    In a standalone backend,
   * zero will be passed.
***************
*** 323,342 ****

      for (;;NextShmemSegID++)
      {
-         /* Special case if creating a private segment --- just malloc() it */
-         if (makePrivate)
-         {
-             memAddress = PrivateMemoryCreate(size);
-             break;
-         }
-
          /* Try to create new segment */
          memAddress = InternalIpcMemoryCreate(NextShmemSegID, size);
          if (memAddress)
              break;                /* successful create and attach */

          /* Check shared memory and possibly remove and recreate */
!
          if ((hdr = (PGShmemHeader *) memAddress = PGSharedMemoryAttach(
                          NextShmemSegID, &shmid, UsedShmemSegAddr)) == NULL)
              continue;            /* can't attach, not one of mine */
--- 289,304 ----

      for (;;NextShmemSegID++)
      {
          /* Try to create new segment */
          memAddress = InternalIpcMemoryCreate(NextShmemSegID, size);
          if (memAddress)
              break;                /* successful create and attach */

          /* Check shared memory and possibly remove and recreate */
!
!         if (makePrivate)        /* a standalone backend shouldn't do this */
!             continue;
!
          if ((hdr = (PGShmemHeader *) memAddress = PGSharedMemoryAttach(
                          NextShmemSegID, &shmid, UsedShmemSegAddr)) == NULL)
              continue;            /* can't attach, not one of mine */
*** src/backend/utils/init/postinit.c.orig    Fri Jun 27 10:45:30 2003
--- src/backend/utils/init/postinit.c    Fri Jul  4 14:47:43 2003
***************
*** 176,187 ****
      {
          /*
           * We're running a postgres bootstrap process or a standalone backend.
!          * Create private "shmem" and semaphores.  Force MaxBackends to 1 so
!          * that we don't allocate more resources than necessary.
           */
-         SetConfigOption("max_connections", "1",
-                         PGC_POSTMASTER, PGC_S_OVERRIDE);
-
          CreateSharedMemoryAndSemaphores(true, MaxBackends, 0);
      }
  }
--- 176,183 ----
      {
          /*
           * We're running a postgres bootstrap process or a standalone backend.
!          * Create private "shmem" and semaphores.
           */
          CreateSharedMemoryAndSemaphores(true, MaxBackends, 0);
      }
  }
*** src/bin/initdb/initdb.sh.orig    Fri Jul  4 12:41:21 2003
--- src/bin/initdb/initdb.sh    Fri Jul  4 15:19:11 2003
***************
*** 579,584 ****
--- 579,618 ----

  ##########################################################################
  #
+ # DETERMINE PLATFORM-SPECIFIC CONFIG SETTINGS
+ #
+ # Use reasonable values if kernel will let us, else scale back
+
+ cp /dev/null "$PGDATA"/postgresql.conf
+
+ $ECHO_N "selecting default shared_buffers... "$ECHO_C
+
+ for nbuffers in 1000 900 800 700 600 500 400 300 200 100 50
+ do
+     TEST_OPT="$PGSQL_OPT -c shared_buffers=$nbuffers -c max_connections=5"
+     if "$PGPATH"/postgres $TEST_OPT template1 </dev/null >/dev/null 2>&1
+     then
+     break
+     fi
+ done
+
+ echo "$nbuffers"
+
+ $ECHO_N "selecting default max_connections... "$ECHO_C
+
+ for nconns in 100 50 40 30 20 10
+ do
+     TEST_OPT="$PGSQL_OPT -c shared_buffers=$nbuffers -c max_connections=$nconns"
+     if "$PGPATH"/postgres $TEST_OPT template1 </dev/null >/dev/null 2>&1
+     then
+     break
+     fi
+ done
+
+ echo "$nconns"
+
+ ##########################################################################
+ #
  # CREATE CONFIG FILES

  $ECHO_N "creating configuration files... "$ECHO_C

Re: Proof-of-concept for initdb-time shared_buffers selection

From

Michael Meskes

Date:

04 July 2003, 17:31:51

On Fri, Jul 04, 2003 at 03:29:37PM -0400, Tom Lane wrote:
> 2. If so, can I get away with applying this post-feature-freeze?  I can
> argue that it's a bug fix, but perhaps some will disagree.

I'd say it is a bug fix.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

Re: Proof-of-concept for initdb-time shared_buffers selection

From

Darcy Buskermolen

Date:

04 July 2003, 18:20:50

On Friday 04 July 2003 13:31, Michael Meskes wrote:
> On Fri, Jul 04, 2003 at 03:29:37PM -0400, Tom Lane wrote:
> > 2. If so, can I get away with applying this post-feature-freeze?  I can
> > argue that it's a bug fix, but perhaps some will disagree.
>
> I'd say it is a bug fix.
>
> Michael

I'm with you Michael/Tom on this one as well, Lets at least get this framework
inplace, we can always experment with what values we settle on.


--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com

Re: [PATCHES] Proof-of-concept for initdb-time shared_buffers selection

From

Joe Conway

Date:

05 July 2003, 00:00:57

Tom Lane wrote:
> 1. Does this approach seem like a reasonable solution to our problem
> of some machines having unrealistically small kernel limits on shared
> memory?

Yes, it does to me.

> 2. If so, can I get away with applying this post-feature-freeze?  I can
> argue that it's a bug fix, but perhaps some will disagree.

I'd go with calling it a bug fix, or rather pluging a known deficiency.

> 3. What should be the set of tested values?  I have it as
>    buffers: first to work of 1000 900 800 700 600 500 400 300 200 100 50
>    connections: first to work of 100 50 40 30 20 10
> but we could certainly argue for different rules.

These seem reasonable. We might want to output a message, even if the
highest values fly, that tuning is recommended for best performance.

Joe

Re: [PATCHES] Proof-of-concept for initdb-time shared_buffers selection

From

Bruno Wolff III

Date:

06 July 2003, 16:41:25

On Fri, Jul 04, 2003 at 15:29:37 -0400,
  Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> 3. What should be the set of tested values?  I have it as
>    buffers: first to work of 1000 900 800 700 600 500 400 300 200 100 50
>    connections: first to work of 100 50 40 30 20 10
> but we could certainly argue for different rules.

Should the default max number of connections first try something greater
than what Apache sets by default (256 for prefork, 400 for worker)?

Re: [PATCHES] Proof-of-concept for initdb-time shared_buffers selection

From

Tom Lane

Date:

06 July 2003, 16:57:57

Bruno Wolff III <bruno@wolff.to> writes:
> Should the default max number of connections first try something greater
> than what Apache sets by default (256 for prefork, 400 for worker)?

We could do that.  I'm a little worried about setting default values
that are likely to cause problems with exhausting the kernel's fd table
(nfiles limit).  If anyone actually tries to run 256 or 400 backends
without having increased nfiles and/or twiddled our
max_files_per_process setting, they're likely to have serious problems.
(There could be some objection even to max_connections 100 on this
ground.)

We could imagine having initdb reduce max_files_per_process to prevent
such problems, but then you'd be talking about giving up performance to
accommodate a limit that the user might not ever approach in practice.
You really don't want the thing selecting parameters on the basis of
unrealistic estimates of what max_connections needs to be.

Ultimately there's no substitute for some user input about what they're
planning to do with the database, and possibly adjustment of kernel
settings along with PG settings, if you're planning to run serious
applications.  initdb can't be expected to do this unless you want to
make it interactive, which would certainly make the RPM guys really
unhappy.

I'd rather see such considerations pushed off to a separate tool,
some kind of "configuration wizard" perhaps.

            regards, tom lane

Re: Proof-of-concept for initdb-time shared_buffers selection

From

Manfred Koizar

Date:

31 July 2003, 07:44:15

On Fri, 04 Jul 2003 15:29:37 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>The attached patch shows how initdb can dynamically determine reasonable
>shared_buffers and max_connections settings that will work on the
>current machine.

Can't this be done on postmaster startup?  I think of two GUC
variables where there is only one today: min_shared_buffers and
max_shared_buffers.  If allocation for the max_ values fails, the
numbers are decreased in a loop of, say, 10 steps until allocation
succeeds, or even fails at the min_ values.

The actual values chosen are reported as a NOTICE and can be inspected
as readonly GUC variables.

This would make the lives easier for the folks trying to come up with
default .conf files, e.g.
  min_shared_buffers = 64
  max_shared_buffers = 2000
could cover a fairly large range of low level to mid level machines.

A paranoid dba, who doesn't want the postmaster to do unpredictable
things on startup, can always set min_xxx == max_xxx to get the
current behaviour.

Servus
 Manfred

Re: Proof-of-concept for initdb-time shared_buffers selection

From

Tom Lane

Date:

31 July 2003, 12:59:25

Manfred Koizar <mkoi-pg@aon.at> writes:
> On Fri, 04 Jul 2003 15:29:37 -0400, Tom Lane <tgl@sss.pgh.pa.us>
> wrote:
>> The attached patch shows how initdb can dynamically determine reasonable
>> shared_buffers and max_connections settings that will work on the
>> current machine.

> Can't this be done on postmaster startup?

Why would that be a good idea?  Seems to me it just offers a fresh
opportunity to do the wrong thing at every startup.  We'v had troubles
enough with problems that appear only when the postmaster is started by
hand rather than by boot script, or vice versa; this would just add
another unknown to the equation.

> This would make the lives easier for the folks trying to come up with
> default .conf files, e.g.
>   min_shared_buffers = 64
>   max_shared_buffers = 2000
> could cover a fairly large range of low level to mid level machines.

Not unless their notion of a default .conf file includes a preinstalled
$PGDATA directory.  Under ordinary circumstances, initdb will get run
locally on the target machine, and should come up with a valid value.

            regards, tom lane

Re: Proof-of-concept for initdb-time shared_buffers selection

From

Josh Berkus

Date:

31 July 2003, 14:48:05

Manfred,

> Can't this be done on postmaster startup?  I think of two GUC
> variables where there is only one today: min_shared_buffers and
> max_shared_buffers.  If allocation for the max_ values fails, the
> numbers are decreased in a loop of, say, 10 steps until allocation
> succeeds, or even fails at the min_ values.

I think the archives are back up.  Take a look at this thread; we already had 
this discussion at some length, and decided that a max of 1000 was reasonable 
in advance of user tuning.  And, I believe, Tom has already written the code.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco