Modeling consumed shmem sizes, and some thorns - Mailing list pgsql-hackers

From Daniel Farina
Subject Modeling consumed shmem sizes, and some thorns
Date
Msg-id CAAZKuFZrsYpA85c3uBwsrVVdAYxMJ_K1SkxKPp+M9KxomCq8SQ@mail.gmail.com
Whole thread Raw
Responses Re: Modeling consumed shmem sizes, and some thorns
List pgsql-hackers
Hello List,

I'd like to share with you some experiences we've had while
investigating what we'd have to do to make very-very tiny databases.

First, the formulae at
http://www.postgresql.org/docs/9.1/static/kernel-resources.html#SHARED-MEMORY-PARAMETERS
(17-2) seem misleading, particularly with regard to the overhead of
supporting a large number of connections: it undercounts by a rather
lot.  I think the estimate of 270 bytes per lock seems too far low on
Linux-amd64.  In addition, this number seem undercounted by 50% or
more because of the addition of predicate locks in 9.1.

Presuming the base cost of 1800-base-cost-per-connection is still
right, experimentally it seems to me that the right numbers are closer
to 700 bytes per max_locks_per_transaction, and 650 for each
max_pred_locks_per_transaction, although there appear to be some
non-linear behavior that make this a hazy projection.

Besides accuracy, there is a thornier problem here that has to do with
hot standby (although the use case is replication more generally) when
one has heterogeneously sized database resources. As-is, it is
required that locking-related structures -- max_connections,
max_prepared_xacts, and max_locks_per_xact (but not predicate locks,
is that an oversight?) must be a larger number on a standby than on a
primary.

In a heterogeneous environment where one uses WAL-based replication,
that means that to obtain unity and full compatibility among
different-sized systems one must always permit a large number of
connections (specifically, the largest number supported by any
database configuration), and those large number of connections can
occupy a large fraction of the overall memory allotted to a small
database, making the amount of lock-related memory consumption on,
say, a database that is intended to only receive 100MB of shmem
approach nearly 50% of the overall total, and that is rather
unfortunate.  I can see why that'd be hard to fix (maybe, instead, a
more logical replication layer is a better investment of time), but I
thought it an interesting consideration that was worth discussing.

-- 
fdr


pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: How hard would it be to support LIKE in return declaration of generic record function calls ?
Next
From: Tom Lane
Date:
Subject: Re: online debloatification (was: extending relations more efficiently)