Thread: shared memory release following failed lock acquirement.

shared memory release following failed lock acquirement.

From
"Merlin Moncure"
Date:
Tom,

I noticed your recent corrections to lock.c regarding the releasing of
locks in an out of shared memory condition.  This may or may not be
relevant, but when I purposefully use up all the lock space with user
locks, the server runs out of shared memory and stays out until it is
restarted (not when the backend shuts down as it is supposed to).

In other words, after doing a select user_write_lock_oid(t.oid) from
big_table t;

It's server restart time.

What's really interesting about this is that the pg_locks view (after
the offending disconnects) reports nothing out of the ordinary even
though no backends can acquire locks after that point.

Merlin





Re: shared memory release following failed lock acquirement.

From
Tom Lane
Date:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> In other words, after doing a select user_write_lock_oid(t.oid) from
> big_table t;
> It's server restart time.

User locks are not released at transaction failure.  Quitting that
backend should have got you out of it, however.

> What's really interesting about this is that the pg_locks view (after
> the offending disconnects) reports nothing out of the ordinary even
> though no backends can acquire locks after that point.

User locks are not shown in pg_locks, either.

There is a secondary issue here, which is that we don't have provision
to recycle hash table entries back into the general shared memory pool
(mainly because there *is* no "shared memory pool", only never-yet-
allocated space).  So when you do release these locks, the freed space
only goes back to the lock hash table's freelist.  That means there
won't be any space for expansion of the buffer hash table, nor any other
shared data structures.  This could lead to problems if you hadn't been
running the server long enough to expand the buffer table to full size.

I don't think it's practical to introduce a real shared memory
allocator, but maybe we could alleviate the worst risks by forcing the
buffer hash table up to full size immediately at startup.  I'll look at
this.
        regards, tom lane


Re: shared memory release following failed lock acquirement.

From
"Merlin Moncure"
Date:
> "Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> > In other words, after doing a select user_write_lock_oid(t.oid) from
> > big_table t;
> > It's server restart time.
>
> User locks are not released at transaction failure.  Quitting that
> backend should have got you out of it, however.

Right, my point being, it doesn't.
> > What's really interesting about this is that the pg_locks view
(after
> > the offending disconnects) reports nothing out of the ordinary even
> > though no backends can acquire locks after that point.
>
> User locks are not shown in pg_locks, either.

Well, actually, they are.  The lock tag values are not shown, but they
do show up as mostly blank entries in the view.
> There is a secondary issue here, which is that we don't have provision
> to recycle hash table entries back into the general shared memory pool
> (mainly because there *is* no "shared memory pool", only never-yet-
> allocated space).  So when you do release these locks, the freed space
> only goes back to the lock hash table's freelist.  That means there
> won't be any space for expansion of the buffer hash table, nor any
other
> shared data structures.  This could lead to problems if you hadn't
been
> running the server long enough to expand the buffer table to full
size.

OK, this perhaps explains it.  You are saying then that I am running the
server out of shared memory, not necessarily space in the lock table.  I
jumped to the conclusion that the memory associated with the locks might
not have been getting freed.
> I don't think it's practical to introduce a real shared memory
> allocator, but maybe we could alleviate the worst risks by forcing the
> buffer hash table up to full size immediately at startup.  I'll look
at
> this.

This still doesn't fix the problem (albeit a low priority problem,
currently just a contrib. module) of user locks eating up all the space
in the lock table.  There are a couple of different ways to look at
fixing this.  My first thought is to bump up the error level of an out
of lock table space to 'fatal'.

Merlin


Re: shared memory release following failed lock acquirement.

From
"Merlin Moncure"
Date:
tgl wrote:
> There is a secondary issue here, which is that we don't have provision
> to recycle hash table entries back into the general shared memory pool
> (mainly because there *is* no "shared memory pool", only never-yet-
> allocated space).  So when you do release these locks, the freed space
> only goes back to the lock hash table's freelist.  That means there
> won't be any space for expansion of the buffer hash table, nor any
other
> shared data structures.  This could lead to problems if you hadn't
been
> running the server long enough to expand the buffer table to full
size.

Ok, I confirmed that I'm running the server out of shared memory space,
not necessarily the lock table.  My server settings were:
max_connections: 100
shred bufs: 8192 buffers
max_locks: 64 (stock).

According to postgresql.conf, using these settings the lock table eats
64*260*100 bytes = < 2M.  Well, if it's running my server out of shared
memory, it's eating much, much more shmem than previously thought.

Also, I was able to acquire around 10k locks before the server borked.
This is obviously a lot more than than 64*100.  However, I set the
max_locks down to 10 and this did affect how many locks could be
acquired (and in this case, a server restart was not required).

Doubling shared buffers to 16k bumped my limit to over 20k locks, but
less than 25k.  As I see it, this means the user-locks (and perhaps all
locks...?) eat around ~ 6k bytes memory each.

This is not really a big deal, 10k locks is way more than a lock heavy
application would be expected to use.  I'll look into this a bit more...

Merlin




Re: shared memory release following failed lock acquirement.

From
Tom Lane
Date:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> According to postgresql.conf, using these settings the lock table eats
> 64*260*100 bytes = < 2M.  Well, if it's running my server out of shared
> memory, it's eating much, much more shmem than previously thought.

Hmm, the 260 is out of date I think.  I was seeing about 184 bytes/lock
in my tests just now.

> Also, I was able to acquire around 10k locks before the server borked.
> This is obviously a lot more than than 64*100.

Sure, because there's about 100K of deliberate slop in the shared memory
size allocation, and you are probably also testing a scenario where the
buffer and FSM hash tables haven't ramped to full size yet, so the lock
table is able to eat more than the nominal amount of space.

> As I see it, this means the user-locks (and perhaps all
> locks...?) eat around ~ 6k bytes memory each.

They're allocated in groups of 32, which would work out to close to 6k;
maybe you were measuring the incremental cost of allocating the first one?

I did some digging, and as far as I can see the only shared memory
allocations that occur after postmaster startup are for the four shmem
hash tables: buffers, FSM relations, locks, and proclocks.  Of these,
the buffer and FSM hashtables have predetermined maximum sizes.  So
arranging for the space in those tables to be fully preallocated should
prevent any continuing problems from lock table overflow.  I've
committed a fix that does this.  I verified that after running the thing
out of shared memory via creating a lot of user locks and then releasing
same, I could run the regression tests.
        regards, tom lane


Re: shared memory release following failed lock acquirement.

From
"Merlin Moncure"
Date:
Tgl wrote:
> > As I see it, this means the user-locks (and perhaps all
> > locks...?) eat around ~ 6k bytes memory each.
>
> They're allocated in groups of 32, which would work out to close to
6k;
> maybe you were measuring the incremental cost of allocating the first
one?

I got my 6k figure by dividing 10000 into 64M, 10000 being the value
that crashed the server.  That's reasonable because doubling shared
buffers slightly more than doubled the crash value.

I was wondering how ~ 10k locks ran me out of shared memory when each
lock takes ~ 260b (half that, as you say) and I am running 8k buffers =
64M.

260 * 100 backends * 64 maxlocks = 1.7 M.  Sure, the hash table and
other stuff adds some...but this is no where near what it should take to
run me out.

Am I just totally misunderstanding how to estimate locks memory
consumption?

Merlin


Re: shared memory release following failed lock acquirement.

From
Tom Lane
Date:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> I was wondering how ~ 10k locks ran me out of shared memory when each
> lock takes ~ 260b (half that, as you say) and I am running 8k buffers =
> 64M.

The number of buffers you have doesn't have anything to do with this.
The question is how much shared memory space is there for the lock
table, above and beyond what's used for everything else (such as
buffers).

I just went through and corrected some minor errors in the calculation
of shared memory block size (mostly stuff where the estimation code had
gotten out of sync with the actual work over time).  I now find that
with all-default configuration parameters I can create 7808 locks before
running out of shared memory, rather than the promised 6400.  (YMMV due
to platform-specific differences in MAXALIGN, sizeof(pointer), etc.)
This is coming from two places: LockShmemSize deliberately adds on 10%
slop factor to its calculation of the lock table size, and then
CreateSharedMemoryAndSemaphores adds on 100KB for safety margin.  Both
of those numbers are kinda pulled from the air, but I don't see a strong
reason to change them.  The other space calculations seem to be pretty
nearly dead-on.
        regards, tom lane


Re: shared memory release following failed lock acquirement.

From
"Simon Riggs"
Date:
>Tom Lane
> "Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> > According to postgresql.conf, using these settings the lock table eats
> > 64*260*100 bytes = < 2M.  Well, if it's running my server out of shared
> > memory, it's eating much, much more shmem than previously thought.
>
> Hmm, the 260 is out of date I think.  I was seeing about 184 bytes/lock
> in my tests just now.
>
> > Also, I was able to acquire around 10k locks before the server borked.
> > This is obviously a lot more than than 64*100.
>
> Sure, because there's about 100K of deliberate slop in the shared memory
> size allocation, and you are probably also testing a scenario where the
> buffer and FSM hash tables haven't ramped to full size yet, so the lock
> table is able to eat more than the nominal amount of space.
>
> > As I see it, this means the user-locks (and perhaps all
> > locks...?) eat around ~ 6k bytes memory each.
>
> They're allocated in groups of 32, which would work out to close to 6k;
> maybe you were measuring the incremental cost of allocating the first one?
>
> I did some digging, and as far as I can see the only shared memory
> allocations that occur after postmaster startup are for the four shmem
> hash tables: buffers, FSM relations, locks, and proclocks.  Of these,
> the buffer and FSM hashtables have predetermined maximum sizes.  So
> arranging for the space in those tables to be fully preallocated should
> prevent any continuing problems from lock table overflow.  I've
> committed a fix that does this.  I verified that after running the thing
> out of shared memory via creating a lot of user locks and then releasing
> same, I could run the regression tests.
>

Few questions:

Is that fix in 8.0?

Does this mean that the parameter max_locks_per_transaction isn't honoured
at all, it is just used to size the lock table - which itself can expand
beyond that max limit in various circumstances? (Though with the bug fix,
not THAT much more than the max limit)
Should we rename and redocument the parameter? If that is so, the current
name is so far away from its real meaning as to constitute a bug in
itself....

Best Regards, Simon Riggs




Re: shared memory release following failed lock acquirement.

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> Does this mean that the parameter max_locks_per_transaction isn't honoured
> at all, it is just used to size the lock table

Yes, and that's how it's documented.
        regards, tom lane


Re: shared memory release following failed lock acquirement.

From
"Simon Riggs"
Date:
>Tom Lane
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > Does this mean that the parameter max_locks_per_transaction
> isn't honoured
> > at all, it is just used to size the lock table
>
> Yes, and that's how it's documented.
>

The name max_locks_per_transaction indicates a limit of some kind. The
documentation doesn't mention anything about whether that limit is enforced
or not.

I suggest the additional wording:
"This parameter is not a hard limit: No limit is enforced on the number of
locks in each transaction. System-wide, the total number of locks is limited
by the size of the lock table."

The recent patch stops the system from crashing with an out of memory
condition, though this probably slightly hastens the condition of "no locks"
available. It would be good to clarify what behaviour the system exhibits
when we run out of locks.

I'm not sure myself now what that behaviour is: My understanding is that we
do not perform lock escalation (as does DB2), so presumably we just grind to
a halt? I take it that there is no automated way of getting out of this
situation? i.e. the deadlock detector doesn't start killing transactions
that hold lots of locks to free up space? So, we would basically just start
to build up lots of people waiting on locks - though without any mechanism
for diagnosing this is happening? What does happen and where does it end
(now)?

Best Regards, Simon Riggs



Re: shared memory release following failed lock acquirement.

From
"Merlin Moncure"
Date:
> The name max_locks_per_transaction indicates a limit of some kind. The
> documentation doesn't mention anything about whether that limit is
> enforced
> or not.
>
> I suggest the additional wording:
> "This parameter is not a hard limit: No limit is enforced on the
number of
> locks in each transaction. System-wide, the total number of locks is
> limited
> by the size of the lock table."


I think it's worse than that.  First of all, user locks persist outside
of transactions, but they apply to this limit.  A more appropriate name
for the GUC variable would be 'estimated_lock_table_size_per_backend',
or something like that.  I've been putting some thought into reworking
the userlock contrib module into something acceptable into the main
project, a substantial part of that being documentation changes.

Merlin


Re: shared memory release following failed lock acquirement.

From
"Simon Riggs"
Date:
> Merlin Moncure
> > The name max_locks_per_transaction indicates a limit of some kind. The
> > documentation doesn't mention anything about whether that limit is
> > enforced
> > or not.
> >
> > I suggest the additional wording:
> > "This parameter is not a hard limit: No limit is enforced on the
> number of
> > locks in each transaction. System-wide, the total number of locks is
> > limited
> > by the size of the lock table."
>
>
> I think it's worse than that.  First of all, user locks persist outside
> of transactions, but they apply to this limit.

I was really thinking of the standard locking case. Yes, user locks make it
worse.

> A more appropriate name
> for the GUC variable would be 'estimated_lock_table_size_per_backend',
> or something like that.  I've been putting some thought into reworking
> the userlock contrib module into something acceptable into the main
> project, a substantial part of that being documentation changes.
>

I agree a renamed parameter would be more appropriate, though I suspect a
more accurate name will be about 5 yards long.

Documentation change would be worthwhile here... but I'll wait for your
changes before doing anything there,

Best Regards, Simon Riggs