Thread: shared memory release following failed lock acquirement.
Tom, I noticed your recent corrections to lock.c regarding the releasing of locks in an out of shared memory condition. This may or may not be relevant, but when I purposefully use up all the lock space with user locks, the server runs out of shared memory and stays out until it is restarted (not when the backend shuts down as it is supposed to). In other words, after doing a select user_write_lock_oid(t.oid) from big_table t; It's server restart time. What's really interesting about this is that the pg_locks view (after the offending disconnects) reports nothing out of the ordinary even though no backends can acquire locks after that point. Merlin
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > In other words, after doing a select user_write_lock_oid(t.oid) from > big_table t; > It's server restart time. User locks are not released at transaction failure. Quitting that backend should have got you out of it, however. > What's really interesting about this is that the pg_locks view (after > the offending disconnects) reports nothing out of the ordinary even > though no backends can acquire locks after that point. User locks are not shown in pg_locks, either. There is a secondary issue here, which is that we don't have provision to recycle hash table entries back into the general shared memory pool (mainly because there *is* no "shared memory pool", only never-yet- allocated space). So when you do release these locks, the freed space only goes back to the lock hash table's freelist. That means there won't be any space for expansion of the buffer hash table, nor any other shared data structures. This could lead to problems if you hadn't been running the server long enough to expand the buffer table to full size. I don't think it's practical to introduce a real shared memory allocator, but maybe we could alleviate the worst risks by forcing the buffer hash table up to full size immediately at startup. I'll look at this. regards, tom lane
> "Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > > In other words, after doing a select user_write_lock_oid(t.oid) from > > big_table t; > > It's server restart time. > > User locks are not released at transaction failure. Quitting that > backend should have got you out of it, however. Right, my point being, it doesn't. > > What's really interesting about this is that the pg_locks view (after > > the offending disconnects) reports nothing out of the ordinary even > > though no backends can acquire locks after that point. > > User locks are not shown in pg_locks, either. Well, actually, they are. The lock tag values are not shown, but they do show up as mostly blank entries in the view. > There is a secondary issue here, which is that we don't have provision > to recycle hash table entries back into the general shared memory pool > (mainly because there *is* no "shared memory pool", only never-yet- > allocated space). So when you do release these locks, the freed space > only goes back to the lock hash table's freelist. That means there > won't be any space for expansion of the buffer hash table, nor any other > shared data structures. This could lead to problems if you hadn't been > running the server long enough to expand the buffer table to full size. OK, this perhaps explains it. You are saying then that I am running the server out of shared memory, not necessarily space in the lock table. I jumped to the conclusion that the memory associated with the locks might not have been getting freed. > I don't think it's practical to introduce a real shared memory > allocator, but maybe we could alleviate the worst risks by forcing the > buffer hash table up to full size immediately at startup. I'll look at > this. This still doesn't fix the problem (albeit a low priority problem, currently just a contrib. module) of user locks eating up all the space in the lock table. There are a couple of different ways to look at fixing this. My first thought is to bump up the error level of an out of lock table space to 'fatal'. Merlin
tgl wrote: > There is a secondary issue here, which is that we don't have provision > to recycle hash table entries back into the general shared memory pool > (mainly because there *is* no "shared memory pool", only never-yet- > allocated space). So when you do release these locks, the freed space > only goes back to the lock hash table's freelist. That means there > won't be any space for expansion of the buffer hash table, nor any other > shared data structures. This could lead to problems if you hadn't been > running the server long enough to expand the buffer table to full size. Ok, I confirmed that I'm running the server out of shared memory space, not necessarily the lock table. My server settings were: max_connections: 100 shred bufs: 8192 buffers max_locks: 64 (stock). According to postgresql.conf, using these settings the lock table eats 64*260*100 bytes = < 2M. Well, if it's running my server out of shared memory, it's eating much, much more shmem than previously thought. Also, I was able to acquire around 10k locks before the server borked. This is obviously a lot more than than 64*100. However, I set the max_locks down to 10 and this did affect how many locks could be acquired (and in this case, a server restart was not required). Doubling shared buffers to 16k bumped my limit to over 20k locks, but less than 25k. As I see it, this means the user-locks (and perhaps all locks...?) eat around ~ 6k bytes memory each. This is not really a big deal, 10k locks is way more than a lock heavy application would be expected to use. I'll look into this a bit more... Merlin
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > According to postgresql.conf, using these settings the lock table eats > 64*260*100 bytes = < 2M. Well, if it's running my server out of shared > memory, it's eating much, much more shmem than previously thought. Hmm, the 260 is out of date I think. I was seeing about 184 bytes/lock in my tests just now. > Also, I was able to acquire around 10k locks before the server borked. > This is obviously a lot more than than 64*100. Sure, because there's about 100K of deliberate slop in the shared memory size allocation, and you are probably also testing a scenario where the buffer and FSM hash tables haven't ramped to full size yet, so the lock table is able to eat more than the nominal amount of space. > As I see it, this means the user-locks (and perhaps all > locks...?) eat around ~ 6k bytes memory each. They're allocated in groups of 32, which would work out to close to 6k; maybe you were measuring the incremental cost of allocating the first one? I did some digging, and as far as I can see the only shared memory allocations that occur after postmaster startup are for the four shmem hash tables: buffers, FSM relations, locks, and proclocks. Of these, the buffer and FSM hashtables have predetermined maximum sizes. So arranging for the space in those tables to be fully preallocated should prevent any continuing problems from lock table overflow. I've committed a fix that does this. I verified that after running the thing out of shared memory via creating a lot of user locks and then releasing same, I could run the regression tests. regards, tom lane
Tgl wrote: > > As I see it, this means the user-locks (and perhaps all > > locks...?) eat around ~ 6k bytes memory each. > > They're allocated in groups of 32, which would work out to close to 6k; > maybe you were measuring the incremental cost of allocating the first one? I got my 6k figure by dividing 10000 into 64M, 10000 being the value that crashed the server. That's reasonable because doubling shared buffers slightly more than doubled the crash value. I was wondering how ~ 10k locks ran me out of shared memory when each lock takes ~ 260b (half that, as you say) and I am running 8k buffers = 64M. 260 * 100 backends * 64 maxlocks = 1.7 M. Sure, the hash table and other stuff adds some...but this is no where near what it should take to run me out. Am I just totally misunderstanding how to estimate locks memory consumption? Merlin
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > I was wondering how ~ 10k locks ran me out of shared memory when each > lock takes ~ 260b (half that, as you say) and I am running 8k buffers = > 64M. The number of buffers you have doesn't have anything to do with this. The question is how much shared memory space is there for the lock table, above and beyond what's used for everything else (such as buffers). I just went through and corrected some minor errors in the calculation of shared memory block size (mostly stuff where the estimation code had gotten out of sync with the actual work over time). I now find that with all-default configuration parameters I can create 7808 locks before running out of shared memory, rather than the promised 6400. (YMMV due to platform-specific differences in MAXALIGN, sizeof(pointer), etc.) This is coming from two places: LockShmemSize deliberately adds on 10% slop factor to its calculation of the lock table size, and then CreateSharedMemoryAndSemaphores adds on 100KB for safety margin. Both of those numbers are kinda pulled from the air, but I don't see a strong reason to change them. The other space calculations seem to be pretty nearly dead-on. regards, tom lane
>Tom Lane > "Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > > According to postgresql.conf, using these settings the lock table eats > > 64*260*100 bytes = < 2M. Well, if it's running my server out of shared > > memory, it's eating much, much more shmem than previously thought. > > Hmm, the 260 is out of date I think. I was seeing about 184 bytes/lock > in my tests just now. > > > Also, I was able to acquire around 10k locks before the server borked. > > This is obviously a lot more than than 64*100. > > Sure, because there's about 100K of deliberate slop in the shared memory > size allocation, and you are probably also testing a scenario where the > buffer and FSM hash tables haven't ramped to full size yet, so the lock > table is able to eat more than the nominal amount of space. > > > As I see it, this means the user-locks (and perhaps all > > locks...?) eat around ~ 6k bytes memory each. > > They're allocated in groups of 32, which would work out to close to 6k; > maybe you were measuring the incremental cost of allocating the first one? > > I did some digging, and as far as I can see the only shared memory > allocations that occur after postmaster startup are for the four shmem > hash tables: buffers, FSM relations, locks, and proclocks. Of these, > the buffer and FSM hashtables have predetermined maximum sizes. So > arranging for the space in those tables to be fully preallocated should > prevent any continuing problems from lock table overflow. I've > committed a fix that does this. I verified that after running the thing > out of shared memory via creating a lot of user locks and then releasing > same, I could run the regression tests. > Few questions: Is that fix in 8.0? Does this mean that the parameter max_locks_per_transaction isn't honoured at all, it is just used to size the lock table - which itself can expand beyond that max limit in various circumstances? (Though with the bug fix, not THAT much more than the max limit) Should we rename and redocument the parameter? If that is so, the current name is so far away from its real meaning as to constitute a bug in itself.... Best Regards, Simon Riggs
"Simon Riggs" <simon@2ndquadrant.com> writes: > Does this mean that the parameter max_locks_per_transaction isn't honoured > at all, it is just used to size the lock table Yes, and that's how it's documented. regards, tom lane
>Tom Lane > "Simon Riggs" <simon@2ndquadrant.com> writes: > > Does this mean that the parameter max_locks_per_transaction > isn't honoured > > at all, it is just used to size the lock table > > Yes, and that's how it's documented. > The name max_locks_per_transaction indicates a limit of some kind. The documentation doesn't mention anything about whether that limit is enforced or not. I suggest the additional wording: "This parameter is not a hard limit: No limit is enforced on the number of locks in each transaction. System-wide, the total number of locks is limited by the size of the lock table." The recent patch stops the system from crashing with an out of memory condition, though this probably slightly hastens the condition of "no locks" available. It would be good to clarify what behaviour the system exhibits when we run out of locks. I'm not sure myself now what that behaviour is: My understanding is that we do not perform lock escalation (as does DB2), so presumably we just grind to a halt? I take it that there is no automated way of getting out of this situation? i.e. the deadlock detector doesn't start killing transactions that hold lots of locks to free up space? So, we would basically just start to build up lots of people waiting on locks - though without any mechanism for diagnosing this is happening? What does happen and where does it end (now)? Best Regards, Simon Riggs
> The name max_locks_per_transaction indicates a limit of some kind. The > documentation doesn't mention anything about whether that limit is > enforced > or not. > > I suggest the additional wording: > "This parameter is not a hard limit: No limit is enforced on the number of > locks in each transaction. System-wide, the total number of locks is > limited > by the size of the lock table." I think it's worse than that. First of all, user locks persist outside of transactions, but they apply to this limit. A more appropriate name for the GUC variable would be 'estimated_lock_table_size_per_backend', or something like that. I've been putting some thought into reworking the userlock contrib module into something acceptable into the main project, a substantial part of that being documentation changes. Merlin
> Merlin Moncure > > The name max_locks_per_transaction indicates a limit of some kind. The > > documentation doesn't mention anything about whether that limit is > > enforced > > or not. > > > > I suggest the additional wording: > > "This parameter is not a hard limit: No limit is enforced on the > number of > > locks in each transaction. System-wide, the total number of locks is > > limited > > by the size of the lock table." > > > I think it's worse than that. First of all, user locks persist outside > of transactions, but they apply to this limit. I was really thinking of the standard locking case. Yes, user locks make it worse. > A more appropriate name > for the GUC variable would be 'estimated_lock_table_size_per_backend', > or something like that. I've been putting some thought into reworking > the userlock contrib module into something acceptable into the main > project, a substantial part of that being documentation changes. > I agree a renamed parameter would be more appropriate, though I suspect a more accurate name will be about 5 yards long. Documentation change would be worthwhile here... but I'll wait for your changes before doing anything there, Best Regards, Simon Riggs