Postgres is not able to handle more than 4k tables!? - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Postgres is not able to handle more than 4k tables!?
Date
Msg-id da3205c4-5b07-a65c-6c26-a293c6464fdb@postgrespro.ru
Whole thread Raw
Responses Re: Postgres is not able to handle more than 4k tables!?
RE: Postgres is not able to handle more than 4k tables!?
Re: Postgres is not able to handle more than 4k tables!?
List pgsql-hackers
I want to explain one bad situation we have encountered with one of our 
customers.
There are ~5000 tables in their database. And what is worse - most of 
them are actively used.
Then several flaws of Postgres make their system almost stuck.

Autovacuum is periodically processing all this 5k relations (because 
them are actively updated).
And as far as most of this tables are small enough autovacuum complete 
processing of them almost in the same time.
As a result autovacuum workers produce ~5k invalidation messages in 
short period of time.

There are several thousand clients, most of which are executing complex 
queries.
So them are not able to process all this invalidation messages and their 
invalidation message buffer is overflown.
Size of this buffer is hardcoded (MAXNUMMESSAGES = 4096) and can not be 
changed without recompilation of Postgres.
This is problem N1.

As a result resetState is set to true, forcing backends to invalidate 
their caches.
So most of backends loose there cached metadata and have to access 
system catalog trying to reload it.
But then we come to the next show stopper: NUM_LOCK_PARTITIONS.
It is also hardcoded and can't be changed without recompilation:

#define LOG2_NUM_LOCK_PARTITIONS  4
#define NUM_LOCK_PARTITIONS  (1 << LOG2_NUM_LOCK_PARTITIONS)

Having just 16 LW-Locks greatly increase conflict probability (taken in 
account that there are 5k tables and totally about 25k relations).
It cause huge lw-lock acquisition time for heap_open and planning stage 
of some queries is increased from milliseconds to several minutes!
Koda!

This is problem number 2. But there is one more flaw we have faced with. 
We have increased LOG2_NUM_LOCK_PARTITIONS to 8
and ... clients start to report "too many LWLocks taken" error.
There is yet another hardcoded constant MAX_SIMUL_LWLOCKS = 200
which relation with NUM_LOCK_PARTITIONS  was not mentioned anywhere.

But there are several places in Postgres where it tries to hold all 
partition locks (for example in deadlock detector).
Definitely if NUM_LOCK_PARTITIONS > MAX_SIMUL_LWLOCKS we get this error.

So looks like NUM_LOCK_PARTITIONS and MAXNUMMESSAGES  constants have to 
be replaced with GUCs.
To avoid division, we can specify log2 of this values, so shift can be 
used instead.
And MAX_SIMUL_LWLOCKS should be defined as NUM_LOCK_PARTITIONS + 
NUM_INDIVIDUAL_LWLOCKS + NAMED_LWLOCK_RESERVE.







pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: language cleanups in code and docs
Next
From: David Steele
Date:
Subject: Re: language cleanups in code and docs