Re: Max connections reached without max connections reached - Mailing list pgsql-general

From Rob Sargent
Subject Re: Max connections reached without max connections reached
Date
Msg-id 0fba028d-68da-349c-0996-e83f707824b7@gmail.com
Whole thread Raw
In response to Re: Max connections reached without max connections reached  (James Sewell <james.sewell@jirotech.com>)
Responses Re: Max connections reached without max connections reached  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
On 12/1/21 3:29 PM, James Sewell wrote:

Looking again into the back trace[1], it appeared that the backend is
getting stuck while getting the subtransaction's parent information,
and that can only happen if the snapshot is getting marked as
overflowed.  So it seems that some of the scripts are using a lot of
sub-transaction (> 64) within a transaction and in such case, we can
not cache the sub-transaction information and the snapshot will be
marked overflowed.  After that, for checking the visibility of any
tuple we need to consult the SLRU to get TopMost transaction
information and if the subtransaction SLRU cache is full then it will
have to do the I/O.  So in this case it appeared that a lot of
parallel scans/new connections are trying to read the pg_subtrans and
due to limited cache size that is not even able to load the data in
the cache and getting frequently blocked.  So I assume that when you
say the startup is getting stuck forever, how long did you wait, maybe
it is moving but getting stuck again for the next tuple, so to confirm
that attach the process with gdb put breakpoint in XidInMVCCSnapshot()
and check whether it is hitting that breakpoint again or not.  Also,
check whether you are seeing "SubtransControlLock" in
pg_stat_activity.  There is one nice blog[2] explaining this issue so
you can refer to that as well.

Ah. Snap. This makes a lot of sense. The developers have a nasty habit of using LOOP over rows which has never been stamped out. I bet they have applied this to something with a lot of rows.

Is there a way of seeing which statements / backends / processes have greater than 64 stxns? I'd like to monitor this.

- James


The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Also be on the look-out for begin trans; begin trans; etc  I read Kumar's report as nested transactions.  If your gang is doing a transaction per row, they need a kick in the rear.  Anyone not closing those needs a pink slip. 

pgsql-general by date:

Previous
From: James Sewell
Date:
Subject: Re: Max connections reached without max connections reached
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: ssl_crl_file Certificate Revocation List doesn't work for postgresql 11