Hi Lukasz, thanks for following up.
On 2021-May-04, Lukasz Biegaj wrote:
> The problem is as described in
https://www.postgresql.org/message-id/flat/8bf8785c-f47d-245c-b6af-80dc1eed40db%40unitygroup.com
>
> It does occur on two separate production clusters and one test cluster - all
> belonging to the same customer, although processing slightly different data
> (it's an e-commerce store with multiple languages and separate production
> databases for each language).
I think the best next move would be to make certain that the problem is
what we think it is, so that we can discuss if Amit's commit is an
appropriate fix. I would suggest to do that by running the problematic
workload in the test system under "perf record -g" and then get a report
with "perf report -g" which should hopefully give enough of a clue.
(Sometimes the reports are much better if you use a binary that was
compiled with -fno-omit-frame-pointer, so if you're in a position to try
that, it might be useful -- or apparently you could try "perf record
--call-graph dwarf" or "perf record --call-graph lbr", depending.)
Also I would be much more comfortable about proposing to backpatch such
an invasive change if you could ensure that in pg10 the same workload
does not cause the problem. If it does, then it'd be clear we're
talking about a regression.
--
Álvaro Herrera Valdivia, Chile
"I'm always right, but sometimes I'm more right than other times."
(Linus Torvalds)