On 12-07-2021 20:56, Vijaykumar Jain wrote:
The backtraces you captured look like the query is not "hung", it's
just computing away.
He mentioned earlier that the query was hung as 'active' for 8 hours and on.
incase this is due to bad plan,
`vacuumdb -a -v` from the terminal, each time before you run your test suite for some runs, do you still get the same issue?
I have a feeling repeated runs may have caused a lot of bloat on some tables which might have not been reclaimed by autovacuum runs.
I configured Jenkins to run that command prior to executing the tests. I got 5 successful runs, no hanging queries. Then I reverted and ran again. The first and second run were ok; the third run hung again. So your hunch might be right.
On 12-07-2021 19:46, Tom Lane wrote:
You might need
to investigate by altering your application to capture "EXPLAIN ..."
output just before the troublesome query, so you can see if it gets
a different plan in the slow cases.
Then I tried this. The query plans are indeed not consistent.
Most of the time the first line of the query plan is: Unique (cost=4892.35..4892.35 rows=1 width=64) [1]
I have seen other costs: 5818.30, 6350.85 and 6514.73. They all complete correctly. [2], [3], [4]
The plan that leaves the query hanging in the 'active' state starts with: Unique (cost=241.81..241.82 rows=1 width=64) [5]
That's clearly much lower than the rest. So I suspect the planner making a 'wrong' guess there, causing a bad plan, and a long time to execute. For reference, the executed query is [6].
Now, where to go from here? Is this considered a bug in PostgreSQL, or am I misusing the database engine by doing DROP DATABASE and CREATE DATABASE over and over again? I must say that I never saw this behavior on PostgreSQL 9.6, so in that regard it might be considered a bug.....?
What can I do to get to the bottom of this? Should I export the content of some metadata tables prior to executing the hanging query? Should I `vacuumdb -a -v` prior to logging the EXPLAIN for the hanging query?
With kind regards,
Jurrie
[1] https://jurr.org/PostgreSQL_13_hanging_query/normal.txt
[2] https://jurr.org/PostgreSQL_13_hanging_query/alt1.txt
[3] https://jurr.org/PostgreSQL_13_hanging_query/alt3.txt
[4] https://jurr.org/PostgreSQL_13_hanging_query/alt2.txt
[5] https://jurr.org/PostgreSQL_13_hanging_query/hang.txt
[6] https://jurr.org/PostgreSQL_13_hanging_query/query.txt