Re: weird issue with occasional stuck queries - Mailing list pgsql-general

From overland
Subject Re: weird issue with occasional stuck queries
Date
Msg-id 11a5dbb5c16cbf66e7db7a861ce139c3d33978b1.camel@recarea.com
Whole thread Raw
In response to weird issue with occasional stuck queries  (spiral <spiral@spiral.sh>)
List pgsql-general
I would look at optimizing the query to increase performance. SELECT *
has room for improvement. 

Also check out the book Mastering PostgreSQL 11 (or whatever version).
There are some good tips in there like using EXPLAIN to analyze the
query plan. 

Looking at/setting max_parallel_workers_per_gather might be of
interest. You can turn off parallelism as well as increase the number
of workers. I have no idea if that will help you but I found it useful.

Also walk through the server settings to optimize the performance like
work_mem, shared_buffers, etc. If you haven't gone through all that yet
understanding each one and setting it correctly is important to get the
best out of your machine. It takes time but with many options comes
great power.



-----Original Message-----
From: spiral <spiral@spiral.sh>
To: pgsql-general@postgresql.org
Subject: weird issue with occasional stuck queries
Date: Fri, 1 Apr 2022 03:06:46 -0400

Hey,

I'm having a weird issue where a few times a day, any query that hits a
specific index (specifically a `unique` column index) gets stuck for
anywhere between 1 and 15 minutes on a LWLock (mostly
MultiXactOffsetSLRU - not sure what that is, I couldn't find anything
about it except for a pgsql-hackers list thread that I didn't really
understand).
Checking netdata history, these stuck queries coincide with massive
disk read; we average ~2MiB/s disk read and it got to 40MiB/s earlier
today.

These queries used to get stuck for ~15 minutes at worst, but I turned
down the query timeout. I assume the numbers above would be worse if I
let the queries run for as long as they need, but I don't have any logs
from before that change and I don't really want to try that again as it
would impact production.

I asked on the IRC a few days ago and got the suggestion to increase
shared_buffers, but that doesn't seem to have helped at all. I also
tried deleting and recreating the index, but that seems to have changed
nothing as well.

Any suggestions are appreciated since I'm really not sure how to debug
this further. I'm also attaching a couple screenshots that might be
useful.

spiral





pgsql-general by date:

Previous
From: Adam Scott
Date:
Subject: Re: weird issue with occasional stuck queries
Next
From: Dave Ekhaus
Date:
Subject: PostgreSQL JSON