Home > mailing lists

Re: weird issue with occasional stuck queries - Mailing list pgsql-general

From	Adam Scott
Subject	Re: weird issue with occasional stuck queries
Date	April 2, 2022 19:09:15
Msg-id	CA+s62-M3bCtos=ocwNoyo0s8rEx0Q_Pw+BiURNXf55hdCaq=PA@mail.gmail.com Whole thread
In response to	Re: weird issue with occasional stuck queries (spiral <spiral@spiral.sh>)
List	pgsql-general

Tree view

The logs were helpful. You may want to see the statements around the errors, as more detail may be there such as the SQL statement associated with the error.

Deadlocks are an indicator that the client code needs to be examined for improvement. See https://www.cybertec-postgresql.com/en/postgresql-understanding-deadlocks/ about deadlocks. They will slow things down and could cause a queue of SQL statements eventually bogging down the system.

It definitely looks like locking issues which is why you don't see high CPU. IIRC you might see high system CPU usage, as opposed to userspace CPU, where the kernel is getting overloaded. The `top` command will help to show that.

The disks could be saturated by the write ahead log (WAL) handling of all the transactions. More about WAL here: https://www.postgresql.org/docs/10/wal-internals.html You could consider moving that directory somewhere else using a symbolic link (conf. the link)

Anyway, these are the things I would look at.

Adam

On Sat, Apr 2, 2022 at 5:23 AM spiral <spiral@spiral.sh> wrote:

Hey,

> That wait event according to documentation is "Waiting to access the
> multixact member SLRU cache." SLRU = segmented least recently used
> cache

I see, thanks!

> if you are low on memory, it can slow down the allocation of
> buffers. Do you have a query that is a "select for update" running
> somewhere? If your disk is low on space `df -h` that might explain
> the issue.

- There aren't any queries that are running for longer than the selects
shown earlier; definitely not "select for update" since I don't ever
use that in my code.
- Both disk and RAM utilization is relatively low.

> Is there an ERROR: multixact something in your postgres log?

There isn't, but while checking I saw some other concerning errors
including "deadlock detected", "could not map dynamic shared memory
segment" and "could not attach to dynamic shared area".
(full logs here: https://paste.sr.ht/blob/9ced99b119c3fce1ecfd71e8554946e7845a44dd )

> Another thing to look at is `iostat -x -y` and look at disk util %.
> This is an indicator, but not definitive, of how much disk access is
> going on. It may be your drives are just saturated although your
> IOWait looks ok in your attachment.

I didn't specifically look at that, but I did notice *very* high disk
utilization in at least one instance of the stuck queries, as I
mentioned previously. Why would the disks be getting saturated? The
query count isn't noticeably higher than average, and the database
is not autovacuuming, so not sure what could cause that.

spiral

pgsql-general by date:

From: Benedict Holland
Date: 02 April 2022, 14:34:34
Subject: Re: Re: How long does iteration over 4-5 million rows usually take?

From: overland
Date: 03 April 2022, 02:08:34
Subject: Re: weird issue with occasional stuck queries

Re: weird issue with occasional stuck queries - Mailing list pgsql-general

Previous

Next