BUG #17710: Slow queries (100% CPU) after auto-vacuum - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17710: Slow queries (100% CPU) after auto-vacuum |
Date | |
Msg-id | 17710-22ff6a35aac256ff@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17710: Slow queries (100% CPU) after auto-vacuum
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17710 Logged by: Jean-Francois Levesque Email address: jf.levesque@gmail.com PostgreSQL version: 11.18 Operating system: Docker postgres:11.18-bullseye on macOS 13.0.1 Description: Hi! I am having an issue that started with 11.0 and is still present with 11.18. The issue is not present with 10.23. The project is using Django with PostgreSQL as the backend DB. I have one pytest test that is doing a lot of things (~4500 SQL queries) that usually takes 17 seconds to run when running normally. The same test is running for years without problems. When I run this test in a loop (with a 10s sleep between each execution), after 10-20 runs, the test starts to take ~300 seconds. All subsequent executions are that long. At this point, I have one postgresql process that is taking 100% CPU utilization during the tests. When I have the issue, the SQL query [1] (see below) is taking a lot of time (between 25 and 50 seconds). During my troubleshooting, I realized that the slow queries start to happen after an auto-vacuum on the tables (see logs [2]). If I disable the auto vacuums (autovacuum=off), I am not able to reproduce the problem. The pytest execution is done inside a transaction that is rolled back after the test is completed. Can it be related? I am testing mostly on macOS 13.0.1 (Macbook Pro, M1 Pro) with Docker 4.12.0. I have been able to reproduce the same issue on Bitbucket Pipeline again using Docker containers. The official postgreSQL containers are being used without configuration changes. Is this a known issue? Is there something I need to tune? How can I troubleshoot this issue? Thank you for your help. Jean-Francois [1]: UPDATE "shared_models_session" SET "plan_disabled" = true WHERE "shared_models_session"."id" IN (SELECT V0."id" FROM "shared_models_session" V0 INNER JOIN "shared_models_sessionmonitoring" V1 ON (V0."id" = V1."session_id") WHERE V1."monitoring_id" IN (SELECT U0."id" FROM "shared_models_monitoring" U0 WHERE U0."owner_id" = 441) FOR UPDATE OF V0) [2] Auto vacuum logs: obkio_db | 2022-12-09 15:02:43.323 UTC [842] LOG: 00000: automatic vacuum of table "test.public.shared_models_sessionmonitoring": index scans: 1 obkio_db | pages: 0 removed, 15 remain, 0 skipped due to pins, 0 skipped frozen obkio_db | tuples: 750 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 1794 obkio_db | buffer usage: 337 hits, 1 misses, 3 dirtied obkio_db | avg read rate: 0.002 MB/s, avg write rate: 0.005 MB/s obkio_db | system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 5.17 s obkio_db | 2022-12-09 15:02:43.323 UTC [842] LOCATION: lazy_vacuum_rel, vacuumlazy.c:407 obkio_db | 2022-12-09 15:02:43.735 UTC [842] LOG: 00000: automatic vacuum of table "test.public.shared_models_session": index scans: 1 obkio_db | pages: 0 removed, 442 remain, 0 skipped due to pins, 0 skipped frozen obkio_db | tuples: 479 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 1794 obkio_db | buffer usage: 3357 hits, 1 misses, 2 dirtied obkio_db | avg read rate: 0.022 MB/s, avg write rate: 0.044 MB/s obkio_db | system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.35 s obkio_db | 2022-12-09 15:02:43.735 UTC [842] LOCATION: lazy_vacuum_rel, vacuumlazy.c:407
pgsql-bugs by date: