Thread: BUG #17330: EXPLAIN hangs and very long query plans

BUG #17330: EXPLAIN hangs and very long query plans

From

PG Bug reporting form

Date:

09 December 2021, 15:18:40

The following bug has been logged on the website:

Bug reference:      17330
Logged by:          Strahinja Kustudic
Email address:      strahinjak@nordeus.com
PostgreSQL version: 10.19
Operating system:   CentOS 7
Description:

We had an issue with one of our production databases running Postgres 10.19
on CentOS 7. One of the most often executed queries started having very long
query plans of 3000ms+, while the execution duration was 1ms-3ms. Query
plans when everything is working regularly were around 1ms or less. While on
the replica it didn't even want to finish EXPLAIN (without ANALYZE, just
EXPLAIN!). EXPLAIN would just hang forever. To be precise we were running
10.10 at that time, but upgrading to 10.19 didn't help. We tried running
ANALYZE on the whole database, but that didn't help. In the end, what helped
is running pg_repack on the whole DB. This was strange because I thought
that the query planner is using table statistics and the index schema to
determine what plan to run, it shouldn't need table/index data to make a
plan, but I don't know PG internals, so I might be wrong.

Because we have backups created by WAL-G, I've restored the base backup with
WALs from that time and I've managed to recreate the issue in Docker on my
own laptop. The issue is that in Docker EXPLAIN runs forever, it never
finishes, like the issue we had on the replica. This means I have a 100%
repro of the issue and I can do whatever you tell me to do to find out what
caused it. I already tried enabling `log_error_verbosity = verbose` but the
server doesn't print anything while EXPLAIN is running and it never
finishes, I haven't tried to leave it to run for hours (but I might).

Is this a known bug in PG 10.X? Or would you like me to share more details
about the table definitions, the exact query and anything you need?

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Jeff Janes

Date:

10 December 2021, 01:31:15

On Thu, Dec 9, 2021 at 10:28 AM PG Bug reporting form <noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 17330
Logged by: Strahinja Kustudic
Email address: strahinjak@nordeus.com
PostgreSQL version: 10.19
Operating system: CentOS 7
Description:

The issue is that in Docker EXPLAIN runs forever, it never
finishes, like the issue we had on the replica. This means I have a 100%
repro of the issue and I can do whatever you tell me to do to find out what
caused it.

Is this 100% CPU bound? 'perf top -p <pid>' would probably be my first tool to use here.

Also, it would be nice to know if an upgrade to a newer major version fixed the problem.

Cheers,

Jeff

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Peter Geoghegan

Date:

10 December 2021, 01:35:47

On Thu, Dec 9, 2021 at 7:28 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
> We had an issue with one of our production databases running Postgres 10.19
> on CentOS 7. One of the most often executed queries started having very long
> query plans of 3000ms+, while the execution duration was 1ms-3ms. Query
> plans when everything is working regularly were around 1ms or less. While on
> the replica it didn't even want to finish EXPLAIN (without ANALYZE, just
> EXPLAIN!). EXPLAIN would just hang forever. To be precise we were running
> 10.10 at that time, but upgrading to 10.19 didn't help. We tried running
> ANALYZE on the whole database, but that didn't help. In the end, what helped
> is running pg_repack on the whole DB. This was strange because I thought
> that the query planner is using table statistics and the index schema to
> determine what plan to run, it shouldn't need table/index data to make a
> plan, but I don't know PG internals, so I might be wrong.

Sounds like a problem with get_actual_variable_range(), which can scan
indexes at plan time to determine minimum or maximum values.

This actually has been improved quite a bit since Postgres 10. So as
Jeff said, seems like you might benefit from upgrading to a newer
major version. v11 has improved things in this exact area.

-- 
Peter Geoghegan

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Strahinja Kustudić

Date:

10 December 2021, 09:17:24

On Fri, Dec 10, 2021 at 2:31 AM Jeff Janes <jeff.janes@gmail.com> wrote:

On Thu, Dec 9, 2021 at 10:28 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference: 17330
Logged by: Strahinja Kustudic
Email address: strahinjak@nordeus.com
PostgreSQL version: 10.19
Operating system: CentOS 7
Description:

The issue is that in Docker EXPLAIN runs forever, it never
finishes, like the issue we had on the replica. This means I have a 100%
repro of the issue and I can do whatever you tell me to do to find out what
caused it.

Is this 100% CPU bound? 'perf top -p <pid>' would probably be my first tool to use here.

I forgot to mention this important detail. When it happened on the production DB it was CPU bound. It was about 70% user time and 30% system time. There was no iowait because most of the DB was in page cache.

Check out the screenshot of the CPU usage on the production DB in the attachment. Here is the timeline:

- 11:00 we started deleting a few million rows

- 11:34 delete finished and instantly CPU went to ~70% user and ~30% system time

- 12:55 we configured PgBouncer to only allow just a few queries in parallel when system time went down, but user time was still 100%

On my Docker instance when I execute EXPLAIN it starts reading a lot of data. The indexes of the biggest table the query reads are 50GB, so my guess is that it reads those indexes.

I also allowed EXPLAIN in Docker to finish and it took almost 500 seconds. Can the index corruption cause this?

Regards,

Strahinja

Attachment

Screenshot-20211210-100111.jpg

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Strahinja Kustudić

Date:

10 December 2021, 09:25:38

On Fri, Dec 10, 2021 at 2:36 AM Peter Geoghegan <pg@bowt.ie> wrote:

Sounds like a problem with get_actual_variable_range(), which can scan
indexes at plan time to determine minimum or maximum values.

This actually has been improved quite a bit since Postgres 10. So as
Jeff said, seems like you might benefit from upgrading to a newer
major version. v11 has improved things in this exact area.

On my Docker instance when I execute EXPLAIN it starts reading a lot of data. The indexes of the biggest table the query reads are 50GB, so my guess is that it reads those indexes.

I allowed EXPLAIN in Docker to finish and it took almost 500 seconds and it was reading data all the time. After I reindexed the biggest table, EXPLAIN finished instantly. Can the index corruption cause this?

Note that this started happening in production after we deleted a few million rows from the biggest table.

Regards,

Strahinja

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Pavel Stehule

Date:

10 December 2021, 09:52:41

pá 10. 12. 2021 v 10:25 odesílatel Strahinja Kustudić <strahinjak@nordeus.com> napsal:

On Fri, Dec 10, 2021 at 2:36 AM Peter Geoghegan <pg@bowt.ie> wrote:
Sounds like a problem with get_actual_variable_range(), which can scan
indexes at plan time to determine minimum or maximum values.

This actually has been improved quite a bit since Postgres 10. So as
Jeff said, seems like you might benefit from upgrading to a newer
major version. v11 has improved things in this exact area.

On my Docker instance when I execute EXPLAIN it starts reading a lot of data. The indexes of the biggest table the query reads are 50GB, so my guess is that it reads those indexes.

I allowed EXPLAIN in Docker to finish and it took almost 500 seconds and it was reading data all the time. After I reindexed the biggest table, EXPLAIN finished instantly. Can the index corruption cause this?

The index can be bloated.

Note that this started happening in production after we deleted a few million rows from the biggest table.

it is maybe same issue

https://www.postgresql.org/message-id/9ADAF560-F331-4AF8-A28A-F005070C3187%40yandex.ru

Regards

Pavel

Regards,
Strahinja

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Strahinja Kustudić

Date:

10 December 2021, 15:52:45

On Fri, Dec 10, 2021 at 10:53 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

pá 10. 12. 2021 v 10:25 odesílatel Strahinja Kustudić <strahinjak@nordeus.com> napsal:
On Fri, Dec 10, 2021 at 2:36 AM Peter Geoghegan <pg@bowt.ie> wrote:
Sounds like a problem with get_actual_variable_range(), which can scan
indexes at plan time to determine minimum or maximum values.

This actually has been improved quite a bit since Postgres 10. So as
Jeff said, seems like you might benefit from upgrading to a newer
major version. v11 has improved things in this exact area.

On my Docker instance when I execute EXPLAIN it starts reading a lot of data. The indexes of the biggest table the query reads are 50GB, so my guess is that it reads those indexes.

I allowed EXPLAIN in Docker to finish and it took almost 500 seconds and it was reading data all the time. After I reindexed the biggest table, EXPLAIN finished instantly. Can the index corruption cause this?

The index can be bloated.

Is bloat just enough to make it slower, or the index needs to be corrupted?

Note that this started happening in production after we deleted a few million rows from the biggest table.

it is maybe same issue

https://www.postgresql.org/message-id/9ADAF560-F331-4AF8-A28A-F005070C3187%40yandex.ru

Looks similar because there were deletes after which query plans went to a few seconds.

I just upgraded Postgres to 12 with pg_upgrade, did ANALYZE of the whole DB and the issue is similar, but EXPLAIN now finishes in 19 seconds instead of 500 seconds, which means the issue is still there. Here is the whole EXPLAIN ANALYZE (I added ANALYZE so that it displays planning duration):

Sort (cost=1946.04..1946.16 rows=46 width=642) (actual time=10.929..10.932 rows=20 loops=1)
   Sort Key: m.delivery_start DESC
   Sort Method: quicksort Memory: 44kB
   -> Nested Loop Left Join (cost=10.19..1944.77 rows=46 width=642) (actual time=2.305..10.401 rows=20 loops=1)
         Join Filter: ((c.game_name = csu.game_name) AND (um.user_id = csu.user_id) AND (c.id = csu.campaign_id))
         Filter: ((c.id IS NULL) OR (c.active AND (now() >= c.campaign_start) AND ((now() <= c.campaign_end) OR (c.campaign_end IS NULL)) AND ((c.segment_ids <@ '{5632,-2,2,5634,9731,5640,5641,9738,13,3599,5648,5655,13343,5
664,13344,13345,13346,13347,13348,13350,13351,13352,5673,13353,13354,1067,13355,13356,13357,13358,13359,5680,13360,6705,13361,13362,13363,13364,13365,13366,5687,13367,13368,13369,20538,13370,13371,13372,13373,6718,13374,190
07,19008,13378,5699,13379,13380,19013,3142,3143,19015,13385,5707,6731,20555,19019,3149,19021,3150,591,19023,592,19024,6737,19027,19028,5717,6743,17496,19033,13402,19035,6750,17502,19039,17504,19041,19043,19044,17508,6757,17
510,19047,104,19048,17512,17514,7788,6764,621,19053,623,19055,7792,625,19059,7796,1141,19061,19063,7800,19064,5241,19067,19068,3713,19073,1667,19075,1668,19079,19081,19083,19084,5262,19087,19088,8338,8339,7828,19093,20631,1
9095,20632,20633,12955,19099,19101,19103,19104,19107,19108,19113,19115,8876,19119,3760,19121,19123,19124,3785,4814,16605,4324,16622,16623,21253,21254,4359,21258,5901,5915,4900,22832,22835,22836,9013,3895,3902,1865,1866,1379
,1380,4978,7027,13685,14203,2428,2429,21887,21888,4485,22932,22933,21911,21417,432,434,20418,968,969,470,7642,991,992,10209,1512,1513,23024,9201,9207,23032,9208,9214}'::integer[]) OR (csu.user_id = '-XXXXXXXXXXXXXXXXXX'::b
igint))))
         Rows Removed by Filter: 11
         -> Nested Loop (cost=10.03..1910.89 rows=163 width=696) (actual time=2.029..10.092 rows=31 loops=1)
               -> Hash Left Join (cost=9.46..695.80 rows=61 width=684) (actual time=0.729..5.991 rows=55 loops=1)
                     Hash Cond: ((m.game_name = c.game_name) AND (m.campaign_id = c.id))
                     -> Index Scan using mail_pkey on mail m (cost=0.27..686.29 rows=61 width=552) (actual time=0.297..5.524 rows=55 loops=1)
                           Filter: (active AND (game_name = 'topeleven'::game_name) AND (now() >= delivery_start) AND ((now() <= delivery_end) OR (delivery_end IS NULL)))
                           Rows Removed by Filter: 97
                     -> Hash (cost=7.83..7.83 rows=91 width=136) (actual time=0.411..0.412 rows=91 loops=1)
                           Buckets: 1024 Batches: 1 Memory Usage: 26kB
                           -> Seq Scan on campaign c (cost=0.00..7.83 rows=91 width=136) (actual time=0.265..0.371 rows=91 loops=1)
                                 Filter: (game_name = 'topeleven'::game_name)
                                 Rows Removed by Filter: 55
               -> Index Scan using user_mail_pkey on user_mail um (cost=0.57..19.88 rows=4 width=24) (actual time=0.074..0.074 rows=1 loops=55)
                     Index Cond: ((user_id = '-XXXXXXXXXXXXXXXXXX'::bigint) AND (game_name = 'topeleven'::game_name) AND (mail_id = m.id))
                     Filter: ((m.mail_type <> 'question'::mail_type) OR (mail_status <> 'completed'::mail_status))
         -> Materialize (cost=0.15..28.18 rows=1 width=16) (actual time=0.009..0.009 rows=0 loops=31)
               -> Index Only Scan using campaign_static_users_pkey on campaign_static_users csu (cost=0.15..28.17 rows=1 width=16) (actual time=0.266..0.266 rows=0 loops=1)
                     Index Cond: ((user_id = '-XXXXXXXXXXXXXXXXXX'::bigint) AND (game_name = 'topeleven'::game_name))
                     Heap Fetches: 0
Planning Time: 19408.862 ms
Execution Time: 10.978 ms
(27 rows)

Just to mention again, reindexing the biggest table fixes the issue, but I'm more curious why did this happen and how can we prevent it from happening again? Or is it already fixed so that it cannot happen in the future?

Regards,

Strahinja

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Pavel Stehule

Date:

10 December 2021, 16:33:55

pá 10. 12. 2021 v 16:52 odesílatel Strahinja Kustudić <strahinjak@nordeus.com> napsal:

On Fri, Dec 10, 2021 at 10:53 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

pá 10. 12. 2021 v 10:25 odesílatel Strahinja Kustudić <strahinjak@nordeus.com> napsal:
On Fri, Dec 10, 2021 at 2:36 AM Peter Geoghegan <pg@bowt.ie> wrote:
Sounds like a problem with get_actual_variable_range(), which can scan
indexes at plan time to determine minimum or maximum values.

This actually has been improved quite a bit since Postgres 10. So as
Jeff said, seems like you might benefit from upgrading to a newer
major version. v11 has improved things in this exact area.

On my Docker instance when I execute EXPLAIN it starts reading a lot of data. The indexes of the biggest table the query reads are 50GB, so my guess is that it reads those indexes.

I allowed EXPLAIN in Docker to finish and it took almost 500 seconds and it was reading data all the time. After I reindexed the biggest table, EXPLAIN finished instantly. Can the index corruption cause this?

The index can be bloated.

Is bloat just enough to make it slower, or the index needs to be corrupted?

corruption and bloatings are two very different things. Bloated index is slower. Corrupted index returns bad results.

Note that this started happening in production after we deleted a few million rows from the biggest table.

it is maybe same issue

https://www.postgresql.org/message-id/9ADAF560-F331-4AF8-A28A-F005070C3187%40yandex.ru

Looks similar because there were deletes after which query plans went to a few seconds.

I just upgraded Postgres to 12 with pg_upgrade, did ANALYZE of the whole DB and the issue is similar, but EXPLAIN now finishes in 19 seconds instead of 500 seconds, which means the issue is still there. Here is the whole EXPLAIN ANALYZE (I added ANALYZE so that it displays planning duration):

Sort (cost=1946.04..1946.16 rows=46 width=642) (actual time=10.929..10.932 rows=20 loops=1)
   Sort Key: m.delivery_start DESC
   Sort Method: quicksort Memory: 44kB
   -> Nested Loop Left Join (cost=10.19..1944.77 rows=46 width=642) (actual time=2.305..10.401 rows=20 loops=1)
         Join Filter: ((c.game_name = csu.game_name) AND (um.user_id = csu.user_id) AND (c.id = csu.campaign_id))
         Filter: ((c.id IS NULL) OR (c.active AND (now() >= c.campaign_start) AND ((now() <= c.campaign_end) OR (c.campaign_end IS NULL)) AND ((c.segment_ids <@ '{5632,-2,2,5634,9731,5640,5641,9738,13,3599,5648,5655,13343,5
664,13344,13345,13346,13347,13348,13350,13351,13352,5673,13353,13354,1067,13355,13356,13357,13358,13359,5680,13360,6705,13361,13362,13363,13364,13365,13366,5687,13367,13368,13369,20538,13370,13371,13372,13373,6718,13374,190
07,19008,13378,5699,13379,13380,19013,3142,3143,19015,13385,5707,6731,20555,19019,3149,19021,3150,591,19023,592,19024,6737,19027,19028,5717,6743,17496,19033,13402,19035,6750,17502,19039,17504,19041,19043,19044,17508,6757,17
510,19047,104,19048,17512,17514,7788,6764,621,19053,623,19055,7792,625,19059,7796,1141,19061,19063,7800,19064,5241,19067,19068,3713,19073,1667,19075,1668,19079,19081,19083,19084,5262,19087,19088,8338,8339,7828,19093,20631,1
9095,20632,20633,12955,19099,19101,19103,19104,19107,19108,19113,19115,8876,19119,3760,19121,19123,19124,3785,4814,16605,4324,16622,16623,21253,21254,4359,21258,5901,5915,4900,22832,22835,22836,9013,3895,3902,1865,1866,1379
,1380,4978,7027,13685,14203,2428,2429,21887,21888,4485,22932,22933,21911,21417,432,434,20418,968,969,470,7642,991,992,10209,1512,1513,23024,9201,9207,23032,9208,9214}'::integer[]) OR (csu.user_id = '-XXXXXXXXXXXXXXXXXX'::b
igint))))
         Rows Removed by Filter: 11
         -> Nested Loop (cost=10.03..1910.89 rows=163 width=696) (actual time=2.029..10.092 rows=31 loops=1)
               -> Hash Left Join (cost=9.46..695.80 rows=61 width=684) (actual time=0.729..5.991 rows=55 loops=1)
                     Hash Cond: ((m.game_name = c.game_name) AND (m.campaign_id = c.id))
                     -> Index Scan using mail_pkey on mail m (cost=0.27..686.29 rows=61 width=552) (actual time=0.297..5.524 rows=55 loops=1)
                           Filter: (active AND (game_name = 'topeleven'::game_name) AND (now() >= delivery_start) AND ((now() <= delivery_end) OR (delivery_end IS NULL)))
                           Rows Removed by Filter: 97
                     -> Hash (cost=7.83..7.83 rows=91 width=136) (actual time=0.411..0.412 rows=91 loops=1)
                           Buckets: 1024 Batches: 1 Memory Usage: 26kB
                           -> Seq Scan on campaign c (cost=0.00..7.83 rows=91 width=136) (actual time=0.265..0.371 rows=91 loops=1)
                                 Filter: (game_name = 'topeleven'::game_name)
                                 Rows Removed by Filter: 55
               -> Index Scan using user_mail_pkey on user_mail um (cost=0.57..19.88 rows=4 width=24) (actual time=0.074..0.074 rows=1 loops=55)
                     Index Cond: ((user_id = '-XXXXXXXXXXXXXXXXXX'::bigint) AND (game_name = 'topeleven'::game_name) AND (mail_id = m.id))
                     Filter: ((m.mail_type <> 'question'::mail_type) OR (mail_status <> 'completed'::mail_status))
         -> Materialize (cost=0.15..28.18 rows=1 width=16) (actual time=0.009..0.009 rows=0 loops=31)
               -> Index Only Scan using campaign_static_users_pkey on campaign_static_users csu (cost=0.15..28.17 rows=1 width=16) (actual time=0.266..0.266 rows=0 loops=1)
                     Index Cond: ((user_id = '-XXXXXXXXXXXXXXXXXX'::bigint) AND (game_name = 'topeleven'::game_name))
                     Heap Fetches: 0
Planning Time: 19408.862 ms
Execution Time: 10.978 ms
(27 rows)

Just to mention again, reindexing the biggest table fixes the issue, but I'm more curious why did this happen and how can we prevent it from happening again? Or is it already fixed so that it cannot happen in the future?

Maybe VACUUM can solve this issue too.

Massive delete is a common issue - you can try to use partitioning and then istead massive delete, use DROP partition.

Regards

Pavel

Regards,
Strahinja

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Tom Lane

Date:

10 December 2021, 16:39:22

=?UTF-8?Q?Strahinja_Kustudi=C4=87?= <strahinjak@nordeus.com> writes:
> On Fri, Dec 10, 2021 at 10:53 AM Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>> pá 10. 12. 2021 v 10:25 odesílatel Strahinja Kustudić <
>> strahinjak@nordeus.com> napsal:
>>> I allowed EXPLAIN in Docker to finish and it took almost 500 seconds and
>>> it was reading data all the time. After I reindexed the biggest table,
>>> EXPLAIN finished instantly. Can the index corruption cause this?

>> The index can be bloated.

> Is bloat just enough to make it slower, or the index needs to be corrupted?

bloat != corruption.  The problem you're dealing with here is a whole lot
of dead index entries at the very end of the index range (could be either
the lowest or highest entries).  The planner is wasting time scanning
through those to find the extremal non-dead value for a range estimate.
REINDEX got rid of the dead entries; a VACUUM would have done as well.

As noted upthread, more recent PG versions are a bit smarter about
such cases.

            regards, tom lane

Re: BUG #17330: EXPLAIN hangs and very long query plans

From

Strahinja Kustudić

Date:

11 December 2021, 14:40:43

On Fri, Dec 10, 2021 at 5:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

bloat != corruption. The problem you're dealing with here is a whole lot
of dead index entries at the very end of the index range (could be either
the lowest or highest entries). The planner is wasting time scanning
through those to find the extremal non-dead value for a range estimate.
REINDEX got rid of the dead entries; a VACUUM would have done as well.

I completely understand that bloat is different from corruption. It was just really strange that some mass deletes caused this now because we have mass deletes regularly on this DB and it has been running for a few years without any issues. You're right about VACUUM also fixing it, I just tried it inside Docker and after VACUUM finished it instantly fixed the issue. The downside is that VACUUM took a long time to finish.

As noted upthread, more recent PG versions are a bit smarter about
such cases.

As far as I understand there is not much we can do except upgrade to future major versions, because on this DB we cannot partition data in that way to be able to drop old partitions.

Thanks again for your help :)

Regards,

Strahinja