Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10 - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10
Date
Msg-id e61a37f3-eeeb-21cb-647d-0fc3771f94c9@enterprisedb.com
Whole thread Raw
In response to Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10  (Robert Leach <rleach@princeton.edu>)
List pgsql-bugs
On 11/2/23 11:45, David Rowley wrote:
> On Thu, 2 Nov 2023 at 21:56, PG Bug reporting form
> <noreply@postgresql.org> wrote:
>> I cannot claim to understand the bug that is causing this issue, so the best
>> I can do is simply provide the explain output and try to keep from providing
>> confusing details, because this is outside the realm of my expertise:
> 
> 
>> ('                    ->  Index Scan using
>> "DataRepo_peakdata_peak_group_id_4dd87f4a" on "DataRepo_peakdata"
>> (cost=0.25..8.26 rows=1 width=8) (actual time=0.017..7.149 rows=7896
>> loops=1)',)
> 
> Nothing looks particularly bug like so far.  It seems the
> pg_class.reltuples estimate for this relation is way out.
> 
> Has autovacuum gotten to this table recently?
> 
> select * from pg_stat_user_tables where relid = '"DataRepo_peakdata"'::regclass;
> 
> The plan would likely come good if you analyzed that table.  You
> should see if you can figure out why autovacuum hasn't analyzed it.
> 
> For the future, it might be best to post EXPLAIN output as an
> attachment.  The extra formatting makes it difficult to read.
> 

I did look at the django forum, and the plans are there. For convenience
I'm attaching them here too. But there's nothing particularly revealing,
except that it indeed seems to be related to the index scan.

One thing that puzzles me is why it'd behave differently between PG10
and PG13. I'd have expected both releases to do the same thing if there
are no stats, but maybe PG10 does analyze it time for some reason?

Would be interesting to see the pg_class data (reltuples, relpages and
timestamp of last analyze/autoanalyze) for the slow/fast cases.

However, the forum thread also claims this:

    And here is the Django 4.2 and postgres 13 output when running just
    the one test (which is always fast):

Which seems weird. But maybe the table is modified/updated by the tests,
so a single test might seem much less data, or something? Not sure.

However, there's one more interesting bit - the 'weird errors' after
placing VACUUM FULL ANALYZE into test setup/teardown are this:

    psycopg2.errors.DataCorrupted: missing chunk number 0 for toast
    value 2884552 in pg_toast_2619

That seems unrelated to the planning issue, but it definitely seems like
a bug. Interestingly enough we got a report [1] about the same error on
pgsql-hackers a couple days ago. That error affected PG11-16, which
seems to be consistent with this (PG13).


[1]
https://www.postgresql.org/message-id/CALdSSPhmqoN02ciT4UxS6ax0N84NpRwPWm87nKJ_%2B0G-Na8qOQ%40mail.gmail.com


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-bugs by date:

Previous
From: Robert Leach
Date:
Subject: Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10
Next
From: David Rowley
Date:
Subject: Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10