Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10 - Mailing list pgsql-bugs

From Robert Leach
Subject Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10
Date
Msg-id 613F52D0-D7F1-4C94-B1F9-DCB53C040406@princeton.edu
Whole thread Raw
In response to Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-bugs
Would be interesting to see the pg_class data (reltuples, relpages and
timestamp of last analyze/autoanalyze) for the slow/fast cases.

If you give me the necessary query, I can execute it an give you the output.

However, the forum thread also claims this:

   And here is the Django 4.2 and postgres 13 output when running just
   the one test (which is always fast):

Which seems weird. But maybe the table is modified/updated by the tests,
so a single test might seem much less data, or something? Not sure.

The tables are not modified by the tests.  All that the tests do is load data and make a query.  If you run just the 1 test, it loads 1 samples and animals (and other underlying data) in the class setup (to make data available to every test in the class) and the test loads peak data and makes a query.  That's the fast case.

In the slow case, before doing all that, the other test class does roughly the same thing, but once that class's tests are done, the data is destroyed and the tests move on to the next class.

However, there's one more interesting bit - the 'weird errors' after
placing VACUUM FULL ANALYZE into test setup/teardown are this:

   psycopg2.errors.DataCorrupted: missing chunk number 0 for toast
   value 2884552 in pg_toast_2619

That seems unrelated to the planning issue, but it definitely seems like
a bug. Interestingly enough we got a report [1] about the same error on
pgsql-hackers a couple days ago. That error affected PG11-16, which
seems to be consistent with this (PG13).

IMO, you should take that error with a grain of salt.  Maybe it means something or maybe it's the result of me not knowing what I'm doing when I started inserting debug code and SQL commands that I'm unfamiliar with and could be using incorrectly.  We never got that error until I started messing around to debug this issue.

By the way, the codebase is a public repo.  Here are some links, if you would find it useful to peruse the code:

REPO:
The models:
https://github.com/Princeton-LSI-ResearchComputing/tracebase/tree/main/DataRepo/models
My debug branch for this issue (which is actively being worked on and occasionally updated):
https://github.com/Princeton-LSI-ResearchComputing/tracebase/tree/speed_debugging
The test I've been using to debug this issue:
https://github.com/Princeton-LSI-ResearchComputing/tracebase/blob/7c7fa13eec1233a12bd114ec90270bb94fe959ae/DataRepo/tests/loading/test_load_accucor_msruns.py#L929

Rob

pgsql-bugs by date:

Previous
From: Gazel Alvarado Ivan
Date:
Subject: Installation problem with application stack builder
Next
From: Laurenz Albe
Date:
Subject: Re: BUG #18178: New Restriction on "ON SELECT" rules on tables