and I've seen similar failures intermittently on other machines.
I'd suggest building this test atop a table that is more stable than pg_class. You're just waving a red flag in front of a bull if you expect stable statistics from that during a regression run. Nor do I see any particular reason for pg_class to be especially suited to the test.
Yeah, it's not a good practice to use pg_class in this place. While looking through the test cases added by this commit, I noticed some other minor issues that are not great. Such as
* The table 'btg' is inserted with 10000 tuples, which seems a bit expensive for a test. I don't think we need such a big table to test what we want.
* I don't see why we need to manipulate GUC max_parallel_workers and max_parallel_workers_per_gather.
* I think we'd better write the tests with the keywords being all upper or all lower. A mixed use of upper and lower is not great. Such as in
explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
* Some comments for the test queries are not easy to read.
* For this statement
CREATE INDEX idx_y_x_z ON btg(y,x,w);
I think the index name would cause confusion. It creates an index on columns y, x and w, but the name indicates an index on y, x and z.