Thread: Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
Tom Lane
Date:
nikhil raj <nikhilraj474@gmail.com> writes: > I've encountered a noticeable difference in execution time and query > execution plan row counts between PostgreSQL 13 and PostgreSQL 16 when > running a query on information_schema tables. Surprisingly, PostgreSQL 16 > is performing slower than PostgreSQL 13. Yeah, it looks like that condition on "table_name" is not getting pushed down to the scan level anymore. I'm not sure why not, but will look closer tomorrow. regards, tom lane
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
David Rowley
Date:
On Tue, 27 Aug 2024 at 13:40, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, it looks like that condition on "table_name" is not getting > pushed down to the scan level anymore. I'm not sure why not, > but will look closer tomorrow. I was looking for the offending commit as at first I thought it might be related to Memoize. It does not seem to be. I get the following up until 2489d76c, and from then on, it's a subquery filter. -> Index Scan using pg_class_relname_nsp_index on pg_class r_2 (cost=0.27..8.30 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: (relname = 't_c56ng1_repository'::name) Filter: ((relkind = ANY ('{r,p}'::"char"[])) AND pg_has_role(relowner, 'USAGE'::text)) So looks like it was the "Make Vars be outer-join-aware." commit that changed this. David
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > On Tue, 27 Aug 2024 at 13:40, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, it looks like that condition on "table_name" is not getting >> pushed down to the scan level anymore. I'm not sure why not, >> but will look closer tomorrow. > So looks like it was the "Make Vars be outer-join-aware." commit that > changed this. Yeah, I got that same result by bisecting. It seems like it's somehow related to the cast to information_schema.sql_identifier: we are able to get rid of that normally but seem to fail to do so in this query. There was a smaller increase in the runtime at dfb75e478 "Add primary keys and unique constraints to system catalogs", but that seems to just be due to there being more rows in the relevant catalogs. (That's from testing the query in an empty database; probably the effect of dfb75e478 would be swamped in a production DB anyway.) regards, tom lane
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
Justin Clift
Date:
On 2024-08-27 11:50, David Rowley wrote: > On Tue, 27 Aug 2024 at 13:40, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, it looks like that condition on "table_name" is not getting >> pushed down to the scan level anymore. I'm not sure why not, >> but will look closer tomorrow. > > I was looking for the offending commit as at first I thought it might > be related to Memoize. It does not seem to be. As a general thought, seeing that this might be an actual problem should some kind of automated testing be added that checks for performance regressions like this? Regards and best wishes, Justin Clift
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
David Rowley
Date:
On Tue, 27 Aug 2024 at 18:00, Justin Clift <justin@postgresql.org> wrote: > As a general thought, seeing that this might be an actual problem > should some kind of automated testing be added that checks for > performance regressions like this? We normally try to catch these sorts of things with regression tests. Of course, that requires having a test that would catch a particular problem, which we don't seem to have for this particular case. A performance test would also require testing a particular scenario, so I don't see why that's better. A regression test is better suited as there's no middle ground between pass and fail. David
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
David Rowley
Date:
On Tue, 27 Aug 2024 at 14:03, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, I got that same result by bisecting. It seems like it's > somehow related to the cast to information_schema.sql_identifier: > we are able to get rid of that normally but seem to fail to do so > in this query. In case it saves you a bit of time, I stripped as much of the unrelated stuff out as I could and got: create table t (a name, b int); explain select * from (select a::varchar,b from (select distinct a,b from t) st) t right join t t2 on t.b=t2.b where t.a='test'; getting rid of the cast or swapping to INNER JOIN rather than RIGHT JOIN means that qual_is_pushdown_safe() gets a Var rather than a PlaceHolderVar. David
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
Justin Clift
Date:
On 2024-08-27 20:14, David Rowley wrote: > On Tue, 27 Aug 2024 at 18:00, Justin Clift <justin@postgresql.org> > wrote: >> As a general thought, seeing that this might be an actual problem >> should some kind of automated testing be added that checks for >> performance regressions like this? > > We normally try to catch these sorts of things with regression tests. > Of course, that requires having a test that would catch a particular > problem, which we don't seem to have for this particular case. A > performance test would also require testing a particular scenario, so > I don't see why that's better. A regression test is better suited as > there's no middle ground between pass and fail. Yeah, that's the kind of thing I was thinking. Any idea who normally does those, and if it would be reasonable to add test(s) for the internal information tables? Regards and best wishes, Justin Clift
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
From
David Rowley
Date:
On Wed, 28 Aug 2024 at 18:59, Justin Clift <justin@postgresql.org> wrote: > Any idea who normally does those, and if it would be reasonable to add > test(s) for the internal information tables? These tend to get added along with features and along with of bug fixes. I imagine any tests for the information_schema views would be for the results of the views rather than for the expected plans. However, that seems very separate from this as the bug has nothing to do with information_schema. It just happens to be a query to an information_schema view that helped highlight the bug. Those views are often quite complex and so are the resulting plans. With tests checking the expected EXPLAIN output, it's much better to give these a very narrow focus otherwise the expected output could be too unstable and the purpose of the test harder to determine for anyone working on a new patch which results in a plan change of a preexisting test. I've seen tests before rendered useless by people blindly accepting the plan change without properly determining what the test is supposed to be testing. That's much more likely to happen when the purpose of the test is less clear due to some unwieldy and complex expected plan. I managed to get a reproducer for this down to something quite simple. Probably that or something similar would be a better test to make sure this bug stays gone. David