pgsql: Repair performance regression in information_schema.triggers vie - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Repair performance regression in information_schema.triggers vie
Date
Msg-id E1jS0n1-00084x-77@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Repair performance regression in information_schema.triggers view.

Commit 32ff26911 introduced use of rank() into the triggers view to
calculate the spec-mandated action_order column.  As written, this
prevents query constraints on the table-name column from being pushed
below the window aggregate step.  That's bad for performance of this
typical usage pattern, since the view now has to be evaluated for all
tables not just the one(s) the user wants to see.  It's also the cause
of some recent buildfarm failures, in which trying to evaluate the view
rows for triggers in process of being dropped resulted in "cache lookup
failed for function NNN" errors.  Those rows aren't of interest to the
test script doing the query, but the filter that would eliminate them
is being applied too late.  None of this happened before the rank()
call was there, so it's a regression compared to v10 and before.

We can improve matters by changing the rank() call so that instead of
partitioning by OIDs, it partitions by nspname and relname, casting
those to sql_identifier so that they match the respective view output
columns exactly.  The planner has enough intelligence to know that
constraints on partitioning columns are safe to push down, so this
eliminates the performance problem and the regression test failure
risk.  We could make the other partitioning columns match view outputs
as well, but it'd be more complicated and the performance benefits
are questionable.

Side note: as this stands, the planner will push down constraints on
event_object_table and trigger_schema, but not on event_object_schema,
because it checks for ressortgroupref matches not expression
equivalence.  That might be worth improving someday, but it's not
necessary to fix the immediate concern.

Back-patch to v11 where the rank() call was added.  Ordinarily we'd not
change information_schema in released branches, but the test failure has
been seen in v12 and presumably could happen in v11 as well, so we need
to do this to keep the buildfarm happy.  The change is harmless so far
as users are concerned.  Some might wish to apply it to existing
installations if performance of this type of query is of concern,
but those who don't are no worse off.

I bumped catversion in HEAD as a pro forma matter (there's no
catalog incompatibility that would really require a re-initdb).
Obviously that can't be done in the back branches.

Discussion: https://postgr.es/m/5891.1587594470@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/baf17ad9dff4552b7e494d3f574972c21d9f90bc

Modified Files
--------------
src/backend/catalog/information_schema.sql | 11 +++++++++--
src/include/catalog/catversion.h           |  2 +-
2 files changed, 10 insertions(+), 3 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Update time zone data files to tzdata release 2020a.
Next
From: Robert Haas
Date:
Subject: pgsql: Try to avoid compiler warnings in optimized builds.