Re: Statistics Import and Export - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Statistics Import and Export
Date
Msg-id CADkLM=e_21rwvKYp_99x=knZXWvDEOSoT4ODcBBLSixFaNu8-Q@mail.gmail.com
Whole thread Raw
In response to Re: Statistics Import and Export  (jian he <jian.universality@gmail.com>)
Responses Re: Statistics Import and Export
List pgsql-hackers
On Wed, Oct 16, 2024 at 7:20 PM Corey Huinker <corey.huinker@gmail.com> wrote:
Code fix with comment on why nobody expects a relpages -1. Test case to demonstrate that relpages -1 can happen, and updated doc to reflect the new lower bound.

Additional fixes, now in a patch-set:

1. Allow relpages to be set to -1 (partitioned tables with partitions have this value after ANALYZE).
2. Turn off autovacuum on tables (where possible) if they are going to be the target of pg_set_relation_stats().
3. Allow pg_set_relation_stats to continue past an out-of-range detection on one attribute, rather than immediately returning false.
 

There is some uncertainty on what, if anything, should be returned by pg_set_relation_stats() and pg_set_attribute_stats().

Initially the function (and there was just one) returned void, but it had a bool return added around the time it split into relation/attribute stats.

Returning a boolean seems like good instrumentation and a helper for allowing other tooling to use the functions. However, it's rather limited in what it can convey.

Currently, a return of true means "a record was written", and false means that a record was not written. Cases where a record was not written for pg_set_relation_stats amount to the following:

1. No attributes were specified, therefore there is nothing to change.
2. The attributes were set to the values that the record already has, thus no change was necessary.

#2 can be confusing, because false may look like a failure, but it means "the pg_class values were already set to what you wanted".

An alternate use of boolean, suggested by Jeff was the following:

1. Return true if all of the fields specified were applied to the record.
2. Return false if any field that was specified was NOT set, even if the other ones were.

#2 is also confusing in that the user has received a false value, but the operation did modify the record, just not as fully as the caller had hoped.

These show that a boolean isn't really up to conveying the nuances of potential outcomes. Complex return types have met with considerable resistance, enumerations are similarly undesirable, no other scalar value seems up to the task, and while an INFO or LOG message could convey considerable complexity, it wouldn't be readily handled programmatically. This re-raises the question of whether the pg_set_*_stats functions should return anything at all.

Any feedback on what users would expect from these functions in terms of return value is appreciated. Bear in mind that these functions will NOT be integrated into pg_upgrade/pg_dump, as that functionality will be handled by functions that are less user friendly but more flexible and forgiving of bad data. We're talking purely about functions meant for tweaking stats to look for changes in planner behavior.











 

pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Considering fractional paths in Append node
Next
From: Andy Fan
Date:
Subject: Re: Considering fractional paths in Append node