Not including a trace regarding to which expression a row refers to sounds like a design mistake to me, particularly because JSON is, by design, JSON, and we don't have ordering requirements.
I'd have no issue including the expr definition in the existing object container - an optional parameter that allow (no warning) but which has no effect. We could then compare that optional parameter against the text representation of the node and issue a non-failing warning on any differences.
But in a way we're already getting this type checking if the expressions have different datatypes and either of them have most_common_values, histogram_bounds, or most_common_elems - as all of those require input coersion of the array values to datatypes of the expressions, and any failure in any of those dooms the whole exprs array.
It's true that json jumbles orders of keys, but it does not jumble the order of elements inside an array. The 2-D text[] export has exactly this same problem, btw.
If we don't include an expression text, I'm OK to give up on this idea. But let's at least include a negative attribute number with an "attribute" field.
That feels like the same thing as including the expr definition as a parameter, and I'm as amenable to this as I am to the expr node text.
We could cross-check it with the number of expressions defined in the statext object.
Yep.
On second though, as we already use negative attribute numbers for ndistinct and dependencies, perhaps it's not a bad choice to use a negative number anyway. As the attribute number assigned depends on the order of the elements in pg_stats_ext.exprs, I'd suggest to tweak the pg_dump query to rely on that rather than ORDINALITY and the order where the rows of pg_stats_ext_exprs are scanned. Using the order of the elements in the definition of the stats object is predictable. A sequential scan of a catalog offers no real guarantees.
This has come up before. pg_stats, infuriatingly, has an attname but not an attnum.
pg_stats_ext_exprs, equally infuriatingly, does not expose pg_statistic_ext.oid, which means to join back from pg_stats_ext_exprs to pg_statistic_ext, we'd have to re-join the namespace and...yeah, it gets yucky.
But instead of doing that, we could fetch the pg_get_statisticsobjdef_expressions() when we first fetch the extstats definitions, use that as a parameter in the dump query, and unnest that with ordinality and join back to pg_stats_ext_expr. I think that's the best way of enforcing the order of the expressions and not just trusting the unnest() in the view definition.
In other news, pg_get_statisticsobjdef_expressions appears to be undocumented, though it exists in v14 when extended stats expressions were introduced, so I think we're safe there.