The problem is not limited to this special case. Consider cases when
1) the remote table that has many rows are heavily updated after it
got analyzed, and then 2) postgres_fdw imports its stats before it
gets re-analyzed. The stats postgres_fdw imports would be stale,
causing plan degradation. I don't think we should enable this feature
by default until we guarantee stats freshness in some way.
So it seems like we have the following configurations desired by at least somebody:
0. Row Sampling Only
1. Fetch stats and fall back to row sampling.
2. Always analyze remote table (assuming it is a table that can hold stats), then fetch stats, and fall back if necessary.
3. Fetch stats, and if that turned up 0 attribute stats try an analyze, then try to refetch and if it still fails go to row sampling.
With the following interpretation of reltuples = 0:
a. The table is definitively empty, stop.
b. The table is missing stats and running an analyze is cheap (assuming remote analysis is even enabled)
c. if remote version >= 14 then a else b
I'm of the opinion that 3c is the best configuration for most tables, and you have advocated for 1a without an analyze option and 2a with one. Option 2 seems a bit heavy handed to me, but I could see checking the remote pg_stat_all_tables and making an analyze/no-analyze judgement call based on that, perhaps call that analyze_stale_vacuum_interval or something like that. That could be a neat feature for v20, and so whatever default we choose for fetch_stats, I ask that we choose values that keep our options open for all 4x3 configurations enumerated above.