* A brand new empty table is analyzed (since it has no statistics).
* Re-running on an empty table analyzes again, because there are still no pg_statistic rows.
* A table with data but no statistics is analyzed.
* Re-running after statistics exist causes the table to be skipped.
* If a new column is added and lacks statistics, the table is analyzed again.
* After statistics are created for that column, subsequent runs skip the table.
* If statistics are manually deleted or effectively lost (e.g., crash recovery scenarios affecting stats tracking), the table is analyzed again.
Repeated runs therefore converge toward a no-op once all relations have complete statistics.
Regression tests are included.
As discussed earlier in the thread, I plan to start a new discussion and patch series for a separate ANALYZE (MODIFIED_STATS) option that would reuse autoanalyze-style thresholds. I believe keeping MISSING_STATS_ONLY and MODIFIED_STATS as separate, clearly defined options makes the semantics easier to reason about.
I would greatly appreciate further review and feedback on this version. Thank you all for the detailed guidance and suggestions so far — especially regarding reuse of examine_attribute() and alignment with vacuumdb behavior. This process has been very educational for me.