Simon Riggs <simon@2ndQuadrant.com> writes:
> Currently, ANALYZE collects data on all columns and stores these
> samples in pg_statistic where they can be seen via the view pg_stats.
Only if you have appropriate privileges.
> In some cases we have data that is private and we do not wish others
> to see it, such as patient names. This becomes more important when we
> have row security.
> Perhaps that data can be protected, but it would be even better if we
> simply didn't store value-revealing statistic data at all.
SET STATISTICS 0 seems like a sufficient solution for people who don't
trust the have_column_privilege() protection in the pg_stats view.
In practice I think this is a waste of time, though. Anyone who can
bypass the view restriction can probably just read the original table.
(I suppose we could consider marking pg_stats as a security_barrier
view to make this even safer. Not sure it's worth the trouble though;
the interesting columns are anyarray so it's hard to do much with them
mechanically.)
> It would be good if we could collect the overall stats
> * NULL fraction
> * average width
> * ndistinct
> yet without storing either the MFVs or histogram.
Do you have any evidence whatsoever that that's worth the trouble?
I'd bet against it. And if we're being paranoid, who's to say that
those numbers couldn't reveal useful data in themselves?
regards, tom lane