Re: Multivariate MCV stats can leak data to unprivileged users - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Multivariate MCV stats can leak data to unprivileged users
Date
Msg-id CAEZATCWdjDpuDkA8xnGzDAAmDekVONj2ZegqdAkjiQE4honh2w@mail.gmail.com
Whole thread Raw
In response to Re: Multivariate MCV stats can leak data to unprivileged users  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Multivariate MCV stats can leak data to unprivileged users
List pgsql-hackers
On Sun, 19 May 2019 at 23:45, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> Oh, right. It still has the disadvantage that it obfuscates the actual
> data stored in the pg_stats_ext_data (or whatever would it be called),
> so e.g. functions would have to do additional checks to make sure it
> actually is the right statistic type. For example pg_mcv_list_items()
> could not rely on receiving pg_mcv_list values, as per the signature,
> but would have to check the value.
>

Yes. In fact, since the user-accessible view would want to expose
datatypes specific to the stats kinds rather than bytea or cstring
values, we would need SQL-callable conversion functions for each kind:

* to_pg_ndistinct(pg_extended_stats_ext_data) returns pg_ndistinct
* to_pg_dependencies(pg_extended_stats_ext_data) returns pg_dependencies
* to_pg_mcv(pg_extended_stats_ext_data) returns pg_mcv
* ...

and each of these would throw an error if it weren't given an extended
stats object of the right kind. Then to extract MCV data, you'd have
to do pg_mcv_list_items(to_pg_mcv(ext_data)), and presumably there'd
be something similar for histograms.

IMO, that's not a nice model, compared to just having columns of the
right types in the first place.

Also this model presupposes that all future stats kinds are most
conveniently represented in a single column, but maybe that won't be
the case. It's conceivable that a future stats kind would benefit from
splitting its data across multiple columns.


> Of course, I don't expect to have too many such functions, but overall
> this approach with a single type feels a bit too like EAV to my taste.
>

Yes, I think it is an EAV model. I think EAV models do have their
place, but I think that's largely where adding new columns is a common
operation and involves adding little to no extra code. I don't think
either of those is true for extended stats. What we've seen over the
last couple of years is that adding each new stats kind is a large
undertaking, involving lots of new code. That alone is going to limit
just how many ever get added, and compared to that effort, adding new
columns to the catalog is small fry.

Regards,
Dean



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: remove doc/bug.template?
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Statistical aggregate functions are not working with PARTIALaggregation