Thread: pgsql: Rework the pg_statistic_ext catalog
Rework the pg_statistic_ext catalog Since extended statistic got introduced in PostgreSQL 10, there was a single catalog pg_statistic_ext storing both the definitions and built statistic. That's however problematic when a user is supposed to have access only to the definitions, but not to user data. Consider for example pg_dump on a database with RLS enabled - if the pg_statistic_ext catalog respects RLS (which it should, if it contains user data), pg_dump would not see any records and the result would not define any extended statistics. That would be a surprising behavior. Until now this was not a pressing issue, because the existing types of extended statistic (functional dependencies and ndistinct coefficients) do not include any user data directly. This changed with introduction of MCV lists, which do include most common combinations of values. The easiest way to fix this is to split the pg_statistic_ext catalog into two - one for definitions, one for the built statistic values. The new catalog is called pg_statistic_ext_data, and we're maintaining a 1:1 relationship with the old catalog - either there are matching records in both catalogs, or neither of them. Bumped CATVERSION due to changing system catalog definitions. Author: Dean Rasheed, with improvements by me Reviewed-by: Dean Rasheed, John Naylor Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui%3DVdx4N%3D%3DVV5XOK5dsXfnGgVOz_JhAicB%3DZA%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/6cbfb784c3c91146148a76d50cda6f69ae6a79fb Modified Files -------------- doc/src/sgml/catalogs.sgml | 70 ++++++++++++++++++++++----- doc/src/sgml/func.sgml | 6 +-- doc/src/sgml/perform.sgml | 20 ++++---- doc/src/sgml/planstats.sgml | 4 +- src/backend/catalog/Makefile | 2 +- src/backend/commands/statscmds.c | 73 ++++++++++++++++++++++------- src/backend/optimizer/util/plancat.c | 12 +++-- src/backend/statistics/README.mcv | 9 ++-- src/backend/statistics/dependencies.c | 7 +-- src/backend/statistics/extended_stats.c | 61 +++++++++++++----------- src/backend/statistics/mcv.c | 7 +-- src/backend/statistics/mvdistinct.c | 7 +-- src/backend/utils/cache/syscache.c | 12 +++++ src/include/catalog/catversion.h | 2 +- src/include/catalog/indexing.h | 17 ++++--- src/include/catalog/pg_statistic_ext.h | 9 ++-- src/include/catalog/pg_statistic_ext_data.h | 52 ++++++++++++++++++++ src/include/catalog/toasting.h | 1 + src/include/utils/syscache.h | 1 + src/test/regress/expected/oidjoins.out | 8 ++++ src/test/regress/expected/sanity_check.out | 1 + src/test/regress/expected/stats_ext.out | 38 ++++++++++----- src/test/regress/sql/oidjoins.sql | 4 ++ src/test/regress/sql/stats_ext.sql | 30 ++++++++---- src/tools/pgindent/typedefs.list | 2 + 25 files changed, 337 insertions(+), 118 deletions(-)
Tomas Vondra <tomas.vondra@postgresql.org> writes: > Rework the pg_statistic_ext catalog So ... not one of the buildfarm members that are running TAP tests likes this. The failures look like # Running: pg_dump --no-sync --file=/Users/tgl/pgsql/src/bin/pg_dump/tmp_check/tmp_test_cSh8/role.sql --role=regress_dump_test_role--schema=dump_test_second_schema postgres pg_dump: error: query failed: ERROR: permission denied for table pg_statistic_ext pg_dump: error: query was: SELECT tableoid, oid, stxname, stxnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid= stxowner) AS rolname FROM pg_catalog.pg_statistic_ext not ok 4576 - role: pg_dump runs # Failed test 'role: pg_dump runs' # at t/002_pg_dump.pl line 3418. I think probably what's happening is that pg_dump is still trying to dump directly from the catalog, when what it needs to do now is dump from the view, in case it's not running as superuser. regards, tom lane
I wrote: > Tomas Vondra <tomas.vondra@postgresql.org> writes: >> Rework the pg_statistic_ext catalog > So ... not one of the buildfarm members that are running TAP tests > likes this. ... > I think probably what's happening is that pg_dump is still trying to dump > directly from the catalog, when what it needs to do now is dump from the > view, in case it's not running as superuser. I experimented with extracting the required data from the view, and there are at least two show-stopper problems: * The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have for dependency tracking purposes. I think we could just add it though. Now that OIDs are ordinary columns it won't even look very odd. * Rather than just not exposing the critical data for stats you don't have privilege to read, the view doesn't expose anything at all. I do not think that's acceptable; it creates a significant hazard of data loss during pg_dump, for no very good reason. What we should be doing, IMO, is still showing all the rows but filling the data-value columns with nulls in rows where the caller can't access the underlying data. regards, tom lane
On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Tomas Vondra <tomas.vondra@postgresql.org> writes:
>> Rework the pg_statistic_ext catalog
> So ... not one of the buildfarm members that are running TAP tests
> likes this. ...
> I think probably what's happening is that pg_dump is still trying to dump
> directly from the catalog, when what it needs to do now is dump from the
> view, in case it's not running as superuser.
I experimented with extracting the required data from the view, and
there are at least two show-stopper problems:
* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have
for dependency tracking purposes. I think we could just add it though.
Now that OIDs are ordinary columns it won't even look very odd.
* Rather than just not exposing the critical data for stats you don't
have privilege to read, the view doesn't expose anything at all.
I do not think that's acceptable; it creates a significant hazard of
data loss during pg_dump, for no very good reason. What we should
be doing, IMO, is still showing all the rows but filling the data-value
columns with nulls in rows where the caller can't access the underlying
data.
Hang on. Isn't the real problem that we should be revoking public access from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still be able to see the stats definitions in the catalog table, but not the stats data, so I think pg_dump wouldn't need changing.
Regards,
Dean
On Sun, Jun 16, 2019 at 07:18:58AM +0100, Dean Rasheed wrote: >On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote: > >> I wrote: >> > Tomas Vondra <tomas.vondra@postgresql.org> writes: >> >> Rework the pg_statistic_ext catalog >> >> > So ... not one of the buildfarm members that are running TAP tests >> > likes this. ... >> > I think probably what's happening is that pg_dump is still trying to dump >> > directly from the catalog, when what it needs to do now is dump from the >> > view, in case it's not running as superuser. >> >> I experimented with extracting the required data from the view, and >> there are at least two show-stopper problems: >> >> * The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have >> for dependency tracking purposes. I think we could just add it though. >> Now that OIDs are ordinary columns it won't even look very odd. >> >> * Rather than just not exposing the critical data for stats you don't >> have privilege to read, the view doesn't expose anything at all. >> I do not think that's acceptable; it creates a significant hazard of >> data loss during pg_dump, for no very good reason. What we should >> be doing, IMO, is still showing all the rows but filling the data-value >> columns with nulls in rows where the caller can't access the underlying >> data. >> >> > >Hang on. Isn't the real problem that we should be revoking public access >from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still >be able to see the stats definitions in the catalog table, but not the >stats data, so I think pg_dump wouldn't need changing. > Yeah, that's pretty much why we split the catalog - to allow pg_dump to read the definitions, without exposing the stats too. I'll look into this and push a fix to unbreak the buildfarm (sorry about that). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Jun 16, 2019 at 10:56:51AM +0200, Tomas Vondra wrote: >On Sun, Jun 16, 2019 at 07:18:58AM +0100, Dean Rasheed wrote: >>On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote: >> >>>I wrote: >>>> Tomas Vondra <tomas.vondra@postgresql.org> writes: >>>>> Rework the pg_statistic_ext catalog >>> >>>> So ... not one of the buildfarm members that are running TAP tests >>>> likes this. ... >>>> I think probably what's happening is that pg_dump is still trying to dump >>>> directly from the catalog, when what it needs to do now is dump from the >>>> view, in case it's not running as superuser. >>> >>>I experimented with extracting the required data from the view, and >>>there are at least two show-stopper problems: >>> >>>* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have >>>for dependency tracking purposes. I think we could just add it though. >>>Now that OIDs are ordinary columns it won't even look very odd. >>> >>>* Rather than just not exposing the critical data for stats you don't >>>have privilege to read, the view doesn't expose anything at all. >>>I do not think that's acceptable; it creates a significant hazard of >>>data loss during pg_dump, for no very good reason. What we should >>>be doing, IMO, is still showing all the rows but filling the data-value >>>columns with nulls in rows where the caller can't access the underlying >>>data. >>> >>> >> >>Hang on. Isn't the real problem that we should be revoking public access >>from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still >>be able to see the stats definitions in the catalog table, but not the >>stats data, so I think pg_dump wouldn't need changing. >> > >Yeah, that's pretty much why we split the catalog - to allow pg_dump to >read the definitions, without exposing the stats too. I'll look into >this and push a fix to unbreak the buildfarm (sorry about that). > OK, the bug here seems to be pretty silly - the GRANT in system_views does this: GRANT SELECT (oid, stxrelid, stxname, stxnamespace, stxowner, stxkeys, stxkind) ON pg_statistic_ext TO public; but pg_dump also uses tableoid column. After adding it to the GRANT I can run pg_dump using regular user just fine. I'll push this fix shortly. But we're not out of the woods yet, it seems. check-world still fails for me with an error like this: t/011_crash_recovery.pl .............. 1/3 # Failed test 'new xid after restart is greater' # at t/011_crash_recovery.pl line 61. # '486' # > # '486' # Failed test 'xid is aborted after crash' # at t/011_crash_recovery.pl line 65. # got: 'committed' # expected: 'aborted' # Looks like you failed 2 tests of 3. t/011_crash_recovery.pl .............. Dubious, test returned 2 (wstat 512, 0x200) Failed 2/3 subtests which I find rather strange. I have no idea why splitting a catalog and adding a system view should affect recovery like this. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Jun 16, 2019 at 12:03:28PM +0200, Tomas Vondra wrote: >On Sun, Jun 16, 2019 at 10:56:51AM +0200, Tomas Vondra wrote: >>On Sun, Jun 16, 2019 at 07:18:58AM +0100, Dean Rasheed wrote: >>>On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote: >>> >>>>I wrote: >>>>>Tomas Vondra <tomas.vondra@postgresql.org> writes: >>>>>>Rework the pg_statistic_ext catalog >>>> >>>>>So ... not one of the buildfarm members that are running TAP tests >>>>>likes this. ... >>>>>I think probably what's happening is that pg_dump is still trying to dump >>>>>directly from the catalog, when what it needs to do now is dump from the >>>>>view, in case it's not running as superuser. >>>> >>>>I experimented with extracting the required data from the view, and >>>>there are at least two show-stopper problems: >>>> >>>>* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have >>>>for dependency tracking purposes. I think we could just add it though. >>>>Now that OIDs are ordinary columns it won't even look very odd. >>>> >>>>* Rather than just not exposing the critical data for stats you don't >>>>have privilege to read, the view doesn't expose anything at all. >>>>I do not think that's acceptable; it creates a significant hazard of >>>>data loss during pg_dump, for no very good reason. What we should >>>>be doing, IMO, is still showing all the rows but filling the data-value >>>>columns with nulls in rows where the caller can't access the underlying >>>>data. >>>> >>>> >>> >>>Hang on. Isn't the real problem that we should be revoking public access >>>from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still >>>be able to see the stats definitions in the catalog table, but not the >>>stats data, so I think pg_dump wouldn't need changing. >>> >> >>Yeah, that's pretty much why we split the catalog - to allow pg_dump to >>read the definitions, without exposing the stats too. I'll look into >>this and push a fix to unbreak the buildfarm (sorry about that). >> > >OK, the bug here seems to be pretty silly - the GRANT in system_views >does this: > >GRANT SELECT (oid, stxrelid, stxname, stxnamespace, stxowner, stxkeys, stxkind) > ON pg_statistic_ext TO public; > >but pg_dump also uses tableoid column. After adding it to the GRANT I >can run pg_dump using regular user just fine. I'll push this fix >shortly. > OK, it seems the buildfarm is getting green again. I'm not sure why I'm seeing the failure in 011_crash_recovery.pl, while the buildfarm does not. Perhaps my environment is borked in some way? Not sure. Anyway, I'm wondering if we need the REVOKE/GRANT on pg_statistic_ext at all, and if we do if we should just remove the list of columns, and grant SELECT on all. All the sensitive columns were moved to another catalog, after all. That also reminds me the patch should have revoked access to the new _data catalog. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Anyway, I'm wondering if we need the REVOKE/GRANT on pg_statistic_ext at > all, and if we do if we should just remove the list of columns, and > grant SELECT on all. All the sensitive columns were moved to another > catalog, after all. +1. Per-column permissions checks seem pointlessly inefficient here. > That also reminds me the patch should have revoked access to the new > _data catalog. Uh ... yeah. regards, tom lane
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > OK, it seems the buildfarm is getting green again. I'm not sure why I'm > seeing the failure in 011_crash_recovery.pl, while the buildfarm does > not. Perhaps my environment is borked in some way? Not sure. I see that Coelho's two bleeding-edge-clang animals are reporting that failure too. Normally I'd just ignore those two; they break pretty regularly. Maybe you're using an almost-bleeding-edge clang? regards, tom lane
I wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> OK, it seems the buildfarm is getting green again. I'm not sure why I'm >> seeing the failure in 011_crash_recovery.pl, while the buildfarm does >> not. Perhaps my environment is borked in some way? Not sure. > I see that Coelho's two bleeding-edge-clang animals are reporting that > failure too. Normally I'd just ignore those two; they break pretty > regularly. Maybe you're using an almost-bleeding-edge clang? Oh --- I managed to reproduce that failure locally on RHEL6 (nothing bleeding-edge anywhere), but it went away after make-clean-and-rebuild. I'm suspicious that the underlying cause was failure to recompile something that knows the value of CATALOG_VERSION_NO. This theory is insufficient to explain why Coelho's animals failed, though. Maybe it would, if you also posit a ccache screwup? But it's not obvious from their configurations that they're using ccache at all. regards, tom lane
Hello Tom, >> I see that Coelho's two bleeding-edge-clang animals are reporting that >> failure too. Normally I'd just ignore those two; And you'd be right. >> they break pretty regularly. Maybe you're using an >> almost-bleeding-edge clang? > > Oh --- I managed to reproduce that failure locally on RHEL6 (nothing > bleeding-edge anywhere), but it went away after make-clean-and-rebuild. > I'm suspicious that the underlying cause was failure to recompile > something that knows the value of CATALOG_VERSION_NO. Hmmm. My animals compile out of the source tree, not sure how this could happen. > This theory is insufficient to explain why Coelho's animals failed, > though. Maybe it would, if you also posit a ccache screwup? But it's > not obvious from their configurations that they're using ccache at all. Indeed, no ccache. The compilers change often (well, once a week), I do not want anything kept across compiler versions. The animals are really testing the compilers as much as they are testing postgres. However, there is an autoconf cache, but I do not see why it would have such an effect. So no clue. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> This theory is insufficient to explain why Coelho's animals failed, >> though. Maybe it would, if you also posit a ccache screwup? But it's >> not obvious from their configurations that they're using ccache at all. > Indeed, no ccache. The compilers change often (well, once a week), I do > not want anything kept across compiler versions. The animals are really > testing the compilers as much as they are testing postgres. > However, there is an autoconf cache, but I do not see why it would have > such an effect. > So no clue. And now they're both green again, so even less clue. Oh well. On another day I'd be interested to understand exactly what happened there, but right now I lack the time or energy to look closer. (I do still have a suspicion that it was somehow caused by the catversion bumps ...) regards, tom lane
On Sun, Jun 16, 2019 at 01:52:03PM -0400, Tom Lane wrote: >Fabien COELHO <coelho@cri.ensmp.fr> writes: >>> This theory is insufficient to explain why Coelho's animals failed, >>> though. Maybe it would, if you also posit a ccache screwup? But it's >>> not obvious from their configurations that they're using ccache at all. > >> Indeed, no ccache. The compilers change often (well, once a week), I do >> not want anything kept across compiler versions. The animals are really >> testing the compilers as much as they are testing postgres. > >> However, there is an autoconf cache, but I do not see why it would have >> such an effect. > >> So no clue. > >And now they're both green again, so even less clue. Oh well. >On another day I'd be interested to understand exactly what happened >there, but right now I lack the time or energy to look closer. > >(I do still have a suspicion that it was somehow caused by the >catversion bumps ...) > FWIW I'm not using anything bleeding edge either (gcc from fedora 29, 8.3.1 to be exact), no ccache. And I can't reproduce it anymore, so I agree it might have been some stale state after catversion bump. Although I usually do make-clean before running tests ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services