Thread: pgsql: Rework the pg_statistic_ext catalog

pgsql: Rework the pg_statistic_ext catalog

From
Tomas Vondra
Date:
Rework the pg_statistic_ext catalog

Since extended statistic got introduced in PostgreSQL 10, there was a
single catalog pg_statistic_ext storing both the definitions and built
statistic.  That's however problematic when a user is supposed to have
access only to the definitions, but not to user data.

Consider for example pg_dump on a database with RLS enabled - if the
pg_statistic_ext catalog respects RLS (which it should, if it contains
user data), pg_dump would not see any records and the result would not
define any extended statistics.  That would be a surprising behavior.

Until now this was not a pressing issue, because the existing types of
extended statistic (functional dependencies and ndistinct coefficients)
do not include any user data directly.  This changed with introduction
of MCV lists, which do include most common combinations of values.

The easiest way to fix this is to split the pg_statistic_ext catalog
into two - one for definitions, one for the built statistic values.
The new catalog is called pg_statistic_ext_data, and we're maintaining
a 1:1 relationship with the old catalog - either there are matching
records in both catalogs, or neither of them.

Bumped CATVERSION due to changing system catalog definitions.

Author: Dean Rasheed, with improvements by me
Reviewed-by: Dean Rasheed, John Naylor
Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui%3DVdx4N%3D%3DVV5XOK5dsXfnGgVOz_JhAicB%3DZA%40mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6cbfb784c3c91146148a76d50cda6f69ae6a79fb

Modified Files
--------------
doc/src/sgml/catalogs.sgml                  | 70 ++++++++++++++++++++++-----
doc/src/sgml/func.sgml                      |  6 +--
doc/src/sgml/perform.sgml                   | 20 ++++----
doc/src/sgml/planstats.sgml                 |  4 +-
src/backend/catalog/Makefile                |  2 +-
src/backend/commands/statscmds.c            | 73 ++++++++++++++++++++++-------
src/backend/optimizer/util/plancat.c        | 12 +++--
src/backend/statistics/README.mcv           |  9 ++--
src/backend/statistics/dependencies.c       |  7 +--
src/backend/statistics/extended_stats.c     | 61 +++++++++++++-----------
src/backend/statistics/mcv.c                |  7 +--
src/backend/statistics/mvdistinct.c         |  7 +--
src/backend/utils/cache/syscache.c          | 12 +++++
src/include/catalog/catversion.h            |  2 +-
src/include/catalog/indexing.h              | 17 ++++---
src/include/catalog/pg_statistic_ext.h      |  9 ++--
src/include/catalog/pg_statistic_ext_data.h | 52 ++++++++++++++++++++
src/include/catalog/toasting.h              |  1 +
src/include/utils/syscache.h                |  1 +
src/test/regress/expected/oidjoins.out      |  8 ++++
src/test/regress/expected/sanity_check.out  |  1 +
src/test/regress/expected/stats_ext.out     | 38 ++++++++++-----
src/test/regress/sql/oidjoins.sql           |  4 ++
src/test/regress/sql/stats_ext.sql          | 30 ++++++++----
src/tools/pgindent/typedefs.list            |  2 +
25 files changed, 337 insertions(+), 118 deletions(-)


Re: pgsql: Rework the pg_statistic_ext catalog

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@postgresql.org> writes:
> Rework the pg_statistic_ext catalog

So ... not one of the buildfarm members that are running TAP tests
likes this.  The failures look like

# Running: pg_dump --no-sync --file=/Users/tgl/pgsql/src/bin/pg_dump/tmp_check/tmp_test_cSh8/role.sql
--role=regress_dump_test_role--schema=dump_test_second_schema postgres 
pg_dump: error: query failed: ERROR:  permission denied for table pg_statistic_ext
pg_dump: error: query was: SELECT tableoid, oid, stxname, stxnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE
oid= stxowner) AS rolname FROM pg_catalog.pg_statistic_ext 
not ok 4576 - role: pg_dump runs

#   Failed test 'role: pg_dump runs'
#   at t/002_pg_dump.pl line 3418.

I think probably what's happening is that pg_dump is still trying to dump
directly from the catalog, when what it needs to do now is dump from the
view, in case it's not running as superuser.

            regards, tom lane



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tom Lane
Date:
I wrote:
> Tomas Vondra <tomas.vondra@postgresql.org> writes:
>> Rework the pg_statistic_ext catalog

> So ... not one of the buildfarm members that are running TAP tests
> likes this. ...
> I think probably what's happening is that pg_dump is still trying to dump
> directly from the catalog, when what it needs to do now is dump from the
> view, in case it's not running as superuser.

I experimented with extracting the required data from the view, and
there are at least two show-stopper problems:

* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have
for dependency tracking purposes.  I think we could just add it though.
Now that OIDs are ordinary columns it won't even look very odd.

* Rather than just not exposing the critical data for stats you don't
have privilege to read, the view doesn't expose anything at all.
I do not think that's acceptable; it creates a significant hazard of
data loss during pg_dump, for no very good reason.  What we should
be doing, IMO, is still showing all the rows but filling the data-value
columns with nulls in rows where the caller can't access the underlying
data.

            regards, tom lane



Re: pgsql: Rework the pg_statistic_ext catalog

From
Dean Rasheed
Date:
On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Tomas Vondra <tomas.vondra@postgresql.org> writes:
>> Rework the pg_statistic_ext catalog

> So ... not one of the buildfarm members that are running TAP tests
> likes this. ...
> I think probably what's happening is that pg_dump is still trying to dump
> directly from the catalog, when what it needs to do now is dump from the
> view, in case it's not running as superuser.

I experimented with extracting the required data from the view, and
there are at least two show-stopper problems:

* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have
for dependency tracking purposes.  I think we could just add it though.
Now that OIDs are ordinary columns it won't even look very odd.

* Rather than just not exposing the critical data for stats you don't
have privilege to read, the view doesn't expose anything at all.
I do not think that's acceptable; it creates a significant hazard of
data loss during pg_dump, for no very good reason.  What we should
be doing, IMO, is still showing all the rows but filling the data-value
columns with nulls in rows where the caller can't access the underlying
data.
 

Hang on. Isn't the real problem that we should be revoking public access from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still be able to see the stats definitions in the catalog table, but not the stats data, so I think pg_dump wouldn't need changing.

Regards,
Dean

Re: pgsql: Rework the pg_statistic_ext catalog

From
Tomas Vondra
Date:
On Sun, Jun 16, 2019 at 07:18:58AM +0100, Dean Rasheed wrote:
>On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
>
>> I wrote:
>> > Tomas Vondra <tomas.vondra@postgresql.org> writes:
>> >> Rework the pg_statistic_ext catalog
>>
>> > So ... not one of the buildfarm members that are running TAP tests
>> > likes this. ...
>> > I think probably what's happening is that pg_dump is still trying to dump
>> > directly from the catalog, when what it needs to do now is dump from the
>> > view, in case it's not running as superuser.
>>
>> I experimented with extracting the required data from the view, and
>> there are at least two show-stopper problems:
>>
>> * The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have
>> for dependency tracking purposes.  I think we could just add it though.
>> Now that OIDs are ordinary columns it won't even look very odd.
>>
>> * Rather than just not exposing the critical data for stats you don't
>> have privilege to read, the view doesn't expose anything at all.
>> I do not think that's acceptable; it creates a significant hazard of
>> data loss during pg_dump, for no very good reason.  What we should
>> be doing, IMO, is still showing all the rows but filling the data-value
>> columns with nulls in rows where the caller can't access the underlying
>> data.
>>
>>
>
>Hang on. Isn't the real problem that we should be revoking public access
>from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still
>be able to see the stats definitions in the catalog table, but not the
>stats data, so I think pg_dump wouldn't need changing.
>

Yeah, that's pretty much why we split the catalog - to allow pg_dump to
read the definitions, without exposing the stats too. I'll look into
this and push a fix to unbreak the buildfarm (sorry about that).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tomas Vondra
Date:
On Sun, Jun 16, 2019 at 10:56:51AM +0200, Tomas Vondra wrote:
>On Sun, Jun 16, 2019 at 07:18:58AM +0100, Dean Rasheed wrote:
>>On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
>>
>>>I wrote:
>>>> Tomas Vondra <tomas.vondra@postgresql.org> writes:
>>>>> Rework the pg_statistic_ext catalog
>>>
>>>> So ... not one of the buildfarm members that are running TAP tests
>>>> likes this. ...
>>>> I think probably what's happening is that pg_dump is still trying to dump
>>>> directly from the catalog, when what it needs to do now is dump from the
>>>> view, in case it's not running as superuser.
>>>
>>>I experimented with extracting the required data from the view, and
>>>there are at least two show-stopper problems:
>>>
>>>* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have
>>>for dependency tracking purposes.  I think we could just add it though.
>>>Now that OIDs are ordinary columns it won't even look very odd.
>>>
>>>* Rather than just not exposing the critical data for stats you don't
>>>have privilege to read, the view doesn't expose anything at all.
>>>I do not think that's acceptable; it creates a significant hazard of
>>>data loss during pg_dump, for no very good reason.  What we should
>>>be doing, IMO, is still showing all the rows but filling the data-value
>>>columns with nulls in rows where the caller can't access the underlying
>>>data.
>>>
>>>
>>
>>Hang on. Isn't the real problem that we should be revoking public access
>>from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still
>>be able to see the stats definitions in the catalog table, but not the
>>stats data, so I think pg_dump wouldn't need changing.
>>
>
>Yeah, that's pretty much why we split the catalog - to allow pg_dump to
>read the definitions, without exposing the stats too. I'll look into
>this and push a fix to unbreak the buildfarm (sorry about that).
>

OK, the bug here seems to be pretty silly - the GRANT in system_views
does this:

GRANT SELECT (oid, stxrelid, stxname, stxnamespace, stxowner, stxkeys, stxkind)
    ON pg_statistic_ext TO public;

but pg_dump also uses tableoid column. After adding it to the GRANT I
can run pg_dump using regular user just fine. I'll push this fix
shortly.

But we're not out of the woods yet, it seems. check-world still fails
for me with an error like this:

t/011_crash_recovery.pl .............. 1/3 
#   Failed test 'new xid after restart is greater'
#   at t/011_crash_recovery.pl line 61.
#     '486'
#         >
#     '486'

#   Failed test 'xid is aborted after crash'
#   at t/011_crash_recovery.pl line 65.
#          got: 'committed'
#     expected: 'aborted'
# Looks like you failed 2 tests of 3.
t/011_crash_recovery.pl .............. Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/3 subtests 

which I find rather strange. I have no idea why splitting a catalog and
adding a system view should affect recovery like this.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tomas Vondra
Date:
On Sun, Jun 16, 2019 at 12:03:28PM +0200, Tomas Vondra wrote:
>On Sun, Jun 16, 2019 at 10:56:51AM +0200, Tomas Vondra wrote:
>>On Sun, Jun 16, 2019 at 07:18:58AM +0100, Dean Rasheed wrote:
>>>On Sun, 16 Jun 2019, 03:05 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
>>>
>>>>I wrote:
>>>>>Tomas Vondra <tomas.vondra@postgresql.org> writes:
>>>>>>Rework the pg_statistic_ext catalog
>>>>
>>>>>So ... not one of the buildfarm members that are running TAP tests
>>>>>likes this. ...
>>>>>I think probably what's happening is that pg_dump is still trying to dump
>>>>>directly from the catalog, when what it needs to do now is dump from the
>>>>>view, in case it's not running as superuser.
>>>>
>>>>I experimented with extracting the required data from the view, and
>>>>there are at least two show-stopper problems:
>>>>
>>>>* The view doesn't expose pg_statistic_ext.oid, which pg_dump has to have
>>>>for dependency tracking purposes.  I think we could just add it though.
>>>>Now that OIDs are ordinary columns it won't even look very odd.
>>>>
>>>>* Rather than just not exposing the critical data for stats you don't
>>>>have privilege to read, the view doesn't expose anything at all.
>>>>I do not think that's acceptable; it creates a significant hazard of
>>>>data loss during pg_dump, for no very good reason.  What we should
>>>>be doing, IMO, is still showing all the rows but filling the data-value
>>>>columns with nulls in rows where the caller can't access the underlying
>>>>data.
>>>>
>>>>
>>>
>>>Hang on. Isn't the real problem that we should be revoking public access
>>>from pg_statistic_ext_data, not pg_statistic_ext? Normal users should still
>>>be able to see the stats definitions in the catalog table, but not the
>>>stats data, so I think pg_dump wouldn't need changing.
>>>
>>
>>Yeah, that's pretty much why we split the catalog - to allow pg_dump to
>>read the definitions, without exposing the stats too. I'll look into
>>this and push a fix to unbreak the buildfarm (sorry about that).
>>
>
>OK, the bug here seems to be pretty silly - the GRANT in system_views
>does this:
>
>GRANT SELECT (oid, stxrelid, stxname, stxnamespace, stxowner, stxkeys, stxkind)
>   ON pg_statistic_ext TO public;
>
>but pg_dump also uses tableoid column. After adding it to the GRANT I
>can run pg_dump using regular user just fine. I'll push this fix
>shortly.
>

OK, it seems the buildfarm is getting green again. I'm not sure why I'm
seeing the failure in 011_crash_recovery.pl, while the buildfarm does
not. Perhaps my environment is borked in some way? Not sure.

Anyway, I'm wondering if we need the REVOKE/GRANT on pg_statistic_ext at
all, and if we do if we should just remove the list of columns, and
grant SELECT on all. All the sensitive columns were moved to another
catalog, after all.

That also reminds me the patch should have revoked access to the new
_data catalog.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Anyway, I'm wondering if we need the REVOKE/GRANT on pg_statistic_ext at
> all, and if we do if we should just remove the list of columns, and
> grant SELECT on all. All the sensitive columns were moved to another
> catalog, after all.

+1.  Per-column permissions checks seem pointlessly inefficient here.

> That also reminds me the patch should have revoked access to the new
> _data catalog.

Uh ... yeah.

            regards, tom lane



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> OK, it seems the buildfarm is getting green again. I'm not sure why I'm
> seeing the failure in 011_crash_recovery.pl, while the buildfarm does
> not. Perhaps my environment is borked in some way? Not sure.

I see that Coelho's two bleeding-edge-clang animals are reporting that
failure too.  Normally I'd just ignore those two; they break pretty
regularly.  Maybe you're using an almost-bleeding-edge clang?

            regards, tom lane



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tom Lane
Date:
I wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> OK, it seems the buildfarm is getting green again. I'm not sure why I'm
>> seeing the failure in 011_crash_recovery.pl, while the buildfarm does
>> not. Perhaps my environment is borked in some way? Not sure.

> I see that Coelho's two bleeding-edge-clang animals are reporting that
> failure too.  Normally I'd just ignore those two; they break pretty
> regularly.  Maybe you're using an almost-bleeding-edge clang?

Oh --- I managed to reproduce that failure locally on RHEL6 (nothing
bleeding-edge anywhere), but it went away after make-clean-and-rebuild.
I'm suspicious that the underlying cause was failure to recompile
something that knows the value of CATALOG_VERSION_NO.

This theory is insufficient to explain why Coelho's animals failed,
though.  Maybe it would, if you also posit a ccache screwup?  But it's
not obvious from their configurations that they're using ccache at all.

            regards, tom lane



Re: pgsql: Rework the pg_statistic_ext catalog

From
Fabien COELHO
Date:
Hello Tom,

>> I see that Coelho's two bleeding-edge-clang animals are reporting that
>> failure too.  Normally I'd just ignore those two;

And you'd be right.

>> they break pretty regularly.  Maybe you're using an 
>> almost-bleeding-edge clang?
>
> Oh --- I managed to reproduce that failure locally on RHEL6 (nothing
> bleeding-edge anywhere), but it went away after make-clean-and-rebuild.

> I'm suspicious that the underlying cause was failure to recompile
> something that knows the value of CATALOG_VERSION_NO.

Hmmm. My animals compile out of the source tree, not sure how this could 
happen.

> This theory is insufficient to explain why Coelho's animals failed,
> though.  Maybe it would, if you also posit a ccache screwup?  But it's
> not obvious from their configurations that they're using ccache at all.

Indeed, no ccache. The compilers change often (well, once a week), I do 
not want anything kept across compiler versions. The animals are really 
testing the compilers as much as they are testing postgres.

However, there is an autoconf cache, but I do not see why it would have 
such an effect.

So no clue.

-- 
Fabien.



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> This theory is insufficient to explain why Coelho's animals failed,
>> though.  Maybe it would, if you also posit a ccache screwup?  But it's
>> not obvious from their configurations that they're using ccache at all.

> Indeed, no ccache. The compilers change often (well, once a week), I do 
> not want anything kept across compiler versions. The animals are really 
> testing the compilers as much as they are testing postgres.

> However, there is an autoconf cache, but I do not see why it would have 
> such an effect.

> So no clue.

And now they're both green again, so even less clue.  Oh well.
On another day I'd be interested to understand exactly what happened
there, but right now I lack the time or energy to look closer.

(I do still have a suspicion that it was somehow caused by the
catversion bumps ...)

            regards, tom lane



Re: pgsql: Rework the pg_statistic_ext catalog

From
Tomas Vondra
Date:
On Sun, Jun 16, 2019 at 01:52:03PM -0400, Tom Lane wrote:
>Fabien COELHO <coelho@cri.ensmp.fr> writes:
>>> This theory is insufficient to explain why Coelho's animals failed,
>>> though.  Maybe it would, if you also posit a ccache screwup?  But it's
>>> not obvious from their configurations that they're using ccache at all.
>
>> Indeed, no ccache. The compilers change often (well, once a week), I do
>> not want anything kept across compiler versions. The animals are really
>> testing the compilers as much as they are testing postgres.
>
>> However, there is an autoconf cache, but I do not see why it would have
>> such an effect.
>
>> So no clue.
>
>And now they're both green again, so even less clue.  Oh well.
>On another day I'd be interested to understand exactly what happened
>there, but right now I lack the time or energy to look closer.
>
>(I do still have a suspicion that it was somehow caused by the
>catversion bumps ...)
>

FWIW I'm not using anything bleeding edge either (gcc from fedora 29,
8.3.1 to be exact), no ccache. And I can't reproduce it anymore, so I
agree it might have been some stale state after catversion bump.
Although I usually do make-clean before running tests ...

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services