pgsql: Add input function for data type pg_ndistinct - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Add input function for data type pg_ndistinct
Date
Msg-id E1vO48k-001QDn-1r@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Add input function for data type pg_ndistinct

pg_ndistinct is used as data type for the contents of ndistinct extended
statistics.  This new input function consumes the format that has been
established by 1f927cce4498 for the output function of pg_ndistinct,
enforcing some sanity checks for:
- Checks for the input object, which should be a one-dimension array
with correct attributes and values.
- The key names: "attributes", "ndistinct".  Both are required, other
key names are blocked.
- Value types for each key: "attributes" requires an array of integers,
and "ndistinct" an integer.
- List of attributes.  Note that this enforces a check so as an
attribute list has to be a subset of the longest attribute list found.
This does not enforce that a full group of attribute sets exist, based
on how the groups are generated when the ndistinct objects are
generated, making the list of ndistinct items a bit loose.  Note a check
would still be required at import to see if the attributes listed match
with the attribute numbers set in the definition of a statistics object.
- Based on the discussion, the checks on the values are loose, as there
is also an argument for potentially stats injection.  The relation and
attribute level stats follow the same line of argument for the values.

This is required for a follow-up patch that aims to implement the import
of extended statistics.  Some tests are added to check the code paths of
the JSON parser checking the shape of the pg_ndistinct inputs, with 90%
of code coverage reached.  The tests are located in their own new test
file, for clarity.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/44eba8f06e5568be35fa3d112ab781e931fe04ae

Modified Files
--------------
src/backend/utils/adt/pg_ndistinct.c       | 768 ++++++++++++++++++++++++++++-
src/test/regress/expected/pg_ndistinct.out | 447 +++++++++++++++++
src/test/regress/parallel_schedule         |   2 +-
src/test/regress/sql/pg_ndistinct.sql      | 106 ++++
src/tools/pgindent/typedefs.list           |   2 +
5 files changed, 1316 insertions(+), 9 deletions(-)


pgsql-committers by date:

Previous
From: Melanie Plageman
Date:
Subject: pgsql: Assert that cutoffs are provided if freezing will be attempted
Next
From: Michael Paquier
Date:
Subject: pgsql: Add input function for data type pg_dependencies