Home > mailing lists

Re: Collecting statistics about contents of JSONB columns - Mailing list pgsql-hackers

From	Justin Pryzby
Subject	Re: Collecting statistics about contents of JSONB columns
Date	April 8, 2022 03:31:22
Msg-id	20220408003122.GF24419@telsasoft.com Whole thread Raw
In response to	Re: Collecting statistics about contents of JSONB columns (Nikita Glukhov <n.gluhov@postgrespro.ru>)
List	pgsql-hackers

Tree view

I noticed some typos.

diff --git a/src/backend/utils/adt/jsonb_selfuncs.c b/src/backend/utils/adt/jsonb_selfuncs.c
index f5520f88a1d..d98cd7020a1 100644
--- a/src/backend/utils/adt/jsonb_selfuncs.c
+++ b/src/backend/utils/adt/jsonb_selfuncs.c
@@ -1342,7 +1342,7 @@ jsonSelectivityContains(JsonStats stats, Jsonb *jb)
                     path->stats = jsonStatsFindPath(stats, pathstr.data,
                                                     pathstr.len);
 
-                /* Appeend path string entry for array elements, get stats. */
+                /* Append path string entry for array elements, get stats. */
                 jsonPathAppendEntry(&pathstr, NULL);
                 pstats = jsonStatsFindPath(stats, pathstr.data, pathstr.len);
                 freq = jsonPathStatsGetFreq(pstats, 0.0);
@@ -1367,7 +1367,7 @@ jsonSelectivityContains(JsonStats stats, Jsonb *jb)
             case WJB_END_ARRAY:
             {
                 struct Path *p = path;
-                /* Absoulte selectivity of the path with its all subpaths */
+                /* Absolute selectivity of the path with its all subpaths */
                 Selectivity abs_sel = p->sel * p->freq;
 
                 /* Pop last path entry */
diff --git a/src/backend/utils/adt/jsonb_typanalyze.c b/src/backend/utils/adt/jsonb_typanalyze.c
index 7882db23a87..9a759aadafb 100644
--- a/src/backend/utils/adt/jsonb_typanalyze.c
+++ b/src/backend/utils/adt/jsonb_typanalyze.c
@@ -123,10 +123,9 @@ typedef struct JsonScalarStats
 /*
  * Statistics calculated for a set of values.
  *
- *
  * XXX This seems rather complicated and needs simplification. We're not
  * really using all the various JsonScalarStats bits, there's a lot of
- * duplication (e.g. each JsonScalarStats contains it's own array, which
+ * duplication (e.g. each JsonScalarStats contains its own array, which
  * has a copy of data from the one in "jsons").
  */
 typedef struct JsonValueStats
@@ -849,7 +848,7 @@ jsonAnalyzePathValues(JsonAnalyzeContext *ctx, JsonScalarStats *sstats,
     stats->stanullfrac = (float4)(1.0 - freq);
 
     /*
-     * Similarly, we need to correct the MCV frequencies, becuse those are
+     * Similarly, we need to correct the MCV frequencies, because those are
      * also calculated only from the non-null values. All we need to do is
      * simply multiply that with the non-NULL frequency.
      */
@@ -1015,7 +1014,7 @@ jsonAnalyzeBuildPathStats(JsonPathAnlStats *pstats)
 
     /*
      * We keep array length stats here for queries like jsonpath '$.size() > 5'.
-     * Object lengths stats can be useful for other query lanuages.
+     * Object lengths stats can be useful for other query languages.
      */
     if (vstats->arrlens.values.count)
         jsonAnalyzeMakeScalarStats(&ps, "array_length", &vstats->arrlens.stats);
@@ -1069,7 +1068,7 @@ jsonAnalyzeCalcPathFreq(JsonAnalyzeContext *ctx, JsonPathAnlStats *pstats,
  * We're done with accumulating values for this path, so calculate the
  * statistics for the various arrays.
  *
- * XXX I wonder if we could introduce some simple heuristict on which
+ * XXX I wonder if we could introduce some simple heuristic on which
  * paths to keep, similarly to what we do for MCV lists. For example a
  * path that occurred just once is not very interesting, so we could
  * decide to ignore it and not build the stats. Although that won't
@@ -1414,7 +1413,7 @@ compute_json_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
 
     /*
      * Collect and analyze JSON path values in single or multiple passes.
-     * Sigle-pass collection is faster but consumes much more memory than
+     * Single-pass collection is faster but consumes much more memory than
      * collecting and analyzing by the one path at pass.
      */
     if (ctx.single_pass)

pgsql-hackers by date:

From: Michael Paquier
Date: 08 April 2022, 03:22:38
Subject: Re: REINDEX blocks virtually any queries but some prepared queries.

From: Justin Pryzby
Date: 08 April 2022, 03:46:13
Subject: Re: [Proposal] vacuumdb --schema only

Re: Collecting statistics about contents of JSONB columns - Mailing list pgsql-hackers

Previous

Next