Re: Collecting statistics about contents of JSONB columns - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Collecting statistics about contents of JSONB columns
Date
Msg-id 15d637bb-900e-e67b-361b-d70a79c324d7@enterprisedb.com
Whole thread Raw
In response to Re: Collecting statistics about contents of JSONB columns  (Mahendra Singh Thalor <mahi6run@gmail.com>)
Responses Re: Collecting statistics about contents of JSONB columns  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Re: Collecting statistics about contents of JSONB columns  (Nikita Glukhov <n.gluhov@postgrespro.ru>)
List pgsql-hackers

On 1/25/22 17:56, Mahendra Singh Thalor wrote:
 >
> ...
> 
> For the last few days, I was trying to understand these patches, and 
> based on Tomas's suggestion, I was doing some performance tests.
> 
> With the attached .SQL file, I can see that analyze is taking more time 
> with these patches.
> 
> *Setup: *
> autovacuum=off
> rest all are default settings.
> 
> Insert attached file with and without the patch to compare the time 
> taken by analyze.
> 
> *With json patches:*
> postgres=# analyze test ;
> ANALYZE
> Time: *28464.062 ms (00:28.464)*
> postgres=#
> postgres=# SELECT pg_size_pretty( 
> pg_total_relation_size('pg_catalog.pg_statistic') );
>   pg_size_pretty
> ----------------
>   328 kB
> (1 row)
> -- 
> 
> *Without json patches:*
> postgres=# analyze test ;
> ANALYZE
> *Time: 120.864* ms
> postgres=# SELECT pg_size_pretty( 
> pg_total_relation_size('pg_catalog.pg_statistic') );
>   pg_size_pretty
> ----------------
>   272 kB
> 
> I haven't found the root cause of this but I feel that this time is due 
> to a loop of all the paths.
> In my test data, there is a total of 951 different-2 paths. While doing 
> analysis, first we check all the sample rows(30000) and we collect all 
> the different-2 paths (here 951), and after that for every single path, 
> we loop over all the sample rows again to collect stats for a particular 
> path. I feel that these loops might be taking time.
> 
> I will run perf and will try to find out the root cause of this.
> 

Thanks, I've been doing some performance tests too, and you're right it 
takes quite a bit of time. I wanted to compare how the timing changes 
with complexity of the JSON documents (nesting, number of keys, ...) so 
I wrote a simple python script to generate random JSON documents with 
different parameters - see the attached json-generate.py script.

It's a bit crude, but it generates synthetic documents with a chosen 
number of levels, keys per level, distinct key values, etc. The 
generated documents are loaded directly into a "json_test" database, 
into a table "test_table" with a single jsonb column called "v". 
Tweaking this to connect to a different database, or just dump the 
generated documents to a file, should be trivial.

The attached bash script runs the data generator for a couple of 
combinations, and them measures how long it takes to analyze the table, 
how large the statistics are (in a rather crude way), etc.

The results look like this (the last two columns are analyze duration in 
milliseconds, for master and with the patch):

   levels   keys  unique keys    paths      master    patched
   ----------------------------------------------------------
        1      1            1        2         153        122
        1      1         1000     1001         134       1590
        1      8            8        9         157        367
        1      8         1000     1001         155       1838
        1     64           64       65         189       2298
        1     64         1000     1001          46       9322
        2      1            1        3         237        197
        2      1         1000    30580         152      46468
        2      8            8       73         245       1780

So yeah, it's significantly slower - in most cases not as much as you 
observed, but an order of magnitude slower than master. For size of the 
statistics, it's similar:

   levels   keys  unique keys   paths  table size   master    patched
   ------------------------------------------------------------------
        1      1            1       2     1843200    16360      24325
        1      1         1000    1001     1843200    16819    1425400
        1      8            8       9     4710400    28948      88837
        1      8         1000    1001     6504448    42694    3915802
        1     64           64      65    30154752   209713     689842
        1     64         1000    1001    49086464     1093    7755214
        2      1            1       3     2572288    24883      48727
        2      1         1000   30580     2572288    11422   26396365
        2      8            8      73    23068672   164785     862258

This is measured by by dumping pg_statistic for the column, so in 
database it might be compressed etc. It's larger, but that's somewhat 
expected because we simply store more detailed stats. The size grows 
with the number of paths extracted - which is expected, of course.

If you noticed why this doesn't show data for additional combinations 
(e.g. 2 levels 8 keys and 1000 distinct key values), then that's the bad 
news - that takes ages (multiple minutes) and then it gets killed by OOM 
killer because it eats gigabytes of memory.

I agree the slowness is largely due to extracting all paths and then 
processing them one by one - which means we have to loop over the tuples 
over and over. In this case there's about 850k distinct paths extracted, 
so we do ~850k loops over 30k tuples. That's gotta take time.

I don't know what exactly to do about this, but I already mentioned we 
may need to pick a subset of paths to keep, similarly to how we pick 
items for MCV. I mean, if we only saw a path once or twice, it's 
unlikely to be interesting enough to build stats for it. I haven't 
tried, but I'd bet most of the 850k paths might be ignored like this.

The other thing we might do is making it the loops more efficient. For 
example, we might track which documents contain each path (by a small 
bitmap or something), so that in the loop we can skip rows that don't 
contain the path we're currently processing. Or something like that.

Of course, this can't eliminate all the overhead - we've building more 
stats and that has a cost. In the "common" case of stable "fixed" schema 
with the same paths in all documents we'll still need to do loop for 
each of them. So it's bound to be slower than master.

Which probably means it's a bad idea to do this for all JSONB columns, 
because in many cases the extra stats are not needed so the extra 
analyze time would be a waste. So I guess we'll need some way to enable 
this only for selected columns ... I argued against the idea to 
implement this as extended statistics in the first message, but it's a 
reasonably nice way to do such stuff (expression stats are a precedent).


> Apart from this performance issue, I haven't found any crashes or issues.
> 

Well, I haven't seen any crashes either, but as I mentioned for complex 
documents (2 levels, many distinct keys) the ANALYZE starts consuming a 
lot of memory and may get killed by OOM. For example if you generate 
documents like this

  ./json-generate.py 30000 2 8 1000 6 1000

and then run ANALYZE, that'll take ages and it very quickly gets into a 
situation like this (generated from gdb by calling MemoryContextStats on 
TopMemoryContext):


-------------------------------------------------------------------------
TopMemoryContext: 80776 total in 6 blocks; 13992 free (18 chunks); 66784 
used
   ...
   TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used
     PortalContext: 1024 total in 1 blocks; 488 free (0 chunks); 536 
used: <unnamed>
       Analyze: 472726496 total in 150 blocks; 3725776 free (4 chunks); 
469000720 used
         Analyze Column: 921177696 total in 120 blocks; 5123256 free 
(238 chunks); 916054440 used
           Json Analyze Tmp Context: 8192 total in 1 blocks; 5720 free 
(1 chunks); 2472 used
             Json Analyze Pass Context: 8192 total in 1 blocks; 7928 
free (0 chunks); 264 used
           JSON analyze path table: 1639706040 total in 25084 blocks; 
1513640 free (33 chunks); 1638192400 used
       Vacuum: 8192 total in 1 blocks; 7448 free (0 chunks); 744 used
...
Grand total: 3035316184 bytes in 25542 blocks; 10971120 free (352 
chunks); 3024345064 used
-------------------------------------------------------------------------


Yes, that's backend 3GB of memory, out of which 1.6GB is in "analyze 
path table" context, 400MB in "analyze" and 900MB in "analyze column"
contexts. I mean, that seems a bit excessive. And it grows over time, so 
after a while my laptop gives up and kills the backend.

I'm not sure if it's a memory leak (which would be fixable), or it's due 
to keeping stats for all the extracted paths. I mean, in this particular 
case we have 850k paths - even if stats are just 1kB per path,  that's 
850MB. This requires more investigation.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Windows now has fdatasync()
Next
From: Tomas Vondra
Date:
Subject: Re: Collecting statistics about contents of JSONB columns