Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Date
Msg-id 202501311208.mihe6qhpw2fq@alvherre.pgsql
Whole thread Raw
In response to Reducing memory consumed by RestrictInfo list translations in partitionwise join planning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers
On 2024-Nov-25, Ashutosh Bapat wrote:

> Hmm, I am doing something similar to what you are doing. Here are my
> scripts.  setup.sql - creates partitioned table, and functions, tables
> used to run the benchmark benchmark.sh - creates queries with all
> combinations of enable_partitionwise_join, number of partitions, joins
> etc. and runs EXPLAIN on each query recording the results in a table.

I was curious about this still being alive and uncommitted, so I
wondered if Dmitry was on to something about this patch not improving
things.  So I tried to rerun Ashutosh's benchmark (of course, on a build
with no C assertions, otherwise the numbers are meaningless).  First,
the patches still apply to current master.  Turns on that they improve
planning times not insignificantly in the majority of cases (not all).
This is on my laptop and I didn't do anything in particular to keep it
stable, though.

I don't know about this modern googol sheets thing you're talking about,
so I use \crosstabview like my cavemen forefathers did.  After running
benchmark.sh (note: it needs "-X" on the psql lines, otherwise it gets
all confused by funny stuff in my .psqlrc) to get the SQL to run, I
obtain a summary table with this psql query:

select code_tag, num_parts, num_joins,
  format ('n=%s avg=%s dev=%s', count(planning_time_ms) filter (where pwj),
          (avg(planning_time_ms) filter (where pwj))::numeric(6, 2),
          (stddev(planning_time_ms) filter (where pwj))::numeric(6,2))
from msmts where code_tag = 'master'
group by code_tag, num_parts, num_joins
order by 1, 2, 3 \crosstabview 2 3 4

Here's the tables I got.

PWJ on, master:
 num_parts │             2             │             3              │              4              │              5
       
 

───────────┼───────────────────────────┼────────────────────────────┼─────────────────────────────┼─────────────────────────────
         0 │ n=10 avg=0.05 dev=0.00    │ n=10 avg=0.16 dev=0.04     │ n=10 avg=0.38 dev=0.10      │ n=10 avg=0.95
dev=0.14
        10 │ n=10 avg=0.41 dev=0.02    │ n=10 avg=1.25 dev=0.09     │ n=10 avg=4.93 dev=0.30      │ n=10 avg=12.12
dev=0.69
       100 │ n=10 avg=4.68 dev=0.40    │ n=10 avg=21.34 dev=1.04    │ n=10 avg=65.09 dev=1.91     │ n=10 avg=206.87
dev=3.87
       500 │ n=10 avg=55.11 dev=1.43   │ n=10 avg=240.97 dev=7.90   │ n=10 avg=834.72 dev=35.67   │ n=10 avg=2534.78
dev=107.28
      1000 │ n=10 avg=242.40 dev=21.09 │ n=10 avg=1085.65 dev=38.11 │ n=10 avg=3161.00 dev=151.04 │ n=10 avg=9634.34
dev=635.57

PWJ on, all patches:
 num_parts │            2             │             3             │              4              │              5
     
 

───────────┼──────────────────────────┼───────────────────────────┼─────────────────────────────┼─────────────────────────────
         0 │ n=10 avg=0.05 dev=0.00   │ n=10 avg=0.12 dev=0.01    │ n=10 avg=0.34 dev=0.01      │ n=10 avg=0.91
dev=0.02
        10 │ n=10 avg=0.37 dev=0.01   │ n=10 avg=1.17 dev=0.07    │ n=10 avg=4.09 dev=0.25      │ n=10 avg=10.31
dev=0.38
       100 │ n=10 avg=4.62 dev=0.14   │ n=10 avg=17.17 dev=0.45   │ n=10 avg=54.05 dev=0.98     │ n=10 avg=178.05
dev=2.69
       500 │ n=10 avg=61.32 dev=1.91  │ n=10 avg=229.54 dev=15.82 │ n=10 avg=701.33 dev=34.16   │ n=10 avg=2176.00
dev=84.28
      1000 │ n=10 avg=195.74 dev=5.73 │ n=10 avg=789.49 dev=16.44 │ n=10 avg=2786.55 dev=254.03 │ n=10 avg=9177.05
dev=467.33


PWJ off, master:
 num_parts │            2             │             3             │             4              │              5
    
 

───────────┼──────────────────────────┼───────────────────────────┼────────────────────────────┼─────────────────────────────
         0 │ n=10 avg=0.06 dev=0.02   │ n=10 avg=0.16 dev=0.04    │ n=10 avg=0.39 dev=0.07     │ n=10 avg=1.08
dev=0.17
        10 │ n=10 avg=0.27 dev=0.03   │ n=10 avg=0.54 dev=0.01    │ n=10 avg=1.05 dev=0.03     │ n=10 avg=2.09
dev=0.07
       100 │ n=10 avg=5.17 dev=2.45   │ n=10 avg=8.96 dev=0.14    │ n=10 avg=17.25 dev=0.29    │ n=10 avg=36.11
dev=1.06
       500 │ n=10 avg=46.82 dev=1.84  │ n=10 avg=149.06 dev=2.79  │ n=10 avg=396.95 dev=26.15  │ n=10 avg=912.93
dev=31.78
      1000 │ n=10 avg=219.86 dev=5.21 │ n=10 avg=697.27 dev=14.96 │ n=10 avg=1925.81 dev=65.78 │ n=10 avg=4857.81
dev=248.71


PWJ off, allpatches:
 num_parts │             2             │             3             │             4              │              5
     
 

───────────┼───────────────────────────┼───────────────────────────┼────────────────────────────┼─────────────────────────────
         0 │ n=10 avg=0.06 dev=0.01    │ n=10 avg=0.13 dev=0.02    │ n=10 avg=0.34 dev=0.02     │ n=10 avg=0.95
dev=0.06
        10 │ n=10 avg=0.25 dev=0.02    │ n=10 avg=0.52 dev=0.01    │ n=10 avg=0.96 dev=0.01     │ n=10 avg=1.86
dev=0.01
       100 │ n=10 avg=5.43 dev=2.37    │ n=10 avg=7.30 dev=0.16    │ n=10 avg=12.93 dev=0.35    │ n=10 avg=24.56
dev=0.49
       500 │ n=10 avg=50.10 dev=2.35   │ n=10 avg=156.04 dev=11.05 │ n=10 avg=332.48 dev=17.44  │ n=10 avg=711.22
dev=21.44
      1000 │ n=10 avg=174.02 dev=15.26 │ n=10 avg=567.23 dev=8.81  │ n=10 avg=1480.75 dev=45.04 │ n=10 avg=3578.19
dev=240.20


So it looks to me like for high number of partitions and joins, this
wins hands down in terms of planning time.  For some of the 2/3 joins
and 100/500 partitions, it loses.


As for planner memory, which this was supposed to improve, I don't find
any significant improvement, except for 1000 partitions and 5 joins
(where it goes from 1071991 to 874480 kilobytes); IMO it's not worth
framing this improvement from that point of view, because it doesn't
seem compelling, at least to me.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede" (Mark Twain)



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: jsonlog missing from logging_collector description
Next
From: Antonin Houska
Date:
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?