Re: effect of JIT tuple deform? - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: effect of JIT tuple deform?
Date
Msg-id CA+q6zcVoPA5OJqyzRn6cByzKGnSpFK60X63+MXPA81RTgXQJug@mail.gmail.com
Whole thread Raw
In response to Re: effect of JIT tuple deform?  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: effect of JIT tuple deform?
Re: effect of JIT tuple deform?
List pgsql-hackers
> On 23 June 2018 at 08:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
> 2018-06-23 8:35 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
>>
>> Hi
>>
>> I try to measure effect of JIT tuple deform and I don't see any possible
>> effect.
>>
>> Is it this feature active in master branch?
>>
>> Is possible to see this feature in EXPLAIN ANALYZE?
>
>
> Unfortunately I got slowdown
>
> 0. shared buffers = 1GB
> 1. create table with 50 int columns
> 2. insert into this table 4M rows

Hi,

Looks like I can reproduce the situation you're talking about (with some minor
differences)

=# explain analyze select sum(data45) from test_deform;
                                QUERY PLAN
-------------------------------------------------------------------------------
 Finalize Aggregate
 (cost=127097.71..127097.72 rows=1 width=8)
 (actual time=813.957..813.957 rows=1 loops=1)
   ->  Gather
       (cost=127097.50..127097.71 rows=2 width=8)
       (actual time=813.946..813.950 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate
             (cost=126097.50..126097.51 rows=1 width=8)
             (actual time=802.585..802.585 rows=1 loops=3)
               ->  Parallel Seq Scan on test_deform
                   (cost=0.00..121930.80 rows=1666680 width=4)
                   (actual time=0.040..207.326 rows=1333333 loops=3)
 Planning Time: 0.212 ms
 JIT:
   Functions: 6
   Generation Time: 3.076 ms
   Inlining: false
   Inlining Time: 0.000 ms
   Optimization: false
   Optimization Time: 1.328 ms
   Emission Time: 8.601 ms
 Execution Time: 820.127 ms
(16 rows)

=# set jit_tuple_deforming to off;

=# explain analyze select sum(data45) from test_deform;
                                QUERY PLAN
-------------------------------------------------------------------------------
 Finalize Aggregate
 (cost=127097.71..127097.72 rows=1 width=8)
 (actual time=742.578..742.578 rows=1 loops=1)
   ->  Gather
       (cost=127097.50..127097.71 rows=2 width=8)
       (actual time=742.529..742.569 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate
             (cost=126097.50..126097.51 rows=1 width=8)
             (actual time=727.876..727.876 rows=1 loops=3)
               ->  Parallel Seq Scan on test_deform
                   (cost=0.00..121930.80 rows=1666680 width=4)
                   (actual time=0.044..204.097 rows=1333333 loops=3)
 Planning Time: 0.361 ms
 JIT:
   Functions: 4
   Generation Time: 2.840 ms
   Inlining: false
   Inlining Time: 0.000 ms
   Optimization: false
   Optimization Time: 0.751 ms
   Emission Time: 6.436 ms
 Execution Time: 749.827 ms
(16 rows)

But at the same time on the bigger dataset (20M rows instead of 4M) the very
same query gets better:

=# explain analyze select sum(data45) from test_deform;
                                QUERY PLAN
-------------------------------------------------------------------------------
 Finalize Aggregate
 (cost=631482.92..631482.93 rows=1 width=8)
 (actual time=2350.288..2350.288 rows=1 loops=1)
   ->  Gather
       (cost=631482.71..631482.92 rows=2 width=8)
       (actual time=2350.276..2350.279 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate
             (cost=630482.71..630482.72 rows=1 width=8)
             (actual time=2328.378..2328.379 rows=1 loops=3)
               ->  Parallel Seq Scan on test_deform
                   (cost=0.00..609649.37 rows=8333337 width=4)
                   (actual time=0.027..1175.960 rows=6666667 loops=3)
 Planning Time: 0.600 ms
 JIT:
   Functions: 6
   Generation Time: 3.661 ms
   Inlining: true
   Inlining Time: 65.373 ms
   Optimization: true
   Optimization Time: 120.885 ms
   Emission Time: 58.836 ms
 Execution Time: 2543.280 ms
(16 rows)

=# set jit_tuple_deforming to off;

=# explain analyze select sum(data45) from test_deform;
                                QUERY PLAN
-------------------------------------------------------------------------------
 Finalize Aggregate
 (cost=631482.92..631482.93 rows=1 width=8)
 (actual time=3616.977..3616.977 rows=1 loops=1)
   ->  Gather
       (cost=631482.71..631482.92 rows=2 width=8)
       (actual time=3616.959..3616.971 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate
             (cost=630482.71..630482.72 rows=1 width=8)
             (actual time=3593.220..3593.220 rows=1 loops=3)
               ->  Parallel Seq Scan on test_deform
                   (cost=0.00..609649.37 rows=8333337 width=4)
                   (actual time=0.049..1027.353 rows=6666667 loops=3)
 Planning Time: 0.149 ms
 JIT:
   Functions: 4
   Generation Time: 1.803 ms
   Inlining: true
   Inlining Time: 23.529 ms
   Optimization: true
   Optimization Time: 18.843 ms
   Emission Time: 9.307 ms
 Execution Time: 3625.674 ms
(16 rows)

`perf diff` indeed shows that in the first case (with the 4M rows dataset) the
jitted version has some noticeable delta for one call, and unfortunately so far
I couldn't figure out which one exactly because of JIT (btw, who can explain
how to see a correct full `perf report` in this case? Somehow `perf
inject --jit -o
perf.data.jitted` and jit_profiling_support didn't help).

But since on the bigger dataset I've got expected results, maybe it's just a
sign that JIT kicks in too early in this case and what's necessary is to adjust
jit_above_cost/jit_optimize_above_cost/jit_inline_above_cost?


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Desirability of client-side expressions in psql?
Next
From: Andres Freund
Date:
Subject: Re: effect of JIT tuple deform?