Re: Removing unneeded self joins - Mailing list pgsql-hackers
From | Jaime Casanova |
---|---|
Subject | Re: Removing unneeded self joins |
Date | |
Msg-id | Yh1kTUtj7+S4Wygy@ahch-to Whole thread Raw |
In response to | Re: Removing unneeded self joins (Andrey Lepikhov <a.lepikhov@postgrespro.ru>) |
List | pgsql-hackers |
On Thu, Jul 15, 2021 at 05:49:11PM +0300, Andrey Lepikhov wrote: > On 6/7/21 13:49, Hywel Carver wrote: > > On Mon, Jul 5, 2021 at 2:20 PM Andrey Lepikhov > > <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote: > > Looking through the email chain, a previous version of this patch added > > ~0.6% to planning time in the worst case tested - does that meet the > > "essentially free" requirement? > I think these tests weren't full coverage of possible use cases. It will > depend on a number of relations in the query. For the JOIN of partitioned > tables, for example, the overhead could grow. But in the context of overall > planning time this overhead will be small till the large number of > relations. > Also, we made this feature optional to solve possible problems. > Rebased on 768ea9bcf9 > I made some tests in a machine with 16 cores and 32GB of RAM. So we can see if this is an improvement. This is what I found: +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | test | mode | master | enabled | % | disabled | % | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | pgbench read only | standard | 64418.13 | 63942.94 | -0.74 | 62231.38 | -3.39 | | pgbench read only | prepared | 108463.51 | 107002.13 | -1.35 | 100960.83 | -6.92 | | pgbench read only | extended | 55409.65 | 56427.63 | 1.84 | 55927.62 | 0.93 | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | pgbench read/write | standard | 9374.91 | 9135.21 | -2.56 | 8840.68 | -5.70 | | pgbench read/write | prepared | 11849.86 | 11672.23 | -1.50 | 11393.39 | -3.85 | | pgbench read/write | extended | 7976.80 | 7947.07 | -0.37 | 7788.99 | -2.35 | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | select non optimize 1 | standard | 80.97 | 81.29 | 0.40 | 81.30 | 0.41 | | select non optimize 1 | prepared | 81.29 | 81.28 | -0.01 | 80.89 | -0.49 | | select non optimize 1 | extended | 81.07 | 80.81 | -0.32 | 80.98 | -0.11 | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | select optimized 1 | standard | 15.84 | 13.90 |-12.25 | 15.80 | -0.25 | | select optimized 1 | prepared | 15.24 | 13.82 | -9.32 | 15.55 | 2.03 | | select optimized 1 | extended | 15.38 | 13.89 | -9.69 | 15.59 | 1.37 | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | select optimized 2 | standard | 10204.91 | 10818.39 | 6.01 | 10261.07 | 0.55 | | select optimized 2 | prepared | 13284.06 | 15579.33 | 17.28 | 13116.22 | -1.26 | | select optimized 2 | extended | 10143.43 | 10645.23 | 4.95 | 10142.77 | -0.01 | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ | select shoe | standard | 5645.28 | 5661.71 | 0.29 | 6180.60 | 9.48 | | select shoe | prepared | 9660.45 | 9602.37 | -0.60 | 9894.82 | 2.43 | | select shoe | extended | 5666.47 | 5634.10 | -0.57 | 5757.26 | 1.60 | +-----------------------+----------+-----------+-----------+-------+-----------+-------+ Obviously the pgbench runs are from the standard script. The numbers are not clear for me, I can see improvementes with the patch only in one case and, for some reason, if I disable the patch (enable_self_join_removal='off') I still see a regression in normal cases and curiosly an improvement in one case. I'm attaching the queries. I used the users table that is down-thread and loaded with ~200k rows using: insert into users select seq, case when random() < 0.2 then null else random() * 1000 end, random() * 10000 from generate_series(1, 1000000) seq on conflict (nullable_int) do nothing; for master I just dumped the data from the table and loaded it. I'm also attaching the queries I used. After this tests, I'm not convinced this is actually providing something performance-wise. At least not in its current state. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Attachment
pgsql-hackers by date: