Re: Removing unneeded self joins - Mailing list pgsql-hackers
From | Andrei Lepikhov |
---|---|
Subject | Re: Removing unneeded self joins |
Date | |
Msg-id | 8800ccf2-ae82-461c-92a1-d4c0a22f8cf2@postgrespro.ru Whole thread Raw |
In response to | Re: Removing unneeded self joins (Alexander Korotkov <aekorotkov@gmail.com>) |
Responses |
Re: Removing unneeded self joins
|
List | pgsql-hackers |
On 12/10/2023 18:32, Alexander Korotkov wrote: > On Thu, Oct 5, 2023 at 12:17 PM Andrei Lepikhov > <a.lepikhov@postgrespro.ru> wrote: >> On 4/10/2023 14:34, Alexander Korotkov wrote: >>> > Relid replacement machinery is the most contradictory code here. We used >>> > a utilitarian approach and implemented a simplistic variant. >>> >>> > > 2) It would be nice to skip the insertion of IS NOT NULL checks when >>> > > they are not necessary. [1] points that infrastructure from [2] might >>> > > be useful. The patchset from [2] seems committed mow. However, I >>> > > can't see it is directly helpful in this matter. Could we just skip >>> > > adding IS NOT NULL clause for the columns, that have >>> > > pg_attribute.attnotnull set? >>> > Thanks for the links, I will look into that case. >> To be more precise, in the attachment, you can find a diff to the main >> patch, which shows the volume of changes to achieve the desired behaviour. >> Some explains in regression tests shifted. So, I've made additional tests: >> >> DROP TABLE test CASCADE; >> CREATE TABLE test (a int, b int not null); >> CREATE UNIQUE INDEX abc ON test(b); >> explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a) >> WHERE t1.b=t2.b; >> CREATE UNIQUE INDEX abc1 ON test(a,b); >> explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a) >> WHERE t1.b=t2.b; >> explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a) >> WHERE t1.b=t2.b AND (t1.a=t2.a OR t2.a=t1.a); >> DROP INDEX abc1; >> explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a) >> WHERE t1.b=t2.b AND (t1.b=t2.b OR t2.b=t1.b); >> >> We have almost the results we wanted to have. But in the last explain >> you can see that nothing happened with the OR clause. We should use the >> expression mutator instead of walker to handle such clauses. But It >> doesn't process the RestrictInfo node ... I'm inclined to put a solution >> of this issue off for a while. > > OK. I think it doesn't worth to eliminate IS NULL quals with this > complexity (at least at this stage of work). > > I made improvements over the code. Mostly new comments, grammar > corrections of existing comments and small refactoring. > > Also, I found that the suggestion from David Rowley [1] to qsort > array of relations to faster find duplicates is still unaddressed. > I've implemented it. That helps to evade quadratic complexity with > large number of relations. > > Also I've incorporated improvements from Alena Rybakina except one for > skipping SJ removal when no SJ quals is found. It's not yet clear for > me if this check fix some cases. But at least optimization got skipped > in some useful cases (as you can see in regression tests). I would like to propose one more minor improvement (see in attachment). The idea here is that after removing a self-join and changing clauses we should re-probe the set of relids with the same Oid, because we can find more removable self-joins (see the demo test in join.sql). -- regards, Andrey Lepikhov Postgres Professional
Attachment
pgsql-hackers by date: