Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Removing unneeded self joins
Date
Msg-id 20474.1526511341@sss.pgh.pa.us
Whole thread Raw
In response to Re: Removing unneeded self joins  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Removing unneeded self joins  (Andres Freund <andres@anarazel.de>)
Re: Removing unneeded self joins  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
David Rowley <david.rowley@2ndquadrant.com> writes:
> On 17 May 2018 at 10:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah.  It'd have to be a very heuristic thing that doesn't account
>> for much beyond the number of relations in the query, and maybe their
>> sizes --- although I don't think we even know the latter at the
>> point where join removal would be desirable.  (And note that one of
>> the desirable benefits of join removal is not having to find out the
>> sizes of removed rels ... so just swapping that around doesn't appeal.)

> There's probably some argument for delaying obtaining the relation
> size until after join removal and probably partition pruning too, but
> it's currently done well before that in build_simple_rel, where the
> RelOptInfo is built.

Yeah, but that's something we ought to fix someday; IMO it's an artifact
of having wedged in remove_useless_joins without doing the extensive
refactoring that'd be needed to do it at a more desirable time.  I don't
want to build user-visible behavior that's dependent on doing that wrong.

(But wait a second ... we could improve this without quite that much work:
instead of doing estimate_rel_size immediately during get_relation_info,
couldn't it be left until the set_base_rel_sizes pass?  Since
RelationGetNumberOfBlocks involves kernel calls, skipping it for removed
rels seems worth doing.)

            regards, tom lane


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Removing unneeded self joins
Next
From: Andres Freund
Date:
Subject: Re: Removing unneeded self joins