Re: should we have a fast-path planning for OLTP starjoins? - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: should we have a fast-path planning for OLTP starjoins?
Date
Msg-id 47930f36-6373-45aa-8d59-9f7b990a2df9@vondra.me
Whole thread Raw
In response to Re: should we have a fast-path planning for OLTP starjoins?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: should we have a fast-path planning for OLTP starjoins?
List pgsql-hackers
On 9/23/25 21:46, Tom Lane wrote:
> [ sorry for ridiculously slow response to this ]
> 
> Tomas Vondra <tomas@vondra.me> writes:
>> Here's a patch trying to do it more like this - by manipulating the
>> lists describing the join problems, before it's passed the the actual
>> join search algorithm (which is where the PoC patch did this).
>> I wonder if this is roughly the place where you imagined this would be
>> done, or if you envision some other issue with this approach.
> 
> Cool.  This is proof-of-concept that manipulating the joinlist can
> do what we need done.  So we can move on to what heuristics we need
> to use.
> 

Thanks. Good to hear this seems like a reasonable place.

>> I initially tried to manipulate the joinlist much earlier - pretty much
>> right at the end of deconstruct_jointree. But that turned out to be way
>> too early. To identify dimensions etc. we need to check stuff about
>> foreign keys, join clauses, ... and that's not available that early.
> 
>> So I think this needs to happen much later in query_planner(), and the
>> patch does it right before the make_one_rel() call. Maybe that's too
>> late, but it needs to happen after match_foreign_keys_to_quals(), as it
>> relies on some of the FK info built by that call. Maybe we could call
>> match_foreign_keys_to_quals() earlier, but I don't quite see any
>> benefits of doing that ...
> 
> I don't have a problem with doing it where you did it, but the comment
> should explain the placement.  What you do have in the comment mostly
> belongs with the code, too; it's not the business of the caller.  So
> in planmain.c something like
> 
> +    /*
> +     * Try to simplify the join search problem for starjoin-like joins.
> +     * This step relies on info about FK relationships, so we can't do it
> +     * till after match_foreign_keys_to_quals().
> +     */
> 
> would be more appropriate IMO.

I agree. I've adopted your wording, and moved the original comment to
starjoin_adjust_joins (with some changes).

> I'd be slightly inclined to put the GUC test there, too:
> 
> +    if (enable_starjoin_join_search)
> +        joinlist = starjoin_adjust_joins(root, joinlist);
> 

I'm not sure I like this very mcuh. No other call in query_planner()
does it like that. Most don't have such GUC, but at least
remove_useless_self_joins does, and it still doesn't check it here.

> 
> I agree that you need to worry about join order restrictions,
> and that it's not immediately clear how to do that.  join_is_legal
> would work if we could call it, but the problem is that at this
> stage we'll only have RelOptInfos for base rels not join rels.
> If we have a joinlist (A B C D) and we are considering treating
> C as a dimension table, then the questions we have to ask are:
> (a) is it okay to build the (A B D) join without C?
> (b) is it okay to join (A B D) to C?
> 
> In this simple case, I think (b) must be true if (a) is, but
> I'm not quite sure that that's so in more complex cases with
> multiple candidates for dimension tables.  In any case,
> join_is_legal won't help us if we don't have join RelOptInfos.
> 
> I'm inclined to start by using has_join_restriction: if that
> says "false" for a candidate dimension table, it should be safe
> to postpone the join to the dimension table.  We might be able
> to refine that later.
> 

Thanks. I agree has_join_restriction seems like a good start, I'll give
that a try in the next patch version.

When thinking about this, I realized the has_join_restriction() is only
ever used in join_search_one_level(), i.e. when dealing with each small
join order problem. Doesn't this mean the deconstructed jointree must
already consider the restrictions in some way? I don't see any explicit
mentions of such join order restrictions in deconstruct_recurse. It must
not violate any ordering restrictions by splitting the joins in a
"wrong" way, right? If I set join_collapse_limit=1 it still needs to
satisfy all the rules.

I was wondering if maybe we could piggy-back on that, somehow. But maybe
that's not very practical, and has_join_restriction() is the way to go.

It's been a while since I looked at this patch, so it's possible I
already concluded that wouldn't work, and forgot about it :-/

>> The second example (create-2/select-2) is quite different, in that it's
>> nor a starjoin schema. It still joins one "main" table with multiple
>> "dimensions", but the FKs go in the other direction (to a single column
>> in the main table). But it has a very similar bottleneck - the order of
>> the joins is expensive, but it often does not matter very much, because
>> the query matches just one row anyway. And even if it returns more rows,
>> isn't the join order determined just by the selectivity of each join?
>> Maybe the starjoin optimization could be made to work for this type too?
> 
> Yeah, I'm slightly itchy about relying on FKs in this heuristic at
> all; it doesn't seem like quite the right thing.  I think we do want
> one side of the join to be joining to a PK or at least a unique index,
> but I'm not sure we need to insist on there being an FK relationship.
> 

True, requiring the FK may be unnecessary in this case. We do need to
guarantee the cardinality does not change, but a UNIQUE + LEFT JOIN
seems to be enough for that.

BTW the v3 lost the patch/ directory. I assume that wasn't intentional,
so I added it back into v4.

> A couple of minor coding notes:
> 
> * There's no point in doing anything (except recursing) if the joinlist
> contains fewer than 3 items, and maybe as a further heuristic
> this shouldn't kick in till later yet, like 5 or so items.
> When there are just a few items, the possibility of missing the
> best plan seems to outweigh the minimal savings in plan time.
> 
> * Joinlists never contain anything but RangeTblRefs and sub-lists.
> See make_rel_from_joinlist.
> 
> * Your reconstructed joinlist is overly complicated.  Instead of
> 
> +        newlist = list_make2(newlist, list_make1(lfirst(lc)));
> 
> you could just do
> 
> +        newlist = list_make2(newlist, lfirst(lc));
> 
> because a single-element subproblem is useless.
> 
> I notice that the patch doesn't apply cleanly anymore because of
> the introduction of guc_parameters.dat.  So here's a v3 that
> rebases over that, and I took the liberty of fixing the joinlist
> construction as above, but I didn't do anything else.
> 

Thanks. I agree with all of these comments, and updated v4 accordingly.
cfbot started complaining about guc_parameters.dat and a couple more
things, so v4 fixes that too.


regards

-- 
Tomas Vondra

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: IO in wrong state on riscv64
Next
From: Joshua Shanks
Date:
Subject: [PATCH] libpq: Wrap out-of-memory error messages with libpq_gettext()