Re: Support run-time partition pruning for hash join - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Support run-time partition pruning for hash join
Date
Msg-id CAKU4AWq=CdOOX0FwRKmV2x+870Uyp6gO-3bUOPTULAPXEBPRnA@mail.gmail.com
Whole thread Raw
In response to Re: Support run-time partition pruning for hash join  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers


> fwiw, the current master totally ignores the cost reduction for run-time
> partition prune, even for init partition prune.  So in some real cases,
> pg chooses a hash join just because the cost of nest loop join is
> highly over estimated.

This is true about the existing code. It's a very tricky thing to cost
given that the parameter values are always unknown to the planner.
The best we have for these today is the various hardcoded constants in
selfuncs.h. While I do agree that it's not great that the costing code
knows nothing about run-time pruning, I also think that run-time
pruning during execution with parameterised nested loops is much more
likely to be able to prune partitions and save actual work than the
equivalent with Hash Joins.  It's more common for the planner to
choose to Nested Loop when there are fewer outer rows, so the pruning
code is likely to be called fewer times with Nested Loop than with
Hash Join.

Yes, I agree with this.  In my 4 years of PostgresSQL,  I just run into
2 cases of this issue and 1 of them is joining 12+ tables with run-time
partition prune for every join.  But this situation causes more issues than
generating a wrong plan, like for a simple SELECT * FROM p WHERE
partkey = $1;  generic plan will never win so we have to pay the expensive
planning cost for partitioned table. 

If we don't require very accurate costing for every case,  like we only
care about '=' operator which is the most common case,  it should be
easier than the case here since we just need to know if only 1 partition
will survive after pruning, but don't care about which one it is.  I'd like
to discuss in another thread, and leave this thread for Richard's patch
only. 

--
Best Regards
Andy Fan

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: pg_upgrade - a function parameter shadows global 'new_cluster'
Next
From: Bruce Momjian
Date:
Subject: Re: PG 16 draft release notes ready