On Wed, Mar 5, 2025 at 1:20 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Wed, Mar 5, 2025 at 8:32 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > On 5/3/2025 03:27, Alexander Korotkov wrote:
> > > On Mon, Mar 3, 2025 at 1:04 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > >>> 2. As usage of root->tuple_fraction RelOptInfo it has been criticized,
> > >>> do you think we could limit this to some simple cases? For instance,
> > >>> check that RelOptInfo is the final result relation for given root.
> > >> I believe that using tuple_fraction is not an issue. Instead, it serves
> > >> as a flag that allows the upper-level optimisation to consider
> > >> additional options. The upper-level optimiser has more variants to
> > >> examine and will select the optimal path based on the knowledge
> > >> available at that level. Therefore, we're not introducing a mistake
> > >> here; we're simply adding extra work in the narrow case. However, having
> > >> only the bottom-up planning process, I don't see how we could avoid this
> > >> additional workload.
> > >
> > > Yes, but if we can assume root->tuple_fraction applies to result of
> > > Append, it's strange we apply the same tuple fraction to all the child
> > > rels. Latter rels should less likely be used at all and perhaps
> > > should have less tuple_fraction.
> > Of course, it may happen. But I'm not sure it is a common rule.
> > Using LIMIT, we usually select data according to specific clauses.
> > Imagine, we need TOP-100 ranked goods. Appending partitions of goods, we
> > will use the index on the 'rating' column. But who knows how top-rated
> > goods are spread across partitions? Maybe a single partition contains
> > all of them? So, we need to select 100 top-rated goods from each partition.
> > Hence, applying the same limit to each partition seems reasonable, right?
>
> Ok, I didn't notice add_paths_to_append_rel() is used for MergeAppend
> as well. I thought again about regular Append. If can have required
> number of rows from the first few children relations, the error of
> tuple fraction shouldn't influence plans much, and other children
> relations wouldn't be used at all. But if we don't, we unlikely get
> prediction of selectivity accurate enough to predict which exact
> children relations are going to be used. So, usage root tuple
> fraction for every child relation would be safe. So, this approach
> makes sense to me.
I've slightly revised the commit message and comments. I'm going to
push this if no objections.
------
Regards,
Alexander Korotkov
Supabase