Re: Considering fractional paths in Append node - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Considering fractional paths in Append node
Date
Msg-id CAPpHfdsyOv_t=VVZds-8mkEuwvx1326RGWoOzS9rgnA8uL5Y+w@mail.gmail.com
Whole thread Raw
In response to Re: Considering fractional paths in Append node  (Andrei Lepikhov <lepihov@gmail.com>)
List pgsql-hackers
On Wed, Mar 5, 2025 at 8:32 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 5/3/2025 03:27, Alexander Korotkov wrote:
> > On Mon, Mar 3, 2025 at 1:04 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> >>> 2. As usage of root->tuple_fraction RelOptInfo it has been criticized,
> >>> do you think we could limit this to some simple cases?  For instance,
> >>> check that RelOptInfo is the final result relation for given root.
> >> I believe that using tuple_fraction is not an issue. Instead, it serves
> >> as a flag that allows the upper-level optimisation to consider
> >> additional options. The upper-level optimiser has more variants to
> >> examine and will select the optimal path based on the knowledge
> >> available at that level. Therefore, we're not introducing a mistake
> >> here; we're simply adding extra work in the narrow case. However, having
> >> only the bottom-up planning process, I don't see how we could avoid this
> >> additional workload.
> >
> > Yes, but if we can assume root->tuple_fraction applies to result of
> > Append, it's strange we apply the same tuple fraction to all the child
> > rels.  Latter rels should less likely be used at all and perhaps
> > should have less tuple_fraction.
> Of course, it may happen. But I'm not sure it is a common rule.
> Using LIMIT, we usually select data according to specific clauses.
> Imagine, we need TOP-100 ranked goods. Appending partitions of goods, we
> will use the index on the 'rating' column. But who knows how top-rated
> goods are spread across partitions? Maybe a single partition contains
> all of them? So, we need to select 100 top-rated goods from each partition.
> Hence, applying the same limit to each partition seems reasonable, right?

Ok, I didn't notice add_paths_to_append_rel() is used for MergeAppend
as well.  I thought again about regular Append.  If can have required
number of rows from the first few children relations, the error of
tuple fraction shouldn't influence plans much, and other children
relations wouldn't be used at all.  But if we don't, we unlikely get
prediction of selectivity accurate enough to predict which exact
children relations are going to be used.  So, usage root tuple
fraction for every child relation would be safe.  So, this approach
makes sense to me.

------
Regards,
Alexander Korotkov
Supabase



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Incorrect result of bitmap heap scan.
Next
From: Jim Jones
Date:
Subject: Re: Commit fest 2025-03