Re: Considering fractional paths in Append node - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Considering fractional paths in Append node
Date
Msg-id CAPpHfdt2oOoarXM71TNL+Ud71DLG506ScpomfNu0LYciedT3ig@mail.gmail.com
Whole thread Raw
In response to Re: Considering fractional paths in Append node  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Considering fractional paths in Append node
List pgsql-hackers
On Wed, Mar 5, 2025 at 1:20 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Wed, Mar 5, 2025 at 8:32 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > On 5/3/2025 03:27, Alexander Korotkov wrote:
> > > On Mon, Mar 3, 2025 at 1:04 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > >>> 2. As usage of root->tuple_fraction RelOptInfo it has been criticized,
> > >>> do you think we could limit this to some simple cases?  For instance,
> > >>> check that RelOptInfo is the final result relation for given root.
> > >> I believe that using tuple_fraction is not an issue. Instead, it serves
> > >> as a flag that allows the upper-level optimisation to consider
> > >> additional options. The upper-level optimiser has more variants to
> > >> examine and will select the optimal path based on the knowledge
> > >> available at that level. Therefore, we're not introducing a mistake
> > >> here; we're simply adding extra work in the narrow case. However, having
> > >> only the bottom-up planning process, I don't see how we could avoid this
> > >> additional workload.
> > >
> > > Yes, but if we can assume root->tuple_fraction applies to result of
> > > Append, it's strange we apply the same tuple fraction to all the child
> > > rels.  Latter rels should less likely be used at all and perhaps
> > > should have less tuple_fraction.
> > Of course, it may happen. But I'm not sure it is a common rule.
> > Using LIMIT, we usually select data according to specific clauses.
> > Imagine, we need TOP-100 ranked goods. Appending partitions of goods, we
> > will use the index on the 'rating' column. But who knows how top-rated
> > goods are spread across partitions? Maybe a single partition contains
> > all of them? So, we need to select 100 top-rated goods from each partition.
> > Hence, applying the same limit to each partition seems reasonable, right?
>
> Ok, I didn't notice add_paths_to_append_rel() is used for MergeAppend
> as well.  I thought again about regular Append.  If can have required
> number of rows from the first few children relations, the error of
> tuple fraction shouldn't influence plans much, and other children
> relations wouldn't be used at all.  But if we don't, we unlikely get
> prediction of selectivity accurate enough to predict which exact
> children relations are going to be used.  So, usage root tuple
> fraction for every child relation would be safe.  So, this approach
> makes sense to me.

I've slightly revised the commit message and comments.  I'm going to
push this if no objections.

------
Regards,
Alexander Korotkov
Supabase

Attachment

pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: [Doc] Improve hostssl related descriptions and option presentation
Next
From: vignesh C
Date:
Subject: Re: Commit fest 2025-03