Re: Tid scan improvements - Mailing list pgsql-hackers

From Edmund Horner
Subject Re: Tid scan improvements
Date
Msg-id CAMyN-kCRiXB50Sx_Ftwb9YFQG5Se99eLkzPPbU34qvpTdQa-2w@mail.gmail.com
Whole thread Raw
In response to Re: Tid scan improvements  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Tid scan improvements
List pgsql-hackers
On Thu, 14 Mar 2019 at 23:06, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On Thu, 14 Mar 2019 at 21:12, Edmund Horner <ejrh00@gmail.com> wrote:
> > I'm not sure how an unreasonable underestimation would occur here.  If
> > you have a table bloated to say 10x its minimal size, the estimator
> > still assumes an even distribution of tuples (I don't think we can do
> > much better than that).  So the selectivity of "ctid >= <last ctid
> > that would exist without bloat>" is still going to be 0.9.
>
> Okay, think you're right there.  I guess the only risk there is just
> varying tuple density per page, and that seems no greater risk than we
> have with the existing stats.

Yeah that is a risk, and will probably come up in practice.  But at
least we're not just picking a hardcoded selectivity any more.

> Just looking again, I think the block of code starting:
>
> + if (density > 0.0)
>
> needs a comment to mention what it's doing. Perhaps:
>
> + /*
> + * Using the average tuples per page, calculate how far into
> + * the page the itemptr is likely to be and adjust block
> + * accordingly.
> + */
> + if (density > 0.0)
>
> Or some better choice of words.  With that done, I think 0001 is good to go.

Ok, I'll look at it and hopefully get a new patch up soon.

Edmund


pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Timeout parameters
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Is PREPARE of ecpglib thread safe?