Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and >from >= - Mailing list pgsql-hackers

From Kuntal Ghosh
Subject Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and >from >=
Date
Msg-id CAGz5QCJp45dSiY5PiB8iNUsbeJtHU9B6eacpg4ZF6D6b0fqY8g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and > from >=  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and > from >=
List pgsql-hackers
On Thu, Jul 6, 2017 at 3:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> (Pokes at it some more...) Oh, interesting: it behaves that way except
>> when p is exactly the lowest histogram entry.
>
+ /*
+ * In the first bin (i==1), add a fudge factor that ensures
+ * that histfrac is at least eq_selec.  We do this because we
+ * know that the first histogram entry does satisfy the
+ * inequality (if !isgt) or not satisfy it (if isgt), so our
+ * estimate here should certainly not be zero even if binfrac
+ * is zero.  (XXX experimentally this is the correct way to do
+ * it, but why isn't it a linear adjustment across the whole
+ * histogram rather than just the first bin?)
+ */
Given that the values are distinct, (I've some doubts for real number case)

if histogram_bounds are assigned as,
{0,9,19,29,39,49,59,69,79,89,99,109,119,129,13,..........}

I think the buckets are defined as,
0 < bucket1 <= 9
9 < bucket2 <=19
19 < bucket3 <= 29 and so on.

Because, the histfrac is calculated as follows:

histfrac = (double) (bucket_current - 1) + (val - low) / (high - low);
(where bucket_current is obtained by doing a binary search on
histogram_bounds.)
histfrac /= (double) (nvalues - 1);

So, if val=low, then hisfrac = (bucket_current - 1)/num_of_buckets
which means it assumes val is included in the previous bucket.

This means that it always fails to calculate the selectivity for
lowest histogram boundary. Hence, we need adjustment only for the
first bucket.

Do you think my reasoning justifies your concern?

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: [HACKERS] Another comment typo in execMain.c
Next
From: Dean Rasheed
Date:
Subject: Re: [HACKERS] Multi column range partition table