Re: doc: Make selectivity example match wording - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: doc: Make selectivity example match wording
Date
Msg-id CAKFQuwY1vSL_D9nUALVjwWfXbV8c2acrKq3kRGhHWZyh1LaqkA@mail.gmail.com
Whole thread Raw
In response to Re: doc: Make selectivity example match wording  ("Dian M Fay" <dian.m.fay@gmail.com>)
Responses Re: doc: Make selectivity example match wording
List pgsql-hackers
On Sat, Jul 2, 2022 at 12:42 PM Dian M Fay <dian.m.fay@gmail.com> wrote:
On Thu Jun 9, 2022 at 11:57 AM EDT, David G. Johnston wrote:
> Reposting this to its own thread.
>
> https://www.postgresql.org/message-id/flat/CAKFQuwby1aMsJDMeibaBaohgoaZhivAo4WcqHC1%3D9-GDZ3TSng%40mail.gmail.com
>
>     doc: make unique non-null join selectivity example match the prose
>
>     The description of the computation for the unique, non-null,
>     join selectivity describes a division by the maximum of two values,
>     while the example shows a multiplication by their reciprocal.  While
>     equivalent the max phrasing is easier to understand; which seems
>     more important here than precisely adhering to the formula used
>     in the code (for which either variant is still an approximation).

Should n_distinct and num_rows be <structname>d in the text?

Thanks for the review.  I generally like everything you said but it made me realize that I still didn't really understand the intent behind the formula.  I spent way too much time working that out for myself, then turned what I found useful into this v2 patch.

It may need some semantic markup still but figured I'd see if the idea makes sense.

I basically rewrote, in a bit different style, the same material into the code comments, then proceeded to rework the proof that was already present there.

I did do this in somewhat of a vacuum.  I'm not inclined to learn this all start-to-end though.  If the abrupt style change is unwanted so be it.  I'm not really sure how much benefit the proof really provides.  The comments in the docs are probably sufficient for the code as well - just define why the three pieces of the formula exist and are packaged into a single multiplier called selectivity as an API choice.  I suspect once someone gets to that comment it is fair to assume some prior knowledge.  Admittedly, I didn't really come into this that way...

David J.

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Proposal to introduce a shuffle function to intarray extension
Next
From: "David G. Johnston"
Date:
Subject: Re: Proposal to introduce a shuffle function to intarray extension