Home > mailing lists

Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: Eager page freeze criteria clarification
Date	December 21, 2023 18:56:09
Msg-id	CAAKRu_YfyOUK8Ne9=6CrqiNPNTfsP76-Gmcv-0p=KQiN1nM14A@mail.gmail.com Whole thread Raw
In response to	Re: Eager page freeze criteria clarification (Joe Conway <mail@joeconway.com>)
Responses	Re: Eager page freeze criteria clarification Re: Eager page freeze criteria clarification
List	pgsql-hackers

Tree view

On Sat, Dec 9, 2023 at 9:24 AM Joe Conway <mail@joeconway.com> wrote:
>
> On 12/8/23 23:11, Melanie Plageman wrote:
> >
> > I'd be delighted to receive any feedback, ideas, questions, or review.
>
>
> This is well thought out, well described, and a fantastic improvement in
> my view -- well done!

Thanks, Joe! That means a lot! I see work done by hackers on the
mailing list a lot that makes me think, "hey, that's
cool/clever/awesome!" but I don't give that feedback. I appreciate you
doing that!

> I do think we will need to consider distributions other than normal, but
> I don't know offhand what they will be.

Agreed. I plan to test with another distribution. Though, the exercise
of determining which ones are useful is probably more challenging.
I imagine we will have to choose one distribution (as opposed to
supporting different distributions and choosing based on data access
patterns for a table). Though, even with a normal distribution, I
think it should be an improvement.

> However, even if we assume a more-or-less normal distribution, we should
> consider using subgroups in a way similar to Statistical Process
> Control[1]. The reasoning is explained in this quote:
>
>      The Math Behind Subgroup Size
>
>      The Central Limit Theorem (CLT) plays a pivotal role here. According
>      to CLT, as the subgroup size (n) increases, the distribution of the
>      sample means will approximate a normal distribution, regardless of
>      the shape of the population distribution. Therefore, as your
>      subgroup size increases, your control chart limits will narrow,
>      making the chart more sensitive to special cause variation and more
>      prone to false alarms.

I haven't read anything about statistical process control until you
mentioned this. I read the link you sent and also googled around a
bit. I was under the impression that the more samples we have, the
better. But, it seems like this may not be the assumption in
statistical process control?

It may help us to get more specific. I'm not sure what the
relationship between "unsets" in my code and subgroup members would
be.  The article you linked suggests that each subgroup should be of
size 5 or smaller. Translating that to my code, were you imagining
subgroups of "unsets" (each time we modify a page that was previously
all-visible)?

Thanks for the feedback!

- Melanie

pgsql-hackers by date:

From: Tom Lane
Date: 21 December 2023, 18:46:02
Subject: Re: ci: Build standalone INSTALL file

From: Andres Freund
Date: 21 December 2023, 18:57:56
Subject: Re: ci: Build standalone INSTALL file

Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers

Previous

Next