Thread: Re: Doc tweak for huge_pages?

Re: Doc tweak for huge_pages?

From
Peter Eisentraut
Date:
On 11/30/17 23:35, Thomas Munro wrote:
> On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
>> On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
>>> Hi hackers,
>>>
>>> The manual implies that only Linux can use huge pages.  That is not
>>> true: FreeBSD, Illumos and probably others support larger page sizes
>>> using transparent page coalescing algorithms.  On my FreeBSD box
>>> procstat -v often shows PostgreSQL shared buffers in "S"-flagged
>>> memory.  I think we should adjust the manual to make clear that it's
>>> the *explicit request for huge pages* that is supported only on Linux
>>> (and hopefully soon Windows).  Am I being too pedantic?
>>
>> I suggest to remove "other" and include Linux in the enumeration, since it also
>> supports "transparent" hugepages.
> 
> Hmm.  Yeah, it does, but apparently it's not so transparent.  So if we
> mention that we'd better indicate in the same paragraph that you
> probably don't actually want to use it.  How about the attached?

Part of the confusion is that the huge_pages setting is only for shared
memory, whereas the kernel settings affect all memory.  Is the same true
for the proposed Windows patch?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Doc tweak for huge_pages?

From
Thomas Munro
Date:
On Sat, Dec 2, 2017 at 4:08 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/30/17 23:35, Thomas Munro wrote:
>> On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
>>> On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
>>>> Hi hackers,
>>>>
>>>> The manual implies that only Linux can use huge pages.  That is not
>>>> true: FreeBSD, Illumos and probably others support larger page sizes
>>>> using transparent page coalescing algorithms.  On my FreeBSD box
>>>> procstat -v often shows PostgreSQL shared buffers in "S"-flagged
>>>> memory.  I think we should adjust the manual to make clear that it's
>>>> the *explicit request for huge pages* that is supported only on Linux
>>>> (and hopefully soon Windows).  Am I being too pedantic?
>>>
>>> I suggest to remove "other" and include Linux in the enumeration, since it also
>>> supports "transparent" hugepages.
>>
>> Hmm.  Yeah, it does, but apparently it's not so transparent.  So if we
>> mention that we'd better indicate in the same paragraph that you
>> probably don't actually want to use it.  How about the attached?
>
> Part of the confusion is that the huge_pages setting is only for shared
> memory, whereas the kernel settings affect all memory.

Right.  And more specifically, just the main shared memory area, not
DSM segments.  Updated to make this point.

(I have wondered whether DSM segments should respect this GUC: it
seems plausible that they should when the size is a multiple of the
huge page size, so that very large DSA areas finish up mostly backed
by huge pages, so that very large shared hash tables would benefit
from lower TLB miss rates.  I have only read in an academic paper that
this is a good idea, I haven't investigated whether that would really
help us in practice, and the first problem is that Linux shm_open
doesn't support huge pages anyway so you've need one of the other DSM
implementation options which are currently non-default.)

> Is the same true
> for the proposed Windows patch?

Yes.  It adds a flag to the request for the main shared memory area
(after jumping through various permissions hoops).

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

Re: Doc tweak for huge_pages?

From
Catalin Iacob
Date:
On Fri, Dec 1, 2017 at 10:09 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
>> On 11/30/17 23:35, Thomas Munro wrote:
>>> Hmm.  Yeah, it does, but apparently it's not so transparent.  So if we
>>> mention that we'd better indicate in the same paragraph that you
>>> probably don't actually want to use it.  How about the attached?

Here's a review for v3.

I find that the first paragraph is an improvement as it's more precise.

What I didn't like about the second paragraph is that it pointed out
Linux transparent huge pages too favorably while they are actually
known to cause big (huge?, pardon the pun) issues (as witnessed in
this thread as well). v3 basically says "in Linux it can be
transparent or explicit and explicit is faster than transparent".
Reading that, and seeing that explicit needs tweaking of kernel
parameters and so on, one might very well conclude "I'll use the
slightly-slower-but-still-better-than-nothing transparent version".

So I tried to redo the second paragraph and ended up with the
attached. Rationale for the changes:
* changed "this feature" to "explicitly requesting huge pages" to
contrast with the automatic one described below
* made the wording of Linux THP more negative (but still with some
wiggle room for future kernel versions which might improve THP),
contrasting with the positive explicit request from this GUC
* integrated your mention of other OSes with automatic huge pages
* moved the new text to the last paragraph to lower its importance

What do you think?

Attachment

Re: Doc tweak for huge_pages?

From
Thomas Munro
Date:
On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
> On Fri, Dec 1, 2017 at 10:09 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>>> On 11/30/17 23:35, Thomas Munro wrote:
>>>> Hmm.  Yeah, it does, but apparently it's not so transparent.  So if we
>>>> mention that we'd better indicate in the same paragraph that you
>>>> probably don't actually want to use it.  How about the attached?
>
> Here's a review for v3.

Thanks!

> I find that the first paragraph is an improvement as it's more precise.
>
> What I didn't like about the second paragraph is that it pointed out
> Linux transparent huge pages too favorably while they are actually
> known to cause big (huge?, pardon the pun) issues (as witnessed in
> this thread as well). v3 basically says "in Linux it can be
> transparent or explicit and explicit is faster than transparent".
> Reading that, and seeing that explicit needs tweaking of kernel
> parameters and so on, one might very well conclude "I'll use the
> slightly-slower-but-still-better-than-nothing transparent version".
>
> So I tried to redo the second paragraph and ended up with the
> attached. Rationale for the changes:
> * changed "this feature" to "explicitly requesting huge pages" to
> contrast with the automatic one described below
> * made the wording of Linux THP more negative (but still with some
> wiggle room for future kernel versions which might improve THP),
> contrasting with the positive explicit request from this GUC
> * integrated your mention of other OSes with automatic huge pages
> * moved the new text to the last paragraph to lower its importance
>
> What do you think?

I don't know enough about this to make such a strong recommendation
myself, which is why I was only trying to report that bad performance
had been observed on some version, not that you shouldn't do it.  Any
other views on this stronger statement?

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Doc tweak for huge_pages?

From
Peter Eisentraut
Date:
On 12/1/17 10:08, Peter Eisentraut wrote:
> Part of the confusion is that the huge_pages setting is only for shared
> memory, whereas the kernel settings affect all memory.  Is the same true
> for the proposed Windows patch?

Btw., I'm kind of hoping that the Windows patch would be committed
first, so that we don't have to rephrase this whole thing again after that.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Doc tweak for huge_pages?

From
Thomas Munro
Date:
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
>> So I tried to redo the second paragraph and ended up with the
>> attached. Rationale for the changes:
>> * changed "this feature" to "explicitly requesting huge pages" to
>> contrast with the automatic one described below
>> * made the wording of Linux THP more negative (but still with some
>> wiggle room for future kernel versions which might improve THP),
>> contrasting with the positive explicit request from this GUC
>> * integrated your mention of other OSes with automatic huge pages
>> * moved the new text to the last paragraph to lower its importance
>>
>> What do you think?
>
> I don't know enough about this to make such a strong recommendation
> myself, which is why I was only trying to report that bad performance
> had been observed on some version, not that you shouldn't do it.  Any
> other views on this stronger statement?

Now that the Windows huge pages patch has landed, here is a rebase.  I
took your alternative and tweaked it a tiny bit more.  Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

Re: Doc tweak for huge_pages?

From
Justin Pryzby
Date:
On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
> > I don't know enough about this to make such a strong recommendation
> > myself, which is why I was only trying to report that bad performance
> > had been observed on some version, not that you shouldn't do it.  Any
> > other views on this stronger statement?
> 
> Now that the Windows huge pages patch has landed, here is a rebase.  I
> took your alternative and tweaked it a tiny bit more.  Thoughts?

+       <para>
+        Note that, besides explicitly requesting huge pages via
+        <varname>huge_pages</varname>,
=> I would just say:
"Note that, besides huge pages requested explicitly, ..."

+ In Linux this automatic use is
=> ON Linux comma?

+        called "transparent huge pages" and is not enabled by default in
+        popular distributions as of the time of writing, but since transparent

=> really ?  I don't know if I've ever seen it not enabled.  In any case,
that's a strong statement to make (to be disabled in ALL popular distributions).

I checked all our servers, including centos6 and ubuntu t-LTS and x-LTS.  On a
limited few where it was disabled, I'd explicitly done so.

On a server on which I just installed ubuntu-x LTS, with 4.13.0-26-generic:
pryzbyj@gta-ubuntu:~$ cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never

https://github.com/torvalds/linux/commit/13ece886d99cd668483113f7238e419d5331af26
=> the compile time default is to disable, but (if enabled at compile time),
the runtime default is "always".

On centos7
Linux template0 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ cat /sys/kernel/mm/transparent_hugepage/enabled 
[always] madvise never
$ grep TRANS /boot/config-3.10.0-693.11.6.el7.x86_64 
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set

https://blog.nelhage.com/post/transparent-hugepages/
=> It is enabled (”enabled=always”) by default in most Linux distributions.

Justin


Re: Doc tweak for huge_pages?

From
Thomas Munro
Date:
On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
>> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
>> > I don't know enough about this to make such a strong recommendation
>> > myself, which is why I was only trying to report that bad performance
>> > had been observed on some version, not that you shouldn't do it.  Any
>> > other views on this stronger statement?
>>
>> Now that the Windows huge pages patch has landed, here is a rebase.  I
>> took your alternative and tweaked it a tiny bit more.  Thoughts?
>
> +       <para>
> +        Note that, besides explicitly requesting huge pages via
> +        <varname>huge_pages</varname>,
> => I would just say:
> "Note that, besides huge pages requested explicitly, ..."

+1

> + In Linux this automatic use is
> => ON Linux comma?

+1

> +        called "transparent huge pages" and is not enabled by default in
> +        popular distributions as of the time of writing, but since transparent
>
> => really ?  I don't know if I've ever seen it not enabled.  In any case,
> that's a strong statement to make (to be disabled in ALL popular distributions).

Argh.

> https://blog.nelhage.com/post/transparent-hugepages/
> => It is enabled (”enabled=always”) by default in most Linux distributions.

Sorry, right, that was 100% wrong.  It would probably be correct to
remove the "not", but let's just remove that bit.  New version
attached.

Thanks.

--
Thomas Munro
http://www.enterprisedb.com

Attachment

Re: Doc tweak for huge_pages?

From
Justin Pryzby
Date:
On Mon, Jan 22, 2018 at 07:10:33PM +1300, Thomas Munro wrote:
> On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
> >> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
> >> <thomas.munro@enterprisedb.com> wrote:
> >> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
> >> > I don't know enough about this to make such a strong recommendation
> >> > myself, which is why I was only trying to report that bad performance
> >> > had been observed on some version, not that you shouldn't do it.  Any
> >> > other views on this stronger statement?
> >>
> >> Now that the Windows huge pages patch has landed, here is a rebase.  I
> >> took your alternative and tweaked it a tiny bit more.  Thoughts?
>
> Sorry, right, that was 100% wrong.  It would probably be correct to
> remove the "not", but let's just remove that bit.  New version
> attached.

+        <productname>PostgreSQL</productname>. On Linux, this is called
+        "transparent huge pages", but since that feature is known to cause
+        performance degradation with
+        <productname>PostgreSQL</productname> on current Linux versions
+        (unlike explicit use of <varname>huge_pages</varname>), its use is
+        discouraged.

Consider this shorter, less-severe sounding alternative:
"... (but note that this feature can degrade performance of some
<productname>PostgreSQL</productname> workloads)."

Justin


Re: Doc tweak for huge_pages?

From
Catalin Iacob
Date:
On Mon, Jan 22, 2018 at 7:23 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> Consider this shorter, less-severe sounding alternative:
> "... (but note that this feature can degrade performance of some
> <productname>PostgreSQL</productname> workloads)."

I think the patch looks good now.

As Justin mentions, as far as I see the only arguable piece is how
strong the language should be against Linux THP.

On one hand it can be argued that warning about THP issues is not the
job of this patch. On the other hand this patch does say more about
THP and Googling does bring up a lot of trouble and advice to disable
THP, including:

https://www.postgresql.org/message-id/CANQNgOrD02f8mR3Y8Pi=zFsoL14RqNQA8hwz1r4rSnDLr1b2Cw@mail.gmail.com

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-transhuge

The RedHat article above says "However, THP is not recommended for
database workloads."

I'll leave this to the committer and switch this patch to Ready for Committer.

By the way, Fedora 27 does disable THP by default, they deviate from
upstream in this regard:

[catalin@fedie scripts]$ cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
[catalin@fedie scripts]$ grep TRANSPARENT /boot/config-4.14.13-300.fc27.x86_64
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y

When I have some time I'll try to do some digging into history of the
Fedora kernel package to see if they provide a rationale for changing
the default. That might hint whether it's likely that future RHEL will
change as well.


Re: Doc tweak for huge_pages?

From
Catalin Iacob
Date:
On Tue, Jan 23, 2018 at 7:13 PM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
> By the way, Fedora 27 does disable THP by default, they deviate from
> upstream in this regard:

> When I have some time I'll try to do some digging into history of the
> Fedora kernel package to see if they provide a rationale for changing
> the default. That might hint whether it's likely that future RHEL will
> change as well.

I see Peter assigned himself as committer, some more information below
for him to decide on the strength of the anti THP message.

commit 9a031d5070d9f8f5916c48637bd0c237cd52eaf9
Author: Josh Boyer <jwboyer@redhat.com>
Date:   Thu Mar 27 18:31:16 2014 -0400

    Switch to CONFIG_TRANSPARENT_HUGEPAGE_MADVISE instead of always on

    The benefit of THP has been somewhat questionable overall for a while,
    and it's been known to cause performance issues with some workloads.
    Upstream also considers it to be overly complicated and really not worth
    it on machines with memory in the amounts found on typical desktops/SMB
    servers.

    Switch to using it via madvise, which most applications that care about
    it should likely already be doing.

Debian 9 also seems to default to madvise instead of always.

Digging more into it, there were changes in the 4.6 kernel (released
May 2016) that should improve THP, more precisely:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=444eb2a449ef36fe115431ed7b71467c4563c7f1

This also lead Debian to change their default in September 2017 (so
for the future Debian release) back to always, referencing the 44eb2a
improvements:
https://anonscm.debian.org/cgit/kernel/linux.git/commit/debian/changelog?id=611a8e67260e8b8190ab991206a3867681d6df91

Ben Hutchings <ben@decadent.org.uk>2017-09-29 14:32:09 (GMT)
thp: Enable TRANSPARENT_HUGEPAGE_ALWAYS instead of TRANSPARENT_HUGEPAGE_MADVISE
As advised by Andrea Arcangeli - since commit 444eb2a449ef "mm: thp:
set THP defrag by default to madvise and add a stall-free defrag
option" this will generally be best for performance.

So maybe we should weaken the language against THP. Maybe present the
known facts so far, even if the post 4.6 situation is vague/unknown:
before Linux 4.6 there were repeated reports of THP problems with
Postgres, Linux >= 4.6 might improve things but this isn't confirmed.
And it would be good if somebody could run benchmarks on pre 4.6 and
post 4.6 kernels. I would love to but have no access to big (or
medium) hardware.


Re: Doc tweak for huge_pages?

From
Justin Pryzby
Date:
On Wed, Jan 24, 2018 at 07:46:41AM +0100, Catalin Iacob wrote:
> I see Peter assigned himself as committer, some more information below
> for him to decide on the strength of the anti THP message.
Thanks for digging this up!

> And it would be good if somebody could run benchmarks on pre 4.6 and
> post 4.6 kernels. I would love to but have no access to big (or
> medium) hardware.
I should be able to do this, since I have a handful of kernels upgrades on my
todo list.  Can you recommend a test ?  Otherwise I'll come up with something
for pgbench.

But I think any test should be independant of and not influence the doc change
(I don't know anywhere else in the docs which talks about behaviors of specific
kernel versions, which often have vendor patches backpatched anyway).

> So maybe we should weaken the language against THP. Maybe present the
> known facts so far, even if the post 4.6 situation is vague/unknown:
> before Linux 4.6 there were repeated reports of THP problems with
> Postgres, Linux >= 4.6 might improve things but this isn't confirmed.
> And it would be good if somebody could run benchmarks on pre 4.6 and
> post 4.6 kernels. I would love to but have no access to big (or
> medium) hardware.
I think all the details should go elsewhere in the docs; config.sgml already
references this:
https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-HUGE-PAGES
..but it doesn't currently mention "transparent" hugepages.

Justin


Re: Doc tweak for huge_pages?

From
Peter Eisentraut
Date:
On 1/22/18 01:10, Thomas Munro wrote:
> Sorry, right, that was 100% wrong.  It would probably be correct to
> remove the "not", but let's just remove that bit.  New version
> attached.

Committed that.

I reordered some of the existing material because it seemed to have
gotten a bit out of order with repeated patching.

I also softened the advice against THP just a bit, since that is
apparently still changing all the time.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services