Doc tweak for huge_pages?
Hi hackers,
The manual implies that only Linux can use huge pages. That is not
true: FreeBSD, Illumos and probably others support larger page sizes
using transparent page coalescing algorithms. On my FreeBSD box
procstat -v often shows PostgreSQL shared buffers in "S"-flagged
memory. I think we should adjust the manual to make clear that it's
the *explicit request for huge pages* that is supported only on Linux
(and hopefully soon Windows). Am I being too pedantic?
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
huge-pages-doc-tweak.patchapplication/octet-stream; name=huge-pages-doc-tweak.patchDownload
From 85a0675f008b840cb3e91c385aeb5a293095fe89 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Fri, 1 Dec 2017 15:34:23 +1300
Subject: [PATCH] Update documentation to mention huge pages on other OSes.
Currently the docs imply that only Linux can use huge pages. That's not quite
true: it's just that Linux is the only OS where we know how to request them
explicitly.
---
doc/src/sgml/config.sgml | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3060597011d..a911df0ab77 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,7 +1363,7 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
+ Controls whether huge memory pages are requested. Valid values are
<literal>try</literal> (the default), <literal>on</literal>,
and <literal>off</literal>.
</para>
@@ -1371,6 +1371,9 @@ include_dir 'conf.d'
<para>
At present, this feature is supported only on Linux. The setting is
ignored on other systems when set to <literal>try</literal>.
+ Note that some other operating systems including FreeBSD and Illumos
+ support huge pages (also known as "super" pages or "large" pages)
+ automatically without an explicit request from PostgreSQL.
</para>
<para>
--
2.15.0
On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
Hi hackers,
The manual implies that only Linux can use huge pages. That is not
true: FreeBSD, Illumos and probably others support larger page sizes
using transparent page coalescing algorithms. On my FreeBSD box
procstat -v often shows PostgreSQL shared buffers in "S"-flagged
memory. I think we should adjust the manual to make clear that it's
the *explicit request for huge pages* that is supported only on Linux
(and hopefully soon Windows). Am I being too pedantic?
I suggest to remove "other" and include Linux in the enumeration, since it also
supports "transparent" hugepages.
Justin
Attachments:
huge-pages-doc-tweak.patch2text/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3060597..98d42e5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,7 +1363,7 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
+ Controls whether huge memory pages are explicitly requested. Valid values are
<literal>try</literal> (the default), <literal>on</literal>,
and <literal>off</literal>.
</para>
@@ -1371,6 +1371,9 @@ include_dir 'conf.d'
<para>
At present, this feature is supported only on Linux. The setting is
ignored on other systems when set to <literal>try</literal>.
+ Note that some operating systems including Linux, FreeBSD and Illumos
+ support huge pages (also known as "super" pages or "large" pages)
+ automatically without an explicit request from PostgreSQL.
</para>
<para>
@@ -1384,7 +1387,7 @@ include_dir 'conf.d'
the server will try to use huge pages, but fall back to using
normal allocation if that fails. With <literal>on</literal>, failure
to use huge pages will prevent the server from starting up. With
- <literal>off</literal>, huge pages will not be used.
+ <literal>off</literal>, huge pages will not be specifically requested.
</para>
</listitem>
</varlistentry>
On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
Hi hackers,
The manual implies that only Linux can use huge pages. That is not
true: FreeBSD, Illumos and probably others support larger page sizes
using transparent page coalescing algorithms. On my FreeBSD box
procstat -v often shows PostgreSQL shared buffers in "S"-flagged
memory. I think we should adjust the manual to make clear that it's
the *explicit request for huge pages* that is supported only on Linux
(and hopefully soon Windows). Am I being too pedantic?I suggest to remove "other" and include Linux in the enumeration, since it also
supports "transparent" hugepages.
Hmm. Yeah, it does, but apparently it's not so transparent. So if we
mention that we'd better indicate in the same paragraph that you
probably don't actually want to use it. How about the attached?
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
huge-pages-doc-tweak-v2.patchapplication/octet-stream; name=huge-pages-doc-tweak-v2.patchDownload
From 260acb8293b957774eae26aa0374860e28a5a71e Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Fri, 1 Dec 2017 15:34:23 +1300
Subject: [PATCH] Update documentation to mention huge pages on other OSes.
Currently the docs imply that only Linux can use huge pages. That's not quite
true: it's just that Linux is the only OS where we know how to request them
explicitly.
Author: Thomas Munro
Reviewed-By: Justin Pryzby
Discussion: https://postgr.es/m/CAEepm=3qzR-hfjepymohuC4XO5phxoSoipOjm6BEhnJHjNR+jg@mail.gmail.com
---
doc/src/sgml/config.sgml | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3060597011d..c104bb66217 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,7 +1363,7 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
+ Controls whether huge memory pages are requested. Valid values are
<literal>try</literal> (the default), <literal>on</literal>,
and <literal>off</literal>.
</para>
@@ -1371,6 +1371,13 @@ include_dir 'conf.d'
<para>
At present, this feature is supported only on Linux. The setting is
ignored on other systems when set to <literal>try</literal>.
+ Note that some other operating systems including FreeBSD and Illumos
+ can use huge pages (also known as "super" pages or "large" pages)
+ automatically without an explicit request from
+ <productname>PostgreSQL</productname>. Linux
+ also has an optional "transparent huge pages" feature, but its
+ performance has shown to be inferior to that of explicitly requested
+ huge pages on some versions.
</para>
<para>
--
2.15.0
On 11/30/17 23:35, Thomas Munro wrote:
On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
Hi hackers,
The manual implies that only Linux can use huge pages. That is not
true: FreeBSD, Illumos and probably others support larger page sizes
using transparent page coalescing algorithms. On my FreeBSD box
procstat -v often shows PostgreSQL shared buffers in "S"-flagged
memory. I think we should adjust the manual to make clear that it's
the *explicit request for huge pages* that is supported only on Linux
(and hopefully soon Windows). Am I being too pedantic?I suggest to remove "other" and include Linux in the enumeration, since it also
supports "transparent" hugepages.Hmm. Yeah, it does, but apparently it's not so transparent. So if we
mention that we'd better indicate in the same paragraph that you
probably don't actually want to use it. How about the attached?
Part of the confusion is that the huge_pages setting is only for shared
memory, whereas the kernel settings affect all memory. Is the same true
for the proposed Windows patch?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Dec 2, 2017 at 4:08 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 11/30/17 23:35, Thomas Munro wrote:
On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
Hi hackers,
The manual implies that only Linux can use huge pages. That is not
true: FreeBSD, Illumos and probably others support larger page sizes
using transparent page coalescing algorithms. On my FreeBSD box
procstat -v often shows PostgreSQL shared buffers in "S"-flagged
memory. I think we should adjust the manual to make clear that it's
the *explicit request for huge pages* that is supported only on Linux
(and hopefully soon Windows). Am I being too pedantic?I suggest to remove "other" and include Linux in the enumeration, since it also
supports "transparent" hugepages.Hmm. Yeah, it does, but apparently it's not so transparent. So if we
mention that we'd better indicate in the same paragraph that you
probably don't actually want to use it. How about the attached?Part of the confusion is that the huge_pages setting is only for shared
memory, whereas the kernel settings affect all memory.
Right. And more specifically, just the main shared memory area, not
DSM segments. Updated to make this point.
(I have wondered whether DSM segments should respect this GUC: it
seems plausible that they should when the size is a multiple of the
huge page size, so that very large DSA areas finish up mostly backed
by huge pages, so that very large shared hash tables would benefit
from lower TLB miss rates. I have only read in an academic paper that
this is a good idea, I haven't investigated whether that would really
help us in practice, and the first problem is that Linux shm_open
doesn't support huge pages anyway so you've need one of the other DSM
implementation options which are currently non-default.)
Is the same true
for the proposed Windows patch?
Yes. It adds a flag to the request for the main shared memory area
(after jumping through various permissions hoops).
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
huge-pages-doc-tweak-v3.patchapplication/octet-stream; name=huge-pages-doc-tweak-v3.patchDownload
From 8d2e69f114f3aa99d314a997a0b6e734274db77a Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Fri, 1 Dec 2017 15:34:23 +1300
Subject: [PATCH] Update documentation to mention huge pages on other OSes.
Currently the docs imply that only Linux can use huge pages. That's not quite
true: it's just that Linux is the only OS where we know how to request them
explicitly.
Author: Thomas Munro
Reviewed-By: Justin Pryzby, Peter Eisentraut
Discussion: https://postgr.es/m/CAEepm=3qzR-hfjepymohuC4XO5phxoSoipOjm6BEhnJHjNR+jg@mail.gmail.com
---
doc/src/sgml/config.sgml | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3060597011d..8b6a8beb0a3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,14 +1363,21 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
- <literal>try</literal> (the default), <literal>on</literal>,
- and <literal>off</literal>.
+ Controls whether huge memory pages are requested for the main shared
+ memory area. Valid values are <literal>try</literal> (the default),
+ <literal>on</literal>, and <literal>off</literal>.
</para>
<para>
At present, this feature is supported only on Linux. The setting is
ignored on other systems when set to <literal>try</literal>.
+ Note that some other operating systems including FreeBSD and Illumos
+ can use huge pages (also known as "super" pages or "large" pages)
+ automatically without an explicit request from
+ <productname>PostgreSQL</productname>. Linux
+ also has an optional "transparent huge pages" feature, but its
+ performance has shown to be inferior to that of explicitly requested
+ huge pages on some versions.
</para>
<para>
--
2.15.0
On 12/01/2017 05:35 AM, Thomas Munro wrote:
since it also
supports "transparent" hugepages.Hmm. Yeah, it does, but apparently it's not so transparent.
+1. We saw performance drop with transparent_hugepage enabled on server with
more than 256GB RAM. Access to the cache where slow down when kernel try to
defragment pages.
When that happens, we saw the function isolate_freepages_block appearing in most
consuming function with perf top.
Thanks to Marc Cousin analysis, putting "madvise" to
/sys/kernel/mm/transparent_hugepage/enabled solved the problem. He also notice
that THP only works for "anonymous memory mappings"[1] (shared_buffers are not
anonymous).
1: https://www.kernel.org/doc/Documentation/vm/transhuge.txt
Regards,
--
Adrien NAYRAT
On Mon, Dec 04, 2017 at 09:48:44AM +0100, Adrien Nayrat wrote:
On 12/01/2017 05:35 AM, Thomas Munro wrote:
since it also supports "transparent" hugepages.
Hmm. Yeah, it does, but apparently it's not so transparent.
+1. We saw performance drop with transparent_hugepage enabled on server with
more than 256GB RAM. Access to the cache where slow down when kernel try to
defragment pages.
consuming function with perf top.Thanks to Marc Cousin analysis, putting "madvise" to
/sys/kernel/mm/transparent_hugepage/enabled solved the problem. He also notice
that THP only works for "anonymous memory mappings"[1] (shared_buffers are not
anonymous).
Note PG since 9.3 can and prefers to use anonymous mmap for shared buffers instead
of sysv shm.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b0fc0df9364d2d2d17c0162cf3b8b59f6cb09f67
./src/backend/port/sysv_shmem.c and
src/include/portability/mem.h:#define PG_MMAP_FLAGS (MAP_SHARED|MAP_ANONYMOUS|MAP_HASSEMAPHORE)
Justin
On Fri, Dec 1, 2017 at 10:09 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
On 11/30/17 23:35, Thomas Munro wrote:
Hmm. Yeah, it does, but apparently it's not so transparent. So if we
mention that we'd better indicate in the same paragraph that you
probably don't actually want to use it. How about the attached?
Here's a review for v3.
I find that the first paragraph is an improvement as it's more precise.
What I didn't like about the second paragraph is that it pointed out
Linux transparent huge pages too favorably while they are actually
known to cause big (huge?, pardon the pun) issues (as witnessed in
this thread as well). v3 basically says "in Linux it can be
transparent or explicit and explicit is faster than transparent".
Reading that, and seeing that explicit needs tweaking of kernel
parameters and so on, one might very well conclude "I'll use the
slightly-slower-but-still-better-than-nothing transparent version".
So I tried to redo the second paragraph and ended up with the
attached. Rationale for the changes:
* changed "this feature" to "explicitly requesting huge pages" to
contrast with the automatic one described below
* made the wording of Linux THP more negative (but still with some
wiggle room for future kernel versions which might improve THP),
contrasting with the positive explicit request from this GUC
* integrated your mention of other OSes with automatic huge pages
* moved the new text to the last paragraph to lower its importance
What do you think?
Attachments:
huge-pages-doc-tweak-v3-alternative.patchtext/x-patch; charset=US-ASCII; name=huge-pages-doc-tweak-v3-alternative.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e4a01699e4..b6b309a943 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,14 +1363,15 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
- <literal>try</literal> (the default), <literal>on</literal>,
- and <literal>off</literal>.
+ Controls whether huge memory pages are requested for the main shared
+ memory area. Valid values are <literal>try</literal> (the default),
+ <literal>on</literal>, and <literal>off</literal>.
</para>
<para>
- At present, this feature is supported only on Linux. The setting is
- ignored on other systems when set to <literal>try</literal>.
+ At present, explicitly requesting huge pages is supported only on
+ Linux. The setting is ignored on other systems when set to
+ <literal>try</literal>.
</para>
<para>
@@ -1386,6 +1387,18 @@ include_dir 'conf.d'
to use huge pages will prevent the server from starting up. With
<literal>off</literal>, huge pages will not be used.
</para>
+
+ <para>
+ Note that, besides explicitly requesting huge pages via
+ <varname>huge_pages</varname>, operating systems including Linux,
+ FreeBSD and Illumos can also use huge pages (sometimes known as "super"
+ pages or "large" pages) automatically, without an explicit request from
+ <productname>PostgreSQL</productname>. In Linux this automatic use is
+ called "transparent huge pages" but, for some Linux kernel versions,
+ transparent huge pages are known to cause performance degradation with
+ <productname>PostgreSQL</productname> so, unlike
+ <varname>huge_pages</varname>, their use is discouraged.
+ </para>
</listitem>
</varlistentry>
On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
On Fri, Dec 1, 2017 at 10:09 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:On 11/30/17 23:35, Thomas Munro wrote:
Hmm. Yeah, it does, but apparently it's not so transparent. So if we
mention that we'd better indicate in the same paragraph that you
probably don't actually want to use it. How about the attached?Here's a review for v3.
Thanks!
I find that the first paragraph is an improvement as it's more precise.
What I didn't like about the second paragraph is that it pointed out
Linux transparent huge pages too favorably while they are actually
known to cause big (huge?, pardon the pun) issues (as witnessed in
this thread as well). v3 basically says "in Linux it can be
transparent or explicit and explicit is faster than transparent".
Reading that, and seeing that explicit needs tweaking of kernel
parameters and so on, one might very well conclude "I'll use the
slightly-slower-but-still-better-than-nothing transparent version".So I tried to redo the second paragraph and ended up with the
attached. Rationale for the changes:
* changed "this feature" to "explicitly requesting huge pages" to
contrast with the automatic one described below
* made the wording of Linux THP more negative (but still with some
wiggle room for future kernel versions which might improve THP),
contrasting with the positive explicit request from this GUC
* integrated your mention of other OSes with automatic huge pages
* moved the new text to the last paragraph to lower its importanceWhat do you think?
I don't know enough about this to make such a strong recommendation
myself, which is why I was only trying to report that bad performance
had been observed on some version, not that you shouldn't do it. Any
other views on this stronger statement?
--
Thomas Munro
http://www.enterprisedb.com
On 12/1/17 10:08, Peter Eisentraut wrote:
Part of the confusion is that the huge_pages setting is only for shared
memory, whereas the kernel settings affect all memory. Is the same true
for the proposed Windows patch?
Btw., I'm kind of hoping that the Windows patch would be committed
first, so that we don't have to rephrase this whole thing again after that.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
So I tried to redo the second paragraph and ended up with the
attached. Rationale for the changes:
* changed "this feature" to "explicitly requesting huge pages" to
contrast with the automatic one described below
* made the wording of Linux THP more negative (but still with some
wiggle room for future kernel versions which might improve THP),
contrasting with the positive explicit request from this GUC
* integrated your mention of other OSes with automatic huge pages
* moved the new text to the last paragraph to lower its importanceWhat do you think?
I don't know enough about this to make such a strong recommendation
myself, which is why I was only trying to report that bad performance
had been observed on some version, not that you shouldn't do it. Any
other views on this stronger statement?
Now that the Windows huge pages patch has landed, here is a rebase. I
took your alternative and tweaked it a tiny bit more. Thoughts?
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
huge-pages-doc-tweak-v4.patchapplication/octet-stream; name=huge-pages-doc-tweak-v4.patchDownload
From 3e52a92a71d9a0f46846c515105c562496a09a18 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 22 Jan 2018 15:30:56 +1300
Subject: [PATCH] Update documentation to mention huge pages on other OSes.
Previously the docs implied that only Linux and Windows could use huge pages.
That's not quite true: it's just that we only know how to request them
explicitly on those OSes. Be more explicit about what huge_pages really does
and mention that some OSes may use huge pages automatically.
Author: Thomas Munro and Catalin Iacob
Reviewed-By: Justin Pryzby, Peter Eisentraut
Discussion: https://postgr.es/m/CAEepm=3qzR-hfjepymohuC4XO5phxoSoipOjm6BEhnJHjNR+jg@mail.gmail.com
---
doc/src/sgml/config.sgml | 34 ++++++++++++++++++++++++----------
1 file changed, 24 insertions(+), 10 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cc156c6385e..a7a8a765973 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,14 +1363,15 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
- <literal>try</literal> (the default), <literal>on</literal>,
- and <literal>off</literal>.
+ Controls whether huge pages are requested for the main shared memory
+ area. Valid values are <literal>try</literal> (the default),
+ <literal>on</literal>, and <literal>off</literal>.
</para>
<para>
- At present, this feature is supported only on Linux and Windows. The
- setting is ignored on other systems when set to <literal>try</literal>.
+ At present, explicitly requesting huge pages is supported only on
+ Linux and Windows. The setting is ignored on other systems when set
+ to <literal>try</literal>.
</para>
<para>
@@ -1392,11 +1393,24 @@ include_dir 'conf.d'
</para>
<para>
- With <varname>huge_pages</varname> set to <literal>try</literal>,
- the server will try to use huge pages, but fall back to using
- normal allocation if that fails. With <literal>on</literal>, failure
- to use huge pages will prevent the server from starting up. With
- <literal>off</literal>, huge pages will not be used.
+ With <varname>huge_pages</varname> set to <literal>try</literal>, the
+ server will try to request huge pages, but fall back to the default if
+ that fails. With <literal>on</literal>, failure to request huge pages
+ will prevent the server from starting up. With
+ <literal>off</literal>, huge pages will not be requested.
+ </para>
+
+ <para>
+ Note that, besides explicitly requesting huge pages via
+ <varname>huge_pages</varname>, operating systems including Linux,
+ FreeBSD and Illumos can also use huge pages (also known as "super"
+ pages or "large" pages) automatically, without an explicit request from
+ <productname>PostgreSQL</productname>. In Linux this automatic use is
+ called "transparent huge pages" and is not enabled by default in
+ popular distributions as of the time of writing, but since transparent
+ huge pages are known to cause performance degradation with
+ <productname>PostgreSQL</productname> on current Linux versions (unlike
+ explicit use of <varname>huge_pages</varname>), their use is discouraged.
</para>
</listitem>
</varlistentry>
--
2.15.1
On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
I don't know enough about this to make such a strong recommendation
myself, which is why I was only trying to report that bad performance
had been observed on some version, not that you shouldn't do it. Any
other views on this stronger statement?Now that the Windows huge pages patch has landed, here is a rebase. I
took your alternative and tweaked it a tiny bit more. Thoughts?
+ <para>
+ Note that, besides explicitly requesting huge pages via
+ <varname>huge_pages</varname>,
=> I would just say:
"Note that, besides huge pages requested explicitly, ..."
+ In Linux this automatic use is
=> ON Linux comma?
+ called "transparent huge pages" and is not enabled by default in
+ popular distributions as of the time of writing, but since transparent
=> really ? I don't know if I've ever seen it not enabled. In any case,
that's a strong statement to make (to be disabled in ALL popular distributions).
I checked all our servers, including centos6 and ubuntu t-LTS and x-LTS. On a
limited few where it was disabled, I'd explicitly done so.
On a server on which I just installed ubuntu-x LTS, with 4.13.0-26-generic:
pryzbyj@gta-ubuntu:~$ cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
https://github.com/torvalds/linux/commit/13ece886d99cd668483113f7238e419d5331af26
=> the compile time default is to disable, but (if enabled at compile time),
the runtime default is "always".
On centos7
Linux template0 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
$ grep TRANS /boot/config-3.10.0-693.11.6.el7.x86_64
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
https://blog.nelhage.com/post/transparent-hugepages/
=> It is enabled (”enabled=always”) by default in most Linux distributions.
Justin
On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
I don't know enough about this to make such a strong recommendation
myself, which is why I was only trying to report that bad performance
had been observed on some version, not that you shouldn't do it. Any
other views on this stronger statement?Now that the Windows huge pages patch has landed, here is a rebase. I
took your alternative and tweaked it a tiny bit more. Thoughts?+ <para> + Note that, besides explicitly requesting huge pages via + <varname>huge_pages</varname>, => I would just say: "Note that, besides huge pages requested explicitly, ..."
+1
+ In Linux this automatic use is
=> ON Linux comma?
+1
+ called "transparent huge pages" and is not enabled by default in + popular distributions as of the time of writing, but since transparent=> really ? I don't know if I've ever seen it not enabled. In any case,
that's a strong statement to make (to be disabled in ALL popular distributions).
Argh.
https://blog.nelhage.com/post/transparent-hugepages/
=> It is enabled (”enabled=always”) by default in most Linux distributions.
Sorry, right, that was 100% wrong. It would probably be correct to
remove the "not", but let's just remove that bit. New version
attached.
Thanks.
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
huge-pages-doc-tweak-v5.patchapplication/octet-stream; name=huge-pages-doc-tweak-v5.patchDownload
From 23af71f093c2ee6c0b1aa8f02bcfd43c0d492d55 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 22 Jan 2018 15:30:56 +1300
Subject: [PATCH] Update documentation to mention huge pages on other OSes.
Previously the docs implied that only Linux and Windows could use huge pages.
That's not quite true: it's just that we only know how to request them
explicitly on those OSes. Be more explicit about what huge_pages really does
and mention that some OSes may use huge pages automatically.
Author: Thomas Munro and Catalin Iacob
Reviewed-By: Justin Pryzby, Peter Eisentraut
Discussion: https://postgr.es/m/CAEepm=3qzR-hfjepymohuC4XO5phxoSoipOjm6BEhnJHjNR+jg@mail.gmail.com
---
doc/src/sgml/config.sgml | 34 ++++++++++++++++++++++++----------
1 file changed, 24 insertions(+), 10 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cc156c6385e..662b2a98b07 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1363,14 +1363,15 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- Enables/disables the use of huge memory pages. Valid values are
- <literal>try</literal> (the default), <literal>on</literal>,
- and <literal>off</literal>.
+ Controls whether huge pages are requested for the main shared memory
+ area. Valid values are <literal>try</literal> (the default),
+ <literal>on</literal>, and <literal>off</literal>.
</para>
<para>
- At present, this feature is supported only on Linux and Windows. The
- setting is ignored on other systems when set to <literal>try</literal>.
+ At present, explicitly requesting huge pages is supported only on
+ Linux and Windows. The setting is ignored on other systems when set
+ to <literal>try</literal>.
</para>
<para>
@@ -1392,11 +1393,24 @@ include_dir 'conf.d'
</para>
<para>
- With <varname>huge_pages</varname> set to <literal>try</literal>,
- the server will try to use huge pages, but fall back to using
- normal allocation if that fails. With <literal>on</literal>, failure
- to use huge pages will prevent the server from starting up. With
- <literal>off</literal>, huge pages will not be used.
+ With <varname>huge_pages</varname> set to <literal>try</literal>, the
+ server will try to request huge pages, but fall back to the default if
+ that fails. With <literal>on</literal>, failure to request huge pages
+ will prevent the server from starting up. With
+ <literal>off</literal>, huge pages will not be requested.
+ </para>
+
+ <para>
+ Note that, besides huge pages requested explicitly, operating systems
+ including Linux, FreeBSD and Illumos can also use huge pages (also
+ known as "super" pages or "large" pages) automatically, without an
+ explicit request from
+ <productname>PostgreSQL</productname>. On Linux, this is called
+ "transparent huge pages", but since that feature is known to cause
+ performance degradation with
+ <productname>PostgreSQL</productname> on current Linux versions
+ (unlike explicit use of <varname>huge_pages</varname>), its use is
+ discouraged.
</para>
</listitem>
</varlistentry>
--
2.15.1
On Mon, Jan 22, 2018 at 07:10:33PM +1300, Thomas Munro wrote:
On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
I don't know enough about this to make such a strong recommendation
myself, which is why I was only trying to report that bad performance
had been observed on some version, not that you shouldn't do it. Any
other views on this stronger statement?Now that the Windows huge pages patch has landed, here is a rebase. I
took your alternative and tweaked it a tiny bit more. Thoughts?Sorry, right, that was 100% wrong. It would probably be correct to
remove the "not", but let's just remove that bit. New version
attached.
+ <productname>PostgreSQL</productname>. On Linux, this is called
+ "transparent huge pages", but since that feature is known to cause
+ performance degradation with
+ <productname>PostgreSQL</productname> on current Linux versions
+ (unlike explicit use of <varname>huge_pages</varname>), its use is
+ discouraged.
Consider this shorter, less-severe sounding alternative:
"... (but note that this feature can degrade performance of some
<productname>PostgreSQL</productname> workloads)."
Justin
On Mon, Jan 22, 2018 at 7:23 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
Consider this shorter, less-severe sounding alternative:
"... (but note that this feature can degrade performance of some
<productname>PostgreSQL</productname> workloads)."
I think the patch looks good now.
As Justin mentions, as far as I see the only arguable piece is how
strong the language should be against Linux THP.
On one hand it can be argued that warning about THP issues is not the
job of this patch. On the other hand this patch does say more about
THP and Googling does bring up a lot of trouble and advice to disable
THP, including:
/messages/by-id/CANQNgOrD02f8mR3Y8Pi=zFsoL14RqNQA8hwz1r4rSnDLr1b2Cw@mail.gmail.com
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-transhuge
The RedHat article above says "However, THP is not recommended for
database workloads."
I'll leave this to the committer and switch this patch to Ready for Committer.
By the way, Fedora 27 does disable THP by default, they deviate from
upstream in this regard:
[catalin@fedie scripts]$ cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
[catalin@fedie scripts]$ grep TRANSPARENT /boot/config-4.14.13-300.fc27.x86_64
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
When I have some time I'll try to do some digging into history of the
Fedora kernel package to see if they provide a rationale for changing
the default. That might hint whether it's likely that future RHEL will
change as well.
On Tue, Jan 23, 2018 at 7:13 PM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
By the way, Fedora 27 does disable THP by default, they deviate from
upstream in this regard:
When I have some time I'll try to do some digging into history of the
Fedora kernel package to see if they provide a rationale for changing
the default. That might hint whether it's likely that future RHEL will
change as well.
I see Peter assigned himself as committer, some more information below
for him to decide on the strength of the anti THP message.
commit 9a031d5070d9f8f5916c48637bd0c237cd52eaf9
Author: Josh Boyer <jwboyer@redhat.com>
Date: Thu Mar 27 18:31:16 2014 -0400
Switch to CONFIG_TRANSPARENT_HUGEPAGE_MADVISE instead of always on
The benefit of THP has been somewhat questionable overall for a while,
and it's been known to cause performance issues with some workloads.
Upstream also considers it to be overly complicated and really not worth
it on machines with memory in the amounts found on typical desktops/SMB
servers.
Switch to using it via madvise, which most applications that care about
it should likely already be doing.
Debian 9 also seems to default to madvise instead of always.
Digging more into it, there were changes in the 4.6 kernel (released
May 2016) that should improve THP, more precisely:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=444eb2a449ef36fe115431ed7b71467c4563c7f1
This also lead Debian to change their default in September 2017 (so
for the future Debian release) back to always, referencing the 44eb2a
improvements:
https://anonscm.debian.org/cgit/kernel/linux.git/commit/debian/changelog?id=611a8e67260e8b8190ab991206a3867681d6df91
Ben Hutchings <ben@decadent.org.uk>2017-09-29 14:32:09 (GMT)
thp: Enable TRANSPARENT_HUGEPAGE_ALWAYS instead of TRANSPARENT_HUGEPAGE_MADVISE
As advised by Andrea Arcangeli - since commit 444eb2a449ef "mm: thp:
set THP defrag by default to madvise and add a stall-free defrag
option" this will generally be best for performance.
So maybe we should weaken the language against THP. Maybe present the
known facts so far, even if the post 4.6 situation is vague/unknown:
before Linux 4.6 there were repeated reports of THP problems with
Postgres, Linux >= 4.6 might improve things but this isn't confirmed.
And it would be good if somebody could run benchmarks on pre 4.6 and
post 4.6 kernels. I would love to but have no access to big (or
medium) hardware.
On Wed, Jan 24, 2018 at 07:46:41AM +0100, Catalin Iacob wrote:
I see Peter assigned himself as committer, some more information below
for him to decide on the strength of the anti THP message.
Thanks for digging this up!
And it would be good if somebody could run benchmarks on pre 4.6 and
post 4.6 kernels. I would love to but have no access to big (or
medium) hardware.
I should be able to do this, since I have a handful of kernels upgrades on my
todo list. Can you recommend a test ? Otherwise I'll come up with something
for pgbench.
But I think any test should be independant of and not influence the doc change
(I don't know anywhere else in the docs which talks about behaviors of specific
kernel versions, which often have vendor patches backpatched anyway).
So maybe we should weaken the language against THP. Maybe present the
known facts so far, even if the post 4.6 situation is vague/unknown:
before Linux 4.6 there were repeated reports of THP problems with
Postgres, Linux >= 4.6 might improve things but this isn't confirmed.
And it would be good if somebody could run benchmarks on pre 4.6 and
post 4.6 kernels. I would love to but have no access to big (or
medium) hardware.
I think all the details should go elsewhere in the docs; config.sgml already
references this:
https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-HUGE-PAGES
..but it doesn't currently mention "transparent" hugepages.
Justin
On 1/22/18 01:10, Thomas Munro wrote:
Sorry, right, that was 100% wrong. It would probably be correct to
remove the "not", but let's just remove that bit. New version
attached.
Committed that.
I reordered some of the existing material because it seemed to have
gotten a bit out of order with repeated patching.
I also softened the advice against THP just a bit, since that is
apparently still changing all the time.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services