Thread: Improve logging when using Huge Pages
Hi Hackers, In the current version, when GUC huge_pages=try, which is the default setting, no log is output regardless of the successor failure of the HugePages acquisition. If you want to output logs, you need to set log_min_messages=DEBUG3, butit will output a huge amount of extra logs. With huge_pages=try setting, if the kernel parameter vm.nr_hugepages is not enough, the administrator will not notice thatHugePages is not being used. I think it should output a log if HugePages was not available. By the way, in MySQL with almost the same architecture, the following log is output at the Warning level. [Warning] [MY-012677] [InnoDB] Failed to allocate 138412032 bytes. errno 1 [Warning] [MY-012679] [InnoDB] Using conventional memory pool The attached small patch outputs a log at the WARNING level when huge_pages = try and if the acquisition of HugePages fails. Regards, Noriyoshi Shinoda
Attachment
On Tue, Aug 31, 2021 at 1:37 PM Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> wrote: > > In the current version, when GUC huge_pages=try, which is the default setting, no log is output regardless of the successor failure of the HugePages acquisition. If you want to output logs, you need to set log_min_messages=DEBUG3, butit will output a huge amount of extra logs. > With huge_pages=try setting, if the kernel parameter vm.nr_hugepages is not enough, the administrator will not notice thatHugePages is not being used. > I think it should output a log if HugePages was not available. I agree that the message should be promoted to a higher level. But I think we should also make that information available at the SQL level, as the log files may be truncated / rotated before you need the info, and it can be troublesome to find the information at the OS level, if you're lucky enough to have OS access.
On 2021/08/31 15:57, Julien Rouhaud wrote: > On Tue, Aug 31, 2021 at 1:37 PM Shinoda, Noriyoshi (PN Japan FSIP) > <noriyoshi.shinoda@hpe.com> wrote: >> >> In the current version, when GUC huge_pages=try, which is the default setting, no log is output regardless of the successor failure of the HugePages acquisition. If you want to output logs, you need to set log_min_messages=DEBUG3, butit will output a huge amount of extra logs. >> With huge_pages=try setting, if the kernel parameter vm.nr_hugepages is not enough, the administrator will not noticethat HugePages is not being used. >> I think it should output a log if HugePages was not available. +1 - elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", + elog(WARNING, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", elog() should be used only for internal errors and low-level debug logging. So per your proposal, elog() is not suitable here. Instead, ereport() should be used. The log level should be LOG rather than WARNING because this message indicates the information about server activity that administrators are interested in. The message should be updated so that it follows the Error Message Style Guide. https://www.postgresql.org/docs/devel/error-style-guide.html With huge_pages=on, if shared memory fails to be allocated, the error message is reported currently. Even with huge_page=try, this error message should be used to simplify the code as follows? errno = mmap_errno; - ereport(FATAL, + ereport((huge_pages == HUGE_PAGES_ON) ? FATAL : LOG, (errmsg("could not map anonymous shared memory: %m"), (mmap_errno == ENOMEM) ? errhint("This error usually means that PostgreSQL's request " > I agree that the message should be promoted to a higher level. But I > think we should also make that information available at the SQL level, > as the log files may be truncated / rotated before you need the info, > and it can be troublesome to find the information at the OS level, if > you're lucky enough to have OS access. +1 Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Fujii-san, Julien-san Thank you very much for your comment. I followed your comment and changed the elog function to ereport function and also changed the log level. The output messageis the same as in the case of non-HugePages memory acquisition failure.I did not simplify the error messages as itwould have complicated the response to the preprocessor. > I agree that the message should be promoted to a higher level. But I > think we should also make that information available at the SQL level, > as the log files may be truncated / rotated before you need the info, > and it can be troublesome to find the information at the OS level, if > you're lucky enough to have OS access. In the attached patch, I have added an Internal GUC 'using_huge_pages' to know that it is using HugePages. This parameterwill be True only if the instance is using HugePages. Regards, Noriyoshi Shinoda -----Original Message----- From: Fujii Masao [mailto:masao.fujii@oss.nttdata.com] Sent: Wednesday, September 1, 2021 2:06 AM To: Julien Rouhaud <rjuju123@gmail.com>; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> Cc: pgsql-hackers@postgresql.org Subject: Re: Improve logging when using Huge Pages On 2021/08/31 15:57, Julien Rouhaud wrote: > On Tue, Aug 31, 2021 at 1:37 PM Shinoda, Noriyoshi (PN Japan FSIP) > <noriyoshi.shinoda@hpe.com> wrote: >> >> In the current version, when GUC huge_pages=try, which is the default setting, no log is output regardless of the successor failure of the HugePages acquisition. If you want to output logs, you need to set log_min_messages=DEBUG3, butit will output a huge amount of extra logs. >> With huge_pages=try setting, if the kernel parameter vm.nr_hugepages is not enough, the administrator will not noticethat HugePages is not being used. >> I think it should output a log if HugePages was not available. +1 - elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", + elog(WARNING, "mmap(%zu) with MAP_HUGETLB failed, huge pages +disabled: %m", elog() should be used only for internal errors and low-level debug logging. So per your proposal, elog() is not suitable here. Instead, ereport() should be used. The log level should be LOG rather than WARNING because this message indicates the information about server activity thatadministrators are interested in. The message should be updated so that it follows the Error Message Style Guide. https://www.postgresql.org/docs/devel/error-style-guide.html With huge_pages=on, if shared memory fails to be allocated, the error message is reported currently. Even with huge_page=try,this error message should be used to simplify the code as follows? errno = mmap_errno; - ereport(FATAL, + ereport((huge_pages == HUGE_PAGES_ON) ? FATAL : LOG, (errmsg("could not map anonymous shared memory: %m"), (mmap_errno == ENOMEM) ? errhint("This error usually means that PostgreSQL's request " > I agree that the message should be promoted to a higher level. But I > think we should also make that information available at the SQL level, > as the log files may be truncated / rotated before you need the info, > and it can be troublesome to find the information at the OS level, if > you're lucky enough to have OS access. +1 Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
Hello. At Fri, 3 Sep 2021 06:28:58 +0000, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi.shinoda@hpe.com> wrote in > Fujii-san, Julien-san > > Thank you very much for your comment. > I followed your comment and changed the elog function to ereport function and also changed the log level. The output messageis the same as in the case of non-HugePages memory acquisition failure.I did not simplify the error messages as itwould have complicated the response to the preprocessor. > > > I agree that the message should be promoted to a higher level. But I > > think we should also make that information available at the SQL level, > > as the log files may be truncated / rotated before you need the info, > > and it can be troublesome to find the information at the OS level, if > > you're lucky enough to have OS access. > > In the attached patch, I have added an Internal GUC 'using_huge_pages' to know that it is using HugePages. This parameterwill be True only if the instance is using HugePages. IF you are thinking to show that in GUC, you might want to look into the nearby thread [1], especially about the behavior when invoking postgres -C using_huge_pages. (Even though the word "using" in the name may suggest that the server is running, but I don't think it is neat that the variable shows "no" by the command but shows "yes" while the same server is running.) I have some comment about the patch. - if (huge_pages == HUGE_PAGES_TRY && ptr == MAP_FAILED) - elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", - allocsize); + if (ptr != MAP_FAILED) + using_huge_pages = true; + else if (huge_pages == HUGE_PAGES_TRY) + ereport(LOG, + (errmsg("could not map anonymous shared memory: %m"), + (mmap_errno == ENOMEM) ? + errhint("This error usually means that PostgreSQL's request " If we set huge_pages to try and postgres falled back to regular pages, it emits a large message relative to its importance. The user specifed that "I'd like to use huge pages, but it's ok if not available.", so I think the message should be far smaller. Maybe just raising the DEBUG1 message to LOG along with moving to ereport might be sufficient. - elog(DEBUG1, "CreateFileMapping(%zu) with SEC_LARGE_PAGES failed, " - "huge pages disabled", - size); + ereport(LOG, + (errmsg("could not create shared memory segment: error code %lu", GetLastError()), + errdetail("Failed system call was CreateFileMapping(size=%zu, name=%s).", + size, szShareMem))); It doesn't seem to be a regular user-facing message. Isn't it sufficient just to raise the log level to LOG? [1] https://www.postgresql.org/message-id/20210903.141206.103927759882272221.horikyota.ntt%40gmail.com
On 2021/09/03 16:49, Kyotaro Horiguchi wrote: > IF you are thinking to show that in GUC, you might want to look into > the nearby thread [1] Yes, let's discuss this feature at that thread. > I have some comment about the patch. > > - if (huge_pages == HUGE_PAGES_TRY && ptr == MAP_FAILED) > - elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", > - allocsize); > + if (ptr != MAP_FAILED) > + using_huge_pages = true; > + else if (huge_pages == HUGE_PAGES_TRY) > + ereport(LOG, > + (errmsg("could not map anonymous shared memory: %m"), > + (mmap_errno == ENOMEM) ? > + errhint("This error usually means that PostgreSQL's request " > > If we set huge_pages to try and postgres falled back to regular pages, > it emits a large message relative to its importance. The user specifed > that "I'd like to use huge pages, but it's ok if not available.", so I > think the message should be far smaller. Maybe just raising the > DEBUG1 message to LOG along with moving to ereport might be > sufficient. IMO, if the level is promoted to LOG, the message should be updated so that it follows the error message style guide. But I agree that simpler message would be better in this case. So what about something like the following? LOG: could not map anonymous shared memory (%zu bytes) with huge pages enabled HINT: The server will map anonymous shared memory again with huge pages disabled. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Fujii Masao <masao.fujii@oss.nttdata.com> writes: > IMO, if the level is promoted to LOG, the message should be updated > so that it follows the error message style guide. But I agree that simpler > message would be better in this case. So what about something like > the following? > LOG: could not map anonymous shared memory (%zu bytes) with huge pages enabled > HINT: The server will map anonymous shared memory again with huge pages disabled. That is not a hint. Maybe it qualifies as errdetail, though. regards, tom lane
On 2021/09/03 23:27, Tom Lane wrote: > Fujii Masao <masao.fujii@oss.nttdata.com> writes: >> IMO, if the level is promoted to LOG, the message should be updated >> so that it follows the error message style guide. But I agree that simpler >> message would be better in this case. So what about something like >> the following? > >> LOG: could not map anonymous shared memory (%zu bytes) with huge pages enabled >> HINT: The server will map anonymous shared memory again with huge pages disabled. > > That is not a hint. Maybe it qualifies as errdetail, though. Yes, agreed. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hello, Thank you everyone for comments. In the thread [1] that Horiguchi told me about, there is already a review going on about GUC for HugePages memory. For this reason, I have removed the new GUC implementation and attached a patch that changes only the message at instancestartup. [1] https://www.postgresql.org/message-id/20210903.141206.103927759882272221.hor Regards, Noriyoshi Shinoda -----Original Message----- From: Fujii Masao [mailto:masao.fujii@oss.nttdata.com] Sent: Saturday, September 4, 2021 1:36 AM To: Tom Lane <tgl@sss.pgh.pa.us> Cc: Kyotaro Horiguchi <horikyota.ntt@gmail.com>; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; rjuju123@gmail.com;pgsql-hackers@postgresql.org Subject: Re: Improve logging when using Huge Pages On 2021/09/03 23:27, Tom Lane wrote: > Fujii Masao <masao.fujii@oss.nttdata.com> writes: >> IMO, if the level is promoted to LOG, the message should be updated >> so that it follows the error message style guide. But I agree that >> simpler message would be better in this case. So what about something >> like the following? > >> LOG: could not map anonymous shared memory (%zu bytes) with huge >> pages enabled >> HINT: The server will map anonymous shared memory again with huge pages disabled. > > That is not a hint. Maybe it qualifies as errdetail, though. Yes, agreed. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On 2021/09/07 13:09, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > Hello, > > Thank you everyone for comments. > In the thread [1] that Horiguchi told me about, there is already a review going on about GUC for HugePages memory. > For this reason, I have removed the new GUC implementation and attached a patch that changes only the message at instancestartup. Thanks for updating the patch! Even with the patch, there are still some cases where huge pages is disabled silently. We should report something even in these cases? For example, in the platform where huge pages is not supported, it's silently disabled when huge_pages=try. One big concern about the patch is that log message is always reported when shared memory fails to be allocated with huge pages enabled when huge_pages=try. Since huge_pages=try is the default setting, many users would see this new log message whenever they start the server. Those who don't need huge pages but just use the default setting might think that such log messages would be noisy. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Tue, 7 Sep 2021 08:16:53 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in > On Tue, Sep 07, 2021 at 07:12:36PM +0900, Fujii Masao wrote: > > One big concern about the patch is that log message is always reported > > when shared memory fails to be allocated with huge pages enabled > > when huge_pages=try. Since huge_pages=try is the default setting, > > many users would see this new log message whenever they start > > the server. Those who don't need huge pages but just use the default > > setting might think that such log messages would be noisy. > > I don't see this as any issue. We're only talking about a single message on > each restart, which would be added in a major release. If it's a problem, the > message could be a NOTICE or INFO, and it won't be shown by default. > > I think it should say "with/out huge pages" without "enabled/disabled", without > "again", and without "The server", like: > > + (errmsg("could not map anonymous shared memory (%zu bytes)" > + " with huge pages.", allocsize), > + errdetail("Anonymous shared memory will be mapped " > + "without huge pages."))); I don't dislike the message, but I'm not sure I like the message is too verbose, especially about it has DETAILS. It seems to me something like the following is sufficient, or I'd like see it even more concise. "fall back anonymous shared memory to non-huge pages: required %zu bytes for huge pages" If we emit an error message for other than mmap failure, it would be like the following. "fall back anonymous shared memory to non-huge pages: huge pages not available" regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, Thank you everyone for comments. I have attached a patch that simply changed the message like the advice from Horiguchi-san. > Even with the patch, there are still some cases where huge pages is disabled silently. We should report something evenin these cases? > For example, in the platform where huge pages is not supported, it's silently disabled when huge_pages=try. The area where this patch is written is inside the "#ifdef MAP_HUGETLB #endif" block. For this reason, I think it is excluded from binaries created in an environment that does not have the MAP_HUGETLB macro. > One big concern about the patch is that log message is always reported when shared memory fails to be allocated with hugepages enabled when huge_pages=try. Since > huge_pages=try is the default setting, many users would see this new log message whenever they start the server. Thosewho don't need huge pages but just use the default > setting might think that such log messages would be noisy. This patch is meant to let the admin know that HugePages isn't being used, so I'm sure you're right. I have no idea whatto do so far. Regards, Noriyoshi Shinoda -----Original Message----- From: Kyotaro Horiguchi [mailto:horikyota.ntt@gmail.com] Sent: Wednesday, September 8, 2021 11:18 AM To: pryzby@telsasoft.com Cc: masao.fujii@oss.nttdata.com; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; pgsql-hackers@postgresql.org;rjuju123@gmail.com; tgl@sss.pgh.pa.us Subject: Re: Improve logging when using Huge Pages At Tue, 7 Sep 2021 08:16:53 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in > On Tue, Sep 07, 2021 at 07:12:36PM +0900, Fujii Masao wrote: > > One big concern about the patch is that log message is always > > reported when shared memory fails to be allocated with huge pages > > enabled when huge_pages=try. Since huge_pages=try is the default > > setting, many users would see this new log message whenever they > > start the server. Those who don't need huge pages but just use the > > default setting might think that such log messages would be noisy. > > I don't see this as any issue. We're only talking about a single > message on each restart, which would be added in a major release. If > it's a problem, the message could be a NOTICE or INFO, and it won't be shown by default. > > I think it should say "with/out huge pages" without > "enabled/disabled", without "again", and without "The server", like: > > + (errmsg("could not map anonymous shared memory (%zu bytes)" > + " with huge pages.", allocsize), > + errdetail("Anonymous shared memory will be mapped " > + "without huge > + pages."))); I don't dislike the message, but I'm not sure I like the message is too verbose, especially about it has DETAILS. It seemsto me something like the following is sufficient, or I'd like see it even more concise. "fall back anonymous shared memory to non-huge pages: required %zu bytes for huge pages" If we emit an error message for other than mmap failure, it would be like the following. "fall back anonymous shared memory to non-huge pages: huge pages not available" regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Thanks! At Wed, 8 Sep 2021 07:52:35 +0000, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi.shinoda@hpe.com> wrote in > Hello, > > Thank you everyone for comments. > I have attached a patch that simply changed the message like the advice from Horiguchi-san. > > > Even with the patch, there are still some cases where huge pages is disabled silently. We should report something evenin these cases? > > For example, in the platform where huge pages is not supported, it's silently disabled when huge_pages=try. > > The area where this patch is written is inside the "#ifdef MAP_HUGETLB #endif" block. > For this reason, I think it is excluded from binaries created in an environment that does not have the MAP_HUGETLB macro. Ah, right. > > One big concern about the patch is that log message is always reported when shared memory fails to be allocated withhuge pages enabled when huge_pages=try. Since > > huge_pages=try is the default setting, many users would see this new log message whenever they start the server. Thosewho don't need huge pages but just use the default > > setting might think that such log messages would be noisy. > > This patch is meant to let the admin know that HugePages isn't being used, so I'm sure you're right. I have no idea whatto do so far. It seems *to me* sufficient. I'm not sure what cases CreateFileMapping return ERROR_NO_SYSTEM_RESOURCES when non-huge page can be allocated successfully, though, but that doesn't matter much, maybe. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2021/09/07 22:16, Justin Pryzby wrote: > On Tue, Sep 07, 2021 at 07:12:36PM +0900, Fujii Masao wrote: >> One big concern about the patch is that log message is always reported >> when shared memory fails to be allocated with huge pages enabled >> when huge_pages=try. Since huge_pages=try is the default setting, >> many users would see this new log message whenever they start >> the server. Those who don't need huge pages but just use the default >> setting might think that such log messages would be noisy. > > I don't see this as any issue. We're only talking about a single message on > each restart, which would be added in a major release. I was afraid that logging the message like "could not ..." every time when the server starts up would surprise users unnecessarily. Because the message sounds like it reports a server error. So it might be good idea to change the message to something like "disabling huge pages" to avoid such surprise. > If it's a problem, the > message could be a NOTICE or INFO, and it won't be shown by default. That's an idea, but neither NOTICE nor INFO are suitable for this kind of message.... Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2021/09/08 11:17, Kyotaro Horiguchi wrote: > I don't dislike the message, but I'm not sure I like the message is > too verbose, especially about it has DETAILS. It seems to me something > like the following is sufficient, or I'd like see it even more concise. > > "fall back anonymous shared memory to non-huge pages: required %zu bytes for huge pages" > > If we emit an error message for other than mmap failure, it would be > like the following. > > "fall back anonymous shared memory to non-huge pages: huge pages not available" How about simpler message like "disabling huge pages" or "disabling huge pages due to lack of huge pages available"? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Fri, 17 Sep 2021 00:13:41 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2021/09/08 11:17, Kyotaro Horiguchi wrote: > > I don't dislike the message, but I'm not sure I like the message is > > too verbose, especially about it has DETAILS. It seems to me something > > like the following is sufficient, or I'd like see it even more > > concise. > > "fall back anonymous shared memory to non-huge pages: required %zu > > bytes for huge pages" > > If we emit an error message for other than mmap failure, it would be > > like the following. > > "fall back anonymous shared memory to non-huge pages: huge pages not > > available" > > How about simpler message like "disabling huge pages" or > "disabling huge pages due to lack of huge pages available"? Honestly, I cannot have conficence on my wording, but "disabling huge pages" souds like somthing that happens on the OS layer. "did not use/gave up using huge pages for anounymous shared memory" might work well, I'm not sure, though... regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hi, Thank you for your comment. > I was afraid that logging the message like "could not ..." every time when the server starts up would surprise users unnecessarily. > Because the message sounds like it reports a server error. Fujii-san, I was worried about the same thing as you. So the attached patch gets the value of the kernel parameter vm.nr_hugepages, and if it is the default value of zero, the log level is the same as before. On the other hand, if any value is set, the level is set to LOG. I hope I can find a better message other than this kind of implementation. Regards, Noriyoshi Shinoda -----Original Message----- From: Kyotaro Horiguchi [mailto:horikyota.ntt@gmail.com] Sent: Friday, September 17, 2021 1:15 PM To: masao.fujii@oss.nttdata.com Cc: pryzby@telsasoft.com; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; pgsql-hackers@postgresql.org; rjuju123@gmail.com;tgl@sss.pgh.pa.us Subject: Re: Improve logging when using Huge Pages At Fri, 17 Sep 2021 00:13:41 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2021/09/08 11:17, Kyotaro Horiguchi wrote: > > I don't dislike the message, but I'm not sure I like the message is > > too verbose, especially about it has DETAILS. It seems to me > > something like the following is sufficient, or I'd like see it even > > more concise. > > "fall back anonymous shared memory to non-huge pages: required %zu > > bytes for huge pages" > > If we emit an error message for other than mmap failure, it would be > > like the following. > > "fall back anonymous shared memory to non-huge pages: huge pages not > > available" > > How about simpler message like "disabling huge pages" or "disabling > huge pages due to lack of huge pages available"? Honestly, I cannot have conficence on my wording, but "disabling huge pages" souds like somthing that happens on the OS layer. "did not use/gave up using huge pages for anounymous shared memory" might work well, I'm not sure, though... regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
On 2021/09/20 17:55, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > I was worried about the same thing as you. > So the attached patch gets the value of the kernel parameter vm.nr_hugepages, > and if it is the default value of zero, the log level is the same as before. > On the other hand, if any value is set, the level is set to LOG. Probably I understood your point. But isn't it more confusing to users? Because whether the log message is output or not is changed depending on the setting of the kernel parameter. For example, when vm.nr_hugepages=0 and no log message about huge pages is output, users might easily misunderstand that shared memory was successfully allocated with huge pages because they saw no such log message. IMO we should leave the log message "mmap(%zu) with MAP_HUGETLB failed..." as it is if users don't like additional log message output whenever the server restarts. In this case, instead, maybe it's better to provide GUC or something to report whether shared memory was successfully allocated with huge pages or not. OTOH, if users can accept such additional log message, I think that it's less confusing to report something like "disable huge pages ..." whenever mmap() with huge pages fails. I still prefer "disable huge pages ..." to "fall back ..." as the log message, but if many people think the latter is better, I'd follow that. Another idea is to output "Anonymous shared memory was allocated with huge pages" when it's successfully allocated with huge pages, and to output "Anonymous shared memory was allocated without huge pages" when it's successfully allocated without huge pages. I'm not sure if users may think even this message is noisy, though. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Tue, 21 Sep 2021 19:23:22 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in > On Wed, Sep 22, 2021 at 02:03:11AM +0900, Fujii Masao wrote: > > Another idea is to output "Anonymous shared memory was allocated with > > huge pages" when it's successfully allocated with huge pages, and to output > > "Anonymous shared memory was allocated without huge pages" > > when it's successfully allocated without huge pages. I'm not sure if users > > may think even this message is noisy, though. > > +1 +1. Positive phrase looks better. > Maybe it could show the page size instead of "with"/without: > "Shared memory allocated with 4k/1MB/1GB pages." +1. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hi, all. Thank you for your comment. > Probably I understood your point. But isn't it more confusing to users? As you say, I think the last patch was rather confusing. For this reason, I simply reconsidered it. The attached patch just outputs a log like your advice on acquiring Huge Page. It is possible to limit the log output trigger only when huge_pages=try, but is it better not to always output it? Regards, Noriyoshi Shinoda -----Original Message----- From: Kyotaro Horiguchi [mailto:horikyota.ntt@gmail.com] Sent: Monday, September 27, 2021 11:40 AM To: pryzby@telsasoft.com Cc: masao.fujii@oss.nttdata.com; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; pgsql-hackers@postgresql.org;rjuju123@gmail.com; tgl@sss.pgh.pa.us Subject: Re: Improve logging when using Huge Pages At Tue, 21 Sep 2021 19:23:22 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in > On Wed, Sep 22, 2021 at 02:03:11AM +0900, Fujii Masao wrote: > > Another idea is to output "Anonymous shared memory was allocated > > with huge pages" when it's successfully allocated with huge pages, > > and to output "Anonymous shared memory was allocated without huge pages" > > when it's successfully allocated without huge pages. I'm not sure > > if users may think even this message is noisy, though. > > +1 +1. Positive phrase looks better. > Maybe it could show the page size instead of "with"/without: > "Shared memory allocated with 4k/1MB/1GB pages." +1. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
+ ereport(LOG, (errmsg("Anonymous shared memory was allocated %s huge pages.", with_hugepages ? "with" : "without"))); You shouldn't break a sentence into pieces like this, since it breaks translation. You don't want an untranslated "without" to appear in the middle of the translated message. There are cases where a component *shouldn't* be translated, like this one: where "numeric" should not be translated. src/backend/utils/adt/numeric.c: errmsg("invalid input syntax for type %s: \"%s\"", src/backend/utils/adt/numeric.c- "numeric", str))); -- Justin
Hi, Thank you for your comment. The attached patch stops message splitting. This patch also limits the timing of message output when huge_pages = try and HugePages is not used. Regards, Noriyoshi Shinoda -----Original Message----- From: Justin Pryzby [mailto:pryzby@telsasoft.com] Sent: Friday, October 22, 2021 11:38 AM To: Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> Cc: masao.fujii@oss.nttdata.com; pgsql-hackers@postgresql.org; rjuju123@gmail.com; tgl@sss.pgh.pa.us; Kyotaro Horiguchi <horikyota.ntt@gmail.com> Subject: Re: Improve logging when using Huge Pages + ereport(LOG, (errmsg("Anonymous shared memory was + allocated %s huge pages.", with_hugepages ? "with" : "without"))); You shouldn't break a sentence into pieces like this, since it breaks translation. You don't want an untranslated "without"to appear in the middle of the translated message. There are cases where a component *shouldn't* be translated, like this one: where "numeric" should not be translated. src/backend/utils/adt/numeric.c: errmsg("invalid input syntax for type %s: \"%s\"", src/backend/utils/adt/numeric.c- "numeric", str))); -- Justin
Attachment
Hi, On Wed, Oct 27, 2021 at 06:39:46AM +0000, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > Thank you for your comment. > The attached patch stops message splitting. > This patch also limits the timing of message output when huge_pages = try and HugePages is not used. Thanks for updating the patch. I hope we've debated almost everything about its behavior, and it's nearly ready :) + } else if (!with_hugepages && huge_pages == HUGE_PAGES_TRY) + ereport(LOG, (errmsg("Anonymous shared memory was allocated without huge pages."))); I would write this as a separate "if". The preceding block is a terminal FATAL and it seems more intuitive that way. But it's up to you (and the committer). The errmsg() text should not be capitalized and not end with a period. https://www.postgresql.org/docs/devel/error-style-guide.html The parenthesis around (errmsg()) are not required since 18 months ago (e3a87b499). Since this change won't be backpatched, I think it's better to omit them. Should it include an errcode() ? ERRCODE_INSUFFICIENT_RESOURCES ? ERRCODE_OUT_OF_MEMORY ? -- Justin
On Wed, Oct 27, 2021 at 3:40 PM Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> wrote: > > Hi, > Thank you for your comment. > The attached patch stops message splitting. > This patch also limits the timing of message output when huge_pages = try and HugePages is not used. > I've looked at the patch. Here are comments: if (huge_pages == HUGE_PAGES_TRY && ptr == MAP_FAILED) elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", allocsize); + else + with_hugepages = true; ISTM the name with_hugepages could lead to confusion since it can be true even if mmap failed and huge_pages == HUGE_PAGES_ON. Also, with the patch, the log message is emitted also during initdb and starting up in single user mode: selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Asia/Tokyo creating configuration files ... ok running bootstrap script ... 2021-10-29 15:45:51.408 JST [55101] LOG: Anonymous shared memory was allocated without huge pages. ok performing post-bootstrap initialization ... 2021-10-29 15:45:53.326 JST [55104] LOG: Anonymous shared memory was allocated without huge pages. ok syncing data to disk ... ok Which is noisy. Perhaps it's better to log it only when IsPostmasterEnvironment is true. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On 2021/10/29 7:05, Justin Pryzby wrote: > Hi, > > On Wed, Oct 27, 2021 at 06:39:46AM +0000, Shinoda, Noriyoshi (PN Japan FSIP) wrote: >> Thank you for your comment. >> The attached patch stops message splitting. >> This patch also limits the timing of message output when huge_pages = try and HugePages is not used. The log message should be reported even when huge_pages=off (and huge pages are not used)? Otherwise we cannot determine whether huge pages are actually used or not when no such log message is found in the server log. Or it's simpler and more intuitive to log the message "Anonymous shared memory was allocated with huge pages" only when huge pages are successfully requested? If that message is logged, we can determine that huge pages are used whatever the setting is. OTOH, if there is no such message, we can determine that huge pages are not used for some reasons, e.g., OS doesn't support huge pages, shared_memory_type is not set to mmap, etc. > + } else if (!with_hugepages && huge_pages == HUGE_PAGES_TRY) > + ereport(LOG, (errmsg("Anonymous shared memory was allocated without huge pages."))); > > I would write this as a separate "if". The preceding block is a terminal FATAL > and it seems more intuitive that way. Agreed. > Should it include an errcode() ? > ERRCODE_INSUFFICIENT_RESOURCES ? > ERRCODE_OUT_OF_MEMORY ? Probably errcode is not necessary here because it's a log message not error one? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2021/10/29 16:00, Masahiko Sawada wrote: > Which is noisy. Perhaps it's better to log it only when > IsPostmasterEnvironment is true. +1 Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Fujii-san, Sawada-san, Thank you for your comment. > Also, with the patch, the log message is emitted also during initdb and starting up in single user mode: Certainly the log output when executing the initdb command was a noise. The attached patch reflects the comments and uses IsPostmasterEnvironment to suppress the output message. Regards, Noriyoshi Shinoda -----Original Message----- From: Fujii Masao [mailto:masao.fujii@oss.nttdata.com] Sent: Tuesday, November 2, 2021 1:25 AM To: Masahiko Sawada <sawada.mshk@gmail.com>; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> Cc: pgsql-hackers@postgresql.org; rjuju123@gmail.com; tgl@sss.pgh.pa.us; Kyotaro Horiguchi <horikyota.ntt@gmail.com>; JustinPryzby <pryzby@telsasoft.com> Subject: Re: Improve logging when using Huge Pages On 2021/10/29 16:00, Masahiko Sawada wrote: > Which is noisy. Perhaps it's better to log it only when > IsPostmasterEnvironment is true. +1 Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On 2021/11/02 18:31, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > Fujii-san, Sawada-san, > > Thank you for your comment. > >> Also, with the patch, the log message is emitted also during initdb and starting up in single user mode: > > Certainly the log output when executing the initdb command was a noise. > The attached patch reflects the comments and uses IsPostmasterEnvironment to suppress the output message. Thanks for updating the patch! + ereport(IsPostmasterEnvironment ? LOG : NOTICE, (errmsg("Anonymous shared memory was allocated without huge pages."))); This change causes the log message to be output with NOTICE level even when IsPostmasterEnvironment is false. But do we really want to log that NOTICE message in that case? Instead, isn't it better to just output the log message with LOG level only when IsPostmasterEnvironment is true? Justin and I posted other comments upthread. Could you consider whether it's worth applying those comments to the patch? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Fujii-san, Thank you for your comment. As advised by Justin, I modified the comment according to the style guide and split the if statement. As you say, the NOTICE log was deleted as it may not be needed. Regards, Noriyoshi Shinoda -----Original Message----- From: Fujii Masao [mailto:masao.fujii@oss.nttdata.com] Sent: Tuesday, November 2, 2021 11:35 PM To: Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; pgsql-hackers@postgresql.org; Masahiko Sawada <sawada.mshk@gmail.com> Cc: rjuju123@gmail.com; tgl@sss.pgh.pa.us; Kyotaro Horiguchi <horikyota.ntt@gmail.com>; Justin Pryzby <pryzby@telsasoft.com> Subject: Re: Improve logging when using Huge Pages On 2021/11/02 18:31, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > Fujii-san, Sawada-san, > > Thank you for your comment. > >> Also, with the patch, the log message is emitted also during initdb and starting up in single user mode: > > Certainly the log output when executing the initdb command was a noise. > The attached patch reflects the comments and uses IsPostmasterEnvironment to suppress the output message. Thanks for updating the patch! + ereport(IsPostmasterEnvironment ? LOG : NOTICE, (errmsg("Anonymous +shared memory was allocated without huge pages."))); This change causes the log message to be output with NOTICE level even when IsPostmasterEnvironment is false. But do we reallywant to log that NOTICE message in that case? Instead, isn't it better to just output the log message with LOG levelonly when IsPostmasterEnvironment is true? Justin and I posted other comments upthread. Could you consider whether it's worth applying those comments to the patch? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On Tue, Nov 2, 2021 at 1:24 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2021/10/29 7:05, Justin Pryzby wrote: > > Hi, > > > > On Wed, Oct 27, 2021 at 06:39:46AM +0000, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > >> Thank you for your comment. > >> The attached patch stops message splitting. > >> This patch also limits the timing of message output when huge_pages = try and HugePages is not used. > > The log message should be reported even when huge_pages=off (and huge pages > are not used)? Otherwise we cannot determine whether huge pages are actually > used or not when no such log message is found in the server log. > > Or it's simpler and more intuitive to log the message "Anonymous shared > memory was allocated with huge pages" only when huge pages are successfully > requested? If that message is logged, we can determine that huge pages are > used whatever the setting is. OTOH, if there is no such message, we can > determine that huge pages are not used for some reasons, e.g., OS doesn't > support huge pages, shared_memory_type is not set to mmap, etc. If users want to know whether the shared memory is allocated with huge pages, I think it’s more intuitive to emit the log only on success when huge_pages = on/try. On the other hand, I guess that users might want to use the message to adjust vm.nr_hugepages when it is not allocated with huge pages. In this case, it’d be better to log the message on failure with the request memory size (or whatever reason for the failure). That is, we end up logging such a message on failure when huge_pages = on/try. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Sawada-san, Fujii-san, Thank you for your reviews. In a database with huge_pages = on / off explicitly set, if memory allocation fails, the instance cannot be started, so Ithink that additional logs are unnecessary. The attached patch outputs the log only when huge_pages = try, and outputs the requested size if the acquisition of HugePages fails. I have a completely different approach, setting GUC shared_memory_size_in_huge_pages to zero if the Huge Pages acquisitionfails. This GUC is currently calculated independently of getting Huge Pages. The attached patch does not includethis specification. Regards, Noriyoshi Shinoda -----Original Message----- From: Masahiko Sawada [mailto:sawada.mshk@gmail.com] Sent: Thursday, November 11, 2021 2:45 PM To: Fujii Masao <masao.fujii@oss.nttdata.com> Cc: Justin Pryzby <pryzby@telsasoft.com>; Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; PostgreSQL-development<pgsql-hackers@postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>;Kyotaro Horiguchi <horikyota.ntt@gmail.com> Subject: Re: Improve logging when using Huge Pages On Tue, Nov 2, 2021 at 1:24 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2021/10/29 7:05, Justin Pryzby wrote: > > Hi, > > > > On Wed, Oct 27, 2021 at 06:39:46AM +0000, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > >> Thank you for your comment. > >> The attached patch stops message splitting. > >> This patch also limits the timing of message output when huge_pages = try and HugePages is not used. > > The log message should be reported even when huge_pages=off (and huge > pages are not used)? Otherwise we cannot determine whether huge pages > are actually used or not when no such log message is found in the server log. > > Or it's simpler and more intuitive to log the message "Anonymous > shared memory was allocated with huge pages" only when huge pages are > successfully requested? If that message is logged, we can determine > that huge pages are used whatever the setting is. OTOH, if there is no > such message, we can determine that huge pages are not used for some > reasons, e.g., OS doesn't support huge pages, shared_memory_type is not set to mmap, etc. If users want to know whether the shared memory is allocated with huge pages, I think it’s more intuitive to emit the logonly on success when huge_pages = on/try. On the other hand, I guess that users might want to use the message to adjustvm.nr_hugepages when it is not allocated with huge pages. In this case, it’d be better to log the message on failurewith the request memory size (or whatever reason for the failure). That is, we end up logging such a message on failurewhen huge_pages = on/try. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
As discussed in [1], we're taking this opportunity to return some patchsets that don't appear to be getting enough reviewer interest. This is not a rejection, since we don't necessarily think there's anything unacceptable about the entry, but it differs from a standard "Returned with Feedback" in that there's probably not much actionable feedback at all. Rather than code changes, what this patch needs is more community interest. You might - ask people for help with your approach, - see if there are similar patches that your code could supplement, - get interested parties to agree to review your patch in a CF, or - possibly present the functionality in a way that's easier to review overall. (Doing these things is no guarantee that there will be interest, but it's hopefully better than endlessly rebasing a patchset that is not receiving any feedback from the community.) Once you think you've built up some community support and the patchset is ready for review, you (or any interested party) can resurrect the patch entry by visiting https://commitfest.postgresql.org/38/3310/ and changing the status to "Needs Review", and then changing the status again to "Move to next CF". (Don't forget the second step; hopefully we will have streamlined this in the near future!) Thanks, --Jacob [1] https://postgr.es/m/86140760-8ba5-6f3a-3e6e-5ca6c060bd24@timescale.com
Hello, > As discussed in [1], we're taking this opportunity to return some patchsets that don't appear to be getting enough reviewerinterest. Thank you for your helpful comments. As you say, my patch doesn't seem to be of much interest to reviewers either. I will discard the patch I proposed this time and consider it again. Regards, Noriyoshi Shinoda -----Original Message----- From: Jacob Champion <jchampion@timescale.com> Sent: Tuesday, August 2, 2022 5:45 AM To: Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com>; Masahiko Sawada <sawada.mshk@gmail.com>; Fujii Masao<masao.fujii@oss.nttdata.com> Cc: Justin Pryzby <pryzby@telsasoft.com>; PostgreSQL-development <pgsql-hackers@postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>;Tom Lane <tgl@sss.pgh.pa.us>; Kyotaro Horiguchi <horikyota.ntt@gmail.com> Subject: Re: Improve logging when using Huge Pages As discussed in [1], we're taking this opportunity to return some patchsets that don't appear to be getting enough reviewerinterest. This is not a rejection, since we don't necessarily think there's anything unacceptable about the entry, but it differs froma standard "Returned with Feedback" in that there's probably not much actionable feedback at all. Rather than code changes,what this patch needs is more community interest. You might - ask people for help with your approach, - see if there are similar patches that your code could supplement, - get interested parties to agree to review your patch in a CF, or - possibly present the functionality in a way that's easier to review overall. (Doing these things is no guarantee that there will be interest, but it's hopefully better than endlessly rebasing a patchsetthat is not receiving any feedback from the community.) Once you think you've built up some community support and the patchset is ready for review, you (or any interested party)can resurrect the patch entry by visiting https://commitfest.postgresql.org/38/3310/ and changing the status to "Needs Review", and then changing the status again to "Move to next CF". (Don't forget the secondstep; hopefully we will have streamlined this in the near future!) Thanks, --Jacob [1] https://postgr.es/m/86140760-8ba5-6f3a-3e6e-5ca6c060bd24@timescale.com
On Wed, Aug 3, 2022 at 8:42 PM Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> wrote: > > As discussed in [1], we're taking this opportunity to return some patchsets that don't appear to be getting enough reviewerinterest. > Thank you for your helpful comments. > As you say, my patch doesn't seem to be of much interest to reviewers either. > I will discard the patch I proposed this time and consider it again. I wonder if the problem here is that people are reluctant to add noise to every starting system. There are people who have not configured their system and don't want to see that noise, and then some people have configured their system and would like to know about it if it doesn't work so they can be aware of that, but don't want to use "off" because they don't want a hard failure. Would it be better if there were a new level "try_log" (or something), which only logs a message if it tries and fails?
Thanks for your comment. I understand that some people don't like noise log. However, I don't understand the feeling of disliking the one-line logthat is output when the instance is started. In both MySQL and Oracle Database, a log is output if it fails to acquire HugePages with the same behavior as huge_pages=tryin PostgreSQL. Regards, Noriyoshi Shinoda -----Original Message----- From: Thomas Munro <thomas.munro@gmail.com> Sent: Thursday, November 3, 2022 10:10 AM To: Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> Cc: Jacob Champion <jchampion@timescale.com>; Masahiko Sawada <sawada.mshk@gmail.com>; Fujii Masao <masao.fujii@oss.nttdata.com>;Justin Pryzby <pryzby@telsasoft.com>; PostgreSQL-development <pgsql-hackers@postgresql.org>;Julien Rouhaud <rjuju123@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>; Kyotaro Horiguchi <horikyota.ntt@gmail.com> Subject: Re: Improve logging when using Huge Pages On Wed, Aug 3, 2022 at 8:42 PM Shinoda, Noriyoshi (PN Japan FSIP) <noriyoshi.shinoda@hpe.com> wrote: > > As discussed in [1], we're taking this opportunity to return some patchsets that don't appear to be getting enough reviewerinterest. > Thank you for your helpful comments. > As you say, my patch doesn't seem to be of much interest to reviewers either. > I will discard the patch I proposed this time and consider it again. I wonder if the problem here is that people are reluctant to add noise to every starting system. There are people who have not configured their system and don't want to see that noise, and then some people have configured their system and would like to know aboutit if it doesn't work so they can be aware of that, but don't want to use "off" because they don't want a hard failure. Would it be better if there were a new level "try_log" (or something), which onlylogs a message if it tries and fails?
On Thu, Nov 3, 2022 at 8:11 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> I wonder if the problem here is that people are reluctant to add noise
> to every starting system. There are people who have not configured
> their system and don't want to see that noise, and then some people
> have configured their system and would like to know about it if it
> doesn't work so they can be aware of that, but don't want to use "off"
> because they don't want a hard failure. Would it be better if there
> were a new level "try_log" (or something), which only logs a message
> if it tries and fails?
I think the best thing to do is change huge_pages='on' to log a WARNING and fallback to regular pages if the mapping fails. That way, both dev and prod can keep the same settings, since 'on' will have both visibility and robustness. I don't see a good reason to refuse to start -- seems like an anti-pattern.
--
John Naylor
EDB: http://www.enterprisedb.com
On Sun, Nov 06, 2022 at 02:04:29PM +0700, John Naylor wrote: > On Thu, Nov 3, 2022 at 8:11 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > > > I wonder if the problem here is that people are reluctant to add noise > > to every starting system. There are people who have not configured > > their system and don't want to see that noise, and then some people > > have configured their system and would like to know about it if it > > doesn't work so they can be aware of that, but don't want to use "off" > > because they don't want a hard failure. Would it be better if there > > were a new level "try_log" (or something), which only logs a message > > if it tries and fails? > > I think the best thing to do is change huge_pages='on' to log a WARNING and > fallback to regular pages if the mapping fails. That way, both dev and prod > can keep the same settings, since 'on' will have both visibility and > robustness. I don't see a good reason to refuse to start -- seems like an > anti-pattern. I'm glad to see there's still discussion on this topic :) Another idea is to add a RUNTIME_COMPUTED GUC to *display* the state of huge pages, so I can stop parsing /proc/maps to find it. -- Justin
Hi, On 2022-11-06 14:04:29 +0700, John Naylor wrote: > I think the best thing to do is change huge_pages='on' to log a WARNING and > fallback to regular pages if the mapping fails. That way, both dev and prod > can keep the same settings, since 'on' will have both visibility and > robustness. I don't see a good reason to refuse to start -- seems like an > anti-pattern. How would on still have robustness if it doesn't actually do anything other than cause a WARNING? The use of huge pages can have very substantial effects on memory usage an performance. And it's easy to just have huge_pages fail, another program that started could have used huge pages, or some config variables was changed to incerase shared memory usage... I am strongly opposed to making 'on' fall back to not using huge pages. That's what 'try' is for. I know of people that scripted cluster start so that they start with 'on' and change the system setting of the number of huge pages according to the error message. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-11-06 14:04:29 +0700, John Naylor wrote: >> I think the best thing to do is change huge_pages='on' to log a WARNING and >> fallback to regular pages if the mapping fails. > I am strongly opposed to making 'on' fall back to not using huge pages. That's > what 'try' is for. Agreed --- changing "on" to be exactly like "try" isn't an improvement. If you want "try" semantics, choose "try". regards, tom lane
On Tue, Nov 8, 2022 at 4:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-11-06 14:04:29 +0700, John Naylor wrote: > >> I think the best thing to do is change huge_pages='on' to log a WARNING and > >> fallback to regular pages if the mapping fails. > > > I am strongly opposed to making 'on' fall back to not using huge pages. That's > > what 'try' is for. > > Agreed --- changing "on" to be exactly like "try" isn't an improvement. > If you want "try" semantics, choose "try". Agreed, but how can we make the people who want a log message happy, without upsetting the people who don't want new log messages? Hence my suggestion of a new level. How about try_verbose?
At Tue, 8 Nov 2022 11:34:53 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > On Tue, Nov 8, 2022 at 4:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andres Freund <andres@anarazel.de> writes: > > > On 2022-11-06 14:04:29 +0700, John Naylor wrote: > > Agreed --- changing "on" to be exactly like "try" isn't an improvement. > > If you want "try" semantics, choose "try". > > Agreed, but how can we make the people who want a log message happy, > without upsetting the people who don't want new log messages? Hence > my suggestion of a new level. How about try_verbose? Honestly I don't come up with other users of the new log-level. Another possible issue is it might be a bit hard for people to connect that level to huge_pages=try, whereas I think we shouldn't put a description about the concrete impact range of that log-level. I came up with an alternative idea that add a new huge_pages value try_report or try_verbose, which tell postgresql to *always* report the result of huge_pages = try. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Nov 09, 2022 at 11:47:57AM +0900, Kyotaro Horiguchi wrote: > Honestly I don't come up with other users of the new > log-level. Another possible issue is it might be a bit hard for people > to connect that level to huge_pages=try, whereas I think we shouldn't > put a description about the concrete impact range of that log-level. > > I came up with an alternative idea that add a new huge_pages value > try_report or try_verbose, which tell postgresql to *always* report > the result of huge_pages = try. Here is an extra idea for the bucket of ideas: switch the user-visible value of huge_pages to 'on' when we are at "try" but success in using huge pages, and switch the visible value to "off". The idea of Justin in [1] to use an internal runtime-computed GUC sounds sensible, as well (say a boolean effective_huge_pages?). [1]: https://www.postgresql.org/message-id/20221106130426.GG16921@telsasoft.com -- Michael
Attachment
On Wed, Nov 09, 2022 at 02:04:00PM +0900, Michael Paquier wrote: > On Wed, Nov 09, 2022 at 11:47:57AM +0900, Kyotaro Horiguchi wrote: > > Honestly I don't come up with other users of the new > > log-level. Another possible issue is it might be a bit hard for people > > to connect that level to huge_pages=try, whereas I think we shouldn't > > put a description about the concrete impact range of that log-level. > > > > I came up with an alternative idea that add a new huge_pages value > > try_report or try_verbose, which tell postgresql to *always* report > > the result of huge_pages = try. > > Here is an extra idea for the bucket of ideas: switch the user-visible > value of huge_pages to 'on' when we are at "try" but success in using > huge pages, and switch the visible value to "off". The idea of Justin > in [1] to use an internal runtime-computed GUC sounds sensible, as well > (say a boolean effective_huge_pages?). > > [1]: https://www.postgresql.org/message-id/20221106130426.GG16921@telsasoft.com > -- > Michael Michael seemed to support this idea and nobody verbalized hatred of it, so I implemented it. In v15, we have shared_memory_size_in_huge_pages, so this adds effective_huge_pages. Feel free to suggest a better name. -- Justin
Attachment
Hi, On 2023-01-23 19:21:00 -0600, Justin Pryzby wrote: > Michael seemed to support this idea and nobody verbalized hatred of it, > so I implemented it. In v15, we have shared_memory_size_in_huge_pages, > so this adds effective_huge_pages. > > Feel free to suggest a better name. Hm. Should this be a GUC? There's a reason shared_memory_size_in_huge_pages is one - it's so it's accessible via -C. But here we could make it a function or whatnot as well. I'm ok with this being a GUC, it doesn't feel elegant, but I suspect there's no realistic elegant answer. Greetings, Andres Freund
On Mon, Jan 23, 2023 at 05:33:35PM -0800, Andres Freund wrote: > Hi, > > On 2023-01-23 19:21:00 -0600, Justin Pryzby wrote: > > Michael seemed to support this idea and nobody verbalized hatred of it, > > so I implemented it. In v15, we have shared_memory_size_in_huge_pages, > > so this adds effective_huge_pages. > > > > Feel free to suggest a better name. > > Hm. Should this be a GUC? There's a reason shared_memory_size_in_huge_pages is > one - it's so it's accessible via -C. But here we could make it a function or > whatnot as well. I have no strong opinion, but a few minor arguments in favour of a GUC: - the implementation parallels huge_pages, huge_page_size, and shared_memory_size_in_huge_pages; - it's short; - it's glob()able with the others: postgres=# \dconfig *huge* List of configuration parameters Parameter | Value ----------------------------------+------- effective_huge_pages | off huge_pages | try huge_page_size | 0 shared_memory_size_in_huge_pages | 12 > I'm ok with this being a GUC, it doesn't feel elegant, but I suspect there's > no realistic elegant answer. BTW, I didn't realize it when I made the suggestion, nor when I wrote the patch, but a GUC was implemented in the v2 patch. https://www.postgresql.org/message-id/TU4PR8401MB1152CB4FEB99658BC6B35543EECF9%40TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM The original proposal was to change the elog(DEBUG1, "mmap..") to LOG, and the thread seems to have derailed out of concern for a hypothetical user who was adverse to an additional line of log messages during server start. I don't share that concern, but GUC or function seems better anyway, since the log message is not easily available (except maybe, sometimes, shortly after the server restart). I'm currently checking for huge pages in a nagios script by grepping /proc/pid/smaps (I *could* make use of a logfile if that was available, but it's better if it's a persistent state/variable). Whether it's a GUC or a function, I propose to name it hugepages_active. If there's no objections, I'll add to the CF. -- Justin
On Tue, Jan 24, 2023 at 07:37:29PM -0600, Justin Pryzby wrote: > BTW, I didn't realize it when I made the suggestion, nor when I wrote > the patch, but a GUC was implemented in the v2 patch. > https://www.postgresql.org/message-id/TU4PR8401MB1152CB4FEB99658BC6B35543EECF9%40TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM > Whether it's a GUC or a function, I propose to name it hugepages_active. > If there's no objections, I'll add to the CF. As such, I re-opened the previous CF. https://commitfest.postgresql.org/38/3310/
Attachment
On 2023-Jan-24, Justin Pryzby wrote: > On Mon, Jan 23, 2023 at 05:33:35PM -0800, Andres Freund wrote: > > I'm ok with this being a GUC, it doesn't feel elegant, but I suspect there's > > no realistic elegant answer. > Whether it's a GUC or a function, I propose to name it hugepages_active. > If there's no objections, I'll add to the CF. Maybe I misread the code (actually I only read the patch), but -- does it set active=true when huge_pages=on? I think the code only works when huge_pages=try. That might be pretty confusing; I think it should say "on" in both cases. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Hay dos momentos en la vida de un hombre en los que no debería especular: cuando puede permitírselo y cuando no puede" (Mark Twain)
On Thu, Feb 02, 2023 at 04:53:37PM +0100, Alvaro Herrera wrote: > Maybe I misread the code (actually I only read the patch), but -- does > it set active=true when huge_pages=on? I think the code only works when > huge_pages=try. That might be pretty confusing; I think it should say > "on" in both cases. +1 Also, while this is indeed a runtime-computed parameter, it won't be initialized until after 'postgres -C' has been handled, so 'postgres -C' will always report it as "off". However, I'm not sure it makes sense to check it with 'postgres -C' anyway since you want to know if the current server is using huge pages. At the moment, I think I'd vote for a new function instead of a GUC. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Feb 02, 2023 at 04:53:37PM +0100, Alvaro Herrera wrote: > On 2023-Jan-24, Justin Pryzby wrote: > > On Mon, Jan 23, 2023 at 05:33:35PM -0800, Andres Freund wrote: > > > I'm ok with this being a GUC, it doesn't feel elegant, but I suspect there's > > > no realistic elegant answer. > > > Whether it's a GUC or a function, I propose to name it hugepages_active. > > If there's no objections, I'll add to the CF. > > Maybe I misread the code (actually I only read the patch), but -- does > it set active=true when huge_pages=on? I think the code only works when > huge_pages=try. That might be pretty confusing; I think it should say > "on" in both cases. Yes - I realized that too. There's no reason this GUC should be inaccurate when huge_pages=on. (I had hoped there would be a conflict needing resolution before anyone else noticed.) I don't think it makes sense to run postgres -C huge_pages_active, however, so I see no issue that that would always returns "false". If need be, maybe the documentation could say "indicates whether huge pages are active for the running server". Does anybody else want to vote for a function rather than a RUNTIME_COMPUTED GUC ? -- Justin
Attachment
On 2023-Feb-08, Justin Pryzby wrote: > I don't think it makes sense to run postgres -C huge_pages_active, > however, so I see no issue that that would always returns "false". Hmm, I would initialize it to return "unknown" rather than "off" — and make sure it turns "off" at the appropriate time. Otherwise you're just moving the confusion elsewhere. > If need be, maybe the documentation could say "indicates whether huge > pages are active for the running server". Dunno, that seems way too subtle. > Does anybody else want to vote for a function rather than a > RUNTIME_COMPUTED GUC ? I don't think I'd like to have SELECT show_block_size() et al, so I'd rather not go that way. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "¿Qué importan los años? Lo que realmente importa es comprobar que a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda)
On Thu, Feb 09, 2023 at 10:40:13AM +0100, Alvaro Herrera wrote: > On 2023-Feb-08, Justin Pryzby wrote: >> I don't think it makes sense to run postgres -C huge_pages_active, >> however, so I see no issue that that would always returns "false". > > Hmm, I would initialize it to return "unknown" rather than "off" — and > make sure it turns "off" at the appropriate time. Otherwise you're just > moving the confusion elsewhere. I think this approach would address my concerns about using a GUC. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Feb 09, 2023 at 11:29:09AM -0800, Nathan Bossart wrote: > On Thu, Feb 09, 2023 at 10:40:13AM +0100, Alvaro Herrera wrote: > > On 2023-Feb-08, Justin Pryzby wrote: > >> I don't think it makes sense to run postgres -C huge_pages_active, > >> however, so I see no issue that that would always returns "false". > > > > Hmm, I would initialize it to return "unknown" rather than "off" — and > > make sure it turns "off" at the appropriate time. Otherwise you're just > > moving the confusion elsewhere. > > I think this approach would address my concerns about using a GUC. Done that way. This also fixes the logic in win32_shmem.c. -- Justin
Attachment
On Mon, Feb 13, 2023 at 05:22:45PM -0600, Justin Pryzby wrote: > + Reports whether huge pages are in use by the current process. > + See <xref linkend="guc-huge-pages"/> for more information. nitpick: Should this say "server" instead of "current process"? > +static char *huge_pages_active = "unknown"; /* dynamically set */ nitpick: Does this need to be initialized here? > + { > + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS, > + gettext_noop("Indicates whether huge pages are in use."), > + NULL, > + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED > + }, > + &huge_pages_active, > + "unknown", > + NULL, NULL, NULL > + }, I'm curious why you chose to make this a string instead of an enum. There might be little practical difference, but since there are only three possible values, I wonder if it'd be better form to make it an enum. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: > On Mon, Feb 13, 2023 at 05:22:45PM -0600, Justin Pryzby wrote: > > + Reports whether huge pages are in use by the current process. > > + See <xref linkend="guc-huge-pages"/> for more information. > > nitpick: Should this say "server" instead of "current process"? It should probably say "instance" :) > > +static char *huge_pages_active = "unknown"; /* dynamically set */ > > nitpick: Does this need to be initialized here? None of the GUCs' C vars need to be initialized, since the guc machinery will do it. ...but the convention is that they *are* initialized - and that's now partially enforced. See: d9d873bac67047cfacc9f5ef96ee488f2cb0f1c3 7d25958453a60337bcb7bcc986e270792c007ea4 a73952b795632b2cf5acada8476e7cf75857e9be > > + { > > + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS, > > + gettext_noop("Indicates whether huge pages are in use."), > > + NULL, > > + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED > > + }, > > + &huge_pages_active, > > + "unknown", > > + NULL, NULL, NULL > > + }, > > I'm curious why you chose to make this a string instead of an enum. There > might be little practical difference, but since there are only three > possible values, I wonder if it'd be better form to make it an enum. It takes more code to write as an enum - see 002.txt. I'm not convinced this is better. But your comment made me fix its <type>, and reconsider the strings, which I changed to active={unknown/true/false} rather than {unk/on/off}. It could also be active={unknown/yes/no}... -- Justin
Attachment
On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote: > On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: >> On Mon, Feb 13, 2023 at 05:22:45PM -0600, Justin Pryzby wrote: >> nitpick: Does this need to be initialized here? > > None of the GUCs' C vars need to be initialized, since the guc machinery > will do it. > > ...but the convention is that they *are* initialized - and that's now > partially enforced. > > See: > d9d873bac67047cfacc9f5ef96ee488f2cb0f1c3 > 7d25958453a60337bcb7bcc986e270792c007ea4 > a73952b795632b2cf5acada8476e7cf75857e9be I see. This looked a little strange to me because many of the other variables are uninitialized. In a73952b, I see that we allow the variables for string GUCs to be initialized to NULL. Anyway, this is only a nitpick. I don't feel strongly about it. >> I'm curious why you chose to make this a string instead of an enum. There >> might be little practical difference, but since there are only three >> possible values, I wonder if it'd be better form to make it an enum. > > It takes more code to write as an enum - see 002.txt. I'm not convinced > this is better. > > But your comment made me fix its <type>, and reconsider the strings, > which I changed to active={unknown/true/false} rather than {unk/on/off}. > It could also be active={unknown/yes/no}... I think unknown/true/false is fine. I'm okay with using a string if no one else thinks it should be an enum (or a bool). -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote: > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote: >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: >>> I'm curious why you chose to make this a string instead of an enum. There >>> might be little practical difference, but since there are only three >>> possible values, I wonder if it'd be better form to make it an enum. >> >> It takes more code to write as an enum - see 002.txt. I'm not convinced >> this is better. >> >> But your comment made me fix its <type>, and reconsider the strings, >> which I changed to active={unknown/true/false} rather than {unk/on/off}. >> It could also be active={unknown/yes/no}... > > I think unknown/true/false is fine. I'm okay with using a string if no one > else thinks it should be an enum (or a bool). There's been no response for this, so I guess we can proceed with a string GUC. + Reports whether huge pages are in use by the current instance. + See <xref linkend="guc-huge-pages"/> for more information. I still think we should say "server" in place of "current instance" here. + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS, + gettext_noop("Indicates whether huge pages are in use."), + NULL, + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED + }, I don't think we need to use GUC_RUNTIME_COMPUTED. 'postgres -C' seems to always report "unknown" for this GUC, so all this would do is cause that command to error unnecessarily when the server is running. It might be worth documenting exactly what "unknown" means. IIUC you'll only ever see "on" or "off" via SHOW or pg_settings, which doesn't seem tremendously obvious. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Greetings, * Nathan Bossart (nathandbossart@gmail.com) wrote: > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote: > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote: > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: > >>> I'm curious why you chose to make this a string instead of an enum. There > >>> might be little practical difference, but since there are only three > >>> possible values, I wonder if it'd be better form to make it an enum. > >> > >> It takes more code to write as an enum - see 002.txt. I'm not convinced > >> this is better. > >> > >> But your comment made me fix its <type>, and reconsider the strings, > >> which I changed to active={unknown/true/false} rather than {unk/on/off}. > >> It could also be active={unknown/yes/no}... > > > > I think unknown/true/false is fine. I'm okay with using a string if no one > > else thinks it should be an enum (or a bool). > > There's been no response for this, so I guess we can proceed with a string > GUC. Just happened to see this and I'm not really a fan of this being a string when it's pretty clear that's not what it actually is. > + Reports whether huge pages are in use by the current instance. > + See <xref linkend="guc-huge-pages"/> for more information. > > I still think we should say "server" in place of "current instance" here. We certainly use 'server' a lot more in config.sgml than we do 'instance'. "currently running server" might be closer to how we describe a running PG system in other parts (we talk about "currently running server processes", "while the server is running", "When running a standby server", "when the server is running"; "instance" is used much less and seems to more typically refer to 'state of files on disk' in my reading vs. 'actively running process' though there's some of each). > + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS, > + gettext_noop("Indicates whether huge pages are in use."), > + NULL, > + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED > + }, > > I don't think we need to use GUC_RUNTIME_COMPUTED. 'postgres -C' seems to > always report "unknown" for this GUC, so all this would do is cause that > command to error unnecessarily when the server is running. ... or we could consider adjusting things to actually try the mmap() and find out if we'd end up with huge pages or not. Naturally we'd only want to do that if the server isn't running though and erroring if it is would be perfectly correct. Either that or just refusing to provide it by an error or other approach (see below) seems entirely reasonable. > It might be worth documenting exactly what "unknown" means. IIUC you'll > only ever see "on" or "off" via SHOW or pg_settings, which doesn't seem > tremendously obvious. If we could get rid of that case and just make this a boolean, that seems like it'd really be the best answer. To that end- perhaps this isn't appropriate as a GUC at all? Why not have this just be a system information function? Functionally it really provides the same info- if the server is running then you get back either true or false, if it's not then you can't call it but that's hardly different from an unknown or error result. Thanks, Stephen
Attachment
On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote: > Greetings, > > * Nathan Bossart (nathandbossart@gmail.com) wrote: > > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote: > > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote: > > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: > > >>> I'm curious why you chose to make this a string instead of an enum. There > > >>> might be little practical difference, but since there are only three > > >>> possible values, I wonder if it'd be better form to make it an enum. > > >> > > >> It takes more code to write as an enum - see 002.txt. I'm not convinced > > >> this is better. > > >> > > >> But your comment made me fix its <type>, and reconsider the strings, > > >> which I changed to active={unknown/true/false} rather than {unk/on/off}. > > >> It could also be active={unknown/yes/no}... > > > > > > I think unknown/true/false is fine. I'm okay with using a string if no one > > > else thinks it should be an enum (or a bool). > > > > There's been no response for this, so I guess we can proceed with a string > > GUC. > > Just happened to see this and I'm not really a fan of this being a > string when it's pretty clear that's not what it actually is. I don't understand what you mean by that. Why do you say it isn't a string ? > > + Reports whether huge pages are in use by the current instance. > > + See <xref linkend="guc-huge-pages"/> for more information. > > > > I still think we should say "server" in place of "current instance" here. > > We certainly use 'server' a lot more in config.sgml than we do > 'instance'. "currently running server" might be closer to how we > describe a running PG system in other parts (we talk about "currently > running server processes", "while the server is running", "When running > a standby server", "when the server is running"; "instance" is used much > less and seems to more typically refer to 'state of files on disk' in my > reading vs. 'actively running process' though there's some of each). I called it "instance" since the GUC has no meaning when it's not running. I'm fine to rename it to "running server". > > + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS, > > + gettext_noop("Indicates whether huge pages are in use."), > > + NULL, > > + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED > > + }, > > > > I don't think we need to use GUC_RUNTIME_COMPUTED. 'postgres -C' seems to > > always report "unknown" for this GUC, so all this would do is cause that > > command to error unnecessarily when the server is running. > > ... or we could consider adjusting things to actually try the mmap() and > find out if we'd end up with huge pages or not. That seems like a bad idea, since it might work one moment and fail one moment later. If we could tell in advance whether it was going to work, we wouldn't be here, and probably also wouldn't have invented huge_pages=true. > > It might be worth documenting exactly what "unknown" means. IIUC you'll > > only ever see "on" or "off" via SHOW or pg_settings, which doesn't seem > > tremendously obvious. > > If we could get rid of that case and just make this a boolean, that > seems like it'd really be the best answer. > > To that end- perhaps this isn't appropriate as a GUC at all? Why not > have this just be a system information function? Functionally it really > provides the same info- if the server is running then you get back > either true or false, if it's not then you can't call it but that's > hardly different from an unknown or error result. We talked about making it a function ~6 weeks ago. Is there an agreement to use a function, instead ? -- Justin
On 2023-Mar-09, Justin Pryzby wrote: > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote: > > > + Reports whether huge pages are in use by the current instance. > > > + See <xref linkend="guc-huge-pages"/> for more information. > > > > > > I still think we should say "server" in place of "current instance" here. > > > > We certainly use 'server' a lot more in config.sgml than we do > > 'instance'. "currently running server" might be closer to how we > > describe a running PG system in other parts (we talk about "currently > > running server processes", "while the server is running", "When running > > a standby server", "when the server is running"; "instance" is used much > > less and seems to more typically refer to 'state of files on disk' in my > > reading vs. 'actively running process' though there's some of each). > > I called it "instance" since the GUC has no meaning when it's not > running. I'm fine to rename it to "running server". I'd rather make all these other places use "instance" instead. We used to consider these terms interchangeable, but since we introduced the glossary to unify the terminology, they are no longer supposed to be. A server (== a machine) can contain many instances, and each individual instance in the server could be using huge pages or not. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Now I have my system running, not a byte was off the shelf; It rarely breaks and when it does I fix the code myself. It's stable, clean and elegant, and lightning fast as well, And it doesn't cost a nickel, so Bill Gates can go to hell."
On Thu, Mar 09, 2023 at 10:38:56AM -0600, Justin Pryzby wrote: > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote: >> To that end- perhaps this isn't appropriate as a GUC at all? Why not >> have this just be a system information function? Functionally it really >> provides the same info- if the server is running then you get back >> either true or false, if it's not then you can't call it but that's >> hardly different from an unknown or error result. > > We talked about making it a function ~6 weeks ago. > > Is there an agreement to use a function, instead ? If we're tallying votes, please count me as +1 for a system information function. I think that nicely sidesteps having to return "unknown" in some cases, which I worry will just cause confusion. IMHO it is much more obvious that the value refers to the current server if you have to log in and execute a function to see it. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Mar 09, 2023 at 06:51:21PM +0100, Alvaro Herrera wrote: > I'd rather make all these other places use "instance" instead. We used > to consider these terms interchangeable, but since we introduced the > glossary to unify the terminology, they are no longer supposed to be. > A server (== a machine) can contain many instances, and each individual > instance in the server could be using huge pages or not. Ah, good to know. I've always considered "server" in this context to mean the server process(es) for a single instance, but I can see the value in having different terminology to clearly distinguish the process(es) from the machine. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Greetings, * Justin Pryzby (pryzby@telsasoft.com) wrote: > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote: > > * Nathan Bossart (nathandbossart@gmail.com) wrote: > > > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote: > > > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote: > > > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: > > > >>> I'm curious why you chose to make this a string instead of an enum. There > > > >>> might be little practical difference, but since there are only three > > > >>> possible values, I wonder if it'd be better form to make it an enum. > > > >> > > > >> It takes more code to write as an enum - see 002.txt. I'm not convinced > > > >> this is better. > > > >> > > > >> But your comment made me fix its <type>, and reconsider the strings, > > > >> which I changed to active={unknown/true/false} rather than {unk/on/off}. > > > >> It could also be active={unknown/yes/no}... > > > > > > > > I think unknown/true/false is fine. I'm okay with using a string if no one > > > > else thinks it should be an enum (or a bool). > > > > > > There's been no response for this, so I guess we can proceed with a string > > > GUC. > > > > Just happened to see this and I'm not really a fan of this being a > > string when it's pretty clear that's not what it actually is. > > I don't understand what you mean by that. > Why do you say it isn't a string ? Because it's clearly a yes/no, either the server is currently running with huge pages, or it isn't. That's the definition of a boolean. Sure, anything can be cast to text but when there's a data type that fits better, that's almost uniformly better to use. > > > + Reports whether huge pages are in use by the current instance. > > > + See <xref linkend="guc-huge-pages"/> for more information. > > > > > > I still think we should say "server" in place of "current instance" here. > > > > We certainly use 'server' a lot more in config.sgml than we do > > 'instance'. "currently running server" might be closer to how we > > describe a running PG system in other parts (we talk about "currently > > running server processes", "while the server is running", "When running > > a standby server", "when the server is running"; "instance" is used much > > less and seems to more typically refer to 'state of files on disk' in my > > reading vs. 'actively running process' though there's some of each). > > I called it "instance" since the GUC has no meaning when it's not > running. I'm fine to rename it to "running server". Great, I do think that would match better with the rest of the documentation. > > > + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS, > > > + gettext_noop("Indicates whether huge pages are in use."), > > > + NULL, > > > + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED > > > + }, > > > > > > I don't think we need to use GUC_RUNTIME_COMPUTED. 'postgres -C' seems to > > > always report "unknown" for this GUC, so all this would do is cause that > > > command to error unnecessarily when the server is running. > > > > ... or we could consider adjusting things to actually try the mmap() and > > find out if we'd end up with huge pages or not. > > That seems like a bad idea, since it might work one moment and fail one > moment later. If we could tell in advance whether it was going to work, > we wouldn't be here, and probably also wouldn't have invented > huge_pages=true. Sure it might ... but I tend to disagree that it's actually all that likely for it to end up being as inconsistent as that and it'd be nice to be able to see if the server will end up successfully starting (for this part, at least), or not, when configured with huge pages set to on, or if starting with 'try' is likely to result in it actually using huge pages, or not. > > > It might be worth documenting exactly what "unknown" means. IIUC you'll > > > only ever see "on" or "off" via SHOW or pg_settings, which doesn't seem > > > tremendously obvious. > > > > If we could get rid of that case and just make this a boolean, that > > seems like it'd really be the best answer. > > > > To that end- perhaps this isn't appropriate as a GUC at all? Why not > > have this just be a system information function? Functionally it really > > provides the same info- if the server is running then you get back > > either true or false, if it's not then you can't call it but that's > > hardly different from an unknown or error result. > > We talked about making it a function ~6 weeks ago. Oh, good, glad I'm not the only one to have thought of that. > Is there an agreement to use a function, instead ? Looking back at the arguments for having it be a GUC ... I just don't really see any of them as very strong. Not that I feel super strongly about it being a function either, but it's certainly not a configuration variable and it also isn't really available with postgres -C (and therefore doesn't actually go along with how the *size GUCs work). It's literally information about the running system that the user might be curious about ... and that sure seems to fit pretty cleanly under 'System Information Functions'. Thanks, Stephen
Attachment
Greetings, * Alvaro Herrera (alvherre@alvh.no-ip.org) wrote: > On 2023-Mar-09, Justin Pryzby wrote: > > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote: > > > > + Reports whether huge pages are in use by the current instance. > > > > + See <xref linkend="guc-huge-pages"/> for more information. > > > > > > > > I still think we should say "server" in place of "current instance" here. > > > > > > We certainly use 'server' a lot more in config.sgml than we do > > > 'instance'. "currently running server" might be closer to how we > > > describe a running PG system in other parts (we talk about "currently > > > running server processes", "while the server is running", "When running > > > a standby server", "when the server is running"; "instance" is used much > > > less and seems to more typically refer to 'state of files on disk' in my > > > reading vs. 'actively running process' though there's some of each). > > > > I called it "instance" since the GUC has no meaning when it's not > > running. I'm fine to rename it to "running server". > > I'd rather make all these other places use "instance" instead. We used > to consider these terms interchangeable, but since we introduced the > glossary to unify the terminology, they are no longer supposed to be. > A server (== a machine) can contain many instances, and each individual > instance in the server could be using huge pages or not. Perhaps, but then the references to instance should probably also be changed since there's certainly some that are referring to a set of data files and not to backend processes, eg: ###### The <literal>shutdown</literal> setting is useful to have the instance ready at the exact replay point desired. The instance will still be able to replay more WAL records (and in fact will have to replay WAL records since the last checkpoint next time it is started). ###### Not sure about just changing one thing at a time though or using the 'right' term when introducing things while having the 'wrong' term be used next to it. Also not saying that this patch should be responsible for fixing everything either. 'Server' in the glossary does explicitly say it could be used when referring to an 'instance' too though, so 'running server' doesn't seem to be entirely wrong. Thanks, Stephen
Attachment
On Thu, Mar 09, 2023 at 03:02:29PM -0500, Stephen Frost wrote: > * Justin Pryzby (pryzby@telsasoft.com) wrote: > > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote: > > > * Nathan Bossart (nathandbossart@gmail.com) wrote: > > > > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote: > > > > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote: > > > > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote: > > > > >>> I'm curious why you chose to make this a string instead of an enum. There > > > > >>> might be little practical difference, but since there are only three > > > > >>> possible values, I wonder if it'd be better form to make it an enum. > > > > >> > > > > >> It takes more code to write as an enum - see 002.txt. I'm not convinced > > > > >> this is better. > > > > >> > > > > >> But your comment made me fix its <type>, and reconsider the strings, > > > > >> which I changed to active={unknown/true/false} rather than {unk/on/off}. > > > > >> It could also be active={unknown/yes/no}... > > > > > > > > > > I think unknown/true/false is fine. I'm okay with using a string if no one > > > > > else thinks it should be an enum (or a bool). > > > > > > > > There's been no response for this, so I guess we can proceed with a string > > > > GUC. > > > > > > Just happened to see this and I'm not really a fan of this being a > > > string when it's pretty clear that's not what it actually is. > > > > I don't understand what you mean by that. > > Why do you say it isn't a string ? > > Because it's clearly a yes/no, either the server is currently running > with huge pages, or it isn't. That's the definition of a boolean. I originally implemented it as a boolean, and I changed it in response to the feedback that postgres -C huge_pages_active should return "unknown". > > Is there an agreement to use a function, instead ? Alvaro was -1 on using a function Andres is +0 (?) Nathan is +1 Stephen is +1 I'm -0.5, but I reimplemented it as a function. I hope that helps it to progress. Please include a suggestion if there's better place for the function or global var. -- Justin
Attachment
Greetings,
On Mon, Mar 13, 2023 at 21:03 Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Mar 09, 2023 at 03:02:29PM -0500, Stephen Frost wrote:
> * Justin Pryzby (pryzby@telsasoft.com) wrote:
> > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote:
> > > * Nathan Bossart (nathandbossart@gmail.com) wrote:
> > > > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote:
> > > > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote:
> > > > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote:
> > > > >>> I'm curious why you chose to make this a string instead of an enum. There
> > > > >>> might be little practical difference, but since there are only three
> > > > >>> possible values, I wonder if it'd be better form to make it an enum.
> > > > >>
> > > > >> It takes more code to write as an enum - see 002.txt. I'm not convinced
> > > > >> this is better.
> > > > >>
> > > > >> But your comment made me fix its <type>, and reconsider the strings,
> > > > >> which I changed to active={unknown/true/false} rather than {unk/on/off}.
> > > > >> It could also be active={unknown/yes/no}...
> > > > >
> > > > > I think unknown/true/false is fine. I'm okay with using a string if no one
> > > > > else thinks it should be an enum (or a bool).
> > > >
> > > > There's been no response for this, so I guess we can proceed with a string
> > > > GUC.
> > >
> > > Just happened to see this and I'm not really a fan of this being a
> > > string when it's pretty clear that's not what it actually is.
> >
> > I don't understand what you mean by that.
> > Why do you say it isn't a string ?
>
> Because it's clearly a yes/no, either the server is currently running
> with huge pages, or it isn't. That's the definition of a boolean.
I originally implemented it as a boolean, and I changed it in response
to the feedback that postgres -C huge_pages_active should return
"unknown".
I really don’t see how that’s at all useful.
> > Is there an agreement to use a function, instead ?
Alvaro was -1 on using a function
I don’t entirely get that argument (select thisfunc(); is much worse than show thisguc; ..? Also, the former is easier to use with other functions and such, as you don’t have to do current_setting(‘thisguc’)…).
Andres is +0 (?)
Kinda felt like this was closer to +0.5 or more, for my part anyway.
Nathan is +1
Stephen is +1
I'm -0.5,
Why..?
but I reimplemented it as a function.
Thanks!
I hope that helps it to
progress. Please include a suggestion if there's better place for the
function or global var.
Will try to give it a look tomorrow.
Thanks again!
Stephen
At Mon, 13 Mar 2023 21:33:31 +0100, Stephen Frost <sfrost@snowman.net> wrote in > > On Thu, Mar 09, 2023 at 03:02:29PM -0500, Stephen Frost wrote: > > > * Justin Pryzby (pryzby@telsasoft.com) wrote: > > > Is there an agreement to use a function, instead ? > > > > Alvaro was -1 on using a function > > > I don’t entirely get that argument (select thisfunc(); is much worse than > show thisguc; ..? Also, the former is easier to use with other functions > and such, as you don’t have to do current_setting(‘thisguc’)…). > > Andres is +0 (?) > > > Kinda felt like this was closer to +0.5 or more, for my part anyway. > > Nathan is +1 > > Stephen is +1 > > > > I'm -0.5, > > > Why..? IMHO, it appears that we use GUC "internal" variables to for the lifespan-long numbers of a postmaster process or an instance. Examples of such variables includes shared_memory_size and s_m_s_in_huge_pages, integer_datetimes and data_checksums. I'm uncertain about in_hot_standby, as it can change during a postmaster's lifetime. However, pg_is_in_recovery() provides essentially the same information. Regarding huge_page_active, its value remains constant throughout a postmaster's lifespan. In this regard, GUC may be a better fit for this information. The issue with using GUC for this value is that the postgres command cannot report the final value via the -C option, which may be the reason for the third alternative "unknown". I slightly prefer using a function for this, as if GUC is used, it can only return "unknown" for the command "postgres -C huge_page_active". However, apart from this advantage, I prefer using a GUC for this information. If we implement it as a function, I suggest naming it "pg_huge_page_is_active" or something similar (the use of "is" is signinficant here) to follow the naming convention used in pg_is_in_recovery(). regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Mar 14, 2023 at 02:02:19PM +0900, Kyotaro Horiguchi wrote: > Regarding huge_page_active, its value remains constant throughout a > postmaster's lifespan. In this regard, GUC may be a better fit for > this information. The issue with using GUC for this value is that the > postgres command cannot report the final value via the -C option, > which may be the reason for the third alternative "unknown". > > I slightly prefer using a function for this, as if GUC is used, it can > only return "unknown" for the command "postgres -C > huge_page_active". However, apart from this advantage, I prefer using > a GUC for this information. The main advantage of a read-only GUC over a function is that users would not need to start a postmaster to know if huge pages would be active or not. This is the main reason why a GUC would be a better fit, in my opinion, because it makes for a cheaper check, while still allowing a SQL query to check the value of the GUC. -- Michael
Attachment
On Mon, Mar 20, 2023 at 01:54:46PM +0900, Michael Paquier wrote: > The main advantage of a read-only GUC over a function is that users > would not need to start a postmaster to know if huge pages would be > active or not. This is the main reason why a GUC would be a better > fit, in my opinion, because it makes for a cheaper check, while still > allowing a SQL query to check the value of the GUC. [ Should have read more carefully ] .. Which is something you cannot do with -C because mmap() happens after the runtime-computed logic for postgres -C. It does not sound right to do the mmap() for a GUC check, so indeed a function may be more adapted rather than move mmap() call a bit earlier in the postmaster startup sequence. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Tue, Mar 14, 2023 at 02:02:19PM +0900, Kyotaro Horiguchi wrote: >> I slightly prefer using a function for this, as if GUC is used, it can >> only return "unknown" for the command "postgres -C >> huge_page_active". However, apart from this advantage, I prefer using >> a GUC for this information. > The main advantage of a read-only GUC over a function is that users > would not need to start a postmaster to know if huge pages would be > active or not. I'm confused here, because Horiguchi-san is saying that that won't work. I've not checked the code lately, but I think that "postgres -C var" prints its results before actually attempting to establish shared memory, so I suspect Horiguchi-san is right. regards, tom lane
On Mon, Mar 20, 2023 at 01:09:09AM -0400, Tom Lane wrote: > I'm confused here, because Horiguchi-san is saying that that > won't work. I've not checked the code lately, but I think that > "postgres -C var" prints its results before actually attempting > to establish shared memory, so I suspect Horiguchi-san is right. Yes, I haven't read correctly through. Sorry for the noise. -- Michael
Attachment
On Mon, Mar 20, 2023 at 02:03:13PM +0900, Michael Paquier wrote: > On Mon, Mar 20, 2023 at 01:54:46PM +0900, Michael Paquier wrote: > > The main advantage of a read-only GUC over a function is that users > > would not need to start a postmaster to know if huge pages would be > > active or not. This is the main reason why a GUC would be a better > > fit, in my opinion, because it makes for a cheaper check, while still > > allowing a SQL query to check the value of the GUC. > > [ Should have read more carefully ] > > .. Which is something you cannot do with -C because mmap() happens > after the runtime-computed logic for postgres -C. It does not sound > right to do the mmap() for a GUC check, so indeed a function may be > more adapted rather than move mmap() call a bit earlier in the > postmaster startup sequence. On Mon, Mar 20, 2023 at 02:17:33PM +0900, Michael Paquier wrote: > On Mon, Mar 20, 2023 at 01:09:09AM -0400, Tom Lane wrote: > > I'm confused here, because Horiguchi-san is saying that that > > won't work. I've not checked the code lately, but I think that > > "postgres -C var" prints its results before actually attempting > > to establish shared memory, so I suspect Horiguchi-san is right. > > Yes, I haven't read correctly through. Sorry for the noise. You set this patch to "waiting on author" twice. Would you let me know what I could do to help progress the patch? Right now, I have no idea. Most recently, you said it'd be better implemented as a GUC to allow using -C, but then recanted because -C doesn't work for this (which is why I implemented it as a string back on 2023-02-08). Which is why I reset its status on 2023-03-20. 2023-03-22 01:36:58 Michael Paquier (michael-kun) New status: Waiting on Author 2023-03-20 13:05:32 Justin Pryzby (justinpryzby) New status: Needs review 2023-03-20 05:03:53 Michael Paquier (michael-kun) New status: Waiting on Author -- Justin
On Tue, Mar 21, 2023 at 09:19:41PM -0500, Justin Pryzby wrote: > You set this patch to "waiting on author" twice. Would you let me know > what I could do to help progress the patch? Right now, I have no idea. My mistake, I've been looking at an incorrect version of the patch. Thanks for correcting me here. I have read through the proposed v5 of the patch, that seems to be the latest one available: https://www.postgresql.org/message-id/ZA+Bpk/6LcYiUXnh@telsasoft.com I have noted something.. For the WIN32 case, we have that: +++ b/src/backend/port/win32_shmem.c @@ -327,6 +327,8 @@ retry: Sleep(1000); continue; } + + huge_pages_active = ((flProtect & SEC_LARGE_PAGES) != 0); break; Are you sure that this is correct? This is set in PGSharedMemoryCreate(), part of CreateSharedMemoryAndSemaphores() in the startup sequence that creates the shmem segment. However, for a normal backend created by EXEC_BACKEND, SubPostmasterMain() reattaches to an existing shared memory segment, so we don't go through the creation path that would set huge_pages_active for the process just started, (note that InitPostmasterChild() switches IsUnderPostmaster, bypassing the shmem segment creation). -- Michael
Attachment
On Wed, Mar 22, 2023 at 04:37:01PM +0900, Michael Paquier wrote: > I have noted something.. For the WIN32 case, we have that: > > +++ b/src/backend/port/win32_shmem.c > @@ -327,6 +327,8 @@ retry: > Sleep(1000); > continue; > } > + > + huge_pages_active = ((flProtect & SEC_LARGE_PAGES) != 0); > break; > > Are you sure that this is correct? This is set in > PGSharedMemoryCreate(), part of CreateSharedMemoryAndSemaphores() in > the startup sequence that creates the shmem segment. However, for a > normal backend created by EXEC_BACKEND, SubPostmasterMain() reattaches > to an existing shared memory segment, so we don't go through the > creation path that would set huge_pages_active for the process just > started, (note that InitPostmasterChild() switches IsUnderPostmaster, > bypassing the shmem segment creation). Wow, good point. I think to make it work we'd need put huge_pages_active into BackendParameters and handle it in save_backend_variables(). If so, that'd be strong argument for using a GUC, which already has all the necessary infrastructure for exposing the server's state. -- Justin
On Wed, Mar 22, 2023 at 05:18:28PM -0500, Justin Pryzby wrote: > Wow, good point. I think to make it work we'd need put > huge_pages_active into BackendParameters and handle it in > save_backend_variables(). If so, that'd be strong argument for using a > GUC, which already has all the necessary infrastructure for exposing the > server's state. I am afraid so, duplicating an existing infrastructure for a need like the one of this thread is not really appealing. -- Michael
Attachment
At Thu, 23 Mar 2023 07:23:28 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Wed, Mar 22, 2023 at 05:18:28PM -0500, Justin Pryzby wrote: > > Wow, good point. I think to make it work we'd need put > > huge_pages_active into BackendParameters and handle it in > > save_backend_variables(). If so, that'd be strong argument for using a > > GUC, which already has all the necessary infrastructure for exposing the > > server's state. > > I am afraid so, duplicating an existing infrastructure for a need like > the one of this thread is not really appealing. Wouldn't storing the value in the shared memory itself work? Though, that space does become almost dead for the server's lifetime... regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Mar 23, 2023 at 05:25:46PM +0900, Kyotaro Horiguchi wrote: > Wouldn't storing the value in the shared memory itself work? Though, > that space does become almost dead for the server's lifetime... I'm sure it's possible, but it's also not worth writing a special implementation just to handle huge_pages_active, which is better written in 30 lines than in 300 lines. If we needed to avoid using a GUC, maybe it'd work to add huge_pages_active to PGShmemHeader. But "avoid using gucs at all costs" isn't the goal, and using a GUC parallels all the similar, and related things without needing to allocate extra bits of shared_memory. On Thu, Mar 23, 2023 at 07:23:28AM +0900, Michael Paquier wrote: > On Wed, Mar 22, 2023 at 05:18:28PM -0500, Justin Pryzby wrote: > > Wow, good point. I think to make it work we'd need put > > huge_pages_active into BackendParameters and handle it in > > save_backend_variables(). If so, that'd be strong argument for using a > > GUC, which already has all the necessary infrastructure for exposing the > > server's state. > > I am afraid so, duplicating an existing infrastructure for a need like > the one of this thread is not really appealing. This goes back to using a GUC, and: - removes GUC_RUNTIME_COMPUTED, since that causes a useless error when using -C if the server is running, rather than successfully printing "unknown". - I also renamed it from huge_pages_active = {true,false,unknown} to huge_pages_STATUS = {on,off,unknown}. This parallels huge_pages, which is documented to accept values on/off/try. Also, true/false isn't how bools are displayed. - I realized that the rename suggested implementing it as an enum GUC, and re-using the existing HUGE_PAGES_{ON,OFF} values (and adding an "UNKNOWN"). Maybe this also avoids Stephen's earlier objection to using a string ? I mis-used cirrusci to check that the GUC does work correctly under windows. If there's continuing aversions to using a GUC, please say so, and try to suggest an alternate implementation you think would be justified. It'd need to work under windows with EXEC_BACKEND. -- Justin
Attachment
On Thu, Mar 23, 2023 at 08:50:50PM -0500, Justin Pryzby wrote: > I'm sure it's possible, but it's also not worth writing a special > implementation just to handle huge_pages_active, which is better written > in 30 lines than in 300 lines. > > If we needed to avoid using a GUC, maybe it'd work to add > huge_pages_active to PGShmemHeader. But "avoid using gucs at all costs" > isn't the goal, and using a GUC parallels all the similar, and related > things without needing to allocate extra bits of shared_memory. Yeah, I kind of agree that the complications are not appealing compared to the use case. FWIW, I am fine with just using "unknown" even under -C because we don't know the status without the mmpa() call. And we don't want to assign what would be potentially a bunch of memory when running that. > This goes back to using a GUC, and: > > - removes GUC_RUNTIME_COMPUTED, since that causes a useless error when > using -C if the server is running, rather than successfully printing > "unknown". > - I also renamed it from huge_pages_active = {true,false,unknown} to > huge_pages_STATUS = {on,off,unknown}. This parallels huge_pages, > which is documented to accept values on/off/try. Also, true/false > isn't how bools are displayed. > - I realized that the rename suggested implementing it as an enum GUC, > and re-using the existing HUGE_PAGES_{ON,OFF} values (and adding an > "UNKNOWN"). Maybe this also avoids Stephen's earlier objection to > using a string ? huge_pages_status is OK here, as is reusing the enum struct to avoid an unnecessary duplication. > I mis-used cirrusci to check that the GUC does work correctly under > windows. You mean that you abused of it in some custom branch running the CI on github? If I may ask, what did you do to make sure that huge pages are set when re-attaching a backend to a shmem area? > If there's continuing aversions to using a GUC, please say so, and try > to suggest an alternate implementation you think would be justified. > It'd need to work under windows with EXEC_BACKEND. Looking at the patch, it seems like that this should work even for EXEC_BACKEND on *nix when a backend reattaches.. It may be better to be sure of that, as well, if it has not been tested. +++ b/src/backend/port/win32_shmem.c @@ -327,6 +327,10 @@ retry: Sleep(1000); continue; } + + SetConfigOption("huge_pages_status", (flProtect & SEC_LARGE_PAGES) ? + "on" : "off", PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT) Perhaps better to just move that at the end of PGSharedMemoryCreate() for WIN32? + <varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status"> + <term><varname>huge_pages_status</varname> (<type>enum</type>) + <indexterm> + <primary><varname>huge_pages_status</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Reports the state of huge pages in the current instance. + See <xref linkend="guc-huge-pages"/> for more information. + </para> + </listitem> + </varlistentry> The patch needs to provide more details about the unknown state, IMO, to make it crystal-clear to the users what are the limitations of this GUC, especially regarding the fact that this is useful when "try"-ing to allocate huge pages, and that the value can only be consulted after the server has been started. -- Michael
Attachment
On Tue, Mar 28, 2023 at 09:35:30AM +0900, Michael Paquier wrote: > The patch needs to provide more details about the unknown state, IMO, > to make it crystal-clear to the users what are the limitations of this > GUC, especially regarding the fact that this is useful when "try"-ing > to allocate huge pages, and that the value can only be consulted after > the server has been started. This is still unanswered? Perhaps at this stage we'd better consider that for v17 so as we have more time to agree on the user interface? -- Michael
Attachment
On Tue, Mar 28, 2023 at 09:35:30AM +0900, Michael Paquier wrote: > On Thu, Mar 23, 2023 at 08:50:50PM -0500, Justin Pryzby wrote: > > You mean that you abused of it in some custom branch running the CI on > github? If I may ask, what did you do to make sure that huge pages > are set when re-attaching a backend to a shmem area? Yes. I hijacked a tap test to first run "show huge_pages_active" and then also caused it to fail, such that I could check its output. https://cirrus-ci.com/task/6309510885670912 https://cirrus-ci.com/task/6613427737591808 > > If there's continuing aversions to using a GUC, please say so, and try > > to suggest an alternate implementation you think would be justified. > > It'd need to work under windows with EXEC_BACKEND. > > Looking at the patch, it seems like that this should work even for > EXEC_BACKEND on *nix when a backend reattaches.. It may be better to > be sure of that, as well, if it has not been tested. Arg, no - it wasn't working. I added an assert to check that a running server won't output "unknown". > +++ b/src/backend/port/win32_shmem.c > @@ -327,6 +327,10 @@ retry: > Sleep(1000); > continue; > } > + > + SetConfigOption("huge_pages_status", (flProtect & SEC_LARGE_PAGES) ? > + "on" : "off", PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT) > > Perhaps better to just move that at the end of PGSharedMemoryCreate() > for WIN32? Maybe - I don't know. > + <varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status"> > + <term><varname>huge_pages_status</varname> (<type>enum</type>) > + <indexterm> > + <primary><varname>huge_pages_status</varname> configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + Reports the state of huge pages in the current instance. > + See <xref linkend="guc-huge-pages"/> for more information. > + </para> > + </listitem> > + </varlistentry> > > The patch needs to provide more details about the unknown state, IMO, > to make it crystal-clear to the users what are the limitations of this > GUC, especially regarding the fact that this is useful when "try"-ing > to allocate huge pages, and that the value can only be consulted after > the server has been started. I'm not sure I agree. It can be confusing (even harmful) to overspecify the documentation. I used the word "instance" specifically to refer to a running server. Anyone querying the status of huge pages is going to need to understand that it doesn't make sense to ask about the memory state of a server that's not running. Actually, I'm having trouble imagining that anybody would ever run postgres -C huge_page_status except if they wondered whether it might misbehave. If what I've written is inadequate, could you propose something ? -- Justin PS. I hadn't updated this thread because my last message from you (on another thread) indicated that you'd gotten busy. I'm hoping you'll respond to these other inquiries when you're able. https://www.postgresql.org/message-id/ZCUZLiCeXq4Im7OG%40telsasoft.com https://www.postgresql.org/message-id/ZCUKL22GutwGrrZk%40telsasoft.com
Attachment
On Thu, Mar 23, 2023 at 05:25:46PM +0900, Kyotaro Horiguchi wrote: > Wouldn't storing the value in the shared memory itself work? Though, > that space does become almost dead for the server's lifetime... Sure, it would work. However, we'd still need an interface for the extra function. At this point, a GUC with an unknown state is kind of OK for me. Anyway, where would you stick this state? -- Michael
Attachment
At Tue, 11 Apr 2023 15:17:46 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Thu, Mar 23, 2023 at 05:25:46PM +0900, Kyotaro Horiguchi wrote: > > Wouldn't storing the value in the shared memory itself work? Though, > > that space does become almost dead for the server's lifetime... > > Sure, it would work. However, we'd still need an interface for the > extra function. At this point, a GUC with an unknown state is kind of > OK for me. Anyway, where would you stick this state? (Digging memory..) Sorry for confusion but I didn't mean to stick to the function. Just I thought that some people seem to dislike having the third state for the should-be-boolean variable. So, I'm okay with GUC, having "unknown". regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Mar 23, 2023 at 07:23:28AM +0900, Michael Paquier wrote: > On Wed, Mar 22, 2023 at 05:18:28PM -0500, Justin Pryzby wrote: >> Wow, good point. I think to make it work we'd need put >> huge_pages_active into BackendParameters and handle it in >> save_backend_variables(). If so, that'd be strong argument for using a >> GUC, which already has all the necessary infrastructure for exposing the >> server's state. > > I am afraid so, duplicating an existing infrastructure for a need like > the one of this thread is not really appealing. AFAICT this would involve adding a bool to BackendParameters and using it in save_backend_variables() and restore_backend_variables(), which is an additional 3 lines of code. That doesn't sound too bad to me, but perhaps I am missing something. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Apr 20, 2023 at 02:16:17PM -0700, Nathan Bossart wrote: > AFAICT this would involve adding a bool to BackendParameters and using it > in save_backend_variables() and restore_backend_variables(), which is an > additional 3 lines of code. That doesn't sound too bad to me, but perhaps > I am missing something. Appending more information to BackendParameters would be OK, still this would require the extra SQL function to access it, which is something that pg_settings is able to equally offer access to if using a GUC. -- Michael
Attachment
On Tue, May 02, 2023 at 11:17:50AM +0900, Michael Paquier wrote: > On Thu, Apr 20, 2023 at 02:16:17PM -0700, Nathan Bossart wrote: >> AFAICT this would involve adding a bool to BackendParameters and using it >> in save_backend_variables() and restore_backend_variables(), which is an >> additional 3 lines of code. That doesn't sound too bad to me, but perhaps >> I am missing something. > > Appending more information to BackendParameters would be OK, still > this would require the extra SQL function to access it, which is > something that pg_settings is able to equally offer access to if > using a GUC. Fair enough. I know I've been waffling in the GUC versus function discussion, but FWIW v7 of the patch looks reasonable to me. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Jun 12, 2023 at 02:37:15PM -0700, Nathan Bossart wrote: > Fair enough. I know I've been waffling in the GUC versus function > discussion, but FWIW v7 of the patch looks reasonable to me. + Assert(strcmp("unknown", GetConfigOption("huge_pages_status", false, false)) != 0); Not sure that there is anything to gain with this assertion in CreateSharedMemoryAndSemaphores(), because this is pretty much what check_GUC_init() looks after? @@ -627,6 +627,9 @@ CreateAnonymousSegment(Size *size) } #endif + SetConfigOption("huge_pages_status", ptr == MAP_FAILED ? "off" : "on", + PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT); The choice of this location is critical, because this is just *after* we've tried huge pages, but *before* doing the fallback mmap() call when the huge page allocation was on "try". I think that this deserves a comment. @@ -327,6 +327,10 @@ retry: Sleep(1000); continue; } + + SetConfigOption("huge_pages_status", (flProtect & SEC_LARGE_PAGES) ? + "on" : "off", PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT); Hmm. I still think that it is cleaner to move that at the end of PGSharedMemoryCreate() for the WIN32 case. There are also few FATALs in-between, which would make SetConfigOption() useless if there is an in-flight problem. Don't we need to update save_backend_variables() and add an entry in BackendParameters to make other processes launched by EXEC_BACKEND inherit the status value set? -- Michael
Attachment
On Tue, Jun 13, 2023 at 02:50:30PM +0900, Michael Paquier wrote: > Don't we need to update save_backend_variables() and add an entry > in BackendParameters to make other processes launched by EXEC_BACKEND > inherit the status value set? I thought this was handled by read/write_nondefault_variables(). -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Jun 12, 2023 at 11:11:14PM -0700, Nathan Bossart wrote: > On Tue, Jun 13, 2023 at 02:50:30PM +0900, Michael Paquier wrote: >> Don't we need to update save_backend_variables() and add an entry >> in BackendParameters to make other processes launched by EXEC_BACKEND >> inherit the status value set? > > I thought this was handled by read/write_nondefault_variables(). Ah, you are right. I forgot this part. That should be OK. -- Michael
Attachment
On Tue, Jun 13, 2023 at 02:50:30PM +0900, Michael Paquier wrote: > + Assert(strcmp("unknown", GetConfigOption("huge_pages_status", false, false)) != 0); > > Not sure that there is anything to gain with this assertion in > CreateSharedMemoryAndSemaphores(), because this is pretty much what > check_GUC_init() looks after? There was a second thing that bugged me here. Would it be worth adding some checks on huge_pages_status to make sure that it is never reported as unknown when the server is up and running? I am not sure what would be the best location for that because there is nothing specific to huge pages in the tests yet, but authentication/ with 005_sspi.pl and a second one would do the job? -- Michael
Attachment
On Tue, Jun 13, 2023 at 02:50:30PM +0900, Michael Paquier wrote: > On Mon, Jun 12, 2023 at 02:37:15PM -0700, Nathan Bossart wrote: > > Fair enough. I know I've been waffling in the GUC versus function > > discussion, but FWIW v7 of the patch looks reasonable to me. > > + Assert(strcmp("unknown", GetConfigOption("huge_pages_status", false, false)) != 0); > > Not sure that there is anything to gain with this assertion in > CreateSharedMemoryAndSemaphores(), because this is pretty much what > check_GUC_init() looks after? It seems like you misread the assertion, so I added a comment about it. Indeed, the assertion addresses the other question you asked later. That's what I already commented about, and the reason I found it compelling not to use a boolean. On Thu, Apr 06, 2023 at 04:57:58PM -0500, Justin Pryzby wrote: > I added an assert to check that a running server won't output > "unknown". On Wed, Jun 14, 2023 at 09:15:35AM +0900, Michael Paquier wrote: > There was a second thing that bugged me here. Would it be worth > adding some checks on huge_pages_status to make sure that it is never > reported as unknown when the server is up and running? -- Justin
Attachment
On Tue, Jun 20, 2023 at 06:44:20PM -0500, Justin Pryzby wrote: > On Tue, Jun 13, 2023 at 02:50:30PM +0900, Michael Paquier wrote: >> On Mon, Jun 12, 2023 at 02:37:15PM -0700, Nathan Bossart wrote: >> > Fair enough. I know I've been waffling in the GUC versus function >> > discussion, but FWIW v7 of the patch looks reasonable to me. >> >> + Assert(strcmp("unknown", GetConfigOption("huge_pages_status", false, false)) != 0); >> >> Not sure that there is anything to gain with this assertion in >> CreateSharedMemoryAndSemaphores(), because this is pretty much what >> check_GUC_init() looks after? > > It seems like you misread the assertion, so I added a comment about it. > Indeed, the assertion addresses the other question you asked later. > > That's what I already commented about, and the reason I found it > compelling not to use a boolean. Apologies for the late reply here. At the end, I am on board with the addition of this assertion and its position after PGSharedMemoryCreate(). I would also move the SetConfigOption() for the WIN32 path after ew have passed all the checks. There are a few FATALs that can be triggered so it would be a waste to call it if we are going to fail the shmem creation in this path. I could not resist adding two checks in the TAP tests to make sure that we don't report unknown. Perhaps that's not necessary, but that would provide coverage in a more broader way, and these are cheap. I have run one indentation, while on it. Note to self: check that manually on Windows. -- Michael
Attachment
On Fri, Jun 23, 2023 at 01:57:51PM +0900, Michael Paquier wrote: > I could not resist adding two checks in the TAP tests to make sure > that we don't report unknown. Perhaps that's not necessary, but that > would provide coverage in a more broader way, and these are cheap. > > I have run one indentation, while on it. > > Note to self: check that manually on Windows. I have spent a few hours on that, running more tests with -DEXEC_BACKEND, including Windows and macos, and applied it. -- Michael