Thread: New GUC to sample log queries
Hello hackers, In case of OLTP trafic it is hard to catch fast queries in logs (for example, you want to know parameters for only few queries). You have to put log_min_duration_statement to 0, do a reload, wait a few seconds/minutes, back log_min_duration_statement to a previous value and reload again. In this time, you can cross your fingers impact will not be important and keep an eye on log size. I suggest to sample logs, like sample_rate for auto_explain [1]. Attached patch introduce a new GUC, log_sample_rate, 1 means all queries will be logged (same behavior as now). Here is a naive SELECT only bench with a dataset which fit in ram (scale factor = 100) and PGDATA and log on a ramdisk: shared_buffers = 4GB seq_page_cost = random_page_cost = 1.0 logging_collector = on (no rotation) pgbench -c 4 -S -T 60 bench master : log_min_duration_statement = 0 TPS: 22562 log size: 1353746 lines (172MB) log_min_duration_statement = -1 TPS: 25654 log size: 0 lines patched: log_min_duration_statement = 0 log_sample_rate = 1 TPS: 22548 log size: 1352873 lines (171MB) log_min_duration_statement = 0 log_sample_rate = 0.1 TPS: 24802 log size: 148709 lines (19MB) log_min_duration_statement = 0 log_sample_rate = 0.01 TPS: 25245 log size: 15344 lines (2MB) log_min_duration_statement = 0 log_sample_rate = 0 TPS: 25858 log size: 0 lines log_min_duration_statement = -1 log_sample_rate = 1 TPS: 25599 log size: 0 lines I don't know the cost of random() call? With log_sample_rate = 0.01 we got 15K lines of logs and you are close to log_min_duration_statement = -1. Difference between log_min_duration_statement = 0 and -1 is about 12% performance drop on my laptop. I will update documentation and postgresql.conf.sample later. Thanks, 1: https://www.postgresql.org/docs/current/static/auto-explain.html -- Adrien NAYRAT
Attachment
On 31 May 2018 at 06:44, Adrien Nayrat <adrien.nayrat@anayrat.info> wrote: > Here is a naive SELECT only bench with a dataset which fit in ram (scale factor > = 100) and PGDATA and log on a ramdisk: > shared_buffers = 4GB > seq_page_cost = random_page_cost = 1.0 > logging_collector = on (no rotation) It would be better to just: SELECT 1; to try to get the true overhead of the additional logging code. > I don't know the cost of random() call? It's probably best to test in Postgres to see if there's an overhead to the new code. It may be worth special casing the 0 and 1 case so random() is not called. + (random() < log_sample_rate * MAX_RANDOM_VALUE); this should be <=, or you'll randomly miss logging a query when log_sample_rate is 1.0 every 4 billion or so queries. Of course, it would be better if we had a proper profiler, but I can see your need for this. Enabling logging of all queries in production is currently reserved for people with low traffic servers and the insane. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 05/31/2018 03:34 AM, David Rowley wrote: > On 31 May 2018 at 06:44, Adrien Nayrat <adrien.nayrat@anayrat.info> wrote: >> Here is a naive SELECT only bench with a dataset which fit in ram (scale factor >> = 100) and PGDATA and log on a ramdisk: >> shared_buffers = 4GB >> seq_page_cost = random_page_cost = 1.0 >> logging_collector = on (no rotation) > > It would be better to just: SELECT 1; to try to get the true overhead > of the additional logging code. Right, I wonder why I didn't have the idea :) > >> I don't know the cost of random() call? > > It's probably best to test in Postgres to see if there's an overhead > to the new code. It may be worth special casing the 0 and 1 case so > random() is not called. Thanks, I based on auto_explain's code. I will send a patch for auto_explain with same changes. > > + (random() < log_sample_rate * MAX_RANDOM_VALUE); > > this should be <=, or you'll randomly miss logging a query when > log_sample_rate is 1.0 every 4 billion or so queries. Right, it is the same in auto_explain. > > Of course, it would be better if we had a proper profiler, but I can > see your need for this. Enabling logging of all queries in production > is currently reserved for people with low traffic servers and the > insane. Yes, here are results with "SELECT 1" and changes to avoid random() call with 0 and 1. master : log_min_duration_statement = 0 TPS: 41294 log size: 2477589 lines (199MB) log_min_duration_statement = -1 TPS: 49478 log size: 0 lines patched: log_min_duration_statement = 0 log_sample_rate = 1 TPS: 41635 log size: 2497994 lines (201MB) log_min_duration_statement = 0 log_sample_rate = 0.1 TPS: 46915 log size: 282303 lines (23MB) log_min_duration_statement = 0 log_sample_rate = 0.01 TPS: 48867 log size: 29193 lines (2.4MB) log_min_duration_statement = 0 log_sample_rate = 0 TPS: 48912 log size: 0 lines log_min_duration_statement = -1 log_sample_rate = 1 TPS: 49061 log size: 0 lines Difference between l_m_d_s = 0 and -1 is about 16% and logger process eat around 30% CPU on my laptop. Revised patch attached. Thanks, -- Adrien NAYRAT
Attachment
On Thu, May 31, 2018 at 02:37:07PM +0200, Adrien Nayrat wrote: > On 05/31/2018 03:34 AM, David Rowley wrote: >> On 31 May 2018 at 06:44, Adrien Nayrat <adrien.nayrat@anayrat.info> wrote: >>> Here is a naive SELECT only bench with a dataset which fit in ram (scale factor >>> = 100) and PGDATA and log on a ramdisk: >>> shared_buffers = 4GB >>> seq_page_cost = random_page_cost = 1.0 >>> logging_collector = on (no rotation) >> >> It would be better to just: SELECT 1; to try to get the true overhead >> of the additional logging code. > > Right, I wonder why I didn't have the idea :) Using prepared statements help also in reducing the load with unnecessary planning. -- Michael
Attachment
On 05/31/2018 03:22 PM, Michael Paquier wrote: > On Thu, May 31, 2018 at 02:37:07PM +0200, Adrien Nayrat wrote: >> On 05/31/2018 03:34 AM, David Rowley wrote: >>> On 31 May 2018 at 06:44, Adrien Nayrat <adrien.nayrat@anayrat.info> wrote: >>>> Here is a naive SELECT only bench with a dataset which fit in ram (scale factor >>>> = 100) and PGDATA and log on a ramdisk: >>>> shared_buffers = 4GB >>>> seq_page_cost = random_page_cost = 1.0 >>>> logging_collector = on (no rotation) >>> >>> It would be better to just: SELECT 1; to try to get the true overhead >>> of the additional logging code. >> >> Right, I wonder why I didn't have the idea :) > > Using prepared statements help also in reducing the load with > unnecessary planning. > -- > Michael > Thanks, but the winner is Julien Rouhaud with only ";", I reach 115K TPS on my laptop :) Attached third version of the patch with documentation. -- Adrien NAYRAT
Attachment
On 24/06/18 13:22, Adrien Nayrat wrote: > Attached third version of the patch with documentation. Hi. I'm reviewing this. > exceeded = (log_min_duration_statement == 0 || > (log_min_duration_statement > 0 && > (secs > log_min_duration_statement / 1000 || > - secs * 1000 + msecs >= log_min_duration_statement))); > + secs * 1000 + msecs >= log_min_duration_statement))) && > + log_sample_rate != 0 && (log_sample_rate == 1 || > + random() <= log_sample_rate * MAX_RANDOM_VALUE); A few notes on this part, which is the crux of the patch. 1) Is there an off-by-one error here? drandom() has the formula (double) random() / ((double) MAX_RANDOM_VALUE + 1) but yours doesn't look like that. 2) I think the whole thing should be separated into its own expression instead of tagging along on an already hard to read expression. 3) Is it intentional to only sample with log_min_duration_statement and not also with log_duration? It seems like it should affect both. In both cases, the name is too generic. Something called "log_sample_rate" should sample **everything**. Otherwise I think it's good. Attached is a revised patch that fixes 2) as well as some wordsmithing throughout. It does not deal with the other issues I raised. Marked "Waiting on Author". -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On 06/24/2018 08:41 PM, Vik Fearing wrote: > On 24/06/18 13:22, Adrien Nayrat wrote: >> Attached third version of the patch with documentation. > > Hi. I'm reviewing this. Hi, thanks for your review. > >> exceeded = (log_min_duration_statement == 0 || >> (log_min_duration_statement > 0 && >> (secs > log_min_duration_statement / 1000 || >> - secs * 1000 + msecs >= log_min_duration_statement))); >> + secs * 1000 + msecs >= log_min_duration_statement))) && >> + log_sample_rate != 0 && (log_sample_rate == 1 || >> + random() <= log_sample_rate * MAX_RANDOM_VALUE); > > A few notes on this part, which is the crux of the patch. > > 1) Is there an off-by-one error here? drandom() has the formula > > (double) random() / ((double) MAX_RANDOM_VALUE + 1) > > but yours doesn't look like that. I don't think there is an error. random() function return a long int between 0 and MAX_RANDOM_VALUE. > > 2) I think the whole thing should be separated into its own expression > instead of tagging along on an already hard to read expression. +1 > > 3) Is it intentional to only sample with log_min_duration_statement and > not also with log_duration? It seems like it should affect both. In > both cases, the name is too generic. Something called "log_sample_rate" > should sample **everything**. I do not think it could be useful to sample other case such as log_duration. But yes, the GUC is confusing and I am not comfortable to introduce a new GUC in my initial patch. Maybe we should adapt current GUC with something like : log_min_duration_statement = <time>,<sample rate> This give : log_min_duration_statement = 0,0.1 Equivalent to : log_min_duration_statement = 0 log_sample_rate = 0.1 Thought? > > Otherwise I think it's good. Attached is a revised patch that fixes 2) > as well as some wordsmithing throughout. It does not deal with the > other issues I raised. Thanks for your corrections. -- Adrien
Attachment
On 06/27/2018 11:13 PM, Adrien Nayrat wrote: >> 3) Is it intentional to only sample with log_min_duration_statement and >> not also with log_duration? It seems like it should affect both. In >> both cases, the name is too generic. Something called "log_sample_rate" >> should sample **everything**. > I do not think it could be useful to sample other case such as log_duration. > > But yes, the GUC is confusing and I am not comfortable to introduce a new GUC in > my initial patch. > > Maybe we should adapt current GUC with something like : > > log_min_duration_statement = <time>,<sample rate>> > This give : > > log_min_duration_statement = 0,0.1 > > Equivalent to : > log_min_duration_statement = 0 > log_sample_rate = 0.1 > > Thought? > After reflection it seems a bad idea : * it breaks compatibility with external tools * it introduce a kind of "composite" GUC which may add complexity to use. For example in pg_settings view. What do you think of : log_min_duration_statement_sample ? Is it too long? I saw a few days ago this error on http://commitfest.cputube.org postgres.sgml:5202: element xref: validity error : IDREF attribute linkend references an unknown ID "log_min_duration_statement" Patch attached with fix on linkend marker Regards, -- Adrien
Attachment
On 10/07/18 20:34, Adrien Nayrat wrote: > On 06/27/2018 11:13 PM, Adrien Nayrat wrote: >>> 3) Is it intentional to only sample with log_min_duration_statement and >>> not also with log_duration? It seems like it should affect both. In >>> both cases, the name is too generic. Something called "log_sample_rate" >>> should sample **everything**. >> I do not think it could be useful to sample other case such as log_duration. >> >> But yes, the GUC is confusing and I am not comfortable to introduce a new GUC in >> my initial patch. >> >> Maybe we should adapt current GUC with something like : >> >> log_min_duration_statement = <time>,<sample rate>> >> This give : >> >> log_min_duration_statement = 0,0.1 >> >> Equivalent to : >> log_min_duration_statement = 0 >> log_sample_rate = 0.1 >> >> Thought? >> > > After reflection it seems a bad idea : > > * it breaks compatibility with external tools > * it introduce a kind of "composite" GUC which may add complexity to use. For > example in pg_settings view. Yes, I think that was a very bad idea. > What do you think of : log_min_duration_statement_sample ? Hmm. Not sure if that last word should be _sample, _sampling, _rate, or a combination of those. > Is it too long? I introduced idle_in_transaction_session_timeout, so I'm not one to say a setting name is too long :) -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sun, Jul 15, 2018 at 6:53 AM, Vik Fearing <vik.fearing@2ndquadrant.com> wrote: > Hmm. Not sure if that last word should be _sample, _sampling, _rate, or > a combination of those. +1 for rate or sample_rate. I think "sample" or "sampling" without "rate" will not be very clear. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 07/16/2018 05:24 PM, Robert Haas wrote: > On Sun, Jul 15, 2018 at 6:53 AM, Vik Fearing > <vik.fearing@2ndquadrant.com> wrote: >> Hmm. Not sure if that last word should be _sample, _sampling, _rate, or >> a combination of those. > > +1 for rate or sample_rate. I think "sample" or "sampling" without > "rate" will not be very clear. > Thanks, After reading this mail : https://www.postgresql.org/message-id/20180710183828.GB3890%40telsasoft.com I wonder if this GUC could be used for theses logging method? log_statement_stats log_parser_stats log_planner_stats log_executor_stats -- Adrien
Attachment
On 07/16/2018 05:24 PM, Robert Haas wrote: > On Sun, Jul 15, 2018 at 6:53 AM, Vik Fearing > <vik.fearing@2ndquadrant.com> wrote: >> Hmm. Not sure if that last word should be _sample, _sampling, _rate, or >> a combination of those. > > +1 for rate or sample_rate. I think "sample" or "sampling" without > "rate" will not be very clear. > 1+ to sample_rate It's what auto_explain and pgbench uses, so let's keep the naming consistent. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On Mon, 16 Jul 2018 at 23:07, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On 07/16/2018 05:24 PM, Robert Haas wrote: > > On Sun, Jul 15, 2018 at 6:53 AM, Vik Fearing > > <vik.fearing@2ndquadrant.com> wrote: > >> Hmm. Not sure if that last word should be _sample, _sampling, _rate, or > >> a combination of those. > > > > +1 for rate or sample_rate. I think "sample" or "sampling" without > > "rate" will not be very clear. > > > > 1+ to sample_rate > > It's what auto_explain and pgbench uses, so let's keep the naming > consistent. This patch went through last few commitfests without any activity, but cfbot says it still applies and doesn't break any tests. From what I see there was some agreement about naming, so the patch is indeed needs more review. Could anyone from the reviewers (Vik?) confirm that it's in a good shape?
On 11/18/18 12:47 AM, Dmitry Dolgov wrote: > This patch went through last few commitfests without any activity, but cfbot > says it still applies and doesn't break any tests. From what I see there was > some agreement about naming, so the patch is indeed needs more review. Could > anyone from the reviewers (Vik?) confirm that it's in a good shape? Thanks Dmitry to bringing this up. For me, the question is how to name this GUC? Is "log_sample_rate" is enough? I am not sure because it is only related to queries logged by log_min_duration_statements. Alors, I wonder if we should use the same logic for other parameters, such as log_statement_stats log_parser_stats log_planner_stats log_executor_stats It was mentioned in this thread https://www.postgresql.org/message-id/20180710183828.GB3890%40telsasoft.com
> On Sun, 18 Nov 2018 at 10:52, Adrien Nayrat <adrien.nayrat@anayrat.info> > wrote: > > For me, the question is how to name this GUC? Is "log_sample_rate" is enough? > I am not sure because it is only related to queries logged by > log_min_duration_statements. Since it's hard to come up with a concise name that will mention sampling rate in the context of min_duration_statement, I think it's fine to name this configuration "log_sample_rate", as long as it's dependency from log_min_duration_statements is clearly explained in the documentation.
On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: > Since it's hard to come up with a concise name that will mention sampling rate > in the context of min_duration_statement, I think it's fine to name this > configuration "log_sample_rate", as long as it's dependency from > log_min_duration_statements is clearly explained in the documentation. log_sample_rate looks fine to me as a name. -- Michael
Attachment
On 11/19/18 2:57 AM, Michael Paquier wrote: > On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: >> Since it's hard to come up with a concise name that will mention sampling rate >> in the context of min_duration_statement, I think it's fine to name this >> configuration "log_sample_rate", as long as it's dependency from >> log_min_duration_statements is clearly explained in the documentation. > > log_sample_rate looks fine to me as a name. That seems far too short to me - the name should indicate it applies to statement logging. I'd say log_statement_sample_rate is better. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/18/18 10:52 AM, Adrien Nayrat wrote: > ... > Alors, I wonder if we should use the same logic for other parameters, such as > log_statement_stats > log_parser_stats > log_planner_stats > log_executor_stats > > It was mentioned in this thread > https://www.postgresql.org/message-id/20180710183828.GB3890%40telsasoft.com You mean apply sampling to those logging options too? I doubt that would be very useful, as the those options are IMHO meant for development. Maybe tracking some of the info would be useful, but integrating it into pg_stat_statements seems like a better solution than just dumping it into the server log. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On Mon, Nov 19, 2018 at 2:40 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On 11/19/18 2:57 AM, Michael Paquier wrote: > > On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: > >> Since it's hard to come up with a concise name that will mention sampling rate > >> in the context of min_duration_statement, I think it's fine to name this > >> configuration "log_sample_rate", as long as it's dependency from > >> log_min_duration_statements is clearly explained in the documentation. > > > > log_sample_rate looks fine to me as a name. > > That seems far too short to me - the name should indicate it applies to > statement logging. I'd say log_statement_sample_rate is better. I agree, sounds reasonable.
On 11/19/18 2:52 PM, Dmitry Dolgov wrote: >> On Mon, Nov 19, 2018 at 2:40 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >> >> On 11/19/18 2:57 AM, Michael Paquier wrote: >>> On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: >>>> Since it's hard to come up with a concise name that will mention sampling rate >>>> in the context of min_duration_statement, I think it's fine to name this >>>> configuration "log_sample_rate", as long as it's dependency from >>>> log_min_duration_statements is clearly explained in the documentation. >>> >>> log_sample_rate looks fine to me as a name. >> >> That seems far too short to me - the name should indicate it applies to >> statement logging. I'd say log_statement_sample_rate is better. > > I agree, sounds reasonable. > Thanks for your comments. Here is the updated patch. I fixed a warning for missing parentheses in this expression: if ((exceeded && in_sample) || log_duration) It passed make check_world and make docs
Attachment
On 21/11/2018 09:06, Adrien Nayrat wrote: > On 11/19/18 2:52 PM, Dmitry Dolgov wrote: >>> On Mon, Nov 19, 2018 at 2:40 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >>> >>> On 11/19/18 2:57 AM, Michael Paquier wrote: >>>> On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: >>>>> Since it's hard to come up with a concise name that will mention sampling rate >>>>> in the context of min_duration_statement, I think it's fine to name this >>>>> configuration "log_sample_rate", as long as it's dependency from >>>>> log_min_duration_statements is clearly explained in the documentation. >>>> >>>> log_sample_rate looks fine to me as a name. >>> >>> That seems far too short to me - the name should indicate it applies to >>> statement logging. I'd say log_statement_sample_rate is better. >> >> I agree, sounds reasonable. >> > > Thanks for your comments. Here is the updated patch. I fixed a warning for > missing parentheses in this expression: > if ((exceeded && in_sample) || log_duration) > > It passed make check_world and make docs I am happy with this version. I've marked it ready for committer. Thanks! -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Nov 21, 2018 at 9:06 PM Adrien Nayrat <adrien.nayrat@anayrat.info> wrote: > Thanks for your comments. Here is the updated patch. I fixed a warning for > missing parentheses in this expression: > if ((exceeded && in_sample) || log_duration) Hi Adrien, GCC 4.8 says: guc.c:3328:3: error: initialization makes integer from pointer without a cast [-Werror] }, ^ guc.c:3328:3: error: (near initialization for ‘ConfigureNamesReal[19].gen.flags’) [-Werror] -- Thomas Munro http://www.enterprisedb.com
On 11/26/18 12:40 AM, Thomas Munro wrote: > On Wed, Nov 21, 2018 at 9:06 PM Adrien Nayrat > <adrien.nayrat@anayrat.info> wrote: >> Thanks for your comments. Here is the updated patch. I fixed a warning for >> missing parentheses in this expression: >> if ((exceeded && in_sample) || log_duration) > > Hi Adrien, > > GCC 4.8 says: > > guc.c:3328:3: error: initialization makes integer from pointer without > a cast [-Werror] > }, > ^ > guc.c:3328:3: error: (near initialization for > ‘ConfigureNamesReal[19].gen.flags’) [-Werror] > Sorry, I missed this warning. There was one extra NULL in guc.c. Here is a new patch. Thanks
Attachment
=# select name, short_desc, extra_desc, category from pg_settings where category like 'Reporting%When%' ; ─[ RECORD 1 ]──────────────────────────────────────────────────────────────────────────────────────────────────── name │ log_min_duration_statement short_desc │ Sets the minimum execution time above which statements will be logged. extra_desc │ Zero prints all queries. -1 turns this feature off. category │ Reporting and Logging / When to Log ─[ RECORD 2 ]──────────────────────────────────────────────────────────────────────────────────────────────────── name │ log_min_error_statement short_desc │ Causes all statements generating error at or above this level to be logged. extra_desc │ Each level includes all the levels that follow it. The later the level, the fewer messages are sent. category │ Reporting and Logging / When to Log ─[ RECORD 3 ]──────────────────────────────────────────────────────────────────────────────────────────────────── name │ log_min_messages short_desc │ Sets the message levels that are logged. extra_desc │ Each level includes all the levels that follow it. The later the level, the fewer messages are sent. category │ Reporting and Logging / When to Log ─[ RECORD 4 ]──────────────────────────────────────────────────────────────────────────────────────────────────── name │ log_statement_sample_rate short_desc │ Fraction of statements to log. extra_desc │ 1.0 logs all statements. category │ Reporting and Logging / When to Log The description here seems a bit short on details to me. I would say something like "Fraction of statements over log_min_duration_statement to log"; otherwise it's not clear why this doesn't apply to log_statement. I think the extra_desc should be more verbose too. (Not really clear to me what to put in each ... suggestions welcome.) (More generally, I think we should add a lot more juice to the GUC description fields.) Attached is pgindent fixes for your next submission. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 11/29/18 2:27 PM, Alvaro Herrera wrote: > > =# select name, short_desc, extra_desc, category from pg_settings where category like 'Reporting%When%' ; > ─[ RECORD 1 ]──────────────────────────────────────────────────────────────────────────────────────────────────── > name │ log_min_duration_statement > short_desc │ Sets the minimum execution time above which statements will be logged. > extra_desc │ Zero prints all queries. -1 turns this feature off. > category │ Reporting and Logging / When to Log > ─[ RECORD 2 ]──────────────────────────────────────────────────────────────────────────────────────────────────── > name │ log_min_error_statement > short_desc │ Causes all statements generating error at or above this level to be logged. > extra_desc │ Each level includes all the levels that follow it. The later the level, the fewer messages are sent. > category │ Reporting and Logging / When to Log > ─[ RECORD 3 ]──────────────────────────────────────────────────────────────────────────────────────────────────── > name │ log_min_messages > short_desc │ Sets the message levels that are logged. > extra_desc │ Each level includes all the levels that follow it. The later the level, the fewer messages are sent. > category │ Reporting and Logging / When to Log > ─[ RECORD 4 ]──────────────────────────────────────────────────────────────────────────────────────────────────── > name │ log_statement_sample_rate > short_desc │ Fraction of statements to log. > extra_desc │ 1.0 logs all statements. > category │ Reporting and Logging / When to Log > > The description here seems a bit short on details to me. I would say > something like "Fraction of statements over log_min_duration_statement > to log"; otherwise it's not clear why this doesn't apply to > log_statement. I think the extra_desc should be more verbose too. (Not > really clear to me what to put in each ... suggestions welcome.) I totally agree with that. I like your short_desc, so I propose: short_desc: "Fraction of statements over log_min_duration_statement to log" long_desc: "If we only want a sample, use a value between 0 (never log) and 1.0 (always log)" If you agree with that, I will send another patch (I will complete postgresql.conf.sample with the same description). > > (More generally, I think we should add a lot more juice to the GUC > description fields.) > > Attached is pgindent fixes for your next submission. > Thanks, (I have to remember to pgident patches before submission)
On 2018-Nov-29, Adrien NAYRAT wrote: > > =# select name, short_desc, extra_desc, category from pg_settings where category like 'Reporting%When%' ; > > ─[ RECORD 1 ]──────────────────────────────────────────────────────────────────────────────────────────────────── > > name │ log_min_duration_statement > > short_desc │ Sets the minimum execution time above which statements will be logged. > > extra_desc │ Zero prints all queries. -1 turns this feature off. > > category │ Reporting and Logging / When to Log > > ─[ RECORD 4 ]──────────────────────────────────────────────────────────────────────────────────────────────────── > > name │ log_statement_sample_rate > > short_desc │ Fraction of statements to log. > > extra_desc │ 1.0 logs all statements. > > category │ Reporting and Logging / When to Log > I like your short_desc, so I propose: > > short_desc: "Fraction of statements over log_min_duration_statement to log" > long_desc: "If we only want a sample, use a value between 0 (never log) and > 1.0 (always log)" Sounds good to me (we -> you, I suppose?). I'd tweak the extra_desc for log_min_duration_statement too, because we no longer log "all" when the rate is <1; maybe "Zero prints all queries, subject to log_statement_sample_rate. -1 turns this feature off." I just noticed we don't use "you" anywhere in the descs, except for this one: name │ log_hostname short_desc │ Logs the host name in the connection logs. extra_desc │ By default, connection logs only show the IP address of the connecting host. If you want them to show the hostname you can turn this on, but depending on your host name resolution setup it might impose a non-negligible performancepenalty. category │ Reporting and Logging / What to Log Probably not worth fussing about :-) > If you agree with that, I will send another patch (I will complete > postgresql.conf.sample with the same description). +1 -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/29/18 6:35 PM, Alvaro Herrera wrote: > > Sounds good to me (we -> you, I suppose?). I'd tweak the extra_desc for > log_min_duration_statement too, because we no longer log "all" when the > rate is <1; maybe "Zero prints all queries, subject to > log_statement_sample_rate. -1 turns this feature off." Right, I also changed a part of log_min_duration_statement documentation to mention log_statement_sample_rate > > I just noticed we don't use "you" anywhere in the descs, except for this > one: Maybe we could change with something like this : long_desc: "Take a value between 0 (never log) and 1.0 (always log) to log a sample." Thanks,
Attachment
Thanks! I pushed this with two changes -- one was to reword the docs a bit more, and the other was to compute in_sample only if it's going to be used (when exceeded is true). I hope this won't upset any compilers ... I wonder if there's any sensible way to verify the behavior in an automated fashion. Who know what marvels we'd discover about the reliability/uniformity of random() across platforms. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 29, 2018 at 1:49 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Thanks! I pushed this with two changes -- one was to reword the docs a
bit more, and the other was to compute in_sample only if it's going to
be used (when exceeded is true). I hope this won't upset any compilers ...
This is a great news – I can imaging how helpful this feature will be for query analysis and
troubleshooting.
At the same time, there is an approach, when we use application (client) code or pgbouncer's
connect_query parameter to perform sampled logging (with log_min_duration_statement = 0)
of n% of all *sessions* or *transactions*.
If you use single-query transactions only, new parameter will do equivalent job for you, while
significantly simplifying you life (pgbouncer is not required and you don't need to patch application
code). However, if you have multi-statement transaction, log_statement_sample_rate will not
be enough for troubleshooting – logging just a single statement of a multi-statement transaction
won't really help to troubleshoot in many cases.
That being said, I wonder, does it make sense to think about extending the functionality
just committed, with some options to to log all statements of n% of transactions (or sessions)?
In other words, allowing to choose, at which level perform sampling – statement, transaction, or
session?
Nik
On 11/29/18 10:48 PM, Alvaro Herrera wrote: > Thanks! I pushed this with two changes -- one was to reword the docs a > bit more, and the other was to compute in_sample only if it's going to > be used (when exceeded is true). I hope this won't upset any compilers ... Thanks! > > I wonder if there's any sensible way to verify the behavior in an > automated fashion. Who know what marvels we'd discover about the > reliability/uniformity of random() across platforms. > Yes, I also wondered how to add tests.
On 11/30/18 7:42 AM, Nikolay Samokhvalov wrote: > On Thu, Nov 29, 2018 at 1:49 PM Alvaro Herrera <alvherre@2ndquadrant.com > <mailto:alvherre@2ndquadrant.com>> wrote: > > Thanks! I pushed this with two changes -- one was to reword the docs a > bit more, and the other was to compute in_sample only if it's going to > be used (when exceeded is true). I hope this won't upset any > compilers ... > > > This is a great news – I can imaging how helpful this feature will be > for query analysis and > troubleshooting. > > At the same time, there is an approach, when we use application (client) > code or pgbouncer's > connect_query parameter to perform sampled logging (with > log_min_duration_statement = 0) > of n% of all *sessions* or *transactions*. Good idead! Unfortunately it require pgbouncer, but I keep this trick in mind. > > If you use single-query transactions only, new parameter will do > equivalent job for you, while > significantly simplifying you life (pgbouncer is not required and you > don't need to patch application > code). However, if you have multi-statement transaction, > log_statement_sample_rate will not > be enough for troubleshooting – logging just a single statement of a > multi-statement transaction > won't really help to troubleshoot in many cases. > > That being said, I wonder, does it make sense to think about extending > the functionality > just committed, with some options to to log all statements of n% of > transactions (or sessions)? > In other words, allowing to choose, at which level perform sampling – > statement, transaction, or > session? > +1, I like this idea.
Hello I can not build current HEAD cleanly. I got warning: > postgres.c: In function ‘check_log_duration’: > postgres.c:2254:17: warning: ‘in_sample’ may be used uninitialized in this function [-Wmaybe-uninitialized] > if ((exceeded && in_sample) || log_duration) Should not we have such change? > - bool in_sample; > + bool in_sample = false; thank you! regards, Sergei
On 2018-Nov-30, Sergei Kornilov wrote: > Hello > > I can not build current HEAD cleanly. I got warning: > > > postgres.c: In function ‘check_log_duration’: > > postgres.c:2254:17: warning: ‘in_sample’ may be used uninitialized in this function [-Wmaybe-uninitialized] > > if ((exceeded && in_sample) || log_duration) Ah, I feared that some compiler would not be smart enough to be silent about that. I hope it does not emit a warning with this patch? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi > Ah, I feared that some compiler would not be smart enough to be silent > about that. I hope it does not emit a warning with this patch? Yes, with this change build is clean. Thank you PS: currently i use old gcc (Debian 4.9.2-10+deb8u1) 4.9.2 regards, Sergei
On 2018-Nov-30, Sergei Kornilov wrote: > > Ah, I feared that some compiler would not be smart enough to be silent > > about that. I hope it does not emit a warning with this patch? > Yes, with this change build is clean. Thank you Thanks, pushed. > PS: currently i use old gcc (Debian 4.9.2-10+deb8u1) 4.9.2 Only five years old ... not that bad. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Thanks! I pushed this with two changes -- one was to reword the docs a > bit more, and the other was to compute in_sample only if it's going to > be used (when exceeded is true). I hope this won't upset any compilers ... > I wonder if there's any sensible way to verify the behavior in an > automated fashion. Who know what marvels we'd discover about the > reliability/uniformity of random() across platforms. So Coverity doesn't like this patch: 2250 in_sample = exceeded && 2251 log_statement_sample_rate != 0 && 2252 (log_statement_sample_rate == 1 || >>> CID 1441909: Security best practices violations (DC.WEAK_CRYPTO) >>> "random" should not be used for security related applications, as linear congruential algorithms are too easy tobreak. 2253 random() <= log_statement_sample_rate * MAX_RANDOM_VALUE); 2254 2255 if ((exceeded && in_sample) || log_duration) I am not sure I buy the argument that this is a security hazard, but there are other reasons to question the use of random() here, some of which you stated yourself above. Another one is that using random() for internal purposes interferes with applications' possible use of drandom() and setseed(), ie an application depending on getting a particular random series would see different behavior depending on whether this GUC is active or not. I wonder whether we should establish a project policy to avoid use of random() for internal purposes, ie try to get to a point where drandom() is the only caller in the backend. A quick grep says that there's a dozen or so callers, so this patch certainly isn't the only offender ... but should we make an effort to convert them all to use, say, pg_erand48()? I think all the existing callers could happily share a process-wide random state, so we could make a wrapper that's no harder to use than random(). Another idea, which would be a lot less prone to breakage by add-on code, is to change drandom() and setseed() to themselves use pg_erand48() with a private seed. Then we could positively promise that their behavior isn't affected by anything else (and it'd become platform-independent, too, which it probably isn't today). There would be a one-time compatibility breakage for anyone who's depending on the exact current behavior, but that might be OK considering how weak our guarantees are for it right now. regards, tom lane
On Wed, Dec 26, 2018 at 10:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wonder whether we should establish a project policy to avoid use > of random() for internal purposes, ie try to get to a point where > drandom() is the only caller in the backend. A quick grep says > that there's a dozen or so callers, so this patch certainly isn't > the only offender ... but should we make an effort to convert them > all to use, say, pg_erand48()? I think all the existing callers > could happily share a process-wide random state, so we could make > a wrapper that's no harder to use than random(). I've used setseed() to make nbtree's "getting tired" behavior deterministic for specific test cases I've developed -- the random() choice of whether to split a page full of duplicates, or continue right in search of free space becomes predictable. I've used this to determine whether my nbtree patch's pg_upgrade'd indexes have precisely the same behavior as v3 indexes on the master branch (precisely the same in terms of the structure of the final index following a bulk load). I'm not sure whether or not kind of debugging scenario is worth giving much weight to going forward, but it's something to consider. It seems generally useful to be able to force deterministic-ish behavior in a single session. I don't expect that the test case is even a bit portable, but the technique was quite effective. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Wed, Dec 26, 2018 at 10:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I wonder whether we should establish a project policy to avoid use >> of random() for internal purposes, ie try to get to a point where >> drandom() is the only caller in the backend. A quick grep says >> that there's a dozen or so callers, so this patch certainly isn't >> the only offender ... but should we make an effort to convert them >> all to use, say, pg_erand48()? I think all the existing callers >> could happily share a process-wide random state, so we could make >> a wrapper that's no harder to use than random(). > I've used setseed() to make nbtree's "getting tired" behavior > deterministic for specific test cases I've developed -- the random() > choice of whether to split a page full of duplicates, or continue > right in search of free space becomes predictable. I've used this to > determine whether my nbtree patch's pg_upgrade'd indexes have > precisely the same behavior as v3 indexes on the master branch > (precisely the same in terms of the structure of the final index > following a bulk load). TBH, I'd call it a bug --- maybe even a low-grade security hazard --- that it's possible to affect that from user level. In fact, contemplating that for a bit: it is possible, as things stand in HEAD, for a user to control which of his statements will get logged if the DBA has enabled log_statement_sample_rate. It doesn't take a lot of creativity to think of ways to abuse that. So maybe Coverity had the right idea to start with. There might well be debugging value in affecting internal PRNG usages, but let's please not think it's a good idea that that's trivially reachable from SQL. regards, tom lane
On Wed, Dec 26, 2018 at 11:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've used setseed() to make nbtree's "getting tired" behavior > > deterministic for specific test cases I've developed -- the random() > > choice of whether to split a page full of duplicates, or continue > > right in search of free space becomes predictable. > TBH, I'd call it a bug --- maybe even a low-grade security hazard > --- that it's possible to affect that from user level. I don't disagree with that conclusion. Deterministic behavior is always preferable to random behavior when the randomness is not truly necessary. > There might well be debugging value in affecting internal PRNG usages, > but let's please not think it's a good idea that that's trivially > reachable from SQL. I hesitate to say that there is much value beyond the value that I've found in this one instance. Maybe the remaining cases where this technique could be applied just aren't very interesting. -- Peter Geoghegan
On Wed, Dec 26, 2018 at 11:46 AM Peter Geoghegan <pg@bowt.ie> wrote: > > There might well be debugging value in affecting internal PRNG usages, > > but let's please not think it's a good idea that that's trivially > > reachable from SQL. > > I hesitate to say that there is much value beyond the value that I've > found in this one instance. Maybe the remaining cases where this > technique could be applied just aren't very interesting. Actually, it looks like there may be several cases that are quite similar to the "getting tired" case that I took an interest in. You have spgdoinsert(), gistchoose(), and gin_rand()/dropItem(), just for starters -- those all seem to affect the final structure of an index. I'm beginning to think that the technique that I came up with to make "getting tired" deterministic ought to be supporting as a debugging option if we're to do away with internal use of the generic/seedable backend PRNG. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > I'm beginning to think that the technique that I came up with to make > "getting tired" deterministic ought to be supporting as a debugging > option if we're to do away with internal use of the generic/seedable > backend PRNG. I have no objection to providing such a thing as a debug option; I just don't want it to be reachable from SQL (at least not without pieces that aren't there normally, e.g. a loadable C function). Replacing random() might actually make that easier not harder, since we'd have more control over what happens when. regards, tom lane
On Wed, Dec 26, 2018 at 12:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Replacing random() might actually make that easier not harder, since > we'd have more control over what happens when. That does seem useful. I'm in favor. But why does the function to seed the internal PRNG have to be loadable? Can't it just be superuser-only? -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Wed, Dec 26, 2018 at 12:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Replacing random() might actually make that easier not harder, since >> we'd have more control over what happens when. > That does seem useful. I'm in favor. But why does the function to seed > the internal PRNG have to be loadable? Can't it just be > superuser-only? That'd work as far as I'm concerned. (But I can hear Frost wanting a built-in role for it ;-)) One thing we'd have to think about if we want to take this seriously is whether a process-wide PRNG state is really adequate; if you're trying to make a particular call site be deterministic, you'd likely wish it weren't interfered with by other call sites. On the flip side, having more call sites probably makes things more random and thus better for normal usage. Not sure how to resolve that tension (but I don't want to build a whole lot of new infrastructure here). regards, tom lane
On Wed, Dec 26, 2018 at 1:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > One thing we'd have to think about if we want to take this seriously > is whether a process-wide PRNG state is really adequate; if you're > trying to make a particular call site be deterministic, you'd likely > wish it weren't interfered with by other call sites. On the flip > side, having more call sites probably makes things more random and > thus better for normal usage. Not sure how to resolve that tension > (but I don't want to build a whole lot of new infrastructure here). I don't think that you need to resolve the tension -- I'd be fine with continuing to have a process-wide PRNG that was cordoned-off for internal use. I care about making behavior deterministic when running an SQL script within a single backend. This can only be expected to work when all inputs (non-random and random alike) are carefully accounted for, including even the mutation of a seed value as the test case runs. I also found myself disabling synchronized sequential scanning within the same test case I mentioned, because they caused false positive failures very occasionally. I can imagining a similar test case where the tester has to worry about autovacuum running ANALYZE, too (as it happens, these were all INSERT ... SELECT statements, where that didn't seem to be an issue). Trying to reason about the behavior of a call site in isolation seems way too complicated to be practical. In general, there will never be a standard way to think about this kind of debugging, and it would be unreasonable to insist on providing any kind of standard framework. It's a low-level tool, useful only to backend hackers. -- Peter Geoghegan
On Wed, Dec 26, 2018 at 01:45:00PM -0500, Tom Lane wrote: > I am not sure I buy the argument that this is a security hazard, but > there are other reasons to question the use of random() here, some of > which you stated yourself above. I wonder whether we should > establish a project policy to avoid use of random() for internal > purposes, ie try to get to a point where drandom() is the only > caller in the backend. Agreed for all three points. > A quick grep says that there's a dozen or so callers, so this patch > certainly isn't the only offender ... but should we make an effort > to convert them all to use, say, pg_erand48()? I think all the > existing callers could happily share a process-wide random state, > so we could make a wrapper that's no harder to use than random(). Another possibility would be to extend a bit more the use of pg_strong_random(), though it is designed to really be used in cases like authentication where the random bytes are strong for cryptography. pg_erand48() would be a good step forward. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Dec 26, 2018 at 01:45:00PM -0500, Tom Lane wrote: >> A quick grep says that there's a dozen or so callers, so this patch >> certainly isn't the only offender ... but should we make an effort >> to convert them all to use, say, pg_erand48()? I think all the >> existing callers could happily share a process-wide random state, >> so we could make a wrapper that's no harder to use than random(). > Another possibility would be to extend a bit more the use of > pg_strong_random(), though it is designed to really be used in cases > like authentication where the random bytes are strong for > cryptography. pg_erand48() would be a good step forward. I think pg_strong_random is overkill, and overly expensive, for most if not all of the existing callers of random(). We already changed the ones where it's important to be strong ... One thing I was wondering is if we should try to enforce a policy like this by putting, say, #define random() pg_random() into port.h or so. That would have the advantages of not having to touch any existing calls and not having to worry too much about future patches breaking the policy. On the other hand, it's conceivable that third-party extensions might get annoyed with us for hijacking a libc function. Thoughts? regards, tom lane
On Wed, Dec 26, 2018 at 08:46:25PM -0500, Tom Lane wrote: > One thing I was wondering is if we should try to enforce a policy > like this by putting, say, > > #define random() pg_random() > > into port.h or so. That would have the advantages of not having to touch > any existing calls and not having to worry too much about future patches > breaking the policy. On the other hand, it's conceivable that third-party > extensions might get annoyed with us for hijacking a libc function. > Thoughts? Not much a fan of that for random() to be honest as we are talking about 15 callers in the backend code and enforcing a call of in a low-level library.. -- Michael
Attachment
On Wed, Dec 26, 2018 at 5:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think pg_strong_random is overkill, and overly expensive, for > most if not all of the existing callers of random(). We already > changed the ones where it's important to be strong ... +1. There was a controversy a bit like this in the Python community a few years ago [1]. I don't think you can trust somebody to write Postgres backend code but not trust them to understand the security issues with a fast user-space PRNG (I think that I'd be willing to say the same thing about people that write Python programs of any consequence). It's always possible to make a change that might stop someone from introducing a bug. The question ought to be: why this change, and why now? [1] https://lwn.net/Articles/657269/ -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > It's always possible to make a change that might stop someone from > introducing a bug. The question ought to be: why this change, and why > now? The point here is not to be cryptographically strong at every single place where the backend might want a random number; I think we're all agreed that we don't need that. To me, the point is to ensure that the user-accessible random sequence is kept separate from internal uses, and the potential security exposure in the new random-logging patch is what justifies getting more worried about this than we were before. Now, we could probably fix that with some less intrusive patch than #define'ing random() --- in particular, if we give drandom and setseed their own private PRNG state, we've really fixed the security exposure without need to change anything else anywhere. So maybe we should just do that and be happy. regards, tom lane
On Wed, Dec 26, 2018 at 6:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The point here is not to be cryptographically strong at every single > place where the backend might want a random number; I think we're > all agreed that we don't need that. To me, the point is to ensure that > the user-accessible random sequence is kept separate from internal uses, > and the potential security exposure in the new random-logging patch is > what justifies getting more worried about this than we were before. I agree that that's the point here. > Now, we could probably fix that with some less intrusive patch than > #define'ing random() --- in particular, if we give drandom and setseed > their own private PRNG state, we've really fixed the security exposure > without need to change anything else anywhere. So maybe we should > just do that and be happy. +1. I don't like the idea of #define'ing random() myself. We're already making fairly broad assumptions about our having control of the backend's PRNG state within InitProcessGlobals(). How should this affect the new drandom()/setseed() private state, if at all? -- Peter Geoghegan
On Thu, Dec 27, 2018 at 3:55 PM Peter Geoghegan <pg@bowt.ie> wrote: > On Wed, Dec 26, 2018 at 6:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The point here is not to be cryptographically strong at every single > > place where the backend might want a random number; I think we're > > all agreed that we don't need that. To me, the point is to ensure that > > the user-accessible random sequence is kept separate from internal uses, > > and the potential security exposure in the new random-logging patch is > > what justifies getting more worried about this than we were before. > > I agree that that's the point here. +1, but I wonder if just separating them is enough. Is our seeding algorithm good enough for this new purpose? The initial seed is 100% predictable to a logged in user (it's made from the backend PID and backend start time, which we tell you), and not even that hard to guess from the outside, so I think Coverity's warning is an understatement in this case. Even if we separate the PRNG state used for internal stuff so that users can't clobber its seed from SQL, wouldn't it be possible to predict which statements will survive the log sampling filter given easily available information and a good guess at how many times random() (or whatever similar thing) has been called so far? > > Now, we could probably fix that with some less intrusive patch than > > #define'ing random() --- in particular, if we give drandom and setseed > > their own private PRNG state, we've really fixed the security exposure > > without need to change anything else anywhere. So maybe we should > > just do that and be happy. > > +1. I don't like the idea of #define'ing random() myself. +1, me neither. -- Thomas Munro http://www.enterprisedb.com
Hello all, > I am not sure I buy the argument that this is a security hazard, but > there are other reasons to question the use of random() here, some of > which you stated yourself above. Another one is that using random() > for internal purposes interferes with applications' possible use of > drandom() and setseed(), ie an application depending on getting a > particular random series would see different behavior depending on > whether this GUC is active or not. > > Another idea, which would be a lot less prone to breakage by > add-on code, is to change drandom() and setseed() to themselves > use pg_erand48() with a private seed. My random thoughts about random, erand48, etc. which may be slightly out of topic, sorry if this is the case. The word "random" is a misnommer for these pseudo-random generators, so that "strong" has to be used for higher quality generators:-( On Linux, random() is advertised with a period of around 2**36, its internal state is 8 to 256 bytes (default unclear, probably 8 bytes), however seeding with srandom() provides only 32 bits, which is a drawback. The pg_erand48 code looks like crumbs from the 70's optimized for 16 bits architectures (which it is probably not, but why not going to 64 bits or 128 bits directly looks like a missed opportunity), its internal state is 48 bits as its name implies, and its period probably around 2**48, which is 2**12 better than the previous case, not an extraordinary achievement. Initial seeding of any pseudo-random generator should NEVER only use pid & time, which are too predictable, as already noted on the thread. They should use a strong random source if available, and maybe some backup, eg hashing logs. I think that this should be directly implemented, maybe with some provision to set the seed manually for debugging purposes, although with time-dependent features that may use random I'm not sure how far this would go. Also, I would suggest to centralize and abstract the implementation of a default pseudo-random generator so that its actual internal size and quality can be changed. That would mean renaming pg_erand48 and hidding its state size, maybe along the lines of: // extractors void pg_random_bytes(int nbytes, char *where_to_put_them); uint32 pg_random_32(); uint64 pg_random_48(); uint64 pg_random_64(); ... // dynamic? int pg_random_state_size(void); // in bytes // or static? #define PG_RANDOM_STATE_SIZE 6 // bytes // get/set state bool pg_random_get_state(uchar *state(, int size|[PG_RANDOM_STATE_SIZE])); bool pg_random_set_state(const uchar *state...); Given the typical hardware a postgres instance runs on, I would shop around for a pseudo-random generator which takes advantage of 64 bits operations, and not go below 64 bit seeds, or possibly 128. If a strong random source is available but considered too costly, so that a (weak) linear congruencial algorithm must be used, a possible compromise is to reseed from the strong source every few thousands/millions draws, or with a time periodicity, eg every few minutes, or maybe some configurable option. A not too costly security enhancer is to combine different fast generators so that if one becomes weak at some point, the combination does not. -- Fabien.
Peter Geoghegan <pg@bowt.ie> writes: > On Wed, Dec 26, 2018 at 6:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Now, we could probably fix that with some less intrusive patch than >> #define'ing random() --- in particular, if we give drandom and setseed >> their own private PRNG state, we've really fixed the security exposure >> without need to change anything else anywhere. So maybe we should >> just do that and be happy. > +1. I don't like the idea of #define'ing random() myself. > We're already making fairly broad assumptions about our having control > of the backend's PRNG state within InitProcessGlobals(). How should > this affect the new drandom()/setseed() private state, if at all? I would think that InitProcessGlobals would initialize drandom's seed alongside random()'s seed. Hopefully to values not easily predictable from each other -- see also Munro's comment, which I'll respond to in a moment. regards, tom lane
Thomas Munro <thomas.munro@enterprisedb.com> writes: > On Thu, Dec 27, 2018 at 3:55 PM Peter Geoghegan <pg@bowt.ie> wrote: >> On Wed, Dec 26, 2018 at 6:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> The point here is not to be cryptographically strong at every single >>> place where the backend might want a random number; I think we're >>> all agreed that we don't need that. To me, the point is to ensure that >>> the user-accessible random sequence is kept separate from internal uses, >>> and the potential security exposure in the new random-logging patch is >>> what justifies getting more worried about this than we were before. > +1, but I wonder if just separating them is enough. Is our seeding > algorithm good enough for this new purpose? The initial seed is 100% > predictable to a logged in user (it's made from the backend PID and > backend start time, which we tell you), and not even that hard to > guess from the outside, so I think Coverity's warning is an > understatement in this case. Even if we separate the PRNG state used > for internal stuff so that users can't clobber its seed from SQL, > wouldn't it be possible to predict which statements will survive the > log sampling filter given easily available information and a good > guess at how many times random() (or whatever similar thing) has been > called so far? Yeah, that's a good point. Maybe we should upgrade the per-process seed initialization to make it less predictable. I could see expending a call of the strong RNG to contribute some more noise to the seeds selected in InitProcessGlobals(). regards, tom lane
I wrote: > Peter Geoghegan <pg@bowt.ie> writes: >> We're already making fairly broad assumptions about our having control >> of the backend's PRNG state within InitProcessGlobals(). How should >> this affect the new drandom()/setseed() private state, if at all? > I would think that InitProcessGlobals would initialize drandom's > seed alongside random()'s seed. Hopefully to values not easily > predictable from each other -- see also Munro's comment, which > I'll respond to in a moment. On further reflection, it seems likely that in most installations a lot of processes never invoke drandom()/setseed() at all, making such work in InitProcessGlobals a waste of cycles. Probably a better idea is to have drandom() initialize the seed on first use, if it wasn't already set by setseed(). This might also make it easier to decouple that seed value from the random() sequence --- we could use a fresh timestamp value, and perhaps another strong-RNG call, though I'm not sure if the latter is worthwhile. regards, tom lane
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> Another idea, which would be a lot less prone to breakage by >> add-on code, is to change drandom() and setseed() to themselves >> use pg_erand48() with a private seed. > The pg_erand48 code looks like crumbs from the 70's optimized for 16 bits > architectures (which it is probably not, but why not going to 64 bits or > 128 bits directly looks like a missed opportunity), its internal state is > 48 bits as its name implies, and its period probably around 2**48, which > is 2**12 better than the previous case, not an extraordinary achievement. I can't get terribly excited about rewriting that code. You're arguing from a "one size should fit all" perspective, which is exactly not the design approach we're using. We've already converted security-sensitive PRNG uses to use pg_strong_random (or if we haven't, that's a localized bug in any such places we missed). What remains are places where we are not so concerned about cryptographic strength but rather speed. It does behoove us to ensure that the seed values are unpredictable and that a user-controllable seed isn't used for internal operations, but I don't feel a need for replacing the algorithm. You might argue that the SQL function drandom should be held to a higher standard, but we document exactly this same tradeoff for it. Users who want cryptographic strength are directed to pgcrypto, leaving the audience for drandom being people who want speed. I do notice a couple of things that could be improved about erand48.c: 1. The _rand48_mult and _rand48_add values are constants, but probably few compilers would notice that, given that they're assigned to in pg_srand48. I think we should replace the references to those variables with direct uses of the macros for their values, to save a few cycles in _dorand48. 2. There seems to be a bug in pg_jrand48 (which is relatively new, added in fe0a0b599, without bothering to update the header comment). Per POSIX, The mrand48() and jrand48() functions return signed long integers uniformly distributed over the interval [-2^31, 2^31). However, if "long" is 64 bits what you're actually going to get is unsigned values ranging up to 2^32-1. This doesn't matter to existing callers because they coerce the value to int32 or uint32, but it's going to bite somebody eventually. regards, tom lane
Hello Tom, >>> Another idea, which would be a lot less prone to breakage by >>> add-on code, is to change drandom() and setseed() to themselves >>> use pg_erand48() with a private seed. > >> The pg_erand48 code looks like crumbs from the 70's optimized for 16 bits >> architectures (which it is probably not, but why not going to 64 bits or >> 128 bits directly looks like a missed opportunity), its internal state is >> 48 bits as its name implies, and its period probably around 2**48, which >> is 2**12 better than the previous case, not an extraordinary achievement. > > I can't get terribly excited about rewriting that code. You're arguing > from a "one size should fit all" perspective, Not exactly. I'm not claiming that distinguishing parts that require good random from others is a bad choice. I'm arguing that: (1) from a software engineering perspective, the PRNG implementation should hide the underlying generator name and state size. (2) from a numerical perspective, poor seedings practice should be avoided when possible. (3) from a cryptographic perspective, LCG is a poor choice of fast PRNG, which a quick look at wikipedia (mother of all knowledge) confirms. (4) the fact that pg historical PRNG choices are well documented is not a good justification for not improving them. Better alternatives exist that do not cost much (eg xorshift variants, WELL...), some of which are optimized for 64 bits architectures. > which is exactly not the design approach we're using. > We've already converted security-sensitive PRNG uses to use > pg_strong_random (or if we haven't, that's a localized bug in any such > places we missed). What remains are places where we are not so > concerned about cryptographic strength but rather speed. I do agree with the speed concern. I'm arguing that better quality at speed can be achieved with better seeding practices and not using a LCG. About costs, not counting array accesses: - lrand48 (48 bits state as 3 uint16) is 29 ops (10 =, 8 *, 7 +, 4 >>) - xorshift+ (128 bits state as 2 uint64) is 13 ops ( 5 =, 0 *, 1 +, 3 >>, 4 ^) - xororshift128+ (idem) is 17 ops ( 6 =, 0 *, 1 +, 5 >>, 3 ^, 2 |, less if rot in hardware) - WELL512 (512 bits state as 16 uint32) is 38 ops (11 =, 0 *, 3 +, 7 >>, 10 ^, 4 &) probably much better, but probably slower than the current version I'd be of the (debatable) opinion that we could use xororshift128+, already used by various languages, even if it fails some specialized tests. > It does behoove us to ensure that the seed values are unpredictable and > that a user-controllable seed isn't used for internal operations, Sure. > but I don't feel a need for replacing the algorithm. Hmmm. Does it mean that you would veto any change, even if the speed concern is addressed (i.e. faster/not slower with better quality)? > You might argue that the SQL function drandom should be held to a higher > standard, but we document exactly this same tradeoff for it. Users who > want cryptographic strength are directed to pgcrypto, leaving the audience > for drandom being people who want speed. My point is that speed is not necessary incompatible with better quality. Better quality should not replace strong random when needed. -- Fabien.
> About costs, not counting array accesses: > > - lrand48 (48 bits state as 3 uint16) is 29 ops > (10 =, 8 *, 7 +, 4 >>) > - xorshift+ (128 bits state as 2 uint64) is 13 ops > ( 5 =, 0 *, 1 +, 3 >>, 4 ^) > - xororshift128+ (idem) is 17 ops > ( 6 =, 0 *, 1 +, 5 >>, 3 ^, 2 |, less if rot in hardware) > - WELL512 (512 bits state as 16 uint32) is 38 ops > (11 =, 0 *, 3 +, 7 >>, 10 ^, 4 &) > probably much better, but probably slower than the current version > > I'd be of the (debatable) opinion that we could use xororshift128+, already > used by various languages, even if it fails some specialized tests. After some more digging, the better choice seems to be the 64 bits optimized xoshiro256** (xoshiro = xor shift rotate): - xoshiro256** (256 bits states as 4 uint64) is 24 ops (18 if rot in hw) 8 =, 2 *, 2 +, 5 <<, 5 ^, 2 | See http://vigna.di.unimi.it/xorshift/ -- Fabien.
>> - lrand48 (48 bits state as 3 uint16) is 29 ops >> (10 =, 8 *, 7 +, 4 >>) > > - xoshiro256** (256 bits states as 4 uint64) is 24 ops (18 if rot in hw) > 8 =, 2 *, 2 +, 5 <<, 5 ^, 2 | > > See http://vigna.di.unimi.it/xorshift/ Small benchmark on my laptop with gcc-7.3 -O3: - pg_lrand48 takes 4.0 seconds to generate 1 billion 32-bit ints - xoshiro256** takes 1.6 seconds to generate 1 billion 64-bit ints With -O2 it is 4.8 and 3.4 seconds, respectively. So significantly better speed _and_ quality are quite achievable. Note that small attempt at optimizing these functions (inline constants, array replaced with scalars) did not yield significant improvements. -- Fabien.
Attachment
On 12/26/18 7:45 PM, Tom Lane wrote: > I wonder whether we should establish a project policy to avoid use > of random() for internal purposes, ie try to get to a point where > drandom() is the only caller in the backend. FYI I picked the idea from auto_explain. Maybe we will have to change auto_explain too.
Hello again, >>> - lrand48 (48 bits state as 3 uint16) is 29 ops >>> (10 =, 8 *, 7 +, 4 >>) >> >> - xoshiro256** (256 bits states as 4 uint64) is 24 ops (18 if rot in hw) >> 8 =, 2 *, 2 +, 5 <<, 5 ^, 2 | > > Small benchmark on my laptop with gcc-7.3 -O3: > > - pg_lrand48 takes 4.0 seconds to generate 1 billion 32-bit ints > - xoshiro256** takes 1.6 seconds to generate 1 billion 64-bit ints These too-good-to-be-true figures have raised doubt in my mind, so after giving it some thought, I do not trust them: the function call is probably inlined and other optimizations are performed which would not apply in a more realistic case. > With -O2 it is 4.8 and 3.4 seconds, respectively. I trust this one, which is reasonably consistent with the operation count. More tests with "clang -O2" yielded 3.2 and 1.6 respectively, that I do not trust either. I did separate compilation to prevent inlining and other undesirable optimizations: clang -Ofast or -O2 gives 5.2 and 3.9, gcc -Ofast gives 5.4 and 3.5. It seems that clang is better at compiling pg_erand48 but gcc is better at xoroshi256**. > So significantly better speed _and_ quality are quite achievable. xoroshi256** is about 1/3 faster at producing twice as much pseudo-randoms stuff, and it passed significant randomness tests that an LCG PRNG would not. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> but I don't feel a need for replacing the algorithm. > Hmmm. Does it mean that you would veto any change, even if the speed > concern is addressed (i.e. faster/not slower with better quality)? Well, not veto exactly, but I'd be suspicious of it. First, erand48 has been around a *long* time and its properties are pretty well understood; these other algorithms you found on the net have no real pedigree IMO. Moreover, since it is standard, there's a lower cognitive burden on people to understand what it is and what it can be trusted for. Second, we don't actually have a problem we need to fix by changing the algorithm. We do need to worry about keeping drandom's state separate from the internal random() usages, but that's independent of what the algorithm is. Nobody has complained that random() is insufficiently random, only that the seed might be predictable. I do agree, after closer inspection, that our current coding of _dorand48 is a hangover from machines lacking 64-bit arithmetic. glibc does it like this: int __drand48_iterate (unsigned short int xsubi[3], struct drand48_data *buffer) { uint64_t X; uint64_t result; /* Initialize buffer, if not yet done. */ if (__glibc_unlikely (!buffer->__init)) { buffer->__a = 0x5deece66dull; buffer->__c = 0xb; buffer->__init = 1; } /* Do the real work. We choose a data type which contains at least 48 bits. Because we compute the modulus it does not care how many bits really are computed. */ X = (uint64_t) xsubi[2] << 32 | (uint32_t) xsubi[1] << 16 | xsubi[0]; result = X * buffer->__a + buffer->__c; xsubi[0] = result & 0xffff; xsubi[1] = (result >> 16) & 0xffff; xsubi[2] = (result >> 32) & 0xffff; return 0; } and other than the pointless use of variable __a and __c, I think we ought to adopt similar coding. regards, tom lane
I wrote: > On further reflection, it seems likely that in most installations a lot > of processes never invoke drandom()/setseed() at all, making such work > in InitProcessGlobals a waste of cycles. Probably a better idea is to > have drandom() initialize the seed on first use, if it wasn't already > set by setseed(). Here's a proposed patch for that. It occurs to me that there's still another good reason to decouple drandom from everything else: the point of setseed(), if it has any at all, is to allow a repeatable sequence of drandom values to be generated. Without this patch, no such guarantee exists because of the possibility of the backend making additional internal calls of libc random(). regards, tom lane diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 3786099..1df4356 100644 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *************** *** 1136,1146 **** </table> <para> ! The characteristics of the values returned by ! <literal><function>random()</function></literal> depend ! on the system implementation. It is not suitable for cryptographic ! applications; see <xref linkend="pgcrypto"/> module for an alternative. ! </para> <para> Finally, <xref linkend="functions-math-trig-table"/> shows the --- 1136,1150 ---- </table> <para> ! The <function>random()</function> function uses a simple linear ! congruential algorithm. It is fast but not suitable for cryptographic ! applications; see the <xref linkend="pgcrypto"/> module for an ! alternative. ! If <function>setseed()</function> is called, the results of ! subsequent <function>random()</function> calls in the current session are ! repeatable by re-issuing <function>setseed()</function> with the same ! argument. ! </para> <para> Finally, <xref linkend="functions-math-trig-table"/> shows the diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c index cf9327f..add099e 100644 *** a/src/backend/utils/adt/float.c --- b/src/backend/utils/adt/float.c *************** *** 22,31 **** --- 22,34 ---- #include "catalog/pg_type.h" #include "common/int.h" #include "libpq/pqformat.h" + #include "miscadmin.h" #include "utils/array.h" + #include "utils/backend_random.h" #include "utils/float.h" #include "utils/fmgrprotos.h" #include "utils/sortsupport.h" + #include "utils/timestamp.h" /* Configurable GUC parameter */ *************** float8 degree_c_sixty = 60.0; *** 53,58 **** --- 56,65 ---- float8 degree_c_one_half = 0.5; float8 degree_c_one = 1.0; + /* State for drandom() and setseed() */ + static bool drandom_seed_set = false; + static unsigned short drandom_seed[3] = {0, 0, 0}; + /* Local function prototypes */ static double sind_q1(double x); static double cosd_q1(double x); *************** drandom(PG_FUNCTION_ARGS) *** 2378,2385 **** { float8 result; ! /* result [0.0 - 1.0) */ ! result = (double) random() / ((double) MAX_RANDOM_VALUE + 1); PG_RETURN_FLOAT8(result); } --- 2385,2414 ---- { float8 result; ! /* Initialize random seed, if not done yet in this process */ ! if (unlikely(!drandom_seed_set)) ! { ! /* ! * If possible, initialize the seed using high-quality random bits. ! * Should that fail for some reason, we fall back on a lower-quality ! * seed based on current time and PID. ! */ ! if (!pg_backend_random((char *) drandom_seed, sizeof(drandom_seed))) ! { ! TimestampTz now = GetCurrentTimestamp(); ! uint64 iseed; ! ! /* Mix the PID with the most predictable bits of the timestamp */ ! iseed = (uint64) now ^ ((uint64) MyProcPid << 32); ! drandom_seed[0] = (unsigned short) iseed; ! drandom_seed[1] = (unsigned short) (iseed >> 16); ! drandom_seed[2] = (unsigned short) (iseed >> 32); ! } ! drandom_seed_set = true; ! } ! ! /* pg_erand48 produces desired result range [0.0 - 1.0) */ ! result = pg_erand48(drandom_seed); PG_RETURN_FLOAT8(result); } *************** Datum *** 2392,2404 **** setseed(PG_FUNCTION_ARGS) { float8 seed = PG_GETARG_FLOAT8(0); ! int iseed; ! if (seed < -1 || seed > 1) ! elog(ERROR, "setseed parameter %f out of range [-1,1]", seed); ! iseed = (int) (seed * MAX_RANDOM_VALUE); ! srandom((unsigned int) iseed); PG_RETURN_VOID(); } --- 2421,2440 ---- setseed(PG_FUNCTION_ARGS) { float8 seed = PG_GETARG_FLOAT8(0); ! uint64 iseed; ! if (seed < -1 || seed > 1 || isnan(seed)) ! ereport(ERROR, ! (errcode(ERRCODE_INVALID_PARAMETER_VALUE), ! errmsg("setseed parameter %g is out of allowed range [-1,1]", ! seed))); ! /* Use sign bit + 47 fractional bits to fill drandom_seed[] */ ! iseed = (int64) (seed * (float8) UINT64CONST(0x7FFFFFFFFFFF)); ! drandom_seed[0] = (unsigned short) iseed; ! drandom_seed[1] = (unsigned short) (iseed >> 16); ! drandom_seed[2] = (unsigned short) (iseed >> 32); ! drandom_seed_set = true; PG_RETURN_VOID(); }
I wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> +1, but I wonder if just separating them is enough. Is our seeding >> algorithm good enough for this new purpose? The initial seed is 100% >> predictable to a logged in user (it's made from the backend PID and >> backend start time, which we tell you), and not even that hard to >> guess from the outside, so I think Coverity's warning is an >> understatement in this case. Even if we separate the PRNG state used >> for internal stuff so that users can't clobber its seed from SQL, >> wouldn't it be possible to predict which statements will survive the >> log sampling filter given easily available information and a good >> guess at how many times random() (or whatever similar thing) has been >> called so far? > Yeah, that's a good point. Maybe we should upgrade the per-process > seed initialization to make it less predictable. I could see expending > a call of the strong RNG to contribute some more noise to the seeds > selected in InitProcessGlobals(). Here's a simple patch to do so. Looking at this, I seem to remember that we considered doing exactly this awhile ago, but refrained because there was concern about depleting the system's reserve of entropy if we have a high backend spawn rate, and it didn't seem like there was a security reason to insist on unpredictable random() results. However, the log-sampling patch destroys the latter argument. As for the former argument, I'm not sure how big a deal that really is. Presumably, the act of spawning a backend would itself contribute some more entropy to the pool (particularly if a network connection is involved), so the depletion problem might be fictitious in the first place. Also, a few references I consulted, such as the Linux urandom(4) man page, suggest that even in a depleted-entropy state the results of reading /dev/urandom should be random enough for all but the very strictest security requirements. Thoughts? regards, tom lane diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index eedc617..2d5a0ac 100644 *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *************** ClosePostmasterPorts(bool am_syslogger) *** 2520,2530 **** /* * InitProcessGlobals -- set MyProcPid, MyStartTime[stamp], random seeds * ! * Called early in every backend. */ void InitProcessGlobals(void) { MyProcPid = getpid(); MyStartTimestamp = GetCurrentTimestamp(); MyStartTime = timestamptz_to_time_t(MyStartTimestamp); --- 2520,2532 ---- /* * InitProcessGlobals -- set MyProcPid, MyStartTime[stamp], random seeds * ! * Called early in the postmaster and every backend. */ void InitProcessGlobals(void) { + unsigned int rseed; + MyProcPid = getpid(); MyStartTimestamp = GetCurrentTimestamp(); MyStartTime = timestamptz_to_time_t(MyStartTimestamp); *************** InitProcessGlobals(void) *** 2539,2553 **** #endif /* ! * Set a different seed for random() in every backend. Since PIDs and ! * timestamps tend to change more frequently in their least significant ! * bits, shift the timestamp left to allow a larger total number of seeds ! * in a given time period. Since that would leave only 20 bits of the ! * timestamp that cycle every ~1 second, also mix in some higher bits. */ ! srandom(((uint64) MyProcPid) ^ ((uint64) MyStartTimestamp << 12) ^ ! ((uint64) MyStartTimestamp >> 20)); } --- 2541,2570 ---- #endif /* ! * Set a different seed for random() in every process. We want something ! * unpredictable, so if possible, use high-quality random bits for the ! * seed. Otherwise, fall back to a seed based on timestamp and PID. ! * ! * Note we can't use pg_backend_random here, since this is used in the ! * postmaster, and even in a backend we might not be attached to shared ! * memory yet. */ ! #ifdef HAVE_STRONG_RANDOM ! if (!pg_strong_random(&rseed, sizeof(rseed))) ! #endif ! { ! /* ! * Since PIDs and timestamps tend to change more frequently in their ! * least significant bits, shift the timestamp left to allow a larger ! * total number of seeds in a given time period. Since that would ! * leave only 20 bits of the timestamp that cycle every ~1 second, ! * also mix in some higher bits. ! */ ! rseed = ((uint64) MyProcPid) ^ ((uint64) MyStartTimestamp << 12) ^ ! ((uint64) MyStartTimestamp >> 20); ! } ! srandom(rseed); }
I wrote: > Looking at this, I seem to remember that we considered doing exactly this > awhile ago, but refrained because there was concern about depleting the > system's reserve of entropy if we have a high backend spawn rate, and it > didn't seem like there was a security reason to insist on unpredictable > random() results. However, the log-sampling patch destroys the latter > argument. As for the former argument, I'm not sure how big a deal that > really is. Presumably, the act of spawning a backend would itself > contribute some more entropy to the pool (particularly if a network > connection is involved), so the depletion problem might be fictitious > in the first place. Also, a few references I consulted, such as the > Linux urandom(4) man page, suggest that even in a depleted-entropy > state the results of reading /dev/urandom should be random enough > for all but the very strictest security requirements. I did some experimentation, watching /proc/sys/kernel/random/entropy_avail while continuously spawning backends, and I can't see any difference in behavior with or without this patch. If there is any effect at all, it's completely swamped by other noise (and there's a lot of noise, even on a machine that's idle). Also, further googling says there's a pretty sizable body of opinion that Linux's available-entropy calculation is bogus anyway: once the system's acquired a reasonable amount of entropy, no amount of reading from /dev/urandom will cause the randomness of the results to decrease. So there's no reason to be concerned about whether we're reading it "too much". regards, tom lane
On Sat, Dec 29, 2018 at 1:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > Looking at this, I seem to remember that we considered doing exactly this > > awhile ago, but refrained because there was concern about depleting the > > system's reserve of entropy if we have a high backend spawn rate, and it > > didn't seem like there was a security reason to insist on unpredictable > > random() results. However, the log-sampling patch destroys the latter > > argument. As for the former argument, I'm not sure how big a deal that > > really is. Presumably, the act of spawning a backend would itself > > contribute some more entropy to the pool (particularly if a network > > connection is involved), so the depletion problem might be fictitious > > in the first place. Also, a few references I consulted, such as the > > Linux urandom(4) man page, suggest that even in a depleted-entropy > > state the results of reading /dev/urandom should be random enough > > for all but the very strictest security requirements. > > I did some experimentation, watching /proc/sys/kernel/random/entropy_avail > while continuously spawning backends, and I can't see any difference in > behavior with or without this patch. If there is any effect at all, it's > completely swamped by other noise (and there's a lot of noise, even on a > machine that's idle). > > Also, further googling says there's a pretty sizable body of opinion that > Linux's available-entropy calculation is bogus anyway: once the system's > acquired a reasonable amount of entropy, no amount of reading from > /dev/urandom will cause the randomness of the results to decrease. > So there's no reason to be concerned about whether we're reading it > "too much". I was going to suggest that we might be able to use a single not-visible-to-users number that is mixed into the existing recipe, so that we only ever read urandom once for the cluster. But it sounds like it's not a problem, and it's probably better to just pass the whole problem over to the OS. -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > I was going to suggest that we might be able to use a single > not-visible-to-users number that is mixed into the existing recipe, so > that we only ever read urandom once for the cluster. Yeah, I was thinking along similar lines, but there's a problem: InitProcessGlobals runs before an EXEC_BACKEND child has reconnected to shared memory, so there's no cheap way to pass state to it. No doubt there are ways around that, but I'd just as soon avoid adding complexity here. If we broke it somehow, the likely results would be silent failure of the per-process seed to be random, which might escape detection for a long time. > But it sounds > like it's not a problem, and it's probably better to just pass the > whole problem over to the OS. Yeah, that's what I'm thinking. regards, tom lane
I wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> I was going to suggest that we might be able to use a single >> not-visible-to-users number that is mixed into the existing recipe, so >> that we only ever read urandom once for the cluster. > Yeah, I was thinking along similar lines, but there's a problem: > InitProcessGlobals runs before an EXEC_BACKEND child has reconnected > to shared memory, so there's no cheap way to pass state to it. > No doubt there are ways around that, but I'd just as soon avoid > adding complexity here. If we broke it somehow, the likely results > would be silent failure of the per-process seed to be random, which > might escape detection for a long time. >> But it sounds >> like it's not a problem, and it's probably better to just pass the >> whole problem over to the OS. > Yeah, that's what I'm thinking. One final point on this --- I wondered whether initializing the seed with pg_strong_random would be too costly time-wise. It doesn't seem so; I measure the time to do it as ~25us with /dev/urandom or ~80us with OpenSSL on my main development workstation. The total time to start a backend on that machine is a few milliseconds depending on what you want to count. I got numbers broadly in line with those results on half a dozen other platforms (recent Fedora, Mac, various BSD on x86_64, ARM, and PPC). So I'd judge that the time cost is not a reason not to apply this patch, but we might want to reconsider why we're preferring OpenSSL over direct use of /dev/urandom. I also wondered whether we might do something like Thomas' suggestion to try to cut the startup time. In a fork() environment it'd be pretty cheap to pass down a master seed value from the postmaster, but with EXEC_BACKEND I think the only reliable way would be to put the seed value in a file and have backends read it from there. That's none too cheap either --- for grins, I hacked pg_strong_random.c to read an ordinary file instead of /dev/urandom, and got runtimes around 15us, so reading /dev/urandom is really not that much slower than a plain file. On the whole I don't find this line of thought attractive, especially since it's not real clear to me how safe it would be to work like this rather than have a fully independent seed for each process. Anyway, I've run out of reasons why we might not want to do this, so I'm going to go ahead and commit it. regards, tom lane
On 19/11/2018 14:40, Tomas Vondra wrote: > On 11/19/18 2:57 AM, Michael Paquier wrote: >> On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: >>> Since it's hard to come up with a concise name that will mention sampling rate >>> in the context of min_duration_statement, I think it's fine to name this >>> configuration "log_sample_rate", as long as it's dependency from >>> log_min_duration_statements is clearly explained in the documentation. >> >> log_sample_rate looks fine to me as a name. > > That seems far too short to me - the name should indicate it applies to > statement logging. I'd say log_statement_sample_rate is better. It was committed with this name, but then one has to wonder why it doesn't also apply to log_statement. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Le 4 janvier 2019 13:16:48 GMT+01:00, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit : >On 19/11/2018 14:40, Tomas Vondra wrote: >> On 11/19/18 2:57 AM, Michael Paquier wrote: >>> On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote: >>>> Since it's hard to come up with a concise name that will mention >sampling rate >>>> in the context of min_duration_statement, I think it's fine to name >this >>>> configuration "log_sample_rate", as long as it's dependency from >>>> log_min_duration_statements is clearly explained in the >documentation. >>> >>> log_sample_rate looks fine to me as a name. >> >> That seems far too short to me - the name should indicate it applies >to >> statement logging. I'd say log_statement_sample_rate is better. > >It was committed with this name, but then one has to wonder why it >doesn't also apply to log_statement. > >-- >Peter Eisentraut http://www.2ndQuadrant.com/ >PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Yes, but maybe log_min_duration_statements_sample is too long? -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.