Thread: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@iki.fi> writes: > Replace PostmasterRandom() with a stronger way of generating randomness. This patch broke padmeleon: 016-10-17 09:57:17.782 EDT [5804d8bd.57c2:1] LOG: database system was shut down at 2016-10-17 09:57:17 EDT 2016-10-17 09:57:17.790 EDT [5804d8bd.57c2:2] LOG: MultiXact member wraparound protections are now enabled 2016-10-17 09:57:17.807 EDT [5804d8bd.57c6:1] LOG: autovacuum launcher started 2016-10-17 09:57:17.820 EDT [5804d8bd.57bf:1] LOG: database system is ready to accept connections 2016-10-17 09:57:18.516 EDT [5804d8bd.57bf:2] LOG: could not generate random query cancel key 2016-10-17 09:57:19.544 EDT [5804d8bd.57bf:3] LOG: could not generate random query cancel key 2016-10-17 09:57:20.563 EDT [5804d8bd.57bf:4] LOG: could not generate random query cancel key 2016-10-17 09:57:21.584 EDT [5804d8bd.57bf:5] LOG: could not generate random query cancel key 2016-10-17 09:57:22.604 EDT [5804d8bd.57bf:6] LOG: could not generate random query cancel key It's failing because this machine lacks /dev/random and /dev/urandom. It does have a non-kernel entropy daemon (prngd), which OpenSSL knows how to read from but the hard-wired code in pg_strong_random() does not. I'm not sure whether it's worth trying to make pg_strong_random() aware of prngd. The real issue here is whether we are willing to say that Postgres simply does not work anymore on machines without standard entropy sources. Doesn't matter whether the user cares about the strength of cancel keys, we're just blowing them off. That seems a bit extreme from here. I think we should be willing to fall back to the old code if we can't find a real entropy source. regards, tom lane
Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Robert Haas
Date:
On Mon, Oct 17, 2016 at 10:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The real issue here is whether we are willing to say that > Postgres simply does not work anymore on machines without standard entropy > sources. Doesn't matter whether the user cares about the strength of > cancel keys, we're just blowing them off. That seems a bit extreme > from here. I think we should be willing to fall back to the old code > if we can't find a real entropy source. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Heikki Linnakangas
Date:
On 10/17/2016 05:50 PM, Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@iki.fi> writes: >> Replace PostmasterRandom() with a stronger way of generating randomness. > > This patch broke padmeleon: > > 016-10-17 09:57:17.782 EDT [5804d8bd.57c2:1] LOG: database system was shut down at 2016-10-17 09:57:17 EDT > 2016-10-17 09:57:17.790 EDT [5804d8bd.57c2:2] LOG: MultiXact member wraparound protections are now enabled > 2016-10-17 09:57:17.807 EDT [5804d8bd.57c6:1] LOG: autovacuum launcher started > 2016-10-17 09:57:17.820 EDT [5804d8bd.57bf:1] LOG: database system is ready to accept connections > 2016-10-17 09:57:18.516 EDT [5804d8bd.57bf:2] LOG: could not generate random query cancel key > 2016-10-17 09:57:19.544 EDT [5804d8bd.57bf:3] LOG: could not generate random query cancel key > 2016-10-17 09:57:20.563 EDT [5804d8bd.57bf:4] LOG: could not generate random query cancel key > 2016-10-17 09:57:21.584 EDT [5804d8bd.57bf:5] LOG: could not generate random query cancel key > 2016-10-17 09:57:22.604 EDT [5804d8bd.57bf:6] LOG: could not generate random query cancel key > > It's failing because this machine lacks /dev/random and /dev/urandom. > It does have a non-kernel entropy daemon (prngd), which OpenSSL knows how > to read from but the hard-wired code in pg_strong_random() does not. > > I'm not sure whether it's worth trying to make pg_strong_random() aware > of prngd. Seems simple enough.. > The real issue here is whether we are willing to say that > Postgres simply does not work anymore on machines without standard entropy > sources. Doesn't matter whether the user cares about the strength of > cancel keys, we're just blowing them off. That seems a bit extreme > from here. I think we should be willing to fall back to the old code > if we can't find a real entropy source. I'm scared of having pg_strong_random() that is willing to fall back to not-so-strong values. We can rename it, of course, but it seems dangerous to use a weak random-number generator for authentication purposes (query cancel, MD5 salts, SCRAM nonces). I think both for the MD5 salt and SCRAM, it doesn't have to be unpredictable to attackers, as long the attacker cannot influence it so that a particular value is chosen. For query cancel, it needs to be unpredictable, but the query cancel key is quite short anyway, and we haven't worried about it so far anyway, because an attacker can't do very much damage by just canceling queries. One option would be to disable query-canceling and MD5 (and SCRAM) authentication, if we don't have an entropy source. - Heikki
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Stephen Frost
Date:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote: > On 10/17/2016 05:50 PM, Tom Lane wrote: > >Heikki Linnakangas <heikki.linnakangas@iki.fi> writes: > >>Replace PostmasterRandom() with a stronger way of generating randomness. > > > >This patch broke padmeleon: > > > >016-10-17 09:57:17.782 EDT [5804d8bd.57c2:1] LOG: database system was shut down at 2016-10-17 09:57:17 EDT > >2016-10-17 09:57:17.790 EDT [5804d8bd.57c2:2] LOG: MultiXact member wraparound protections are now enabled > >2016-10-17 09:57:17.807 EDT [5804d8bd.57c6:1] LOG: autovacuum launcher started > >2016-10-17 09:57:17.820 EDT [5804d8bd.57bf:1] LOG: database system is ready to accept connections > >2016-10-17 09:57:18.516 EDT [5804d8bd.57bf:2] LOG: could not generate random query cancel key > >2016-10-17 09:57:19.544 EDT [5804d8bd.57bf:3] LOG: could not generate random query cancel key > >2016-10-17 09:57:20.563 EDT [5804d8bd.57bf:4] LOG: could not generate random query cancel key > >2016-10-17 09:57:21.584 EDT [5804d8bd.57bf:5] LOG: could not generate random query cancel key > >2016-10-17 09:57:22.604 EDT [5804d8bd.57bf:6] LOG: could not generate random query cancel key > > > >It's failing because this machine lacks /dev/random and /dev/urandom. > >It does have a non-kernel entropy daemon (prngd), which OpenSSL knows how > >to read from but the hard-wired code in pg_strong_random() does not. > > > >I'm not sure whether it's worth trying to make pg_strong_random() aware > >of prngd. > > Seems simple enough.. If it is, then that's certainly tempting, especially if that's the only platform we support where we don't have a better option, and we wish to continue supporting it. > > The real issue here is whether we are willing to say that > >Postgres simply does not work anymore on machines without standard entropy > >sources. Doesn't matter whether the user cares about the strength of > >cancel keys, we're just blowing them off. That seems a bit extreme > >from here. I think we should be willing to fall back to the old code > >if we can't find a real entropy source. > > I'm scared of having pg_strong_random() that is willing to fall back > to not-so-strong values. We can rename it, of course, but it seems > dangerous to use a weak random-number generator for authentication > purposes (query cancel, MD5 salts, SCRAM nonces). > > I think both for the MD5 salt and SCRAM, it doesn't have to be > unpredictable to attackers, as long the attacker cannot influence it > so that a particular value is chosen. For query cancel, it needs to > be unpredictable, but the query cancel key is quite short anyway, > and we haven't worried about it so far anyway, because an attacker > can't do very much damage by just canceling queries. > > One option would be to disable query-canceling and MD5 (and SCRAM) > authentication, if we don't have an entropy source. Wouldn't it be possible to make this a build-time question..? I don't particularly like the idea of pg_strong_random() falling back to not-strong values, but maybe we could have a '--use-weak-random' configure switch for platforms that don't provide a strong source. Then again, if we wish to support this platform, then perhaps we should put in the effort to support prngd. Thanks! Stephen
Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes: > On 10/17/2016 05:50 PM, Tom Lane wrote: >> The real issue here is whether we are willing to say that >> Postgres simply does not work anymore on machines without standard entropy >> sources. Doesn't matter whether the user cares about the strength of >> cancel keys, we're just blowing them off. That seems a bit extreme >> from here. I think we should be willing to fall back to the old code >> if we can't find a real entropy source. > I'm scared of having pg_strong_random() that is willing to fall back to > not-so-strong values. We can rename it, of course, but it seems > dangerous to use a weak random-number generator for authentication > purposes (query cancel, MD5 salts, SCRAM nonces). I think that it's probably moot on all modern platforms, and even on platforms as old as pademelon, the answer for people who care about strong security is "--with-openssl". What I'm on about here is whether we should make people who don't care about that jump through hoops. Not caring is a perfectly reasonable stance for non-exposed postmasters; otherwise we wouldn't have the "trust" auth method. I would be satisfied with making it a non-default build option, eg add this to pg_strong_random: if (random_from_file("/dev/random", buf, len)) return true; +#ifdef ALLOW_WEAK_SECURITY + ... old PostmasterRandom method here ... +#endif +/* None of the sources were available. */return false; regards, tom lane
Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Magnus Hagander
Date:
On Mon, Oct 17, 2016 at 8:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
-- Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 10/17/2016 05:50 PM, Tom Lane wrote:
>> The real issue here is whether we are willing to say that
>> Postgres simply does not work anymore on machines without standard entropy
>> sources. Doesn't matter whether the user cares about the strength of
>> cancel keys, we're just blowing them off. That seems a bit extreme
>> from here. I think we should be willing to fall back to the old code
>> if we can't find a real entropy source.
> I'm scared of having pg_strong_random() that is willing to fall back to
> not-so-strong values. We can rename it, of course, but it seems
> dangerous to use a weak random-number generator for authentication
> purposes (query cancel, MD5 salts, SCRAM nonces).
I think that it's probably moot on all modern platforms, and even on
platforms as old as pademelon, the answer for people who care about
strong security is "--with-openssl". What I'm on about here is whether
we should make people who don't care about that jump through hoops.
Not caring is a perfectly reasonable stance for non-exposed postmasters;
otherwise we wouldn't have the "trust" auth method.
I would be satisfied with making it a non-default build option, eg
add this to pg_strong_random:
+1 for that approach. I really wouldn't want to see it fall back completely transparently in case something stops working. But if it's a non-default build option, that's not a problem, and it should make it possible to make it work on older platforms.
Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Heikki Linnakangas
Date:
On 10/17/2016 06:21 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnaka@iki.fi> writes: >> On 10/17/2016 05:50 PM, Tom Lane wrote: >>> The real issue here is whether we are willing to say that >>> Postgres simply does not work anymore on machines without standard entropy >>> sources. Doesn't matter whether the user cares about the strength of >>> cancel keys, we're just blowing them off. That seems a bit extreme >>> from here. I think we should be willing to fall back to the old code >>> if we can't find a real entropy source. > >> I'm scared of having pg_strong_random() that is willing to fall back to >> not-so-strong values. We can rename it, of course, but it seems >> dangerous to use a weak random-number generator for authentication >> purposes (query cancel, MD5 salts, SCRAM nonces). > > I think that it's probably moot on all modern platforms, and even on > platforms as old as pademelon, the answer for people who care about > strong security is "--with-openssl". What I'm on about here is whether > we should make people who don't care about that jump through hoops. > Not caring is a perfectly reasonable stance for non-exposed postmasters; > otherwise we wouldn't have the "trust" auth method. > > I would be satisfied with making it a non-default build option, eg > add this to pg_strong_random: > > if (random_from_file("/dev/random", buf, len)) > return true; > > +#ifdef ALLOW_WEAK_SECURITY > + ... old PostmasterRandom method here ... > +#endif > + > /* None of the sources were available. */ > return false; I also changed pgcrypto to use pg_strong_random(). Using a strong random number generator is even more important for pgcrypto, since it might be used for generating encryption keys. Would we allow a weak generator for pgcrypto too, with ALLOW_WEAK_SECURITY? If we don't, then pgcrypto test cases will still fail on pademelon. If we consider the option to be just for testing, never for production use, then we could allow it, but it feels a bit dangerous, and I'm not sure what's the point of testing a configuration that you should never use in production. I'm going to try implementing prngd support. It seems easy enough, and prngd can be run on modern systems too, which is important for testing it. In addition to that, I'm going to see if we can make postmaster to work sensibly without query cancel keys, if pg_strong_random() fails. That way, you could still run on platforms that don't have /dev/[u]random or prngd, just without support for query cancels, MD5 authentication, and pgcrypto key generation functions. I'd consider that similar to --disable-spinlocks; it would be a helpful step when porting to a new platform, but you wouldn't want to use it in production. One more thing: in hindsight I think would be better to not fall back to /dev/ random, if compiled with OpenSSL support. It really shouldn't fail, and if it does, it seems better to complain loudly. - Heikki
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes: > I'm going to try implementing prngd support. It seems easy enough, and > prngd can be run on modern systems too, which is important for testing it. OK, if you feel like doing the work. However: > In addition to that, I'm going to see if we can make postmaster to work > sensibly without query cancel keys, if pg_strong_random() fails. This seems like a lot of work, with the "reward" that we'd start getting hard-to-debug reports about query cancel not working. Anytime anyone ever says "cancel didn't seem to work" we'd have to wonder whether there had been a transient failure of pg_strong_random. I think if we're going to refuse to generate weak cancel keys, we should just fail, not silently fall into a functionally degraded state. But in general, I think that being this picky about cancel keys on systems that are too old to have /dev/random is not really helpful to anybody. I don't recall any reports of anyone ever having a DOS situation from weak cancel keys. It's fine to upgrade our practice where it's convenient to do that, but taking away functionality on systems where it's not convenient isn't improving anyone's life. pg_crypto has a different set of tradeoffs and I don't necessarily make the same argument there. I don't feel that cancel keys have to act the same as pg_crypto keys. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Robert Haas
Date:
On Mon, Oct 17, 2016 at 2:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > But in general, I think that being this picky about cancel keys on systems > that are too old to have /dev/random is not really helpful to anybody. > I don't recall any reports of anyone ever having a DOS situation from > weak cancel keys. It's fine to upgrade our practice where it's convenient > to do that, but taking away functionality on systems where it's not > convenient isn't improving anyone's life. Right. I strongly agree with that. If somebody's running on a platform where they don't have a good source of entropy, they are clearly going to still want query cancel to work. They are not going to want ^C to start doing nothing, and they are *definitely* not going to want PostgreSQL to fail to compile and/or start. pgcrypto is a different situation, but I think it's just crazy to say that the problems with cancel keys are so bad that we should just refuse to run at all. Anyone who is in this situation has this problem not just with PostgreSQL but with everything on their system that wishes it had cryptographically strong random numbers, which is probably quite a bit of stuff. We shouldn't take the position that a machine without a good PRNG is a brick. They just have to accept that random number generation will be weaker not only for PostgreSQL but for any software whatever that they run on that machine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Robert Haas
Date:
On Mon, Oct 17, 2016 at 1:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I'm going to try implementing prngd support. It seems easy enough, and prngd > can be run on modern systems too, which is important for testing it. TBH, if pandemolon is the only system in the BF that doesn't have any other source of entropy, I think that's going 100% in the wrong direction. Even with prngd support, this carries a significant risk of effectively desupporting a large number of obscure UNIX platforms, which I think is a bad decision. It's fine (IMHO) to have optimizations in the code here and there that cater to Windows and Linux because those are the most widely-used platfoms, but we've been pretty diligent about trying to be portable to a wide variety of UNIX-like platforms. I think that's a good decision, and reversing it over the strength of cancel keys seems like letting the tail wag the dog. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Oct 17, 2016 at 1:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I'm going to try implementing prngd support. It seems easy enough, and prngd >> can be run on modern systems too, which is important for testing it. > TBH, if pandemolon is the only system in the BF that doesn't have any > other source of entropy, I think that's going 100% in the wrong > direction. Even with prngd support, this carries a significant risk > of effectively desupporting a large number of obscure UNIX platforms, > which I think is a bad decision. It's fine (IMHO) to have > optimizations in the code here and there that cater to Windows and > Linux because those are the most widely-used platfoms, but we've been > pretty diligent about trying to be portable to a wide variety of > UNIX-like platforms. I think that's a good decision, and reversing it > over the strength of cancel keys seems like letting the tail wag the > dog. This is largely my position as well. A relevant point here is that prngd did not come with that platform --- I had to install it after the fact. (If memory serves, which it may not because this was years ago, I did so because some new version of openssl started whining about lack of an entropy source.) Strong cancel keys are simply not important enough to desupport a bunch of old platforms over. I'm not convinced either way about pgcrypto. It's certainly a reasonable position that you want that to generate sound keys or else fail entirely. But if nothing else on the platform is generating sound keys, do we want to insist on being the first? If we want it to fail, and don't want to retire pademelon, there are multiple ways we could get to that goal: * Enable --with-openssl in pademelon's build (don't really want to do this, since I believe almost all the rest of the buildfarm tests with openssl) * Add variant expected-files (probably bad, it'd hide real failures) * Add a configure option to suppress building/testing pgcrypto (maybe just make it contingent on --with-openssl, which would allow deletion of a bunch of code that duplicates openssl functionality...) * Support reading entropy from prngd (but this means we have no buildfarm coverage for entropy-daemon-less platforms) None of these are perfect, but I'd say the last one is not so obviously the best that we shouldn't consider alternatives. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Michael Paquier
Date:
On Tue, Oct 18, 2016 at 5:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > If we want it to fail, and don't want to retire pademelon, there are > multiple ways we could get to that goal: > > * Enable --with-openssl in pademelon's build (don't really want to do > this, since I believe almost all the rest of the buildfarm tests with > openssl) Yes, I don't think that's a good thing to make openssl installation mandatory for this animal. > * Add variant expected-files (probably bad, it'd hide real failures) > > * Add a configure option to suppress building/testing pgcrypto (maybe > just make it contingent on --with-openssl, which would allow deletion > of a bunch of code that duplicates openssl functionality...) > > * Support reading entropy from prngd (but this means we have no buildfarm > coverage for entropy-daemon-less platforms) > > None of these are perfect, but I'd say the last one is not so obviously > the best that we shouldn't consider alternatives. In light of this discussion, it seems to me that we still want at the end the --allow-weak-keys anyway as an extreme fallback, and this even if there is additional support for prngd. An essential part is to document the weakness of this option properly, like not using pgcrypto with that if there is no other entropy source on an OS. By reading this thread, the point is that we should not complicate the support for obscure nix platforms, and it would be user-unfriendly to require users to install prngd to get more entropy from the system. And actually, enabling prngd would need to be controlled by a configure switch as well disabled by default, no? -- Michael
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes: > And actually, enabling prngd would need to be controlled by a > configure switch as well disabled by default, no? AFAICT, openssl has no configuration options related to prngd; they seem to be able to use it automatically when /dev/[u]random isn't there. This surprises me a bit because the location of prngd's random-data socket is evidently variable. I've not dug into exactly how openssl figures that out, but I'm sure a little quality time with the openssl sources would explain it. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
From
Michael Paquier
Date:
On Tue, Oct 18, 2016 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> And actually, enabling prngd would need to be controlled by a >> configure switch as well disabled by default, no? > > AFAICT, openssl has no configuration options related to prngd; they > seem to be able to use it automatically when /dev/[u]random isn't there. > This surprises me a bit because the location of prngd's random-data socket > is evidently variable. I've not dug into exactly how openssl figures that > out, but I'm sure a little quality time with the openssl sources would > explain it. I dug a bit into the code around RAND_egd and how it gets into fetching a method to get random bytes but got tired of the game for now. The man page means visibly that OpenSSL connects directly to the daemon: https://www.openssl.org/docs/manmaster/crypto/RAND_egd.html By the way, after a short chat with Heikki, we can up with an extra idea: resurrect unix_std from pgcrypto in pg_strong_random as the last fallback method and instead of using sha1, use sha2 as SCRAM is going to need those functions in src/common/ as well. Having a configure switch to enable it may be a good idea. -- Michael