Thread: pgsql: Add pg_alterckey utility to change the cluster key

pgsql: Add pg_alterckey utility to change the cluster key

From

Bruce Momjian

Date:

26 December 2020, 01:25:05

Add pg_alterckey utility to change the cluster key

This can change the key that encrypts the data encryption keys used for
cluster file encryption.

Discussion: https://postgr.es/m/20201202213814.GG20285@momjian.us

Backpatch-through: master

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/62afb42a7f9f533efc6c19f462c3a848fa4ddb63

Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml   | 186 ++++++++++
src/bin/Makefile                    |   1 +
src/bin/pg_alterckey/.gitignore     |   1 +
src/bin/pg_alterckey/Makefile       |  44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693 ++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Tom Lane

Date:

26 December 2020, 03:36:55

Bruce Momjian <bruce@momjian.us> writes:
> Add pg_alterckey utility to change the cluster key

> Modified Files
> --------------
> doc/src/sgml/ref/pg_alterkey.sgml   | 186 ++++++++++

1. I wonder why this file is "pg_alterkey.sgml" when the
program it documents is pg_alterckey.

2. Regardless of name, this file is dead code because you did not
hook it into the documentation infrastructure.

3. The buildfarm says this commit is (still) busted on Win32.

Possibly these commits need more review than you think.

            regards, tom lane

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Bruce Momjian

Date:

26 December 2020, 03:40:37

On Fri, Dec 25, 2020 at 10:36:55PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Add pg_alterckey utility to change the cluster key
> 
> > Modified Files
> > --------------
> > doc/src/sgml/ref/pg_alterkey.sgml   | 186 ++++++++++
> 
> 1. I wonder why this file is "pg_alterkey.sgml" when the
> program it documents is pg_alterckey.

Oh, mistake then, will fix.
> 
> 2. Regardless of name, this file is dead code because you did not
> hook it into the documentation infrastructure.

Wow, another issue.

> 3. The buildfarm says this commit is (still) busted on Win32.
> 
> Possibly these commits need more review than you think.

I am fixing those now.  It is a sleep() and ifdef-inside-macro problem.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Michael Paquier

Date:

26 December 2020, 03:49:58

On Fri, Dec 25, 2020 at 10:36:55PM -0500, Tom Lane wrote:
> 3. The buildfarm says this commit is (still) busted on Win32.
>
> Possibly these commits need more review than you think.

Shared feeling here, I think that this is still too early.  FWIW, I am
surprised that this patch series includes exactly zero line of code
for tests, while the total amount of code committed is close to 3000
lines.
--
Michael

Attachment

signature.asc

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Pavel Stehule

Date:

26 December 2020, 05:18:01

so 26. 12. 2020 v 2:25 odesílatel Bruce Momjian <bruce@momjian.us> napsal:

Add pg_alterckey utility to change the cluster key

This can change the key that encrypts the data encryption keys used for
cluster file encryption.

Discussion: https://postgr.es/m/20201202213814.GG20285@momjian.us

Backpatch-through: master

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/62afb42a7f9f533efc6c19f462c3a848fa4ddb63

Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
src/bin/Makefile | 1 +
src/bin/pg_alterckey/.gitignore | 1 +
src/bin/pg_alterckey/Makefile | 44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693 ++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)

Broken tests

make[2]: Vstupuje se do adresáře „/home/pavel/src/postgresql.master/src/bin/pg_alterckey“
rm -rf '/home/pavel/src/postgresql.master/src/bin/pg_alterckey'/tmp_check
/usr/bin/mkdir -p '/home/pavel/src/postgresql.master/src/bin/pg_alterckey'/tmp_check
cd . && TESTDIR='/home/pavel/src/postgresql.master/src/bin/pg_alterckey' PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/bin:$PATH" LD_LIBRARY_PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/lib" PGPORT='65432' PG_REGRESS='/home/pavel/src/postgresql.master/src/bin/pg_alterckey/../../../src/test/regress/pg_regress' REGRESS_SHLIB='/home/pavel/src/postgresql.master/src/test/regress/regress.so' /usr/bin/prove -I ../../../src/test/perl/ -I . t/*.pl
Cannot detect source of 't/*.pl'! at /usr/share/perl5/vendor_perl/TAP/Parser/IteratorFactory.pm line 256.
TAP::Parser::IteratorFactory::detect_source(TAP::Parser::IteratorFactory=HASH(0x5570abd663d0), TAP::Parser::Source=HASH(0x5570abc36df8)) called at /usr/share/perl5/vendor_perl/TAP/Parser/IteratorFactory.pm line 211
TAP::Parser::IteratorFactory::make_iterator(TAP::Parser::IteratorFactory=HASH(0x5570abd663d0), TAP::Parser::Source=HASH(0x5570abc36df8)) called at /usr/share/perl5/vendor_perl/TAP/Parser.pm line 472
TAP::Parser::_initialize(TAP::Parser=HASH(0x5570abc36a50), HASH(0x5570ab9ce938)) called at /usr/share/perl5/vendor_perl/TAP/Object.pm line 55
TAP::Object::new("TAP::Parser", HASH(0x5570ab9ce938)) called at /usr/share/perl5/vendor_perl/TAP/Object.pm line 130
TAP::Object::_construct(TAP::Harness=HASH(0x5570ab8f61f0), "TAP::Parser", HASH(0x5570ab9ce938)) called at /usr/share/perl5/vendor_perl/TAP/Harness.pm line 852
TAP::Harness::make_parser(TAP::Harness=HASH(0x5570ab8f61f0), TAP::Parser::Scheduler::Job=HASH(0x5570abc07428)) called at /usr/share/perl5/vendor_perl/TAP/Harness.pm line 651
TAP::Harness::_aggregate_single(TAP::Harness=HASH(0x5570ab8f61f0), TAP::Parser::Aggregator=HASH(0x5570ab46ad90), TAP::Parser::Scheduler=HASH(0x5570abc073f8)) called at /usr/share/perl5/vendor_perl/TAP/Harness.pm line 743
TAP::Harness::aggregate_tests(TAP::Harness=HASH(0x5570ab8f61f0), TAP::Parser::Aggregator=HASH(0x5570ab46ad90), "t/*.pl") called at /usr/share/perl5/vendor_perl/TAP/Harness.pm line 558
TAP::Harness::__ANON__() called at /usr/share/perl5/vendor_perl/TAP/Harness.pm line 571
TAP::Harness::runtests(TAP::Harness=HASH(0x5570ab8f61f0), "t/*.pl") called at /usr/share/perl5/vendor_perl/App/Prove.pm line 548
App::Prove::_runtests(App::Prove=HASH(0x5570ab432868), HASH(0x5570ab8b6af0), "t/*.pl") called at /usr/share/perl5/vendor_perl/App/Prove.pm line 506
App::Prove::run(App::Prove=HASH(0x5570ab432868)) called at /usr/bin/prove line 10
make[2]: *** [Makefile:41: check] Chyba 2
make[2]: Opouští se adresář „/home/pavel/src/postgresql.master/src/bin/pg_alterckey“
make[1]: *** [Makefile:43: check-pg_alterckey-recurse] Chyba 2
make[1]: Opouští se adresář „/home/pavel/src/postgresql.master/src/bin“
make: *** [GNUmakefile:71: check-world-src/bin-recurse] Chyba 2

Regards

Pavel

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Bruce Momjian

Date:

26 December 2020, 06:20:43

On Sat, Dec 26, 2020 at 06:18:01AM +0100, Pavel Stehule wrote:
>     Details
>     -------
>     https://git.postgresql.org/pg/commitdiff/
>     62afb42a7f9f533efc6c19f462c3a848fa4ddb63
> 
>     Modified Files
>     --------------
>     doc/src/sgml/ref/pg_alterkey.sgml   | 186 ++++++++++
>     src/bin/Makefile                    |   1 +
>     src/bin/pg_alterckey/.gitignore     |   1 +
>     src/bin/pg_alterckey/Makefile       |  44 +++
>     src/bin/pg_alterckey/pg_alterckey.c | 693
>     ++++++++++++++++++++++++++++++++++++
>     5 files changed, 925 insertions(+)
> 
> 
> Broken tests

Uh, how do I reproduce this failure?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Pavel Stehule

Date:

26 December 2020, 06:25:00

so 26. 12. 2020 v 7:20 odesílatel Bruce Momjian <bruce@momjian.us> napsal:

On Sat, Dec 26, 2020 at 06:18:01AM +0100, Pavel Stehule wrote:
> Details
> -------
> https://git.postgresql.org/pg/commitdiff/
> 62afb42a7f9f533efc6c19f462c3a848fa4ddb63
>
> Modified Files
> --------------
> doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
> src/bin/Makefile | 1 +
> src/bin/pg_alterckey/.gitignore | 1 +
> src/bin/pg_alterckey/Makefile | 44 +++
> src/bin/pg_alterckey/pg_alterckey.c | 693
> ++++++++++++++++++++++++++++++++++++
> 5 files changed, 925 insertions(+)
>
>
> Broken tests

Uh, how do I reproduce this failure?

I just executed make check-world on master

Pavel

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Pavel Stehule

Date:

26 December 2020, 07:29:10

so 26. 12. 2020 v 7:25 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

so 26. 12. 2020 v 7:20 odesílatel Bruce Momjian <bruce@momjian.us> napsal:
On Sat, Dec 26, 2020 at 06:18:01AM +0100, Pavel Stehule wrote:
> Details
> -------
> https://git.postgresql.org/pg/commitdiff/
> 62afb42a7f9f533efc6c19f462c3a848fa4ddb63
>
> Modified Files
> --------------
> doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
> src/bin/Makefile | 1 +
> src/bin/pg_alterckey/.gitignore | 1 +
> src/bin/pg_alterckey/Makefile | 44 +++
> src/bin/pg_alterckey/pg_alterckey.c | 693
> ++++++++++++++++++++++++++++++++++++
> 5 files changed, 925 insertions(+)
>
>
> Broken tests

Uh, how do I reproduce this failure?

I just executed make check-world on master

I did recheck with same result

Pavel

Pavel

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Michael Paquier

Date:

26 December 2020, 09:16:37

On Sat, Dec 26, 2020 at 08:29:10AM +0100, Pavel Stehule wrote:
> I did recheck with same result

The Makefile of pg_alterckey is busted, and adding --enable-tap-tests
to the options of ./configure is enough to see a failure.  In short,
src/bin/pg_alterckey/Makefile includes the following lines, but it has
no TAP tests as of pg_alterckey/t/:
check:
        $(prove_check)

installcheck:
        $(prove_installcheck)

So this breaks.

I would like to point out that all non-Unix buildfarm members are
broken like fairywen because of the addition of those scripts:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-12-26%2009%3A04%3A27
/usr/bin/install: cannot stat 'crypto/ckey_aws.sh.sample': No such
file or directory

The CF bot at http://cfbot.cputube.org/ includes tests on Windows, so
those problems would have been detected beforehand.  Did you look at
these?  If this cannot be fixed, could it be possible to revert
please?  It looks rather clear that this has not been tested across
multiple platforms, and the absence of tests to allow the buildfarm to
stress this code does not really help either in gaining confidence
that this is stable.
--
Michael

Attachment

signature.asc

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Michael Paquier

Date:

26 December 2020, 09:24:45

On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
> I would like to point out that all non-Unix buildfarm members are
> broken like fairywen because of the addition of those scripts:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-12-26%2009%3A04%3A27
> /usr/bin/install: cannot stat 'crypto/ckey_aws.sh.sample': No such
> file or directory

I may have been incorrect here, it looks like those failures come from
VPATH installations.
--
Michael

Attachment

signature.asc

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Fabien COELHO

Date:

26 December 2020, 10:03:00

Hello Bruce

Tom>> Possibly these commits need more review than you think.

Michaël> Shared feeling here, I think that this is still too early. 
Michaël> FWIW, I am surprised that this patch series includes exactly zero line of code
Michaël> for tests, while the total amount of code committed is close to 
Michaël> 3000 lines.

I intended to have a look at all that, but it got committed before I had 
time to do it. Ok, I do not have much time these past months, but I've 
been quite surprised anyway by the speed for such a feature/patch.

The feeling I expressed early in the thread is that the design should be 
extendable, so that it does not fit only one particular use-case but fail 
at any other that were not the author's, and a large reimplementation 
would be needed to get them in. I was thinking of having a closer look 
with that in mind. In particular and IMHO, the "master key" should not 
(necessarily) be hold by a postgres process, not sure what the current 
status is, but that is an example.

Committing such amount of codes and features without any test is much too 
blunt. Why the rush?

Question: Should these patches be reverted till the stuff is properly 
stabilized, tested and reviewed?

-- 
Fabien.

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Tom Lane

Date:

26 December 2020, 16:45:41

Bruce Momjian <bruce@momjian.us> writes:
> On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
>> The CF bot at http://cfbot.cputube.org/ includes tests on Windows, so
>> those problems would have been detected beforehand.  Did you look at
>> these?  If this cannot be fixed, could it be possible to revert
>> please?  It looks rather clear that this has not been tested across
>> multiple platforms, and the absence of tests to allow the buildfarm to
>> stress this code does not really help either in gaining confidence
>> that this is stable.

> I have been working on this patch for almost two months, and wanted to
> try to get it into the tree near Christmas as sort of a Christmas
> present to the community.

It's not much of a "Christmas present" to have to spend part of the
holiday fixing somebody else's obviously under-baked commits.

I think you should revert *all* of this and not come back until
(a) you have some test cases and (b) somebody besides you has read
the entire patch.  There are way too many beginner-level errors in
what you've pushed so far.  And that's not even counting the question
of whether the design is right to begin with, which somebody already
expressed doubts about.

            regards, tom lane

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Andrew Dunstan

Date:

28 December 2020, 13:49:09

On 12/27/20 12:44 PM, Bruce Momjian wrote:
>
>> Based on the number of concerns raised by various people over the last
>> couple of days (including myself, one point being the refactoring of
>> the ciphers taken from pgcrypto that should have been in its own
>> commit), I agree that it would be better to revert this code for now.
> OK, I will do so in the next few hours.  My followup will be to:
>
> *  register it for the commit-fest so it gets cfbot and other visibility
> *  modify pgcrypto to use the new AES API (the SHA512 call no longer exists)
> *  develop TAP tests, though as I mentioned, they will be odd
>

Bruce

I think this is a wise course.

The cfbot and the vanilla buildfarm can't test some things easily, and
this might be such a case. That's where custom buildfarm modules can
prove useful. For example, we have one that is used by rhinoceros to run
the SEPgsql tests. ISTM this might well be another case. I can work with
you to develop such a module.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Fabien COELHO

Date:

28 December 2020, 14:09:11

Hello Bruce,

I put the thread back on hackers.

>> The first two keys are stored in pg_cryptokeys/ in the data directory,
>> while the third one is retrieved using a GUC for validation at server
>> startup for the other two.

>> Do we necessarily have to store the first level keys within the data 
>> directory?  I guess that this choice has been made for performance, but 
>> is that really something that a user would want all the time?  AES256 
>> is the only option available for the data keys.  What if somebody wants 
>> to roll in their own encryption?
>
> To clarify, we encrypt the data keys using AES256, but the data keys
> themselves can be 128, 192, or 256 bits.
>
>> Companies can have many requirements in terms of accepting the use of
>> one option or another.
>
> I think ultimately we will need three commands to control the keys.
> First, there is the cluster_key_command, which we have now.  Second, I
> think we will need an optional command which returns random bytes ---
> this would allow users to get random bytes from a different source than
> that used by the server code.
>
> Third, we will probably need a command that returns the data encryption
> keys directly, either heap/index or WAL keys, probably based on key
> number --- you pass the key number you want, and the command returns the
> data key.  There would not be a cluster key in this case, but the
> command could still prompt the user for perhaps a password to the KMS
> server. It could not be used if any of the previous two commands are
> used. I assume an HMAC would still be stored in the pg_cryptokeys
> directory to check that the right key has been returned.

Yep, my point is that it should be possible to have the whole key 
management outside of postgres.

This said, postgres should provide a reasonable default implementation, 
obviously, preferably by using the provided mechanism (*NOT* a direct 
internal implementation and a possible switch to something else, IMHO, 
because then it would not be tested for whether it provides the right 
level of usability).

I agree that keys need to be identified. I somehow disagree with the 
naming of the script and the implied usage.

ISTM that there could be an external interface:

  - to initialize something. It may start a suid process, it may connect to 
a remote host, it may ask for a master password, who knows?

    /path/to/init --options arguments…

the init process would return something which would be reused later on, eg 
an authentication token, or maybe a path to a socket for communication, or 
a file which contains something, or even a master/cluster key, but not 
necessarily. It may be anything. How it is passed to the next 
process/connection is an open question. Maybe on its stdin?

  - to start a process (?) which provide keys, either created (new) or 
existing (get), and possibly destroy them (or not?). The init process 
result should/could be passed somehow to this process, which may be suid 
something else. Another option would be to rely on some IPC mechanism.
I'm not sure what the best choice is.

ISTM that this process/connection could/should be persistent, with a 
simplistic text or binary based client/server interface. What this 
process/connection does it beyond postgres. In my mind, it could implement 
getting random data as well. I'd suggest that under no circumstances 
should the postgres process create cryptographic keys, although it should 
probably name them with some predefine length limit.

    /path/to/run --options arguments…

Then there should be an postgres internal interface to store the results 
for local processing, retrieve them when needed, and so on, ok.

ISTM that there should also be an internal interface to load the 
cryptographic primitives. Possibly a so/dll would do, or maybe just an 
extension mechanism which would provide the necessary functions, but this 
raise the issue of bootstraping, so maybe not so great an idea. The 
functions should probably be able to implement a counter mode, so that 
actual keys depend on the page position in file position, but what is 
really does is not postgres concern.

A cryptographic concern for me is whether it would be possible to have 
authentication/integrity checks associated to each page. This means having 
the ability to reserve some space somewhere, possibly 8-16 bytes, in a 
page. Different algorithm could have different space requirements.

The same interface should be used by other back-end commands (pg_upgrade, 
whatever).

Somehow, the design should be abstract, without implying much, so that 
very different interfaces could be provided in term of whether there 
exists a master key, how keys are derived, what key sizes are, what 
algorithms are used, and so on. Postgres itself should not store keys, 
only key identifiers.

I'm wondering whether replication should be able to work without some/all 
keys, so that a streaming replication could be implemented without the 
remote host being fully aware of the cryptographic keys.

Another functional point is to allow changing the underlying key for a 
file, and discuss how this could work with the interface, as I noted that 
it was a desired feature. I'd suggest that maybe this should be based on 
changing the name of the "key", so that the external key management would 
not need to know about it. How to achieve that as a transaction is an open 
question. Maybe it should be an change outside of postgres, which modifies 
files at the cluster level with the database stopped.

> I thought we should implement the first command, because it will
> probably be the most common and easiest to use, and then see what people
> want added.

I somehow disagree: I think that pg should provide from the start the full 
generic interface, *and* a reasonable implementation which is what the 
current proposal does, fine with me. A simplistic test-oriented interface 
could be implemented in a scripting language. I think that great care must 
be put upfront in the overall design, so that it can be reused later on by 
people with pretty different requirements (in term of auditors, legal 
constraints, functions, whatever). I would like to avoid providing an 
half-baked design which suits some use-cases but cannot be used for 
others, because of key design choices. From a number of line of code point 
of view, it may not change much, really, this is more about design and 
putting functionalities in the right places.

Now I intend to give some time to review patches with this in mind. Maybe 
I'll have some time at the end of the next CF, or the next.

-- 
Fabien.

Re: pgsql: Add pg_alterckey utility to change the cluster key

From

Fabien COELHO

Date:

28 December 2020, 14:09:11

Hello Bruce,

I put the thread back on hackers.

>> The first two keys are stored in pg_cryptokeys/ in the data directory,
>> while the third one is retrieved using a GUC for validation at server
>> startup for the other two.

>> Do we necessarily have to store the first level keys within the data 
>> directory?  I guess that this choice has been made for performance, but 
>> is that really something that a user would want all the time?  AES256 
>> is the only option available for the data keys.  What if somebody wants 
>> to roll in their own encryption?
>
> To clarify, we encrypt the data keys using AES256, but the data keys
> themselves can be 128, 192, or 256 bits.
>
>> Companies can have many requirements in terms of accepting the use of
>> one option or another.
>
> I think ultimately we will need three commands to control the keys.
> First, there is the cluster_key_command, which we have now.  Second, I
> think we will need an optional command which returns random bytes ---
> this would allow users to get random bytes from a different source than
> that used by the server code.
>
> Third, we will probably need a command that returns the data encryption
> keys directly, either heap/index or WAL keys, probably based on key
> number --- you pass the key number you want, and the command returns the
> data key.  There would not be a cluster key in this case, but the
> command could still prompt the user for perhaps a password to the KMS
> server. It could not be used if any of the previous two commands are
> used. I assume an HMAC would still be stored in the pg_cryptokeys
> directory to check that the right key has been returned.

Yep, my point is that it should be possible to have the whole key 
management outside of postgres.

This said, postgres should provide a reasonable default implementation, 
obviously, preferably by using the provided mechanism (*NOT* a direct 
internal implementation and a possible switch to something else, IMHO, 
because then it would not be tested for whether it provides the right 
level of usability).

I agree that keys need to be identified. I somehow disagree with the 
naming of the script and the implied usage.

ISTM that there could be an external interface:

  - to initialize something. It may start a suid process, it may connect to 
a remote host, it may ask for a master password, who knows?

    /path/to/init --options arguments…

the init process would return something which would be reused later on, eg 
an authentication token, or maybe a path to a socket for communication, or 
a file which contains something, or even a master/cluster key, but not 
necessarily. It may be anything. How it is passed to the next 
process/connection is an open question. Maybe on its stdin?

  - to start a process (?) which provide keys, either created (new) or 
existing (get), and possibly destroy them (or not?). The init process 
result should/could be passed somehow to this process, which may be suid 
something else. Another option would be to rely on some IPC mechanism.
I'm not sure what the best choice is.

ISTM that this process/connection could/should be persistent, with a 
simplistic text or binary based client/server interface. What this 
process/connection does it beyond postgres. In my mind, it could implement 
getting random data as well. I'd suggest that under no circumstances 
should the postgres process create cryptographic keys, although it should 
probably name them with some predefine length limit.

    /path/to/run --options arguments…

Then there should be an postgres internal interface to store the results 
for local processing, retrieve them when needed, and so on, ok.

ISTM that there should also be an internal interface to load the 
cryptographic primitives. Possibly a so/dll would do, or maybe just an 
extension mechanism which would provide the necessary functions, but this 
raise the issue of bootstraping, so maybe not so great an idea. The 
functions should probably be able to implement a counter mode, so that 
actual keys depend on the page position in file position, but what is 
really does is not postgres concern.

A cryptographic concern for me is whether it would be possible to have 
authentication/integrity checks associated to each page. This means having 
the ability to reserve some space somewhere, possibly 8-16 bytes, in a 
page. Different algorithm could have different space requirements.

The same interface should be used by other back-end commands (pg_upgrade, 
whatever).

Somehow, the design should be abstract, without implying much, so that 
very different interfaces could be provided in term of whether there 
exists a master key, how keys are derived, what key sizes are, what 
algorithms are used, and so on. Postgres itself should not store keys, 
only key identifiers.

I'm wondering whether replication should be able to work without some/all 
keys, so that a streaming replication could be implemented without the 
remote host being fully aware of the cryptographic keys.

Another functional point is to allow changing the underlying key for a 
file, and discuss how this could work with the interface, as I noted that 
it was a desired feature. I'd suggest that maybe this should be based on 
changing the name of the "key", so that the external key management would 
not need to know about it. How to achieve that as a transaction is an open 
question. Maybe it should be an change outside of postgres, which modifies 
files at the cluster level with the database stopped.

> I thought we should implement the first command, because it will
> probably be the most common and easiest to use, and then see what people
> want added.

I somehow disagree: I think that pg should provide from the start the full 
generic interface, *and* a reasonable implementation which is what the 
current proposal does, fine with me. A simplistic test-oriented interface 
could be implemented in a scripting language. I think that great care must 
be put upfront in the overall design, so that it can be reused later on by 
people with pretty different requirements (in term of auditors, legal 
constraints, functions, whatever). I would like to avoid providing an 
half-baked design which suits some use-cases but cannot be used for 
others, because of key design choices. From a number of line of code point 
of view, it may not change much, really, this is more about design and 
putting functionalities in the right places.

Now I intend to give some time to review patches with this in mind. Maybe 
I'll have some time at the end of the next CF, or the next.

-- 
Fabien.