pgsql: Add pg_alterckey utility to change the cluster key
Add pg_alterckey utility to change the cluster key
This can change the key that encrypts the data encryption keys used for
cluster file encryption.
Discussion: /messages/by-id/20201202213814.GG20285@momjian.us
Backpatch-through: master
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/62afb42a7f9f533efc6c19f462c3a848fa4ddb63
Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
src/bin/Makefile | 1 +
src/bin/pg_alterckey/.gitignore | 1 +
src/bin/pg_alterckey/Makefile | 44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693 ++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)
Bruce Momjian <bruce@momjian.us> writes:
Add pg_alterckey utility to change the cluster key
Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
1. I wonder why this file is "pg_alterkey.sgml" when the
program it documents is pg_alterckey.
2. Regardless of name, this file is dead code because you did not
hook it into the documentation infrastructure.
3. The buildfarm says this commit is (still) busted on Win32.
Possibly these commits need more review than you think.
regards, tom lane
On Fri, Dec 25, 2020 at 10:36:55PM -0500, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Add pg_alterckey utility to change the cluster key
Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++1. I wonder why this file is "pg_alterkey.sgml" when the
program it documents is pg_alterckey.
Oh, mistake then, will fix.
2. Regardless of name, this file is dead code because you did not
hook it into the documentation infrastructure.
Wow, another issue.
3. The buildfarm says this commit is (still) busted on Win32.
Possibly these commits need more review than you think.
I am fixing those now. It is a sleep() and ifdef-inside-macro problem.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Fri, Dec 25, 2020 at 10:36:55PM -0500, Tom Lane wrote:
3. The buildfarm says this commit is (still) busted on Win32.
Possibly these commits need more review than you think.
Shared feeling here, I think that this is still too early. FWIW, I am
surprised that this patch series includes exactly zero line of code
for tests, while the total amount of code committed is close to 3000
lines.
--
Michael
Hi
so 26. 12. 2020 v 2:25 odesílatel Bruce Momjian <bruce@momjian.us> napsal:
Add pg_alterckey utility to change the cluster key
This can change the key that encrypts the data encryption keys used for
cluster file encryption.Discussion: /messages/by-id/20201202213814.GG20285@momjian.us
Backpatch-through: master
Branch
------
masterDetails
-------https://git.postgresql.org/pg/commitdiff/62afb42a7f9f533efc6c19f462c3a848fa4ddb63
Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
src/bin/Makefile | 1 +
src/bin/pg_alterckey/.gitignore | 1 +
src/bin/pg_alterckey/Makefile | 44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693
++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)
Broken tests
make[2]: Vstupuje se do adresáře
„/home/pavel/src/postgresql.master/src/bin/pg_alterckey“
rm -rf '/home/pavel/src/postgresql.master/src/bin/pg_alterckey'/tmp_check
/usr/bin/mkdir -p
'/home/pavel/src/postgresql.master/src/bin/pg_alterckey'/tmp_check
cd . && TESTDIR='/home/pavel/src/postgresql.master/src/bin/pg_alterckey'
PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/bin:$PATH"
LD_LIBRARY_PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/master/lib"
PGPORT='65432'
PG_REGRESS='/home/pavel/src/postgresql.master/src/bin/pg_alterckey/../../../src/test/regress/pg_regress'
REGRESS_SHLIB='/home/pavel/src/postgresql.master/src/test/regress/regress.so'
/usr/bin/prove -I ../../../src/test/perl/ -I . t/*.pl
Cannot detect source of 't/*.pl'! at
/usr/share/perl5/vendor_perl/TAP/Parser/IteratorFactory.pm line 256.
TAP::Parser::IteratorFactory::detect_source(TAP::Parser::IteratorFactory=HASH(0x5570abd663d0),
TAP::Parser::Source=HASH(0x5570abc36df8)) called at
/usr/share/perl5/vendor_perl/TAP/Parser/IteratorFactory.pm line 211
TAP::Parser::IteratorFactory::make_iterator(TAP::Parser::IteratorFactory=HASH(0x5570abd663d0),
TAP::Parser::Source=HASH(0x5570abc36df8)) called at
/usr/share/perl5/vendor_perl/TAP/Parser.pm line 472
TAP::Parser::_initialize(TAP::Parser=HASH(0x5570abc36a50),
HASH(0x5570ab9ce938)) called at /usr/share/perl5/vendor_perl/TAP/Object.pm
line 55
TAP::Object::new("TAP::Parser", HASH(0x5570ab9ce938)) called at
/usr/share/perl5/vendor_perl/TAP/Object.pm line 130
TAP::Object::_construct(TAP::Harness=HASH(0x5570ab8f61f0), "TAP::Parser",
HASH(0x5570ab9ce938)) called at /usr/share/perl5/vendor_perl/TAP/Harness.pm
line 852
TAP::Harness::make_parser(TAP::Harness=HASH(0x5570ab8f61f0),
TAP::Parser::Scheduler::Job=HASH(0x5570abc07428)) called at
/usr/share/perl5/vendor_perl/TAP/Harness.pm line 651
TAP::Harness::_aggregate_single(TAP::Harness=HASH(0x5570ab8f61f0),
TAP::Parser::Aggregator=HASH(0x5570ab46ad90),
TAP::Parser::Scheduler=HASH(0x5570abc073f8)) called at
/usr/share/perl5/vendor_perl/TAP/Harness.pm line 743
TAP::Harness::aggregate_tests(TAP::Harness=HASH(0x5570ab8f61f0),
TAP::Parser::Aggregator=HASH(0x5570ab46ad90), "t/*.pl") called at
/usr/share/perl5/vendor_perl/TAP/Harness.pm line 558
TAP::Harness::__ANON__() called at
/usr/share/perl5/vendor_perl/TAP/Harness.pm line 571
TAP::Harness::runtests(TAP::Harness=HASH(0x5570ab8f61f0), "t/*.pl") called
at /usr/share/perl5/vendor_perl/App/Prove.pm line 548
App::Prove::_runtests(App::Prove=HASH(0x5570ab432868),
HASH(0x5570ab8b6af0), "t/*.pl") called at
/usr/share/perl5/vendor_perl/App/Prove.pm line 506
App::Prove::run(App::Prove=HASH(0x5570ab432868)) called at /usr/bin/prove
line 10
make[2]: *** [Makefile:41: check] Chyba 2
make[2]: Opouští se adresář
„/home/pavel/src/postgresql.master/src/bin/pg_alterckey“
make[1]: *** [Makefile:43: check-pg_alterckey-recurse] Chyba 2
make[1]: Opouští se adresář „/home/pavel/src/postgresql.master/src/bin“
make: *** [GNUmakefile:71: check-world-src/bin-recurse] Chyba 2
Regards
Pavel
On Sat, Dec 26, 2020 at 06:18:01AM +0100, Pavel Stehule wrote:
Details
-------
https://git.postgresql.org/pg/commitdiff/
62afb42a7f9f533efc6c19f462c3a848fa4ddb63Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml� �| 186 ++++++++++
src/bin/Makefile� � � � � � � � � � |� �1 +
src/bin/pg_alterckey/.gitignore� � �|� �1 +
src/bin/pg_alterckey/Makefile� � � �|� 44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693
++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)Broken tests
Uh, how do I reproduce this failure?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
so 26. 12. 2020 v 7:20 odesílatel Bruce Momjian <bruce@momjian.us> napsal:
On Sat, Dec 26, 2020 at 06:18:01AM +0100, Pavel Stehule wrote:
Details
-------
https://git.postgresql.org/pg/commitdiff/
62afb42a7f9f533efc6c19f462c3a848fa4ddb63Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
src/bin/Makefile | 1 +
src/bin/pg_alterckey/.gitignore | 1 +
src/bin/pg_alterckey/Makefile | 44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693
++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)Broken tests
Uh, how do I reproduce this failure?
I just executed make check-world on master
Pavel
Show quoted text
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.comThe usefulness of a cup is in its emptiness, Bruce Lee
so 26. 12. 2020 v 7:25 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:
so 26. 12. 2020 v 7:20 odesílatel Bruce Momjian <bruce@momjian.us> napsal:
On Sat, Dec 26, 2020 at 06:18:01AM +0100, Pavel Stehule wrote:
Details
-------
https://git.postgresql.org/pg/commitdiff/
62afb42a7f9f533efc6c19f462c3a848fa4ddb63Modified Files
--------------
doc/src/sgml/ref/pg_alterkey.sgml | 186 ++++++++++
src/bin/Makefile | 1 +
src/bin/pg_alterckey/.gitignore | 1 +
src/bin/pg_alterckey/Makefile | 44 +++
src/bin/pg_alterckey/pg_alterckey.c | 693
++++++++++++++++++++++++++++++++++++
5 files changed, 925 insertions(+)Broken tests
Uh, how do I reproduce this failure?
I just executed make check-world on master
I did recheck with same result
Pavel
Show quoted text
Pavel
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.comThe usefulness of a cup is in its emptiness, Bruce Lee
On Sat, Dec 26, 2020 at 08:29:10AM +0100, Pavel Stehule wrote:
I did recheck with same result
The Makefile of pg_alterckey is busted, and adding --enable-tap-tests
to the options of ./configure is enough to see a failure. In short,
src/bin/pg_alterckey/Makefile includes the following lines, but it has
no TAP tests as of pg_alterckey/t/:
check:
$(prove_check)
installcheck:
$(prove_installcheck)
So this breaks.
I would like to point out that all non-Unix buildfarm members are
broken like fairywen because of the addition of those scripts:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-12-26%2009%3A04%3A27
/usr/bin/install: cannot stat 'crypto/ckey_aws.sh.sample': No such
file or directory
The CF bot at http://cfbot.cputube.org/ includes tests on Windows, so
those problems would have been detected beforehand. Did you look at
these? If this cannot be fixed, could it be possible to revert
please? It looks rather clear that this has not been tested across
multiple platforms, and the absence of tests to allow the buildfarm to
stress this code does not really help either in gaining confidence
that this is stable.
--
Michael
On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
I would like to point out that all non-Unix buildfarm members are
broken like fairywen because of the addition of those scripts:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-12-26%2009%3A04%3A27
/usr/bin/install: cannot stat 'crypto/ckey_aws.sh.sample': No such
file or directory
I may have been incorrect here, it looks like those failures come from
VPATH installations.
--
Michael
Hello Bruce
Tom>> Possibly these commits need more review than you think.
Michaᅵl> Shared feeling here, I think that this is still too early.
Michaᅵl> FWIW, I am surprised that this patch series includes exactly zero line of code
Michaᅵl> for tests, while the total amount of code committed is close to
Michaᅵl> 3000 lines.
I intended to have a look at all that, but it got committed before I had
time to do it. Ok, I do not have much time these past months, but I've
been quite surprised anyway by the speed for such a feature/patch.
The feeling I expressed early in the thread is that the design should be
extendable, so that it does not fit only one particular use-case but fail
at any other that were not the author's, and a large reimplementation
would be needed to get them in. I was thinking of having a closer look
with that in mind. In particular and IMHO, the "master key" should not
(necessarily) be hold by a postgres process, not sure what the current
status is, but that is an example.
Committing such amount of codes and features without any test is much too
blunt. Why the rush?
Question: Should these patches be reverted till the stuff is properly
stabilized, tested and reviewed?
--
Fabien.
On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
On Sat, Dec 26, 2020 at 08:29:10AM +0100, Pavel Stehule wrote:
I did recheck with same result
The Makefile of pg_alterckey is busted, and adding --enable-tap-tests
to the options of ./configure is enough to see a failure. In short,
src/bin/pg_alterckey/Makefile includes the following lines, but it has
no TAP tests as of pg_alterckey/t/:
check:
$(prove_check)installcheck:
$(prove_installcheck)So this breaks.
OK, fixed.
I would like to point out that all non-Unix buildfarm members are
broken like fairywen because of the addition of those scripts:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-12-26%2009%3A04%3A27
/usr/bin/install: cannot stat 'crypto/ckey_aws.sh.sample': No such
file or directory
Also fixed. I got too fancy with Makefile magic.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
The CF bot at http://cfbot.cputube.org/ includes tests on Windows, so
those problems would have been detected beforehand. Did you look at
these? If this cannot be fixed, could it be possible to revert
please? It looks rather clear that this has not been tested across
multiple platforms, and the absence of tests to allow the buildfarm to
stress this code does not really help either in gaining confidence
that this is stable.
I have been working on this patch for almost two months, and wanted to
try to get it into the tree near Christmas as sort of a Christmas
present to the community. It has been a tough year, and I know there
are a many users waiting for this feature, even though these commits
only get us a small way to the goal. I also felt the tree would be quiet
so if I broke it, the disruption would be minor.
What I did learn from this is that the thing I was most concerned about,
the crypto code and the OpenSSL API changing between OpenSSL versions,
was a non-issue, but I forgot to test for the no-OpenSSL case, forgot to
do more spellchecking, and didn't check the TAP tests. Windows was more
of a minor issue than I thought.
The only guaranteed user-visible feature for PG 14 is the ability to
have ssl_passphrase_command prompt from the terminal when started via
pg_ctl. All other parts of this patch series will have to be
disabled/hidden by the time we get to PG 14 beta unless we can get the
data encryption part into the tree before then.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
Bruce Momjian <bruce@momjian.us> writes:
On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
The CF bot at http://cfbot.cputube.org/ includes tests on Windows, so
those problems would have been detected beforehand. Did you look at
these? If this cannot be fixed, could it be possible to revert
please? It looks rather clear that this has not been tested across
multiple platforms, and the absence of tests to allow the buildfarm to
stress this code does not really help either in gaining confidence
that this is stable.
I have been working on this patch for almost two months, and wanted to
try to get it into the tree near Christmas as sort of a Christmas
present to the community.
It's not much of a "Christmas present" to have to spend part of the
holiday fixing somebody else's obviously under-baked commits.
I think you should revert *all* of this and not come back until
(a) you have some test cases and (b) somebody besides you has read
the entire patch. There are way too many beginner-level errors in
what you've pushed so far. And that's not even counting the question
of whether the design is right to begin with, which somebody already
expressed doubts about.
regards, tom lane
On Sat, Dec 26, 2020 at 11:45:41AM -0500, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Sat, Dec 26, 2020 at 06:16:37PM +0900, Michael Paquier wrote:
The CF bot at http://cfbot.cputube.org/ includes tests on Windows, so
those problems would have been detected beforehand. Did you look at
these? If this cannot be fixed, could it be possible to revert
please? It looks rather clear that this has not been tested across
multiple platforms, and the absence of tests to allow the buildfarm to
stress this code does not really help either in gaining confidence
that this is stable.I have been working on this patch for almost two months, and wanted to
try to get it into the tree near Christmas as sort of a Christmas
present to the community.It's not much of a "Christmas present" to have to spend part of the
holiday fixing somebody else's obviously under-baked commits.
I thought I could just fix it on my own on that day.
I think you should revert *all* of this and not come back until
(a) you have some test cases and (b) somebody besides you has read
the entire patch. There are way too many beginner-level errors in
what you've pushed so far.
Magnus pointed out that one big help would have been to add it to the
commitfest, so the cfbot could have warned me of these issues. I
underestimated how much help that would have been. I am also using my
old workflow since I haven't worked on any large patches like this
lately. I guess I am relearning.
I can easily revert and come back, though the buildfarm is green now.
As far as testing, I can test that the cluster key unlocks the data
keys, but there is no current interface to the data keys. Ideally we
would test the full input/output path, but with no access to the output,
how can we test it? Also, as I stated, there are some shell script APIs
that can't easily be tested, e.g. AWS, Yubikey. I can continue to test
those manually.
And that's not even counting the question
of whether the design is right to begin with, which somebody already
expressed doubts about.
What doubts?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Sat, Dec 26, 2020 at 12:18:18PM -0500, Bruce Momjian wrote:
I can easily revert and come back, though the buildfarm is green now.
As far as testing, I can test that the cluster key unlocks the data
keys, but there is no current interface to the data keys. Ideally we
would test the full input/output path, but with no access to the output,
how can we test it? Also, as I stated, there are some shell script APIs
that can't easily be tested, e.g. AWS, Yubikey. I can continue to test
those manually.
I have an idea. I can add debug code that outputs key 0 as hex to the
file descriptor open to the terminal, and compare the initdb value to
the server-start value. The existing code doesn't do anything useful,
so having debug output until it does seems acceptable, if it is enabled.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Sat, Dec 26, 2020 at 06:03:00AM -0400, Fabien COELHO wrote:
The feeling I expressed early in the thread is that the design should be
extendable, so that it does not fit only one particular use-case but fail at
any other that were not the author's, and a large reimplementation would be
needed to get them in. I was thinking of having a closer look with that in
mind. In particular and IMHO, the "master key" should not (necessarily) be
hold by a postgres process, not sure what the current status is, but that is
an example.
Hmm. That sounds like a fair concern to me. Based on the information
given by the docs, three keys are actually created/used at initdb
time:
- One for the relation files, that does not have to be shared across
the nodes in the cluster.
- One for the WAL files, that has to be shared across the nodes of the
cluster or no physical replication could happen.
- One to encrypt the first two ones.
The first two keys are stored in pg_cryptokeys/ in the data directory,
while the third one is retrieved using a GUC for validation at server
startup for the other two. Do we necessarily have to store the first
level keys within the data directory? I guess that this choice has
been made for performance, but is that really something that a user
would want all the time? AES256 is the only option available for the
data keys. What if somebody wants to roll in their own encryption?
Companies can have many requirements in terms of accepting the use of
one option or another.
--
Michael
On Sun, Dec 27, 2020 at 10:11:17AM +0900, Michael Paquier wrote:
Hmm. That sounds like a fair concern to me. Based on the information
given by the docs, three keys are actually created/used at initdb
time:
- One for the relation files, that does not have to be shared across
the nodes in the cluster.
- One for the WAL files, that has to be shared across the nodes of the
cluster or no physical replication could happen.
- One to encrypt the first two ones.The first two keys are stored in pg_cryptokeys/ in the data directory,
while the third one is retrieved using a GUC for validation at server
startup for the other two. Do we necessarily have to store the first
level keys within the data directory? I guess that this choice has
been made for performance, but is that really something that a user
would want all the time? AES256 is the only option available for the
data keys. What if somebody wants to roll in their own encryption?
To clarify, we encrypt the data keys using AES256, but the data keys
themselves can be 128, 192, or 256 bits.
Companies can have many requirements in terms of accepting the use of
one option or another.
I think ultimately we will need three commands to control the keys.
First, there is the cluster_key_command, which we have now. Second, I
think we will need an optional command which returns random bytes ---
this would allow users to get random bytes from a different source than
that used by the server code.
Third, we will probably need a command that returns the data encryption
keys directly, either heap/index or WAL keys, probably based on key
number --- you pass the key number you want, and the command returns the
data key. There would not be a cluster key in this case, but the
command could still prompt the user for perhaps a password to the KMS
server. It could not be used if any of the previous two commands are
used. I assume an HMAC would still be stored in the pg_cryptokeys
directory to check that the right key has been returned.
I thought we should implement the first command, because it will
probably be the most common and easiest to use, and then see what people
want added.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Sat, Dec 26, 2020 at 02:00:02PM -0500, Bruce Momjian wrote:
On Sat, Dec 26, 2020 at 12:18:18PM -0500, Bruce Momjian wrote:
I can easily revert and come back, though the buildfarm is green now.
As far as testing, I can test that the cluster key unlocks the data
keys, but there is no current interface to the data keys. Ideally we
would test the full input/output path, but with no access to the output,
how can we test it? Also, as I stated, there are some shell script APIs
that can't easily be tested, e.g. AWS, Yubikey. I can continue to test
those manually.
It seems to me that it is better to figure out that while the feature
is being developed, not after committing it so as there are
fully-functional tests at the same time the feature is committed. If
we don't have that in place, how can people know the amount of testing
that has been done for this feature? And how can anybody be sure that
nothing breaks if this area of the code gets changed? Manual testing
does not scale. For example, I have seen cases in the past where
implementing tests improved the whole state of a feature design
because it was possible to finish with a more flexible set of methods
for this feature.
Based on the number of concerns raised by various people over the last
couple of days (including myself, one point being the refactoring of
the ciphers taken from pgcrypto that should have been in its own
commit), I agree that it would be better to revert this code for now.
--
Michael
On Sun, Dec 27, 2020 at 05:48:47PM +0900, Michael Paquier wrote:
On Sat, Dec 26, 2020 at 02:00:02PM -0500, Bruce Momjian wrote:
On Sat, Dec 26, 2020 at 12:18:18PM -0500, Bruce Momjian wrote:
I can easily revert and come back, though the buildfarm is green now.
As far as testing, I can test that the cluster key unlocks the data
keys, but there is no current interface to the data keys. Ideally we
would test the full input/output path, but with no access to the output,
how can we test it? Also, as I stated, there are some shell script APIs
that can't easily be tested, e.g. AWS, Yubikey. I can continue to test
those manually.It seems to me that it is better to figure out that while the feature
is being developed, not after committing it so as there are
fully-functional tests at the same time the feature is committed. If
we don't have that in place, how can people know the amount of testing
that has been done for this feature? And how can anybody be sure that
nothing breaks if this area of the code gets changed? Manual testing
does not scale. For example, I have seen cases in the past where
implementing tests improved the whole state of a feature design
because it was possible to finish with a more flexible set of methods
for this feature.Based on the number of concerns raised by various people over the last
couple of days (including myself, one point being the refactoring of
the ciphers taken from pgcrypto that should have been in its own
commit), I agree that it would be better to revert this code for now.
OK, I will do so in the next few hours. My followup will be to:
* register it for the commit-fest so it gets cfbot and other visibility
* modify pgcrypto to use the new AES API (the SHA512 call no longer exists)
* develop TAP tests, though as I mentioned, they will be odd
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Sun, Dec 27, 2020 at 12:44:50PM -0500, Bruce Momjian wrote:
Based on the number of concerns raised by various people over the last
couple of days (including myself, one point being the refactoring of
the ciphers taken from pgcrypto that should have been in its own
commit), I agree that it would be better to revert this code for now.OK, I will do so in the next few hours. My followup will be to:
* register it for the commit-fest so it gets cfbot and other visibility
* modify pgcrypto to use the new AES API (the SHA512 call no longer exists)
* develop TAP tests, though as I mentioned, they will be odd
Patch set reverted. This was obviously a bad idea. I also apologize to
anyone who was disturbed on Christmas by my activity.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On 12/27/20 12:44 PM, Bruce Momjian wrote:
Based on the number of concerns raised by various people over the last
couple of days (including myself, one point being the refactoring of
the ciphers taken from pgcrypto that should have been in its own
commit), I agree that it would be better to revert this code for now.OK, I will do so in the next few hours. My followup will be to:
* register it for the commit-fest so it gets cfbot and other visibility
* modify pgcrypto to use the new AES API (the SHA512 call no longer exists)
* develop TAP tests, though as I mentioned, they will be odd
Bruce
I think this is a wise course.
The cfbot and the vanilla buildfarm can't test some things easily, and
this might be such a case. That's where custom buildfarm modules can
prove useful. For example, we have one that is used by rhinoceros to run
the SEPgsql tests. ISTM this might well be another case. I can work with
you to develop such a module.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hello Bruce,
I put the thread back on hackers.
The first two keys are stored in pg_cryptokeys/ in the data directory,
while the third one is retrieved using a GUC for validation at server
startup for the other two.
Do we necessarily have to store the first level keys within the data
directory? I guess that this choice has been made for performance, but
is that really something that a user would want all the time? AES256
is the only option available for the data keys. What if somebody wants
to roll in their own encryption?To clarify, we encrypt the data keys using AES256, but the data keys
themselves can be 128, 192, or 256 bits.Companies can have many requirements in terms of accepting the use of
one option or another.I think ultimately we will need three commands to control the keys.
First, there is the cluster_key_command, which we have now. Second, I
think we will need an optional command which returns random bytes ---
this would allow users to get random bytes from a different source than
that used by the server code.Third, we will probably need a command that returns the data encryption
keys directly, either heap/index or WAL keys, probably based on key
number --- you pass the key number you want, and the command returns the
data key. There would not be a cluster key in this case, but the
command could still prompt the user for perhaps a password to the KMS
server. It could not be used if any of the previous two commands are
used. I assume an HMAC would still be stored in the pg_cryptokeys
directory to check that the right key has been returned.
Yep, my point is that it should be possible to have the whole key
management outside of postgres.
This said, postgres should provide a reasonable default implementation,
obviously, preferably by using the provided mechanism (*NOT* a direct
internal implementation and a possible switch to something else, IMHO,
because then it would not be tested for whether it provides the right
level of usability).
I agree that keys need to be identified. I somehow disagree with the
naming of the script and the implied usage.
ISTM that there could be an external interface:
- to initialize something. It may start a suid process, it may connect to
a remote host, it may ask for a master password, who knows?
/path/to/init --options arguments…
the init process would return something which would be reused later on, eg
an authentication token, or maybe a path to a socket for communication, or
a file which contains something, or even a master/cluster key, but not
necessarily. It may be anything. How it is passed to the next
process/connection is an open question. Maybe on its stdin?
- to start a process (?) which provide keys, either created (new) or
existing (get), and possibly destroy them (or not?). The init process
result should/could be passed somehow to this process, which may be suid
something else. Another option would be to rely on some IPC mechanism.
I'm not sure what the best choice is.
ISTM that this process/connection could/should be persistent, with a
simplistic text or binary based client/server interface. What this
process/connection does it beyond postgres. In my mind, it could implement
getting random data as well. I'd suggest that under no circumstances
should the postgres process create cryptographic keys, although it should
probably name them with some predefine length limit.
/path/to/run --options arguments…
Then there should be an postgres internal interface to store the results
for local processing, retrieve them when needed, and so on, ok.
ISTM that there should also be an internal interface to load the
cryptographic primitives. Possibly a so/dll would do, or maybe just an
extension mechanism which would provide the necessary functions, but this
raise the issue of bootstraping, so maybe not so great an idea. The
functions should probably be able to implement a counter mode, so that
actual keys depend on the page position in file position, but what is
really does is not postgres concern.
A cryptographic concern for me is whether it would be possible to have
authentication/integrity checks associated to each page. This means having
the ability to reserve some space somewhere, possibly 8-16 bytes, in a
page. Different algorithm could have different space requirements.
The same interface should be used by other back-end commands (pg_upgrade,
whatever).
Somehow, the design should be abstract, without implying much, so that
very different interfaces could be provided in term of whether there
exists a master key, how keys are derived, what key sizes are, what
algorithms are used, and so on. Postgres itself should not store keys,
only key identifiers.
I'm wondering whether replication should be able to work without some/all
keys, so that a streaming replication could be implemented without the
remote host being fully aware of the cryptographic keys.
Another functional point is to allow changing the underlying key for a
file, and discuss how this could work with the interface, as I noted that
it was a desired feature. I'd suggest that maybe this should be based on
changing the name of the "key", so that the external key management would
not need to know about it. How to achieve that as a transaction is an open
question. Maybe it should be an change outside of postgres, which modifies
files at the cluster level with the database stopped.
I thought we should implement the first command, because it will
probably be the most common and easiest to use, and then see what people
want added.
I somehow disagree: I think that pg should provide from the start the full
generic interface, *and* a reasonable implementation which is what the
current proposal does, fine with me. A simplistic test-oriented interface
could be implemented in a scripting language. I think that great care must
be put upfront in the overall design, so that it can be reused later on by
people with pretty different requirements (in term of auditors, legal
constraints, functions, whatever). I would like to avoid providing an
half-baked design which suits some use-cases but cannot be used for
others, because of key design choices. From a number of line of code point
of view, it may not change much, really, this is more about design and
putting functionalities in the right places.
Now I intend to give some time to review patches with this in mind. Maybe
I'll have some time at the end of the next CF, or the next.
--
Fabien.
On Mon, Dec 28, 2020 at 08:49:09AM -0500, Andrew Dunstan wrote:
On 12/27/20 12:44 PM, Bruce Momjian wrote:
Based on the number of concerns raised by various people over the last
couple of days (including myself, one point being the refactoring of
the ciphers taken from pgcrypto that should have been in its own
commit), I agree that it would be better to revert this code for now.OK, I will do so in the next few hours. My followup will be to:
* register it for the commit-fest so it gets cfbot and other visibility
* modify pgcrypto to use the new AES API (the SHA512 call no longer exists)
* develop TAP tests, though as I mentioned, they will be oddBruce
I think this is a wise course.
The cfbot and the vanilla buildfarm can't test some things easily, and
this might be such a case. That's where custom buildfarm modules can
prove useful. For example, we have one that is used by rhinoceros to run
the SEPgsql tests. ISTM this might well be another case. I can work with
you to develop such a module.
Oh, wow, I never considered that would be possible. I think we are OK
not having automation of the Yubikey and AWS scripts, since they are
small and rarely change. I think we can get the rest done with normal
buildfarm members.
I am going to use the pg_upgrade TAP test as an example since it has
similar requirements and prior to its creation, I did my own testing of
that. I am not sure how to test prompting from /dev/tty.
Seems I am behind the times in learning how to write TAP tests and use
the buildfarm cfbot testing properly.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Mon, Dec 28, 2020 at 10:09:11AM -0400, Fabien COELHO wrote:
Yep, my point is that it should be possible to have the whole key management
outside of postgres.
I think this kind of discussion has to happen in a different thread,
parhsps:
/messages/by-id/20201227172526.GB17037@momjian.us
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee