Modern SHA2- based password hashes for pgcrypto

Started by Bernd Helmleover 1 year ago38 messageshackers
Jump to latest
#1Bernd Helmle
mailings@oopsware.de

Hi Hackers,

Some of you might already arrived 2025, so first a Happy New Year to
everyone already there ;)

Please find attached a patch to pgcrypto to add modern SHA-2 based
password hashes sha256crypt (256 bit) and sha512crypt (512 bit) for
crypt() and gen_salt() respectively. This is compatible on what crypt()
currently does on FreeBSD and Linux and both algorithms are considered
more secure than the currently implemented hashes.

I adapted the code from the publicly available reference implementation
at [1]https://www.akkadia.org/drepper/SHA-crypt.txt. It's based on our existing OpenSSL infrastructure in pgcrypto
and produces compatible password hashes with crypt() and "openssl
passwd" with "-5" and "-6" switches.

I documented the new supported hashes for pgcrypto, but didn't do
anything to update the benchmark table for the supported password
hashes.

Modern OS (at least Linux, BSDs) implementations for crypt() also
support yescrypt, which is the recommended (and default) password hash
algorithm there. I am also looking to implement that, but thought it
would be useful to have the SHA-2 based hashes first in the review.

I am going to add this patch to the upcoming january commitfest for
initial review.

[1]: https://www.akkadia.org/drepper/SHA-crypt.txt

--
Thanks,
Bernd

Attachments:

0001-Add-modern-SHA-2-based-password-hashes-to-pgcrypto.patchtext/x-patch; charset=UTF-8; name=0001-Add-modern-SHA-2-based-password-hashes-to-pgcrypto.patchDownload+849-3
#2Bernd Helmle
mailings@oopsware.de
In reply to: Bernd Helmle (#1)
Re: Modern SHA2- based password hashes for pgcrypto

Am Dienstag, dem 31.12.2024 um 17:06 +0100 schrieb Bernd Helmle:

I am going to add this patch to the upcoming january commitfest for
initial review.

I see cfbot fails Debian Bookworm with autoconf and on macOS with
meson. I will look into it.

#3Daniel Gustafsson
daniel@yesql.se
In reply to: Bernd Helmle (#1)
Re: Modern SHA2- based password hashes for pgcrypto

On 31 Dec 2024, at 17:06, Bernd Helmle <mailings@oopsware.de> wrote:

I adapted the code from the publicly available reference implementation
at [1]. It's based on our existing OpenSSL infrastructure in pgcrypto
and produces compatible password hashes with crypt() and "openssl
passwd" with "-5" and "-6" switches.

Potentially daft question, but since we require OpenSSL to build pgcrypto, why
do we need to include sha2 code instead of using the sha2 implementation in
libcrypto? How complicated would it be to use the OpenSSL API instead?

--
Daniel Gustafsson

#4Bernd Helmle
mailings@oopsware.de
In reply to: Daniel Gustafsson (#3)
Re: Modern SHA2- based password hashes for pgcrypto

Am Donnerstag, dem 02.01.2025 um 15:57 +0100 schrieb Daniel Gustafsson:

I adapted the code from the publicly available reference
implementation
at [1]. It's based on our existing OpenSSL infrastructure in
pgcrypto
and produces compatible password hashes with crypt() and "openssl
passwd" with "-5" and "-6" switches.

Potentially daft question, but since we require OpenSSL to build
pgcrypto, why
do we need to include sha2 code instead of using the sha2
implementation in
libcrypto? How complicated would it be to use the OpenSSL API
instead?

Not sure i got you, but i use OpenSSL and the SHA2 implementation
there. See the pgcrypto px_* API (px.h and openssl.c respectively) i am
using to create the digests.

Thanks,
Bernd

#5Daniel Gustafsson
daniel@yesql.se
In reply to: Bernd Helmle (#4)
Re: Modern SHA2- based password hashes for pgcrypto

On 2 Jan 2025, at 16:17, Bernd Helmle <mailings@oopsware.de> wrote:

Am Donnerstag, dem 02.01.2025 um 15:57 +0100 schrieb Daniel Gustafsson:

I adapted the code from the publicly available reference
implementation
at [1]. It's based on our existing OpenSSL infrastructure in
pgcrypto
and produces compatible password hashes with crypt() and "openssl
passwd" with "-5" and "-6" switches.

Potentially daft question, but since we require OpenSSL to build
pgcrypto, why
do we need to include sha2 code instead of using the sha2
implementation in
libcrypto? How complicated would it be to use the OpenSSL API
instead?

Not sure i got you, but i use OpenSSL and the SHA2 implementation
there. See the pgcrypto px_* API (px.h and openssl.c respectively) i am
using to create the digests.

Sorry, skimming the patch I misread it, nevermind.

--
Daniel Gustafsson

#6Japin Li
japinli@hotmail.com
In reply to: Bernd Helmle (#1)
Re: Modern SHA2- based password hashes for pgcrypto

On Tue, 31 Dec 2024 at 17:06, Bernd Helmle <mailings@oopsware.de> wrote:

Hi Hackers,

Some of you might already arrived 2025, so first a Happy New Year to
everyone already there ;)

Please find attached a patch to pgcrypto to add modern SHA-2 based
password hashes sha256crypt (256 bit) and sha512crypt (512 bit) for
crypt() and gen_salt() respectively. This is compatible on what crypt()
currently does on FreeBSD and Linux and both algorithms are considered
more secure than the currently implemented hashes.

I adapted the code from the publicly available reference implementation
at [1]. It's based on our existing OpenSSL infrastructure in pgcrypto
and produces compatible password hashes with crypt() and "openssl
passwd" with "-5" and "-6" switches.

I documented the new supported hashes for pgcrypto, but didn't do
anything to update the benchmark table for the supported password
hashes.

Modern OS (at least Linux, BSDs) implementations for crypt() also
support yescrypt, which is the recommended (and default) password hash
algorithm there. I am also looking to implement that, but thought it
would be useful to have the SHA-2 based hashes first in the review.

I am going to add this patch to the upcoming january commitfest for
initial review.

[1] https://www.akkadia.org/drepper/SHA-crypt.txt

Greate! I have some questions after using it.

When I use the following query, it crashed!

[local]:4012204 postgres=# select crypt('hello', '$5$$6$rounds=10000$/Zg436s2vmTwsoSz');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
: !?>

It is caused by checking the shacrypt digest. The following can fix this crash,
but I'm unsure if it is correct.

@@ -151,8 +152,7 @@ px_crypt_shacrypt(const char *pw, const char *salt, char *passwd, unsigned dstle
         type = PGCRYPTO_SHA256CRYPT;
         dec_salt_binary += strlen(magic_bytes[0]);
     }
-
-    if (strncmp(dec_salt_binary, magic_bytes[1], strlen(magic_bytes[1])) == 0)
+    else if (strncmp(dec_salt_binary, magic_bytes[1], strlen(magic_bytes[1])) == 0)
     {
         type = PGCRYPTO_SHA512CRYPT;
         dec_salt_binary += strlen(magic_bytes[1]);

Another, if I run crypt with big rounds, it cannot be canceled, for example:

select crypt('hello', '$5$rounds=9999999999$/Zg436s2vmTwsoSz');

Maybe we should add CHECK_FOR_INTERRUPTS() in step 21.

@@ -411,6 +411,7 @@ px_crypt_shacrypt(const char *pw, const char *salt, char *passwd, unsigned dstle
* uses the notation "digest A/B" to describe this behavior.
*/
for (block = 0; block < rounds; block++) {
+ CHECK_FOR_INTERRUPTS();

/* a) start digest B */
px_md_reset(digestB);

--
Regrads,
Japin Li

#7Bernd Helmle
mailings@oopsware.de
In reply to: Bernd Helmle (#1)
Re: Modern SHA2- based password hashes for pgcrypto

Am Freitag, dem 03.01.2025 um 23:57 +0800 schrieb Japin Li:

Greate!  I have some questions after using it.

When I use the following query, it crashed!

[local]:4012204 postgres=# select crypt('hello',
'$5$$6$rounds=10000$/Zg436s2vmTwsoSz');
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
: !?>

It is caused by checking the shacrypt digest.  The following can fix
this crash,
but I'm unsure if it is correct.

Hmm, can you provide a backtrace? I am currently debugging the CI
results and i currently have a thinko in my code at crypt-sha.c:530, i
am using the wrong length to copy the result buffer, see

https://api.cirrus-ci.com/v1/artifact/task/6395274957946880/log/contrib/pgcrypto/log/postmaster.log

==33750==ERROR: AddressSanitizer: stack-buffer-overflow on address
0x7ffc385204c7 at pc 0x7fd65a24814b bp 0x7ffc38520240 sp 0x7ffc3851f9f0
READ of size 128 at 0x7ffc385204c7 thread T0
#0 0x7fd65a24814a in __interceptor_memcpy
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_intercep
tors.inc:827
#1 0x7fd64c37c25f in px_crypt_shacrypt /tmp/cirrus-ci-
build/contrib/pgcrypto/crypt-sha.c:530
#2 0x7fd64c3993f0 in run_crypt_sha /tmp/cirrus-ci-
build/contrib/pgcrypto/px-crypt.c:76

But i cannot reproduce your crash in this case, maybe this is related
to the above issue...

@@ -151,8 +152,7 @@ px_crypt_shacrypt(const char *pw, const char
*salt, char *passwd, unsigned dstle
         type = PGCRYPTO_SHA256CRYPT;
         dec_salt_binary += strlen(magic_bytes[0]);
     }
-
-    if (strncmp(dec_salt_binary, magic_bytes[1],
strlen(magic_bytes[1])) == 0)
+    else if (strncmp(dec_salt_binary, magic_bytes[1],
strlen(magic_bytes[1])) == 0)
     {
         type = PGCRYPTO_SHA512CRYPT;
         dec_salt_binary += strlen(magic_bytes[1]);

Yeah, seems the check is not strict enough for the magic byte and i
need to be more strict here. I am going to look into this. "$5$$6$"
certainly is bogus enough that we should error out instead.

Another, if I run crypt with big rounds, it cannot be canceled, for
example:

select crypt('hello', '$5$rounds=9999999999$/Zg436s2vmTwsoSz');

Maybe we should add CHECK_FOR_INTERRUPTS() in step 21.

@@ -411,6 +411,7 @@ px_crypt_shacrypt(const char *pw, const char
*salt, char *passwd, unsigned dstle
      *    uses the notation "digest A/B" to describe this behavior.
      */
     for (block = 0; block < rounds; block++) {
+        CHECK_FOR_INTERRUPTS();

         /* a) start digest B */
         px_md_reset(digestB);

Hmm not sure how expensive it is to check for interrupts for each
block. Maybe its better to check for each 10.000 blocks or so. But i
like the idea. Will investigate.

Thanks for your review,

Bernd

#8Japin Li
japinli@hotmail.com
In reply to: Bernd Helmle (#7)
Re: Modern SHA2- based password hashes for pgcrypto

On Fri, 03 Jan 2025 at 17:55, Bernd Helmle <mailings@oopsware.de> wrote:

Am Freitag, dem 03.01.2025 um 23:57 +0800 schrieb Japin Li:

Greate!  I have some questions after using it.

When I use the following query, it crashed!

[local]:4012204 postgres=# select crypt('hello',
'$5$$6$rounds=10000$/Zg436s2vmTwsoSz');
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
: !?>

It is caused by checking the shacrypt digest.  The following can fix
this crash,
but I'm unsure if it is correct.

Hmm, can you provide a backtrace? I am currently debugging the CI
results and i currently have a thinko in my code at crypt-sha.c:530, i
am using the wrong length to copy the result buffer, see

Here is the backtrace.

(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ba852a4526e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ba852a288ff in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ba852a297b6 in __libc_message_impl (fmt=fmt@entry=0x7ba852bce765 "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:132
#6 0x00007ba852b36c19 in __GI___fortify_fail (msg=msg@entry=0x7ba852bce77d "stack smashing detected") at ./debug/fortify_fail.c:24
#7 0x00007ba852b37ea4 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#8 0x00007ba853d8cb2e in px_crypt_shacrypt (pw=0x64e191229370 "hello", salt=0x64e191229388 "$5$$6$rounds=10000$/Zg436s2vmTwsoSz", passwd=0x64e1912293d8 "$6$rounds=10000$/Zg436s2vmTwsoSz$TTCnOO7S5pkJHBVJ.oL74WN1Yt0n1RfQOWd60CRb4xtd9q7ChipyZ00jwYZfhDGRRJOoJNOgYKAVGpdmA8qhT1", dstlen=128) at /data/Codes/pg/master/build/../contrib/pgcrypto/crypt-sha.c:546
#9 0x00007ba853d9d9f8 in run_crypt_sha (psw=<error reading variable: Cannot access memory at address 0x31546871384155>, salt=<error reading variable: Cannot access memory at address 0x3154687138414d>, buf=<error reading variable: Cannot access memory at address 0x31546871384145>, len=<error reading variable: Cannot access memory at address 0x31546871384141>) at /data/Codes/pg/master/build/../contrib/pgcrypto/px-crypt.c:76
Backtrace stopped: Cannot access memory at address 0x31546871384175

--
Regrads,
Japin Li

#9Bernd Helmle
mailings@oopsware.de
In reply to: Japin Li (#8)
Re: Modern SHA2- based password hashes for pgcrypto

Am Samstag, dem 04.01.2025 um 08:19 +0800 schrieb Japin Li:

(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6,
threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at
./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ba852a4526e in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4  0x00007ba852a288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ba852a297b6 in __libc_message_impl
(fmt=fmt@entry=0x7ba852bce765 "*** %s ***: terminated\n") at
../sysdeps/posix/libc_fatal.c:132
#6  0x00007ba852b36c19 in __GI___fortify_fail
(msg=msg@entry=0x7ba852bce77d "stack smashing detected") at
./debug/fortify_fail.c:24
#7  0x00007ba852b37ea4 in __stack_chk_fail () at
./debug/stack_chk_fail.c:24
#8  0x00007ba853d8cb2e in px_crypt_shacrypt (pw=0x64e191229370
"hello", salt=0x64e191229388 "$5$$6$rounds=10000$/Zg436s2vmTwsoSz",
passwd=0x64e1912293d8
"$6$rounds=10000$/Zg436s2vmTwsoSz$TTCnOO7S5pkJHBVJ.oL74WN1Yt0n1RfQOWd
60CRb4xtd9q7ChipyZ00jwYZfhDGRRJOoJNOgYKAVGpdmA8qhT1", dstlen=128) at
/data/Codes/pg/master/build/../contrib/pgcrypto/crypt-sha.c:546

Thank you very much.

This points to the same area i've investigated and it turned out i
confused length macros for the password salt, leading to the out_buf
buffer in px_crypt_shacrypt() being too small, stupid me.

Attached is a new version of the patch:

- Use correct length for the out_buf result buffer in
px_crypt_shacrypt()

- Silent a compiler warning in the error goto branch in
px_crypt_shacrypt()

- Don't accept rounds values lower or larger than supported (previously
they were silently changed to minimum/maximum).

- Per your suggestions, use CHECK_FOR_INTERRUPTS() during block
calculation to make it possible to interrupt the code when using large
values for the rounds option and just rely on the very first three
bytes when parsing the magic bytes of the salt string.

Please test again.

Bernd

Attachments:

0001-Add-modern-SHA-2-based-password-hashes-to-pgcrypto-v2.patchtext/x-patch; charset=UTF-8; name=0001-Add-modern-SHA-2-based-password-hashes-to-pgcrypto-v2.patchDownload+882-3
#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bernd Helmle (#9)
Re: Modern SHA2- based password hashes for pgcrypto

Hello

I was passing by and I noticed that this needs badly pgindent, and some
comments are enumerations that would lose formatting once through it.
For example, this would happen which is not good:

    /*
-    * 1. Start digest A
-    * 2. Add the password string to digest A
-    * 3. Add the salt to digest A
+    * 1. Start digest A 2. Add the password string to digest A 3. Add the
+    * salt to digest A
     */

I suggest you can fix this by adding a "-" sign to the opening "/*" line
so that pgindent doesn't mangle it (so it becomes /*- ). This appears
in several places.

It's been said in my presence that pgcrypto is obsolete and shouldn't be
used anymore. I'm not sure I believe that, but even if that's true,
it's clear that there's plenty of people who has an interest on it, so I
don't see that as an objection to reject this work. So let's move on.

Your test data (crypt-shacrypt.sql) looks a bit short. I noticed that
Drepper's SHA-crypt.txt file has a bunch of test lines (starting with
the "Test vectors from FIPS 180-2: appendix B.1." comment line, as well
as "appendix C.1" at the bottom) which perhaps could be incorporated
into the .sql script, to ensure correctness (or at least,
bug-compatibility with the reference implementation). I'd also add a
note that Drepper's implementation is public domain in crypt-sha.c's
license block.

I think the "ascii_dollar" variable is a bit useless. Why not just use the
literal $ sign where needed (or a DOLLAR_CHR if we feel like avoiding a
magic value there)? Also, I wonder if it would be better to use a
StringInfo instead of a fixed-size buffer, which would probably make
some string manipulations easier ... Or maybe not, but let's not mix
strlcat calls with strncat calls with no comments about why.

Some of your elog(ERROR)s should probably be ereport(), and I'm not sure
we want all the elog(DEBUG1)s.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"El que vive para el futuro es un iluso, y el que vive para el pasado,
un imbécil" (Luis Adler, "Los tripulantes de la noche")

#11Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#10)
Re: Modern SHA2- based password hashes for pgcrypto

Hi Alvaro,

Am Montag, dem 13.01.2025 um 17:06 +0100 schrieb Alvaro Herrera:

Hello

I was passing by and I noticed that this needs badly pgindent, and
some
comments are enumerations that would lose formatting once through it.
For example, this would happen which is not good:

    /*
-    * 1. Start digest A
-    * 2. Add the password string to digest A
-    * 3. Add the salt to digest A
+    * 1. Start digest A 2. Add the password string to digest A 3.
Add the
+    * salt to digest A
     */

I didn't think about testing pg_ident, sorry.

I suggest you can fix this by adding a "-" sign to the opening "/*"
line
so that pgindent doesn't mangle it (so it becomes /*- ).  This
appears
in several places.

Will try.

It's been said in my presence that pgcrypto is obsolete and shouldn't
be
used anymore.  I'm not sure I believe that, but even if that's true,
it's clear that there's plenty of people who has an interest on it,
so I
don't see that as an objection to reject this work.  So let's move
on.

Oh, that's news to me. Is there a plan for it somewhere? I agree that
pgcrypto is widley used and needs a proper replacement when we decide
to deprecate it.

Your test data (crypt-shacrypt.sql) looks a bit short.  I noticed
that
Drepper's SHA-crypt.txt file has a bunch of test lines (starting with
the "Test vectors from FIPS 180-2: appendix B.1." comment line, as
well
as "appendix C.1" at the bottom) which perhaps could be incorporated
into the .sql script, to ensure correctness (or at least,
bug-compatibility with the reference implementation).  I'd also add a
note that Drepper's implementation is public domain in crypt-sha.c's
license block.

Good idea. Will do.

I think the "ascii_dollar" variable is a bit useless.  Why not just
use the
literal $ sign where needed (or a DOLLAR_CHR if we feel like avoiding
a

magic value there)?  Also, I wonder if it would be better to use a
StringInfo instead of a fixed-size buffer, which would probably make
some string manipulations easier ... Or maybe not, but let's not mix
strlcat calls with strncat calls with no comments about why.

I am going to give it a try, probably helps to get rid of the
strncat()/strlcat() mess.

I originally thought about StringInfo but went with just the fixed
length character buffers since we access them directly anyways (and the
px_*/OpenSSL API needs char * ).

Some of your elog(ERROR)s should probably be ereport(), and I'm not
sure
we want all the elog(DEBUG1)s.

I added them during development. I am not married to them, but found
them useful during testing. If we come to the conclusion they're not
really that important, i drop them entirely.

Thanks for your comments!

Bernd

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bernd Helmle (#11)
Re: Modern SHA2- based password hashes for pgcrypto

Hello Bernd,

On 2025-Jan-14, Bernd Helmle wrote:

It's been said in my presence that pgcrypto is obsolete and
shouldn't be used anymore.  I'm not sure I believe that, but even if
that's true, it's clear that there's plenty of people who has an
interest on it, so I don't see that as an objection to reject this
work.  So let's move on.

Oh, that's news to me. Is there a plan for it somewhere? I agree that
pgcrypto is widley used and needs a proper replacement when we decide
to deprecate it.

I don't know about a plan, but
https://www.youtube.com/watch?v=pp6xdr3TuWM&amp;t=1088s

I originally thought about StringInfo but went with just the fixed
length character buffers since we access them directly anyways (and the
px_*/OpenSSL API needs char * ).

Note that you can access the char * via StringInfo->data. Just don't
modify it without the StringInfo APIs.

Some of your elog(ERROR)s should probably be ereport(), and I'm not
sure we want all the elog(DEBUG1)s.

I added them during development. I am not married to them, but found
them useful during testing. If we come to the conclusion they're not
really that important, i drop them entirely.

Yeah, the DEBUGs are a pretty minor issue -- it's easy to remove them
afterwards. For any actual error condition that's not a "can't happen"
one, please use ereport() for consistency. (There's no translation
support for contrib, so they won't be translated anyway.)

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

#13Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#12)
Re: Modern SHA2- based password hashes for pgcrypto

Am Dienstag, dem 14.01.2025 um 11:47 +0100 schrieb Alvaro Herrera:

Oh, that's news to me. Is there a plan for it somewhere? I agree
that
pgcrypto is widley used and needs a proper replacement when we
decide
to deprecate it.

I don't know about a plan, but
https://www.youtube.com/watch?v=pp6xdr3TuWM&amp;t=1088s

Hmm, I understand this like Peter tries to make a point regarding
transparent data encryption solutions, which pgcrypto doesn't serve
very well. Maybe it's not a point for total deprecation but that we
need additional, better solutions (for sure)...

Thanks,

Bernd

#14Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#10)
Re: Modern SHA2- based password hashes for pgcrypto

Hi,

Please find attached a reworked patch according Alvaro's comments.

Am Montag, dem 13.01.2025 um 17:06 +0100 schrieb Alvaro Herrera:

Hello

I was passing by and I noticed that this needs badly pgindent, and
some
comments are enumerations that would lose formatting once through it.
For example, this would happen which is not good:

    /*
-    * 1. Start digest A
-    * 2. Add the password string to digest A
-    * 3. Add the salt to digest A
+    * 1. Start digest A 2. Add the password string to digest A 3.
Add the
+    * salt to digest A
     */

I suggest you can fix this by adding a "-" sign to the opening "/*"
line
so that pgindent doesn't mangle it (so it becomes /*- ).  This
appears
in several places.

This is even documented:

https://www.postgresql.org/docs/devel/source-format.html

I ran pgindent locally and verified it works like you suggested.

[...]

Your test data (crypt-shacrypt.sql) looks a bit short.  I noticed
that
Drepper's SHA-crypt.txt file has a bunch of test lines (starting with
the "Test vectors from FIPS 180-2: appendix B.1." comment line, as
well
as "appendix C.1" at the bottom) which perhaps could be incorporated
into the .sql script, to ensure correctness (or at least,
bug-compatibility with the reference implementation).  I'd also add a
note that Drepper's implementation is public domain in crypt-sha.c's
license block.

These FIPS cases tests explicit against sha{256|512}_process_bytes() in
Drepper's code, which seem to be the equivalent to OpenSSL's
EVP_Digest*() API we're using. I am not sure it makes sense to adapt
them.

I left them out for now but adapted all the testcases for the password
hashes defined in the 2nd test cases to make sure we create the same
hashes.

I think the "ascii_dollar" variable is a bit useless.  Why not just
use the
literal $ sign where needed (or a DOLLAR_CHR if we feel like avoiding
a
magic value there)?  Also, I wonder if it would be better to use a
StringInfo instead of a fixed-size buffer, which would probably make
some string manipulations easier ... Or maybe not, but let's not mix
strlcat calls with strncat calls with no comments about why.

Result buffer via out_buf is wrapped into a StringInfo now.

Some of your elog(ERROR)s should probably be ereport(), and I'm not
sure
we want all the elog(DEBUG1)s.

Done, but i kept some of the DEBUG output for now. I am not sure if i
really should set the errcode for ereport() explicitly, but i did on
some places where i thought it might be useful.

This patch version also changes the original behavior of crypt() when
called for sha{256|512}crypt hashes. Before we didn't accept values for
the rounds parameter exceeding the minimum and maximum values. This was
along with gen_salt(), which also errors out in such cases. But
Drepper's code accepts them and changes them behind the scenes either
to the minimum or maximum and calculates the password hash based on
them, depending on the exceeded boundary.

So when passing a user defined salt string to crypt(), we do the same
now but i added a NOTICE to indicate that we changed the user submitted
parameter behind the scene. This also enables us to stress
px_crypt_shacrypt() with the same test cases as Drepper's code.

Thanks,

Bernd

Attachments:

0001-Add-modern-SHA-2-based-password-hashes-to-pgcrypto-v3.patchtext/x-patch; charset=UTF-8; name=0001-Add-modern-SHA-2-based-password-hashes-to-pgcrypto-v3.patchDownload+1091-3
#15Japin Li
japinli@hotmail.com
In reply to: Bernd Helmle (#14)
Re: Modern SHA2- based password hashes for pgcrypto

On Mon, 20 Jan 2025 at 18:41, Bernd Helmle <mailings@oopsware.de> wrote:

Hi,

Please find attached a reworked patch according Alvaro's comments.

Am Montag, dem 13.01.2025 um 17:06 +0100 schrieb Alvaro Herrera:

Hello

I was passing by and I noticed that this needs badly pgindent, and
some
comments are enumerations that would lose formatting once through it.
For example, this would happen which is not good:

    /*
-    * 1. Start digest A
-    * 2. Add the password string to digest A
-    * 3. Add the salt to digest A
+    * 1. Start digest A 2. Add the password string to digest A 3.
Add the
+    * salt to digest A
     */

I suggest you can fix this by adding a "-" sign to the opening "/*"
line
so that pgindent doesn't mangle it (so it becomes /*- ).  This
appears
in several places.

This is even documented:

https://www.postgresql.org/docs/devel/source-format.html

I ran pgindent locally and verified it works like you suggested.

[...]

Your test data (crypt-shacrypt.sql) looks a bit short.  I noticed
that
Drepper's SHA-crypt.txt file has a bunch of test lines (starting with
the "Test vectors from FIPS 180-2: appendix B.1." comment line, as
well
as "appendix C.1" at the bottom) which perhaps could be incorporated
into the .sql script, to ensure correctness (or at least,
bug-compatibility with the reference implementation).  I'd also add a
note that Drepper's implementation is public domain in crypt-sha.c's
license block.

These FIPS cases tests explicit against sha{256|512}_process_bytes() in
Drepper's code, which seem to be the equivalent to OpenSSL's
EVP_Digest*() API we're using. I am not sure it makes sense to adapt
them.

I left them out for now but adapted all the testcases for the password
hashes defined in the 2nd test cases to make sure we create the same
hashes.

I think the "ascii_dollar" variable is a bit useless.  Why not just
use the
literal $ sign where needed (or a DOLLAR_CHR if we feel like avoiding
a
magic value there)?  Also, I wonder if it would be better to use a
StringInfo instead of a fixed-size buffer, which would probably make
some string manipulations easier ... Or maybe not, but let's not mix
strlcat calls with strncat calls with no comments about why.

Result buffer via out_buf is wrapped into a StringInfo now.

Some of your elog(ERROR)s should probably be ereport(), and I'm not
sure
we want all the elog(DEBUG1)s.

Done, but i kept some of the DEBUG output for now. I am not sure if i
really should set the errcode for ereport() explicitly, but i did on
some places where i thought it might be useful.

This patch version also changes the original behavior of crypt() when
called for sha{256|512}crypt hashes. Before we didn't accept values for
the rounds parameter exceeding the minimum and maximum values. This was
along with gen_salt(), which also errors out in such cases. But
Drepper's code accepts them and changes them behind the scenes either
to the minimum or maximum and calculates the password hash based on
them, depending on the exceeded boundary.

So when passing a user defined salt string to crypt(), we do the same
now but i added a NOTICE to indicate that we changed the user submitted
parameter behind the scene. This also enables us to stress
px_crypt_shacrypt() with the same test cases as Drepper's code.

Hi, Bernd Helmle

Thanks for updating the patch. Here are some comments for v3 patch.

1.
+char *
+_crypt_gensalt_sha256_rn(unsigned long count,
+                                                const char *input, int size,
+                                                char *output, int output_size)
+{
+       memset(output, 0, output_size);
+       /* set magic byte for sha512crypt */

s/sha512crypt/sha256crypt/g

2.
Both PX_SHACRYPT_BUF_LEN and PX_SHACRYPT_DIGEST_MAX_LENGTH are macros for
length; however, they use different suffixes. How about unifying them?
I prefer to use the LEN suffix.

3.
+       if ((dec_salt_binary[0] != '$')
+               && (dec_salt_binary[2] != '$'))
+       {
+               ereport(ERROR,
+                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("invalid format of salt")),
+                               errhint("magic byte format for shacrypt is either \"$5$\" or \"$6$\""));
+       }
...
+       if (strncmp(dec_salt_binary, magic_bytes[0], strlen(magic_bytes[0])) == 0)
...
+       else if (strncmp(dec_salt_binary, magic_bytes[1], strlen(magic_bytes[1])) == 0)

IMO, we could change the above code as:

+       if (strncmp(dec_salt_binary, magic_bytes[0], strlen(magic_bytes[0])) == 0)
...
+       else if (strncmp(dec_salt_binary, magic_bytes[1], strlen(magic_bytes[1])) == 0)
...
+       else
+       {
+               ereport(ERROR,
+                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("invalid format of salt")),
+                               errhint("magic byte format for shacrypt is either \"$5$\" or \"$6$\""));
+       }
4.
+/* Default number of rounds of shacrypt if not explicitly specified.  */
+#define PX_SHACRYPT_ROUNDS_DEFAULT 5000

It seems you were missing the `l` suffix, since both PX_SHACRYPT_ROUNDS_MIN and
PX_SHACRYPT_ROUNDS_MAX have this suffix.

5.
Does the following work as expected?

postgres=# select crypt('hello', '$5$$6$rounds=10000$/Zg436s2vmTwsoSz');
crypt
-------------------------------------------------
$5$$TCRu/ts4Npu8OJyeWy2WnUHCe/6bKVMSi0sROUrPh48
(1 row)

postgres=# select crypt('hello', '$5$rounds=10000$/Zg436s2vmTwsoSz');
crypt
------------------------------------------------------------------------------
$5$rounds=10000$/Zg436s2vmTwsoSz$N4Ad0oGzE6ak30z4EGSEXOc2OXQOpmauicPT3560lY2
(1 row)

According to the documentation, I think the first should output as following:

$5$rounds=5000$TCRu/ts4Npu8OJyeWy2WnUHCe/6bKVMSi0sROUrPh48

--
Regrads,
Japin Li

#16Bernd Helmle
mailings@oopsware.de
In reply to: Japin Li (#15)
Re: Modern SHA2- based password hashes for pgcrypto

Am Donnerstag, dem 23.01.2025 um 21:36 +0800 schrieb Japin Li:

Thanks for your review again. I am going to work on the other items,
but this one might need further discussion:

5.
Does the following work as expected?

postgres=# select crypt('hello',
'$5$$6$rounds=10000$/Zg436s2vmTwsoSz');
                      crypt
-------------------------------------------------
 $5$$TCRu/ts4Npu8OJyeWy2WnUHCe/6bKVMSi0sROUrPh48
(1 row)

postgres=# select crypt('hello', '$5$rounds=10000$/Zg436s2vmTwsoSz');
                                    crypt
---------------------------------------------------------------------
---------
 $5$rounds=10000$/Zg436s2vmTwsoSz$N4Ad0oGzE6ak30z4EGSEXOc2OXQOpmauicP
T3560lY2
(1 row)

According to the documentation, I think the first should output as
following:

$5$rounds=5000$TCRu/ts4Npu8OJyeWy2WnUHCe/6bKVMSi0sROUrPh48

So the password hash/salt string used here is broken. '$' is exclusive
to define the boundaries of each part of the password hash string, so
it's very ambiguous what this preample in your example should do.

Note that the code i'm using is nearly identical to what the current
implementation of 'md5' does:

bernd@localhost:bernd :1 #= select crypt('hello',
'$1$$6$/Zg436s2vmTwsoSz');
DEBUG: md5 salt len = 0, salt = $6$/Zg436s2vmTwsoSz
crypt
────────────────────────────
$1$$/PWPe740uvaQxKzRH.Zxj1
(1 row)

The DEBUG output was added by me. Since the salt len is evaluated to 0,
effectively no salt is handled at all.

So we behave exactly the same way as px_crypt_md5(): It stops after the
first '$' after the magic byte preamble. For shacrypt, this could be
the next '$' after the closing one of the non-mandatory 'rounds'
option, but with your example this doesn't happen since it gets never
parsed. The salt length will be set to 0.

Furthermore, EVP_DigestUpdate() will do nothing if a count parameter
with 0 is handed over, which happens in our case here when adding the
salt. It happily returns 1 (success), so px_update_digest() will never
ERROR out.

Btw., blowfish seems more strict about this: playing around with it a
little i couldn't make it accept a garbaged password hash string like
the current example.

So, I am not sure what to do about this. Originally i had the feeling
we just throw an ERROR if the salt couldn't be evaluated properly, e.g
when we couldn't examine any salt by checking its length being larger
than 0. And '$' is not even a valid character within a salt string. But
then decided to go along with px_crypt_md5().

In short: If you throw a garbaged password hash string as the 2nd
argument to crypt() for md5/shacrypt, you currently might get something
obscure back.

I'd like to hear other opinions on this issue.

Bernd

#17Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bernd Helmle (#16)
Re: Modern SHA2- based password hashes for pgcrypto

On 2025-Jan-24, Bernd Helmle wrote:

So we behave exactly the same way as px_crypt_md5(): It stops after the
first '$' after the magic byte preamble. For shacrypt, this could be
the next '$' after the closing one of the non-mandatory 'rounds'
option, but with your example this doesn't happen since it gets never
parsed. The salt length will be set to 0.

IMO silently using no salt or 0 iterations because the input is somewhat
broken is bad security and should be rejected. If we did so in the past
without noticing, that's bad already, but we should not replicate that
behavior any further.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem." (Tom Lane)

#18Japin Li
japinli@hotmail.com
In reply to: Alvaro Herrera (#17)
Re: Modern SHA2- based password hashes for pgcrypto

On Fri, 24 Jan 2025 at 19:06, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2025-Jan-24, Bernd Helmle wrote:

So we behave exactly the same way as px_crypt_md5(): It stops after the
first '$' after the magic byte preamble. For shacrypt, this could be
the next '$' after the closing one of the non-mandatory 'rounds'
option, but with your example this doesn't happen since it gets never
parsed. The salt length will be set to 0.

IMO silently using no salt or 0 iterations because the input is somewhat
broken is bad security and should be rejected. If we did so in the past
without noticing, that's bad already, but we should not replicate that
behavior any further.

I agree with this point, so maybe we should fix this for px_crypt_md().

--
Regrads,
Japin Li

#19Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#17)
Re: Modern SHA2- based password hashes for pgcrypto

Am Freitag, dem 24.01.2025 um 19:06 +0100 schrieb Alvaro Herrera:

So we behave exactly the same way as px_crypt_md5(): It stops after
the
first '$' after the magic byte preamble. For shacrypt, this could
be
the next '$' after the closing one of the non-mandatory 'rounds'
option, but with your example this doesn't happen since it gets
never
parsed. The salt length will be set to 0.

IMO silently using no salt or 0 iterations because the input is
somewhat
broken is bad security and should be rejected.  If we did so in the
past
without noticing, that's bad already, but we should not replicate
that
behavior any further.

Following Japin's example we parse the broken input after the magic
byte into

$6$rounds=10000$/Zg436s2vmTwsoSz

So this is what gets passed over to the next steps after extracting the
magic byte.

We then try to parse the optional rounds option, which is expected to
be at the very first characters at this stage, which isn't true. In
that case we fall back to default 5000 iterations. So no problem with
iterations here. Note that we don't output the rounds option in that
case. I adapted this from Drepper's code to be compatible with the
tests (it suppresses the rounds option in case the default is used). I
forgot to adjust the docs about this change.

But we don't extract the salt because of the first '$', which confused
the input handling that there isn't anything interesting for it. And
that get unnoticed currently, which might be a problem for the caller.
Well, you see that something is wrong when looking at the result, but
if that happens programmatically it might be hidden.

I've looked what others do:

Drepper's code with the input above produces the following:

$5$=10000$cWSkAaJa1mL912.HQqjIhODJ9b3S7hmgsb/k9Efp7.7

It parses the salt from the input as "=10000".

Python's passlib is very strict when it comes to supported characters
within a salt string. It rejects everything thats not matching '[./0-
9A-Za-z]'. So when you provide the example above you get

File "/usr/lib/python3.13/site-packages/passlib/utils/handlers.py",
line 1459, in _norm_salt
raise ValueError("invalid characters in %s salt" % cls.name)
ValueError: invalid characters in sha256_crypt salt

openssl-passwd does this:

echo test | openssl passwd -5 -stdin -salt
'$6$rounds=10000$/Zg436s2vmTwsoSz'

$5$$6$rounds=10000$$BFtTxJrvpb/ra8SnnXkiNCJ1HGZ3OOmwvyQ2TJZx44B

It absorbs everything up to 16 bytes and uses that as the salt string.

Interestingly, python-passlib and Drepper's code accept both empty
salt, openssl-passwd seems to be buggy:

echo test | openssl passwd -5 -stdin -salt ''
<NULL>

So at least, they do *something* with the input and don't silently omit
salting the input passphrase, but the results aren't the same hash.

So i think we have two options: adapt Drepper's code and throw away
everything or be very strict like python-passlib. I myself can find
support for both, so not sure.

Bernd

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bernd Helmle (#19)
Re: Modern SHA2- based password hashes for pgcrypto

On 2025-Jan-28, Bernd Helmle wrote:

Python's passlib is very strict when it comes to supported characters
within a salt string. It rejects everything thats not matching '[./0-
9A-Za-z]'. So when you provide the example above you get

The reason it uses these chars is that in their scheme the salt bytes
are base64-encoded.

The passlib docs has this page about the "modular crypt format":
https://passlib.readthedocs.io/en/stable/modular_crypt_format.html

and they point this other page as a "modern, non-ambiguous standard":
https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md
About the salt, this last document says:

The role of salts is to achieve uniqueness. A random salt is fine for
that as long as its length is sufficient; a 16-byte salt would work
well (by definition, UUID are very good salts, and they encode over
exactly 16 bytes). 16 bytes encode as 22 characters in B64. Functions
should disallow salt values that are too small for security (4 bytes
should be viewed as an absolute minimum).

This "Password Hashing Competition" organization hardly seems an
authority though. It'd be great to have an IETF standard about this ...

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#21Japin Li
japinli@hotmail.com
In reply to: Alvaro Herrera (#20)
#22Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Japin Li (#21)
#23Japin Li
japinli@hotmail.com
In reply to: Alvaro Herrera (#22)
#24Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#22)
#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bernd Helmle (#24)
#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#25)
#27Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#25)
#28Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bernd Helmle (#27)
#29Bernd Helmle
mailings@oopsware.de
In reply to: Alvaro Herrera (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#28)
#31Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#28)
#32Bernd Helmle
mailings@oopsware.de
In reply to: Tom Lane (#30)
#33Bernd Helmle
mailings@oopsware.de
In reply to: Andres Freund (#31)
#34Bernd Helmle
mailings@oopsware.de
In reply to: Tom Lane (#30)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bernd Helmle (#34)
#36Andres Freund
andres@anarazel.de
In reply to: Bernd Helmle (#33)
#37Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#35)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#35)