Salt in encrypted password in pg_shadow
I read that the password hash in pg_shadow is salted with username. Is
this still the case? If so, since probably 99% of all PostgreSQL has
"postgres" as the superuser name, wouldn't it be better to use standard
Unix/Apache MD5 hash instead?
--
dave
David Garamond <lists@zara.6.isreserved.com> writes:
I read that the password hash in pg_shadow is salted with username. Is
this still the case? If so, since probably 99% of all PostgreSQL has
"postgres" as the superuser name, wouldn't it be better to use standard
Unix/Apache MD5 hash instead?
How does that improve anything? If we add a random salt into it, we'd
have to store the salt in pg_shadow, so there wouldn't be any secrecy
added --- an attacker who can read pg_shadow could see the salt too.
(Actually, an attacker who can read pg_shadow is already superuser,
so it's not clear there's anything left to hide from him anyway.)
regards, tom lane
Tom Lane wrote:
I read that the password hash in pg_shadow is salted with username. Is
this still the case? If so, since probably 99% of all PostgreSQL has
"postgres" as the superuser name, wouldn't it be better to use standard
Unix/Apache MD5 hash instead?How does that improve anything? If we add a random salt into it, we'd
have to store the salt in pg_shadow, so there wouldn't be any secrecy
added --- an attacker who can read pg_shadow could see the salt too.
Consider someone who creates a long list of:
MD5( "postgres" + "aaaaaaaa" )
MD5( "postgres" + "aaaaaaab" )
MD5( "postgres" + "aaaaaaac" )
...
Now if he has access to other people's pg_shadow, he can compare the
hashes with his dictionary. Replacing "postgres" with a random salt
defeats this dictionary attack (and thus he will have to resort to brute
force).
(Actually, an attacker who can read pg_shadow is already superuser,
so it's not clear there's anything left to hide from him anyway.)
But consider someone who finds a harddisk or tape containing a database
backup... he can then gain access to the real, online database.
--
dave
David Garamond wrote:
Consider someone who creates a long list of:
MD5( "postgres" + "aaaaaaaa" )
MD5( "postgres" + "aaaaaaab" )
MD5( "postgres" + "aaaaaaac" )
...Now if he has access to other people's pg_shadow, he can compare the
hashes with his dictionary. Replacing "postgres" with a random salt
defeats this dictionary attack (and thus he will have to resort to brute
force).
But surely you have to store the random salt in pg_shadow too? Or am I
missing something?
--
Richard Huxton
Archonet Ltd
Richard Huxton <dev@archonet.com> writes:
David Garamond wrote:
Consider someone who creates a long list of:
MD5( "postgres" + "aaaaaaaa" )
MD5( "postgres" + "aaaaaaab" )
MD5( "postgres" + "aaaaaaac" )
But surely you have to store the random salt in pg_shadow too? Or am I
missing something?
I think David is suggesting that the hypothetical attacker could gain
economies of scale in multiple attacks (ie, if he'd been able to steal
the contents of multiple installations' pg_shadow, he'd only need to
generate his long list of precalculated hashes once). I think this is
too far-fetched to justify an authentication protocol change though.
Also, MD5 hashing is fast enough that I'm not sure the above is really
significantly cheaper than a straight brute-force attack, ie, you just
take your list of possible passwords and compute the hashes on the fly.
The hashes are going to be much longer than the average real-world
password, so reading in a list of hashes is going to take several times
as much I/O as reading the passwords --- seems to me that it'd be
cheaper just to re-hash each password.
regards, tom lane
Tom Lane wrote:
I think David is suggesting that the hypothetical attacker could gain
economies of scale in multiple attacks (ie, if he'd been able to steal
the contents of multiple installations' pg_shadow, he'd only need to
generate his long list of precalculated hashes once). I think this is
too far-fetched to justify an authentication protocol change though.Also, MD5 hashing is fast enough that I'm not sure the above is really
significantly cheaper than a straight brute-force attack, ie, you just
take your list of possible passwords and compute the hashes on the fly.
The hashes are going to be much longer than the average real-world
password, so reading in a list of hashes is going to take several times
as much I/O as reading the passwords --- seems to me that it'd be
cheaper just to re-hash each password.
Many people use short and easy-to-guess passwords (remember we're not
talking about the superuser only here), so the dictionary attack can be
more effective than people think. And considering many people use the
same password for several things, Postgres could become one of the easy
sources to get/guess people's plaintext passwords from hacked machines.
At least Apache and Unix have been random-salting passwords for a while now.
However, I realize this will break the current MD5 hash, probably too
painful to do at the moment. Perhaps for the future, non-MD5 hash...
--
dave
David Garamond <lists@zara.6.isreserved.com> writes:
Tom Lane wrote:
Also, MD5 hashing is fast enough that I'm not sure the above is really
significantly cheaper than a straight brute-force attack, ie, you just
take your list of possible passwords and compute the hashes on the fly.
The hashes are going to be much longer than the average real-world
password, so reading in a list of hashes is going to take several times
as much I/O as reading the passwords --- seems to me that it'd be
cheaper just to re-hash each password.
Many people use short and easy-to-guess passwords (remember we're not
talking about the superuser only here), so the dictionary attack can be
more effective than people think.
And that responds to the speed argument how? I quite agree that a
guessable password is risky, but putting in a random salt offers no
real advantage if the salt has to be stored in the same place as the
encrypted password.
regards, tom lane
On Tue, Sep 07, 2004 at 03:09:28PM -0400, Tom Lane wrote:
David Garamond <lists@zara.6.isreserved.com> writes:
Tom Lane wrote:
Also, MD5 hashing is fast enough that I'm not sure the above is really
significantly cheaper than a straight brute-force attack, ie, you just
take your list of possible passwords and compute the hashes on the fly.
The hashes are going to be much longer than the average real-world
password, so reading in a list of hashes is going to take several times
as much I/O as reading the passwords --- seems to me that it'd be
cheaper just to re-hash each password.Many people use short and easy-to-guess passwords (remember we're not
talking about the superuser only here), so the dictionary attack can be
more effective than people think.And that responds to the speed argument how? I quite agree that a
guessable password is risky, but putting in a random salt offers no
real advantage if the salt has to be stored in the same place as the
encrypted password.
The usual attack against hashed passwords is to use a dictionary
driven password generator to create a large number of passwords, find
the hash of each of those, then store the passwords on disk indexed by
the hash.
That's a one-time effort that can then be used in the future to crack
any number of password hashes extremely cheaply - given any hash you
can find the corresponding password, if you have one, with one index
lookup.
A random salt stored with the hashed password increases the storage
and precomputation time required by the size of the salt (so a 16 bit
salt would increase the storage and precomputation time needed by
a factor of 65536). That increase makes the pre-computed dictionary
attack pretty much infeasible.
Cheers,
Steve
Tom Lane wrote:
Many people use short and easy-to-guess passwords (remember we're not
talking about the superuser only here), so the dictionary attack can be
more effective than people think.And that responds to the speed argument how? I quite agree that a
guessable password is risky, but putting in a random salt offers no
real advantage if the salt has to be stored in the same place as the
encrypted password.
Hm, I thought the purpose of salt is generally well understood? A
well-known string such as "postgres" is *not* a good salt at all.
Here's a couple of pages that hopefully can explain better:
http://en.wikipedia.org/wiki/Dictionary_attack
http://en.wikipedia.org/wiki/Salt_(cryptography)
--
dave
Import Notes
Reply to msg id not found: 29120938.1094585291287.JavaMail.root@hercules.ponder-stibbons.com
Steve Atkins <steve@blighty.com> writes:
A random salt stored with the hashed password increases the storage
and precomputation time required by the size of the salt (so a 16 bit
salt would increase the storage and precomputation time needed by
a factor of 65536). That increase makes the pre-computed dictionary
attack pretty much infeasible.
[ raised eyebrow... ] It is not immediately obvious that a factor of
2^16 makes the difference between feasible and infeasible. As
counterexamples, if it would otherwise take you one microsecond to break
the password, 64 milliseconds isn't going to scare you; if it would
otherwise take you a century to break the password, raising it to
64k centuries isn't going to make for a very meaningful improvement in
security either.
Show me a scheme where the random salt isn't stored right beside the
password, and I might start to get interested.
regards, tom lane
David Garamond <lists@zara.6.isreserved.com> writes:
Hm, I thought the purpose of salt is generally well understood?
Apparently not.
The purpose of salting the encrypted passwords in pg_shadow is *not* to
protect them against attackers who have somehow managed to
illegitimately read pg_shadow. (As I explained before, such attackers
are effectively superuser already, and so protecting the superuser
password from them is not nearly as interesting as all that.) The
purpose is to prevent unscrupulous DBAs from deducing the cleartext
passwords being used by their users. Since the users presumably are not
all named "postgres", the argument you are advancing is not relevant.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes:
Steve Atkins <steve@blighty.com> writes:
A random salt stored with the hashed password increases the storage
...
[ raised eyebrow... ] It is not immediately obvious that a factor of
2^16 makes the difference between feasible and infeasible. As
counterexamples, if it would otherwise take you one microsecond to break
the password, 64 milliseconds isn't going to scare you; if it would
otherwise take you a century to break the password, raising it to
64k centuries isn't going to make for a very meaningful improvement in
security either.
*Storage* requirements. There's a pretty big range in the middle where the
extra 2^16 does make the difference.
The entire premise of cryptographic hashes after all depends on the assumption
that you cannot simply store an index of every possible hash value with the
plain-text that generated it.
That's only true if the number of plain-texts of concern is large enough to
make storing the hash value of each impractical. That's not true for human
chosen guessable passwords.
Now it's true that if you only want to try the top 1,000 guessable passwords
and you store a dictionary of all those with all 2^16 possible salts then it
requires only a gigabyte of storage. Perhaps a four character random salt
would make more sense.
However with a known salt you only have to store the 1,000 hashes with the
known salt. You could instead store a dictionary of 64 million password
guesses in the same gigabyte.
There's no question using a predictable salt is bad idea, postgres may as well
have not bothered salting at all. But realistically it seems kind of
irrelevant. It seems pretty unlikely that someone will gain access to
pg_shadow on many postgres installs which is the only way a dictionary attack
becomes a worry.
pg_shadow is not a public /etc/passwd like on a traditional unix machine where
I first saw salted crypt strings and it doesn't have hundreds or thousands of
entries like a public unix machine (at least not with predictable salts). The
threat model just doesn't apply.
--
greg
On Tue, Sep 07, 2004 at 10:27:40PM -0400, Tom Lane wrote:
Steve Atkins <steve@blighty.com> writes:
A random salt stored with the hashed password increases the storage
and precomputation time required by the size of the salt (so a 16 bit
salt would increase the storage and precomputation time needed by
a factor of 65536). That increase makes the pre-computed dictionary
attack pretty much infeasible.[ raised eyebrow... ] It is not immediately obvious that a factor of
2^16 makes the difference between feasible and infeasible. As
counterexamples, if it would otherwise take you one microsecond to break
the password, 64 milliseconds isn't going to scare you; if it would
otherwise take you a century to break the password, raising it to
64k centuries isn't going to make for a very meaningful improvement in
security either.Show me a scheme where the random salt isn't stored right beside the
password, and I might start to get interested.
A password salt is a pretty well understood benefit. It's very, very
standard technology, and has been in use for decades. The primary use
for it is to increase the cost of a pre-computed dictionary attack.
While the value of that has decreased as the ratio of CPU speed to
mass storage size has changed it's still a significant
value. Particularly as it still applies to parallel attacks.
We're talking about one-way hashes here, again a fairly standard
technology. Given the hash of a password the only (currently known)
way to compute a password that will validate against it is to guess
a password, compute the hash, see if it's the same, repeat.
You're never going to 'decode' a password from the hash - it's about
guessing the right password. Some passwords will be well-chosen, some
will be poorly chosen, some will be in between. The ideal situation
is where I have a number of hashes and just need to find a password
to match one of them.
So, assume I have a password generator that will generate me an
endless stream of passwords, starting from the obvious ones and
getting increasingly complex - this is the usual approach, as used
by crack and other unix password file crackers.
As one example, say I have 1,000 hashes, that it takes me ten
milliseconds to hash a password and the easiest to guess password will
be the hundred millionth produced by my password generator.
I can simply calculate the hash of each password in turn and compare
that hash against each of the thousand hashes. That'll take me about
11.5 days to crack the simplest to guess account of that list of
a thouasand.
Now, lets say that instead I have a thousand hashes, and associated
with each hash is a salt, all different. That means that to test
each generated password I'll need to calculate not a single hash,
but a thousand hashes - one against the combination of my guessed
password and the first salt, then the combination of the passsword
and the second salt and so on.
To crack the simplest to guess account of the thousand accounts I
have access to in this case will take me a thousand times as long -
about 30 years.
That's an example of why a salt is still extremely valuable, despite
the change in CPU speed:storage speed/size ration
It actually takes around 300us to compute an MD5 hash on modern
hardware (which is fast enough that use of MD5 for password validation
isn't clearly a good idea anyway, but that's another issue[1]The other issue being: When the concept was originally deployed it took around half a second to calculate the hash. As hardware has got faster, the calculation has got faster and brute force attacks against a password file have become easier. MD5 is way too fast to be used as part of a password based system that has attack trees including compromise of the crypted password file. That's one reason that unixen use shadow password files rather than making /etc/passwd readable to all local users, to reduce the chance of the known hash attack occuring.) which
changes the math somewhat, but the principle is the same.
There are other benefits to a random salt too. Lets assume that
I have (by doing something dubious with a combination of google
and an insecure application server) ten thousand password files.
If no salt is used, or the same ('postgres') salt is used for
all of them then any two accounts with the same password will
have the same hashed value. That means I can identify two
people using the same password - if I can social engineer it
out of one of them I can use it to access the others account.
Cheers,
Steve
[1]: The other issue being: When the concept was originally deployed it took around half a second to calculate the hash. As hardware has got faster, the calculation has got faster and brute force attacks against a password file have become easier. MD5 is way too fast to be used as part of a password based system that has attack trees including compromise of the crypted password file. That's one reason that unixen use shadow password files rather than making /etc/passwd readable to all local users, to reduce the chance of the known hash attack occuring.
took around half a second to calculate the hash. As hardware has got
faster, the calculation has got faster and brute force attacks against
a password file have become easier. MD5 is way too fast to be used as
part of a password based system that has attack trees including
compromise of the crypted password file. That's one reason that
unixen use shadow password files rather than making /etc/passwd
readable to all local users, to reduce the chance of the known
hash attack occuring.
Import Notes
Reply to msg id not found: 23940670.1094610491722.JavaMail.root@hercules.ponder-stibbons.com
On Tue, Sep 07, 2004 at 08:48:13PM -0700, Steve Atkins wrote:
That's an example of why a salt is still extremely valuable, despite
the change in CPU speed:storage speed/size ration
But, to clarify, I don't see any practical problem in the current
PostgreSQL implementation. It's not particularly secure, but not much
worse than the underlying OS authentication. Most of the feasible
attack trees are going to start with compromising the OS platform, by
which point weaknesses in the postgresql authentication are fairly
meaningless.
If we need to tweak the authentication protocol _anyway_ at some
point it'd be great to improve things. But until then... not worth
the pain.
Cheers,
Steve
Greg Stark <gsstark@mit.edu> writes:
However with a known salt you only have to store the 1,000 hashes with the
known salt. You could instead store a dictionary of 64 million password
guesses in the same gigabyte.
This is still not responding to my original point though: if you know
the salt that was used, you can try brute-force scan of a few thousand
probable passwords in less CPU time than it will take to read a gigabyte
of precomputed hashes. The fact that common passwords are much shorter
than the fixed-size MD5 hashes works against you in a big way.
I think the only way for the defender to get any real traction is to not
store the random salt right next to the encrypted password, so that the
attacker who hypothetically has read pg_shadow still has to guess about
the salt that was used. If someone shows me a plausible way to do that,
I'm all ears.
The threat model just doesn't apply.
This we agree on ...
regards, tom lane
Steve Atkins <steve@blighty.com> writes:
If we need to tweak the authentication protocol _anyway_ at some
point it'd be great to improve things. But until then... not worth
the pain.
I've been hearing rumblings that MD5 and all other known crypto
protocols are known vulnerable since the latest crypto symposiums.
(Not that we didn't all suspect the NSA et al could break 'em, but
now they've told us exactly how they do it.)
So as soon as someone wheels up a new crypto hash method that looks
trustworthy, we can invent a new auth protocol and maybe throw in
another level of random salting while we're at it. But right now
I doubt it's worth the effort :-(
regards, tom lane
So as soon as someone wheels up a new crypto hash method that looks
trustworthy, we can invent a new auth protocol and maybe throw in
another level of random salting while we're at it. But right now
I doubt it's worth the effort :-(
A relatively simple enhancement would be to use some or all of the user
name as the salt. That makes reverse engineering the passwords a bit
harder because there are multiple salts being used.
I suspect that with the speed of modern microprocessors that nearly any
crypto scheme is breakable using a few thousand dollars worth of hardware
and a few hours of time. If the NSA has built in shortcuts to their
sanctioned algorithms, that just makes the cracking process faster.
I know of an ecryption technique developed by a friend of mine, a retired
mathematician, that would probably be sufficiently robust. It uses group
theory to permutate the bit field and has both reversible and
non-reversible forms.
It would probably be subject to export restrictions. As I recall, he
couldn't even send a copy of the program to his son in Greece without
State Department approval.
But as long as people use vulnerable passwords, there is no password
encryption scheme that isn't vulnerable to attack, with or without
salt.
Challenge/response and one-time password schemes are more secure but
a lot more hassle for users.
--
Mike Nolan
Tom Lane wrote:
Hm, I thought the purpose of salt is generally well understood?
Apparently not.
The purpose of salting the encrypted passwords in pg_shadow is *not* to
protect them against attackers who have somehow managed to
illegitimately read pg_shadow. (As I explained before, such attackers
are effectively superuser already, and so protecting the superuser
password from them is not nearly as interesting as all that.) The
purpose is to prevent unscrupulous DBAs from deducing the cleartext
passwords being used by their users. Since the users presumably are not
all named "postgres", the argument you are advancing is not relevant.
Then I'd say the purpose is wrong. There is not much hope in protecting
unscrupulous DBAs from getting their users' password anyway (they can
most probably sniff traffic, trap/log queries, or shut down postmaster
and replace it with a trojan binary).
The purpose of a salt, by most definition, should be to discourage
dictionary attack.
Anyway, I think we've agreed that the current protocol need not be
changed, on the basis of too much pain caused. But there's no reason not
to use random salt in future protocol. It offers the benefit you
mentioned plus protects against dictionary attack.
--
dave
Import Notes
Reply to msg id not found: 31531613.1094611391726.JavaMail.root@hercules.ponder-stibbons.com
Tom Lane <tgl@sss.pgh.pa.us> writes:
Greg Stark <gsstark@mit.edu> writes:
However with a known salt you only have to store the 1,000 hashes with the
known salt. You could instead store a dictionary of 64 million password
guesses in the same gigabyte.This is still not responding to my original point though: if you know
the salt that was used, you can try brute-force scan of a few thousand
probable passwords in less CPU time than it will take to read a gigabyte
of precomputed hashes. The fact that common passwords are much shorter
than the fixed-size MD5 hashes works against you in a big way.
We must be talking past each other. The threat model salts are meant to defend
against is someone who has access to the data in pg_shadow for many users.
Without the salts or with salts you can predict beforehand you look up the
hash value in your precomputed dictionary using an index and instantaneously
get a working password.
Postgres's current method is in fact doing it wrong. Because the salt is
predictable for the "postgres" users it doesn't protect against the attack
above. If I knew I could get access to lots of pg_shadow's, say I work at an
off-site backup storage company, I could check each of them instantaneously
against a precomputed index of hundreds of thousands of guessable passwords.
The same attack against a well salted hash would require running the entire
battery of hashes against each client's password individually.
The reason I say the threat model doesn't apply is only because it's unlikely
that someone would have access to many postgres installs's pg_shadow. That's
the only situation where that attack would arise. (Or if a given install had
hundreds of users with the same initial letters I guess, but that also seems
rare)
But if you're not going to worry about that threat then salting is buying you
nothing anyways. If you're going to use a salt you may as well use one that
accomplishes what it's there for.
--
greg
On Wed, Sep 08, 2004 at 00:33:39 -0400,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
I've been hearing rumblings that MD5 and all other known crypto
protocols are known vulnerable since the latest crypto symposiums.
(Not that we didn't all suspect the NSA et al could break 'em, but
now they've told us exactly how they do it.)
Things aren't currently that bad. So far people have found a way to find
two strings that give the same hash using MD5. They haven't yet found a way
to find a string which hashes to a given hash. SHA-0 was also shown to
have some weakness. From comments I have read, I don't think SHA-1 was
shown to have any weaknesses. One comment specifically mentioned that
the change made between SHA-0 and SHA-1 seems to have been made to address
the weakness found in SHA-0. I haven't read the source papers, so take this
all with a grain of salt.