Looking for advice on database encryption
What are folks doing to protect sensitive data in their databases?
We're running on the assumption that the _really_ sensitive data
is too sensitive for us to just trust the front-end programs that
connect to it.
The decision coming down from on-high is that we need to encrypt
certain fields. That's fine, looked at pgcrypto, but found
the requirement to use pgp on the command line for key management
to be a problem.
So we're trying to implement the encryption in the front-end, but
the problem we're having is searching on the encrypted fields. Since
we have to decrypt each field to search on it, queries that previously
took seconds now take minutes (or worse).
We've tested a number of cryptographic accelerator products. In
case nobody else has tried this, let me give away the ending: none
that we've found are any faster than a typical server CPU.
So, it's a pretty open-ended question, since we're still pretty open
to different approaches, but how are others approaching this problem?
The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.
--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/
Bill Moran wrote on 16.04.2009 21:40:
The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.
I'm by far not an expert, but my naive attempt would be to store the the
database files in an encrypted filesystem.
Thomas
Bill Moran wrote:
What are folks doing to protect sensitive data in their databases?
I would probably do my encryption in the application layer, and only
encrypt the sensitive fields. fields used as indexes probably should
not be encrypted, unless the only index operation is EQ/NE, then you
could use the encrypted index value as the search key. this would even
work for foreign key relations.
of course, if part of your cryptography regimen involves key expiration
and rotation, there'd be the hellacious problem of decrypting/reencryption.
it really all depends on what the security requirements are.
-somewhere- there's a weak spot, in the above model, its the application
server thats doing the cryptography, if it gets compromised, then the
keys can be extracted, and all bets are off.
In response to Thomas Kellerer <spam_eater@gmx.net>:
Bill Moran wrote on 16.04.2009 21:40:
The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.I'm by far not an expert, but my naive attempt would be to store the the
database files in an encrypted filesystem.
That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs. In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.
--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/
On Thu, April 16, 2009 13:20, Bill Moran wrote:
In response to Thomas Kellerer <spam_eater@gmx.net>:
Bill Moran wrote on 16.04.2009 21:40:
The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.I'm by far not an expert, but my naive attempt would be to store the the
database files in an encrypted filesystem.That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs. In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
I'll chime in here, even though I probably shouldn't. A lot is dependent
on what standard you're trying to meet. General Security (and Common
Sense) vs PCI/DSS vs NSA/DoD vs some other standard.
Do you need to decrypt the values once they're in the system?
Do you need the items in an index?
Do the values need to be part of a constraint / foreign key relationship
(because a hashed value may cause you a lot of headaches!)?
Look at these different scenarios and think about the data (both in
encrypted format and unencrypted format) before you decide HOW you want to
do it.
Tim
--
Timothy J. Bruce
On Apr 16, 2009, at 12:40 PM, Bill Moran wrote:
(This is the traditional "you're asking the wrong question" response).
What are folks doing to protect sensitive data in their databases?
I don't think that's a useful way to look at it. Protecting sensitive
data in the entire system, where the database is just one
part of that system is likely to lead to a much better answer.
We're running on the assumption that the _really_ sensitive data
is too sensitive for us to just trust the front-end programs that
connect to it.The decision coming down from on-high is that we need to encrypt
certain fields.
If that's the mandate, then that's what you have to do. It's unlikely to
make the system overall much more secure, and likely no more secure
than some much less intrusive approaches, though.
That's fine, looked at pgcrypto, but found
the requirement to use pgp on the command line for key management
to be a problem.So we're trying to implement the encryption in the front-end, but
the problem we're having is searching on the encrypted fields. Since
we have to decrypt each field to search on it, queries that previously
took seconds now take minutes (or worse).We've tested a number of cryptographic accelerator products. In
case nobody else has tried this, let me give away the ending: none
that we've found are any faster than a typical server CPU.So, it's a pretty open-ended question, since we're still pretty open
to different approaches, but how are others approaching this problem?The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.
If the concern is database dumps, then encrypting the output of
pg_dump will pretty much solve the problem. But if the attack
vector is the common one of compromising the front end, then
encrypting data in the database, but allowing the front end to
decrypt it is likely useless. If the concern is "what if an attacker
got access to the server?" then physical security is likely to have
much better ROI than some random encryption regime.
Can you go back and ask your management what their actual
security or compliance needs are?
If it's a real business need you probably want to find a decent
security guy and have him draft the questions that management
need to answer and start from there, rather than trying to clean
up after someone has already made 95% of the decisions, in
an uninformed way, for you.
Cheers,
Steve
Bill Moran wrote on 16.04.2009 22:20:
I'm by far not an expert, but my naive attempt would be to store the the
database files in an encrypted filesystem.That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs. In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.
Which is something different than your statement
The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.
which only talks about someone getting hold of the contents of the server's
harddisk.
As you have to ultimately decrypt the data to display it to the user, he can
always take a screenshot (or copy & paste the text from the web front end) and
walk away. He doesn't even need to use some SQL injection.
To prevent SQL injection there are pretty robust solutions for this (prepared
statements, sanitizing and cleaning any user input, maybe even control the
access to the data by stored procedures which can add an additional layer of
security)
I agree with Kenneth: you need to be more precise on which scenario you have to
deal with.
Thomas
In response to Steve Atkins <steve@blighty.com>:
On Apr 16, 2009, at 12:40 PM, Bill Moran wrote:
(This is the traditional "you're asking the wrong question" response).
What are folks doing to protect sensitive data in their databases?
I don't think that's a useful way to look at it. Protecting sensitive
data in the entire system, where the database is just one
part of that system is likely to lead to a much better answer.
<snip>
I disagree. We're already addressing the issues of security on the
application level through extensive testing, data validation out the
wazoo (to prevent SQL Injection and other application breaches). All
our servers are in highly secure data centers. We have VPNs and
access restrictions at the IP and the user level to the 9s.
It's still not enough.
My task here is to develop a system to protect the data in the event
that all of those fail. As a result, I'm looking for general advice.
I already have a system in place. This is apparently another part
that I should have described in more detail. So, here goes:
To draw a parallel example on the application:
Imagine that you're an employee in a business. When you're hired, you
enter your SSN into the company database. Now, your department manager
needs to have access to your SSN for various reasons, so the system
grants access to your encryption key to the department manager. Based
on system policy, the division manager has access to all the data in
the department, and the company head has access to all divisions. As
a result, the company head can get your SSN out of the database using
the passphrase for his key.
However, Joe, over in IT can not access your SSN. Even though he's the
DBA and can pull a full text dump of the database at will, he can not
decrypt your SSN unless he has a passphrase to one of the keys that
can decrypt it.
All that is pretty standard PKI stuff, and I've created the tables and
the functions that implement it.
The problem comes when the company head wants to search through the
database to find out which employee has a specific SSN. He should
be able to do so, since he has access to everything, but the logistics of
doing so in a reasonable amount of time are rather complex and very
time consuming. On a million rows with the SSN unencrypted, such a
query would take less than a second with an appropriate index, but
pulling those million rows into the application in order to decrypt
each one and see if it matches can easily take a half hour or longer.
That's where we're having difficulty. Our requirements are that the
data must be strongly protected, but the appropriate people must be
able to do (often complex) searches on it that complete in record
time.
--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/
In response to Thomas Kellerer <spam_eater@gmx.net>:
Bill Moran wrote on 16.04.2009 22:20:
I'm by far not an expert, but my naive attempt would be to store the the
database files in an encrypted filesystem.That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs. In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.Which is something different than your statement
The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.which only talks about someone getting hold of the contents of the server's
harddisk.
Not really. You're making an assumption that a pg_dump can only be
run on the server itself. Let's chalk this up to miscommunication
and allow me to rephrase:
The data needs to be encrypted in such a way that if an attacker can
get an offline copy of the data by any means, they have no greater
access to the data that they would if they used the application to
access it.
I already have that using PKI. Again, it seems that I left too many
details out of my description of the problem. See my post in
response to Steve Atkins for a more detailed description of the
problem, and I apologize for being too vague the first go-round.
--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/
Bill Moran wrote on 16.04.2009 23:06:
which only talks about someone getting hold of the contents of the server's
harddisk.Not really. You're making an assumption that a pg_dump can only be
run on the server itself.
Right, I forgot that.
But then it's similar to the situation where the user displays the data and
walks away with the screenshot...
If you have an application server sitting in the middle you can limit
connections to the database to the app server itself. Or even put the appserver
on the same box as the database server and limit connections only to localhost.
In that case the attacker needs to be able to log-in to the server directly.
and I apologize for being too vague the first go-round.
No problem. This happens to me all the time. Once a discussion starts about a
topic I find myself wondering how I could forget all the details that I'm being
asked about ;)
Thomas
Couldn't you just add a PGP based column (or similar encryption
protocol) for authentication? This would protect you against injection
attacks, would it not?
You could also use PGP or similar for key management if I'm not
mistaken.
-Will
-----Original Message-----
In response to Thomas Kellerer <spam_eater@gmx.net>:
That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs. In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.
Bill Moran wrote:
The problem comes when the company head wants to search through the
database to find out which employee has a specific SSN. He should
be able to do so, since he has access to everything, but the logistics of
doing so in a reasonable amount of time are rather complex and very
time consuming. On a million rows with the SSN unencrypted, such a
query would take less than a second with an appropriate index, but
pulling those million rows into the application in order to decrypt
each one and see if it matches can easily take a half hour or longer.That's where we're having difficulty. Our requirements are that the
data must be strongly protected, but the appropriate people must be
able to do (often complex) searches on it that complete in record
time.
an index on the encrypted SSN field would do this just fine. if
authorized person needs to find the record with a specific SSN, they
encrypt that SSN and then look up the ciphertext in the database... done.
If the purpose of encrypting the data is just to keep prying eyes from decerning what that data is then a simple encryption can be coded. something like adding 128 or 256, depending on the character set, to each of the chr(value) for each of the characters in the string should work just fine. You could also use a bitwise shift or xor to change the value of each character.
If the purpose of encryption is for financial or medica data transmission security, or something of a higher order, you may want to implement a stronger type of security such as SSL or PGP or some other type of public/private key process.
You could create a schema that contains views of the data with out the sensitive data and have the users use that schema for their needs, assumes that it basically used to view or report on the data.
Just some thoughts.
Michael Black
Show quoted text
Date: Thu, 16 Apr 2009 15:40:12 -0400
From: wmoran@potentialtech.com
To: pgsql-general@postgresql.org
Subject: [GENERAL] Looking for advice on database encryptionWhat are folks doing to protect sensitive data in their databases?
We're running on the assumption that the _really_ sensitive data
is too sensitive for us to just trust the front-end programs that
connect to it.The decision coming down from on-high is that we need to encrypt
certain fields. That's fine, looked at pgcrypto, but found
the requirement to use pgp on the command line for key management
to be a problem.So we're trying to implement the encryption in the front-end, but
the problem we're having is searching on the encrypted fields. Since
we have to decrypt each field to search on it, queries that previously
took seconds now take minutes (or worse).We've tested a number of cryptographic accelerator products. In
case nobody else has tried this, let me give away the ending: none
that we've found are any faster than a typical server CPU.So, it's a pretty open-ended question, since we're still pretty open
to different approaches, but how are others approaching this problem?The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu Apr 16 05:06 PM, Bill Moran wrote:
The problem comes when the company head wants to search through the
database to find out which employee has a specific SSN. He should be
able to do so, since he has access to everything, but the logistics of
doing so in a reasonable amount of time are rather complex and very
time consuming. On a million rows with the SSN unencrypted, such a
query would take less than a second with an appropriate index, but
pulling those million rows into the application in order to decrypt
each one and see if it matches can easily take a half hour or longer.That's where we're having difficulty. Our requirements are that the
data must be strongly protected, but the appropriate people must be
able to do (often complex) searches on it that complete in record time.--
Would storing a one-way hash of the SSN work for you? i.e. combine sha1
and/or md5, use a salt...
SELECT ssn_encrypted FROM employees WHERE ssn_hash =
yourhashmethod(SSN_PLAINTEXT)
So you have both an encrypted version of the SSN and a one-way hash of it.
That's how we store credit card numbers.
That's where we're having difficulty. Our requirements are that the
data must be strongly protected, but the appropriate people must be
able to do (often complex) searches on it that complete in record
time.an index on the encrypted SSN field would do this just fine. if
authorized person needs to find the record with a specific SSN, they
encrypt that SSN and then look up the ciphertext in the database...
done.
This will only work for e(lectronic?) code book ciphers, and not
chained block ciphers, since the initialization vector will randomize
the output of the encryption so that E(foo) != E(foo) just to prevent
this sort of attack.
You're looking for a hash function, since that's a one way, stable
function meant for comparing.
eric
Eric Soroos wrote:
an index on the encrypted SSN field would do this just fine. if
authorized person needs to find the record with a specific SSN, they
encrypt that SSN and then look up the ciphertext in the database...
done.This will only work for e(lectronic?) code book ciphers, and not
chained block ciphers, since the initialization vector will randomize
the output of the encryption so that E(foo) != E(foo) just to prevent
this sort of attack.
can those sorts of chained block ciphers decode blocks in a different
order than they were originally encoded? for this sort of
application, wouldn't each field or record pretty much have to be
encrypted discretely so that they can be decrypted in any order, or any
single record be decrypted on its own?
Thomas Kellerer <spam_eater@gmx.net> wrote:
Bill Moran wrote on 16.04.2009 23:06:
which only talks about someone getting hold of the contents of the server's
harddisk.Not really. You're making an assumption that a pg_dump can only be
run on the server itself.Right, I forgot that.
But then it's similar to the situation where the user displays the data and
walks away with the screenshot...
Actually, it's completely different. If a user walks away with a screenshot
of data that they had access to anyway, then the application developer is
not culpable.
However, if a flaw is found in the application and a user can use it to
gain escalated privs and access data that would normally not be available,
the application developer is going out of business.
If a user finds a flaw, but it simply result in an error because the layer
of security behind it prevents an information leak, then the application
developer doesn't look very bad at all. Layered security saves the day!
If you have an application server sitting in the middle you can limit
connections to the database to the app server itself. Or even put the appserver
on the same box as the database server and limit connections only to localhost.
In that case the attacker needs to be able to log-in to the server directly.
You're assuming that the application is perfect. With the data we're
protecting, we don't have that luxury.
This isn't a particularly new view of security. CERT has hundreds or pages
documented on how this is correct security practice. If it wasn't there
wouldn't need to be firewalls between Windows servers and the Internet.
The part that's unique (from my experience) is the demand that the data
be so readily assessable. Usually, highly secure data is understood to
be difficult to access, but that understanding doesn't exist in this
market. It's an unreasonable expectation on the part of our clients, to
be honest, but if we can find a way to meet it, we leave the competition
in the dust.
Thanks for the feedback so far.
--
Bill Moran
http://www.potentialtech.com
"Will Rutherdale (rutherw)" <rutherw@cisco.com> wrote:
Couldn't you just add a PGP based column (or similar encryption
protocol) for authentication? This would protect you against injection
attacks, would it not?You could also use PGP or similar for key management if I'm not
mistaken.
Thanks for the input, Will. We're already doing this, the problem we've
had is that the time to decrypt the data is making access too slow.
Basically, people administrators need to be able to say, "show me all the
registrants whose personal medical information is x" and get results in
a reasonable amount of time. Decrypting the data to do the matching is
about 100x slower than a typical seq scan.
To give you an idea of what we've tried, I've tried pgcrypto, openssl with
rc4, des and 3des, using envelope encryption, and raw aes-128 symmetrical
encryption. In addition, we've purchased two different hardware
accelerators for crypto to find that both of them are slower than the
CPU itself, and they're both the high-end "enterprise" class cards.
--
Bill Moran
http://www.potentialtech.com
Michael Black <michaelblack75052@hotmail.com> wrote:
If the purpose of encryption is for financial or medica data transmission security, or something of a higher order, you may want to implement a stronger type of security such as SSL or PGP or some other type of public/private key process.
You could create a schema that contains views of the data with out the sensitive data and have the users use that schema for their needs, assumes that it basically used to view or report on the data.
Thanks for the input, Michael. We're already working on using PKI, the
big problem we're having is the speed of access when an administrator
needs to search through the encrypted data.
--
Bill Moran
http://www.potentialtech.com
John R Pierce <pierce@hogranch.com> wrote:
Eric Soroos wrote:
an index on the encrypted SSN field would do this just fine. if
authorized person needs to find the record with a specific SSN, they
encrypt that SSN and then look up the ciphertext in the database...
done.This will only work for e(lectronic?) code book ciphers, and not
chained block ciphers, since the initialization vector will randomize
the output of the encryption so that E(foo) != E(foo) just to prevent
this sort of attack.can those sorts of chained block ciphers decode blocks in a different
order than they were originally encoded? for this sort of
application, wouldn't each field or record pretty much have to be
encrypted discretely so that they can be decrypted in any order, or any
single record be decrypted on its own?
Eric is right about CBC ciphers. The problem is that any function that
will produce the same output for the same input (such as md5 or sha) leaves
us open to brute force attacks if the number of choices is small, or
pattern discovery attacks in other cases. And anything that protects
us against such attacks (such as aes-cbc) will generate data that I
can't pre-encrypt and search against.
I haven't tried it, but I don't believe CBC ciphers can decrypt data out
of order.
In the implementation I've built, the IV is stored with the ciphertext,
much the same way that crypt() stores the salt with the password hash.
As a result, if you have the key, you then have all the data required
to decrypt the field, but you can't easily brute force it or do any
pattern analysis.
--
Bill Moran
http://www.potentialtech.com