Index on a Decrypt / Bytea2Text Function

Started by Anthony Presleyalmost 16 years ago8 messagesgeneral
Jump to latest
#1Anthony Presley
anthony@resolution.com

Hi all,

We tend to do a lot of lookups on our database that look something like:

select
e.id
from
employee e ,app_user au
where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

The analyze here looks like:

explain analyze select e.id from employee e ,app_user au where

au.id=user_id and au.corporation_id=41197 and e.ssn is not null and
e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))='188622250';

QUERY
PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual
time=24.591..192.435 rows=1 loops=1)
-> Index Scan using emp_del on employee e (cost=0.00..18625.99
rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
Index Cond: (deleted = 'N'::bpchar)
Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
-> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36
rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
Index Cond: (au.id = e.user_id)
Filter: (au.corporation_id = 41197)
Total runtime: 192.565 ms
(8 rows)

It would appear that almost 100% of this time is taken up by doing the
bytea2text and decrypt() functions.

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Thanks!

--
Anthony

#2Thom Brown
thombrown@gmail.com
In reply to: Anthony Presley (#1)
Re: Index on a Decrypt / Bytea2Text Function

On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:

Hi all,

We tend to do a lot of lookups on our database that look something like:

select
       e.id
from
employee e ,app_user au
       where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

The analyze here looks like:

explain analyze select e.id from employee e ,app_user au where

au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))='188622250';

QUERY
PLAN
--------------------------------------------------------------------------
 Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
time=24.591..192.435 rows=1 loops=1)
  ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
        Index Cond: (deleted = 'N'::bpchar)
        Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
  ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
        Index Cond: (au.id = e.user_id)
        Filter: (au.corporation_id = 41197)
 Total runtime: 192.565 ms
(8 rows)

It would appear that almost 100% of this time is taken up by doing the
bytea2text and decrypt() functions.

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Thanks!

--
Anthony

Would the following work?:

CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf'))

Thom

#3Bill Moran
wmoran@potentialtech.com
In reply to: Anthony Presley (#1)
Re: Index on a Decrypt / Bytea2Text Function

In response to Anthony Presley <anthony@resolution.com>:

Hi all,

We tend to do a lot of lookups on our database that look something like:

select
e.id
from
employee e ,app_user au
where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

The analyze here looks like:

explain analyze select e.id from employee e ,app_user au where

au.id=user_id and au.corporation_id=41197 and e.ssn is not null and
e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))='188622250';

QUERY
PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual
time=24.591..192.435 rows=1 loops=1)
-> Index Scan using emp_del on employee e (cost=0.00..18625.99
rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
Index Cond: (deleted = 'N'::bpchar)
Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
-> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36
rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
Index Cond: (au.id = e.user_id)
Filter: (au.corporation_id = 41197)
Total runtime: 192.565 ms
(8 rows)

It would appear that almost 100% of this time is taken up by doing the
bytea2text and decrypt() functions.

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

The best way would be to unencrypt the column and use a normal index.

Since you're simply using a value in another column as the key anyway,
your design has created all the performance headaches of encryption
with no actual security.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

#4Thom Brown
thombrown@gmail.com
In reply to: Bill Moran (#3)
Re: Index on a Decrypt / Bytea2Text Function

On 14 July 2010 20:32, Bill Moran <wmoran@potentialtech.com> wrote:

In response to Anthony Presley <anthony@resolution.com>:

Hi all,

We tend to do a lot of lookups on our database that look something like:

select
      e.id
from
employee e ,app_user au
      where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

The analyze here looks like:

explain analyze select e.id from employee e ,app_user au where

au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))='188622250';

QUERY
PLAN
--------------------------------------------------------------------------
 Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
time=24.591..192.435 rows=1 loops=1)
   ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
         Index Cond: (deleted = 'N'::bpchar)
         Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
   ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
         Index Cond: (au.id = e.user_id)
         Filter: (au.corporation_id = 41197)
 Total runtime: 192.565 ms
(8 rows)

It would appear that almost 100% of this time is taken up by doing the
bytea2text and decrypt() functions.

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

The best way would be to unencrypt the column and use a normal index.

Since you're simply using a value in another column as the key anyway,
your design has created all the performance headaches of encryption
with no actual security.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

--

Yes, I immediately thought about what's actually happening as soon as
I sent the last message. Forget the functional index.

Thom

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thom Brown (#2)
Re: Index on a Decrypt / Bytea2Text Function

Thom Brown <thombrown@gmail.com> writes:

On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:

select
e.id
from
employee e ,app_user au
where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Would the following work?:

CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf'))

That would work as far as speeding up the query goes. However, as Bill
Moran points out nearby, the query reveals a totally incompetent
security design. There is no value to speak of in encrypting a data
value and then storing the decryption key right beside it. Perhaps the
excuse is to not have the SSN in cleartext on disk, nevermind whether a
halfway competent attacker could get it back --- but even with that
barely-useful goal, you do *not* want an index like this, because all
the index entries will be cleartext SSNs.

What you really need is to take two steps back and figure out why you
want to encrypt this data and what threats you intend to protect
against. It's probably possible to make a credibly-secure design that
runs faster than this does, but there's no point at all in improving
the performance of a fundamentally broken design.

regards, tom lane

#6Anthony Presley
anthony@resolution.com
In reply to: Tom Lane (#5)
Re: Index on a Decrypt / Bytea2Text Function

On Wed, 2010-07-14 at 15:56 -0400, Tom Lane wrote:

Thom Brown <thombrown@gmail.com> writes:

On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:

select
e.id
from
employee e ,app_user au
where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Would the following work?:

CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf'))

That would work as far as speeding up the query goes. However, as Bill
Moran points out nearby, the query reveals a totally incompetent
security design. There is no value to speak of in encrypting a data
value and then storing the decryption key right beside it. Perhaps the
excuse is to not have the SSN in cleartext on disk, nevermind whether a
halfway competent attacker could get it back --- but even with that
barely-useful goal, you do *not* want an index like this, because all
the index entries will be cleartext SSNs.

What you really need is to take two steps back and figure out why you
want to encrypt this data and what threats you intend to protect
against. It's probably possible to make a credibly-secure design that
runs faster than this does, but there's no point at all in improving
the performance of a fundamentally broken design.

regards, tom lane

Yes, you are right ... the security here serves no purpose other than to
not have SSN's stored on disk in an un-encrypted way. Unfortunately, we
need to be able to easily, and quickly, reverse the security, so that we
can get access to unencrypted data ... because our application does
export to payroll providers, and many of them still use SSN's as keys.
IE, storing the SSN in an encrypted manner (or using a one-way salt),
won't work.

IE, the reality is that our application has to be able to show / hide
the SSN, so someone breaking into the application (it is likely easier
to steal your manager's password than it would be to hack into the
server), would be able to access the data.

Even if we wanted to tackle *real* security here, I'm not sure how we'd
go about it. Encrypting any data on a web app would require that the
encryption key and/or salt be stored in some combination of the
database, or app code, which is all vulnerable if someone breaks into
and/or steals the server. There isn't a "client" piece, like you'd have
with Carbonite, etc...

--
Anthony

#7Anthony Presley
anthony@resolution.com
In reply to: Thom Brown (#2)
Re: Index on a Decrypt / Bytea2Text Function

On Wed, 2010-07-14 at 20:32 +0100, Thom Brown wrote:

On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:

Hi all,

We tend to do a lot of lookups on our database that look something like:

select
e.id
from
employee e ,app_user au
where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

The analyze here looks like:

explain analyze select e.id from employee e ,app_user au where

au.id=user_id and au.corporation_id=41197 and e.ssn is not null and
e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))='188622250';

QUERY
PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual
time=24.591..192.435 rows=1 loops=1)
-> Index Scan using emp_del on employee e (cost=0.00..18625.99
rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
Index Cond: (deleted = 'N'::bpchar)
Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
-> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36
rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
Index Cond: (au.id = e.user_id)
Filter: (au.corporation_id = 41197)
Total runtime: 192.565 ms
(8 rows)

It would appear that almost 100% of this time is taken up by doing the
bytea2text and decrypt() functions.

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Thanks!

--
Anthony

Would the following work?:

CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf'))

Thom

Unfortunately, that doesn't work:

# CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf')));
ERROR: functions in index expression must be marked IMMUTABLE

Guess we'll need to come up with something else.

--
Anthony

#8Bill Moran
wmoran@potentialtech.com
In reply to: Anthony Presley (#6)
Re: Index on a Decrypt / Bytea2Text Function

In response to Anthony Presley <anthony@resolution.com>:

On Wed, 2010-07-14 at 15:56 -0400, Tom Lane wrote:

Thom Brown <thombrown@gmail.com> writes:

On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:

select
e.id
from
employee e ,app_user au
where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Would the following work?:

CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf'))

That would work as far as speeding up the query goes. However, as Bill
Moran points out nearby, the query reveals a totally incompetent
security design. There is no value to speak of in encrypting a data
value and then storing the decryption key right beside it. Perhaps the
excuse is to not have the SSN in cleartext on disk, nevermind whether a
halfway competent attacker could get it back --- but even with that
barely-useful goal, you do *not* want an index like this, because all
the index entries will be cleartext SSNs.

What you really need is to take two steps back and figure out why you
want to encrypt this data and what threats you intend to protect
against. It's probably possible to make a credibly-secure design that
runs faster than this does, but there's no point at all in improving
the performance of a fundamentally broken design.

regards, tom lane

Yes, you are right ... the security here serves no purpose other than to
not have SSN's stored on disk in an un-encrypted way. Unfortunately, we
need to be able to easily, and quickly, reverse the security, so that we
can get access to unencrypted data ... because our application does
export to payroll providers, and many of them still use SSN's as keys.
IE, storing the SSN in an encrypted manner (or using a one-way salt),
won't work.

IE, the reality is that our application has to be able to show / hide
the SSN, so someone breaking into the application (it is likely easier
to steal your manager's password than it would be to hack into the
server), would be able to access the data.

Even if we wanted to tackle *real* security here, I'm not sure how we'd
go about it. Encrypting any data on a web app would require that the
encryption key and/or salt be stored in some combination of the
database, or app code, which is all vulnerable if someone breaks into
and/or steals the server. There isn't a "client" piece, like you'd have
with Carbonite, etc...

You need to do more research on this. I understand that bosses are
arbitrarily requiring "encrypt the SSN" without understanding what they're
asking for, but doing it half-assed like this is irresponsible to the
point of being criminal.

As Tom says, first identify what attack vectors you're protecting from.

If you just want to protect the data if the server if physically stolen,
disk encryption of the partition where PG has it's data files is
probably your best bet, and on most OSen is pretty easy to set up. The
only headache there is someone has to manually enter the disk passphrase
any time the system is rebooted.

If you want to protect from other attack vectors, such as SQL injections,
it's a little trickier, but still doable. The complexity depends on
the rules of your access model.

In the simplest case, you generate a shared secret that every SSN is
encrypted with. You don't keep the shared secret anywhere in the DB.
But when you create users that need access to the SSNs, you create a
copy of the shared secret that's encrypted with their password. Now,
when they log in, they can decrypt the shared secret and get access to
the SSNs.

It even makes your application look better, because when they go to access
the SSNs, a screen pops up saying "please enter your password again to
verify that your session hasn't been hijacked" ... which you can tout
as a benefit to the users.

If your access rules are more complex (which it doesn't seem that they
are, but I only know so much) the implementation gets more complex, with
hierarchies of shared secrets and interesting grant/revoke routines --
maybe even PKI -- but at it's core, it's the same setup as what I just
described with additional layers.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/