Key encryption and relational integrity

Started by Moreno Andreoabout 7 years ago25 messagesgeneral
Jump to latest
#1Moreno Andreo
moreno.andreo@evolu-s.it

Hello folks :-)

Is there any workaround to implement key encryption without breaking
relational integrity?

Thanks

Moreno.-

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Moreno Andreo (#1)
Re: Key encryption and relational integrity

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without breaking
relational integrity?

This is going to need more information. For starters 'key' has separate
meanings for encryption and RI. I could make some guesses about what you
want, but to avoid false assumptions a simple example would be helpful.

Thanks

Moreno.-

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Adrian Klaver (#2)
Re: Key encryption and relational integrity

Il 26/03/2019 15:24, Adrian Klaver ha scritto:

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without breaking
relational integrity?

This is going to need more information.

OK, I'll try to be as clearer as I can

For starters 'key' has separate meanings for encryption and RI. I
could make some guesses about what you want, but to avoid false
assumptions a simple example would be helpful.

In a master-detail relation, I need to encrypt one of master table PK or
detail table FK, in order to achieve pseudonimization, required by GDPR
in Europe when managing particular data
Imagine I have
Table users
idᅵᅵ surnameᅵᅵᅵ last name
1ᅵᅵᅵ Johnᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
2ᅵᅵᅵ Janeᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
3ᅵᅵᅵ Fooᅵᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Bar

Table medications
idᅵᅵᅵ user_idᅵᅵᅵ med
1ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Medication
2ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
idᅵᅵᅵ user_idᅵᅵ ᅵᅵ ᅵ med
1ᅵᅵᅵ sgkighs98ᅵᅵᅵ Medication
2ᅵᅵᅵ sghighs98ᅵᅵᅵ Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a
way to manage this encryption internally so RI is kept safe?

Thanks

Moreno.-

#4Kevin Brannen
KBrannen@efji.com
In reply to: Moreno Andreo (#3)
RE: Key encryption and relational integrity

-----Original Message-----
From: Moreno Andreo <moreno.andreo@evolu-s.it>
Sent: Tuesday, March 26, 2019 11:09 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>; PostgreSQL mailing lists <pgsql-general@postgresql.org>
Subject: Re: Key encryption and relational integrity

In a master-detail relation, I need to encrypt one of master table PK or detail table FK, in order to achieve pseudonimization, required by GDPR in Europe when managing particular data Imagine I have Table users id surname last name
1 John Doe
2 Jane Doe
3 Foo Bar

Table medications
id user_id med
1 1 Medication
2 1 Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
id user_id med
1 sgkighs98 Medication
2 sghighs98 Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a way to manage this encryption internally so RI is kept safe?

===

To me, this is really more of an application question, not a DB question. When you dump or share the data, do your randomization then. If you want to do it in the DB, then add an extra column to your user table, let's call it "pseudonym" and when you insert a new user, fill that in with your randomization string. Then never dump or use the real name, but use the pseudonym column. Better if you can do it, just don't store the real name (though your application may not let you do that).

Honestly, dumping the user as an internal and meaningless number (user_id) seems pretty safe to me, but perhaps your GDPR doesn't allow for that.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.

#5Michel Pelletier
pelletier.michel@gmail.com
In reply to: Kevin Brannen (#4)
Re: Key encryption and relational integrity

On Tue, Mar 26, 2019 at 9:39 AM Kevin Brannen <KBrannen@efji.com> wrote:

-----Original Message-----
From: Moreno Andreo <moreno.andreo@evolu-s.it>
Sent: Tuesday, March 26, 2019 11:09 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>; PostgreSQL mailing lists <
pgsql-general@postgresql.org>
Subject: Re: Key encryption and relational integrity

In a master-detail relation, I need to encrypt one of master table PK or
detail table FK, in order to achieve pseudonimization, required by GDPR in
Europe when managing particular data Imagine I have Table users id
surname last name
1 John Doe
2 Jane Doe
3 Foo Bar

Table medications
id user_id med
1 1 Medication
2 1 Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
id user_id med
1 sgkighs98 Medication
2 sghighs98 Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a way
to manage this encryption internally so RI is kept safe?

===

To me, this is really more of an application question, not a DB question.
When you dump or share the data, do your randomization then. If you want to
do it in the DB, then add an extra column to your user table, let's call it
"pseudonym" and when you insert a new user, fill that in with your
randomization string. Then never dump or use the real name, but use the
pseudonym column. Better if you can do it, just don't store the real name
(though your application may not let you do that).

Agreed, OP could also encrypt the name columns with pgcrypto and keep the
key out of the database.

-Michel

#6Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Moreno Andreo (#3)
Re: Key encryption and relational integrity

On 3/26/19 9:08 AM, Moreno Andreo wrote:

Il 26/03/2019 15:24, Adrian Klaver ha scritto:

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without breaking
relational integrity?

This is going to need more information.

OK, I'll try to be as clearer as I can

For starters 'key' has separate meanings for encryption and RI. I
could make some guesses about what you want, but to avoid false
assumptions a simple example would be helpful.

In a master-detail relation, I need to encrypt one of master table PK or
detail table FK, in order to achieve pseudonimization, required by GDPR
in Europe when managing particular data
Imagine I have
Table users
idᅵᅵ surnameᅵᅵᅵ last name
1ᅵᅵᅵ Johnᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
2ᅵᅵᅵ Janeᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
3ᅵᅵᅵ Fooᅵᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Bar

Table medications
idᅵᅵᅵ user_idᅵᅵᅵ med
1ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Medication
2ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
idᅵᅵᅵ user_idᅵᅵ ᅵᅵ ᅵ med
1ᅵᅵᅵ sgkighs98ᅵᅵᅵ Medication
2ᅵᅵᅵ sghighs98ᅵᅵᅵ Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a
way to manage this encryption internally so RI is kept safe?

Not that I know of. RI is based on maintaining a link between parent and
child. So by definition you would be able to get to the parent record
via the child.

A quick search on pseudonymisation found a boatload of interpretations
of how to implement this:

"Pseudonymisation' means the processing of personal data in such a
manner that the personal data can no longer be attributed to a specific
data subject without the use of additional information, provided that
such additional information is kept separately and is subject to
technical and organisational measures to ensure that the personal data
are not attributed to an identified or identifiable natural person."

To me it would seem something like:

Table medications
id user_id med
1 sgkighs98 Medication
2 sghighs98 Ear check

Table users
id surname last name
sgkighs98 John Doe
jkopkl1 Jane Doe
uepoti21 Foo Bar

Where there is no direct link between the two. Instead permissions would
prevent linking from medications to users even via a SELECT. One could
also use pgcrypto:

https://www.postgresql.org/docs/10/pgcrypto.html

on the users table to further hide the personal info.

*NOTE* I am not a lawyer so any advice on my part as to meeting legal
requirements are just me thinking out loud. I would suggest, if not
already done, getting proper legal advice on what the section quoted
above actually means.

Thanks

Moreno.-

--
Adrian Klaver
adrian.klaver@aklaver.com

#7Tony Shelver
tshelver@gmail.com
In reply to: Moreno Andreo (#1)
Re: Key encryption and relational integrity

Not in Europe, but have worked a bit with medical records systems in the
USA, including sharing across providers.

The primary key of the user is _should_ be system generated, and this is
meaningless from a user identity standpoint. If you encrypt user name and
other significant personal data on the user id record, you can query the id
column all day long, and there would be no way to identify who the user is
without the encryption key.

The only other way to do it would be to store the encrypted key value in
both user.id and medications.user_id. That would encrypt the data and
maintain relational integrity.

For US medical record systems, there is a strict act (HIPAA) that specifies
both privacy and security rules, with lists of what is regarded as
sensitive or private information, what can be shared with various providers
and outside organizations, and so on.. As far as user name goes, that is
almost never a decisive form of identification for any person in a system.

While GDPR is relatively young and untested, surely someone in your
organization (or outside) has a list of the relevant rules broken down to
specific requirements.

Also, securing the data internally on a database system MAY have very
different requirements to making that data visible to applications or
transmitting it to external parties.

Storing the user id in plain on the medications record and encrypted on the
user primary key would seem meaningless, assuming some meaning could be
derived from a system generated ID.

I would suggest you sit down with the application / legal guys generating
the requirements to see what the real rules are. if they want everything
encrypted, then automatic primary key generation in the DB has to be
invalidated, and they should provide that from the application side.

On Tue, 26 Mar 2019 at 16:19, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Show quoted text

Hello folks :-)

Is there any workaround to implement key encryption without breaking
relational integrity?

Thanks

Moreno.-

#8Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Tony Shelver (#7)
Re: Key encryption and relational integrity

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Il 27/03/2019 07:42, Tony Shelver ha
scritto:<br>
</div>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">
<div>Not in Europe, but have worked a bit with medical records
systems in the USA, including sharing across providers.<br>
</div>
<div><br>
</div>
<div>The primary key of the user is _should_ be system
generated, and this is meaningless from a user identity
standpoint.  </div>
</div>
</div>
</blockquote>
It is, I was planning to use UUID<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div>If you encrypt user name and other significant personal
data on the user id record, you can query the id column all
day long, and there would be no way to identify who the user
is without the encryption key.</div>
</div>
</div>
</blockquote>
That's the first idea that came in my mind. Unfortunately, the
customer needs to do "real time" search in personal data, so for
each key that's pressed, I need to query the whole table filtering
the encrypted value (1st layer of slowness) LIKE (2nd layer of
slowness) the value the user is typing. I ran a test on my i7-2660
with SSD and 16G RAM and on a 2500 rows table these queries last
about 2 seconds each keypress. So it's not the way to go, we have to
find something different. Same thing with clinical data, it would be
a mess because when I open the patient clinical record I need to see
all treatments, or all drugs that have been prescripted, all
diseases and so on, and it would be not-so fast if every clinical
data name (i.e. all drug names) is encrypted.<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div><br>
</div>
<div>The only other way to do it would be to store the
encrypted key value in both <a href="http://user.id&quot;
moz-do-not-send="true">user.id</a> and
medications.user_id.  That would encrypt the data and
maintain relational integrity.<br>
</div>
</div>
</div>
</blockquote>
Hmmm... if user.id and medications.user_id are the same, I can link
user with medication... and GDPR rule does not apply..... or am I
missing something?<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div><br>
</div>
<div>For US medical record systems, there is a strict act
(HIPAA) that specifies both privacy and security rules, with
lists of what is regarded as sensitive or private
information, what can be shared with various providers and
outside organizations, and so on..  As far as user name
goes, that is almost never a decisive form of identification
for any person in a system.</div>
</div>
</div>
</blockquote>
GDPR is almost the same concept, even if some rules may differ<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div><br>
</div>
<div>While GDPR is relatively young and untested, surely
someone in your organization (or outside) has a list of the
relevant rules broken down to specific requirements.<br>
</div>
</div>
</div>
</blockquote>
You sure? :-)<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div>Also, securing the data internally on a database system
MAY have very different requirements to making that data
visible to applications or transmitting it to external
parties.<br>
</div>
</div>
</div>
</blockquote>
Profiling, IMHO, has to be designed in application, here I'm trying
to find a way so <u>nobody</u> can, without the use of the
application, match a patient with their clinical records (i.e.
someone breaking into the server -- data breach)<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div><br>
</div>
<div>Storing the user id in plain on the medications record
and encrypted on the user primary key would seem
meaningless, assuming some meaning could be derived from a
system generated ID.</div>
</div>
</div>
</blockquote>
It is a system generated ID, obviously the query is more logical and
quicker if i look from master into detail, so I SELECT something
FROM medications WHERE medications.user_id = encrypt(user.id).
Encrypting the (ex-)FK appears the best solution so far, but I'm
afraid of the consequences of losing RI.<br>
<blockquote type="cite"
cite="mid:CAG0dhZCLGR2YB1TckUFaP5OHwDr6UdmBc7j=g0kK6jZz-2sB8Q@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div> </div>
<div>I would suggest you sit down with the application / legal
guys generating the requirements to see what the real rules
are.  if they want everything encrypted, then automatic
primary key generation in the DB has to be invalidated, and
they should provide that from the application side.<br>
</div>
</div>
</div>
</blockquote>
<p>Having everything encrypted would be a big performance hit. We
are trying to achieve best performance with the right masking. <br>
</p>
<p>Thanks</p>
<p>Moreno.-<br>
</p>
<br>
</body>
</html>

#9Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Moreno Andreo (#8)
Re: Key encryption and relational integrity

On 3/28/19 7:29 AM, Moreno Andreo wrote:

Il 27/03/2019 07:42, Tony Shelver ha scritto:

Not in Europe, but have worked a bit with medical records systems in
the USA, including sharing across providers.

The only other way to do it would be to store the encrypted key value
in both user.id <http://user.id&gt; and medications.user_id.  That would
encrypt the data and maintain relational integrity.

Hmmm... if user.id and medications.user_id are the same, I can link user
with medication... and GDPR rule does not apply..... or am I missing
something?

Yes the link means that someone could use the medications.user_id to
fetch the rest of the user information from the user table. Unless you
encrypted that information also, which I gather you do not want to do
for performance reasons.

--
Adrian Klaver
adrian.klaver@aklaver.com

#10Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Adrian Klaver (#6)
Re: Key encryption and relational integrity

Il 26/03/2019 18:08, Adrian Klaver ha scritto:

On 3/26/19 9:08 AM, Moreno Andreo wrote:

Il 26/03/2019 15:24, Adrian Klaver ha scritto:

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without
breaking relational integrity?

This is going to need more information.

OK, I'll try to be as clearer as I can

For starters 'key' has separate meanings for encryption and RI. I
could make some guesses about what you want, but to avoid false
assumptions a simple example would be helpful.

In a master-detail relation, I need to encrypt one of master table PK
or detail table FK, in order to achieve pseudonimization, required by
GDPR in Europe when managing particular data
Imagine I have
Table users
idᅵᅵ surnameᅵᅵᅵ last name
1ᅵᅵᅵ Johnᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
2ᅵᅵᅵ Janeᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
3ᅵᅵᅵ Fooᅵᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Bar

Table medications
idᅵᅵᅵ user_idᅵᅵᅵ med
1ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Medication
2ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
idᅵᅵᅵ user_idᅵᅵ ᅵᅵ ᅵ med
1ᅵᅵᅵ sgkighs98ᅵᅵᅵ Medication
2ᅵᅵᅵ sghighs98ᅵᅵᅵ Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a
way to manage this encryption internally so RI is kept safe?

Not that I know of. RI is based on maintaining a link between parent
and child. So by definition you would be able to get to the parent
record via the child.

That's what I was afraid of :-(

A quick search on pseudonymisation found a boatload of interpretations
of how to implement this:

"Pseudonymisation' means the processing of personal data in such a
manner that the personal data can no longer be attributed to a
specific data subject without the use of additional information,
provided that such additional information is kept separately and is
subject to technical and organisational measures to ensure that the
personal data are not attributed to an identified or identifiable
natural person."

To me it would seem something like:

Table medications
idᅵᅵᅵ user_idᅵᅵᅵᅵᅵᅵᅵ med
1ᅵᅵᅵ sgkighs98ᅵᅵᅵ Medication
2ᅵᅵᅵ sghighs98ᅵᅵᅵ Ear check

Table users
idᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ surnameᅵᅵᅵ last name
sgkighs98ᅵᅵᅵᅵ Johnᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ Doe
jkopkl1ᅵᅵᅵᅵᅵᅵ Janeᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ Doe
uepoti21ᅵᅵᅵᅵᅵ Fooᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ Bar

Where there is no direct link between the two.

Are you sure there isn't?... the key "sgkighs98" is present on both
tables and I can join tables on that field, so the pseudonimysation does
not apply, it's just "separation" (that was OK with the last privacy
act, but not with GDPR

The problem is not on the application side... there you can do almost
anything you want to do. The prolem is that if someone breaks in the
server (data breach) it is easy to join patients and their medications.

Instead permissions would prevent linking from medications to users
even via a SELECT. One could also use pgcrypto:

https://www.postgresql.org/docs/10/pgcrypto.html

on the users table to further hide the personal info.

That's what I used to try to encrypt first name, last name, street
address and some other fields (that would be the best solution because
RI was not impacted at all), but the customer stated that they have to
perform real-time search (like when you type in the Google search box),
and the query that has to decrypt all names and return only the ones
that begin with a certain set of characters is way too slow (tried on a
good i7 configuration, that's about 2 seconds for each key pressed on aᅵ
2500-row table). So I dropped this approach.

*NOTE* I am not a lawyer so any advice on my part as to meeting legal
requirements are just me thinking out loud. I would suggest, if not
already done, getting proper legal advice on what the section quoted
above actually means.

Relax, I'm not here to ask and then sue anyone :-)

#11Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Adrian Klaver (#9)
Re: Key encryption and relational integrity

Il 28/03/2019 15:45, Adrian Klaver ha scritto:

On 3/28/19 7:29 AM, Moreno Andreo wrote:

Il 27/03/2019 07:42, Tony Shelver ha scritto:

Not in Europe, but have worked a bit with medical records systems in
the USA, including sharing across providers.

The only other way to do it would be to store the encrypted key
value in both user.id <http://user.id&gt; and medications.user_id. 
That would encrypt the data and maintain relational integrity.

Hmmm... if user.id and medications.user_id are the same, I can link
user with medication... and GDPR rule does not apply..... or am I
missing something?

Yes the link means that someone could use the medications.user_id to
fetch the rest of the user information from the user table. Unless you
encrypted that information also, which I gather you do not want to do
for performance reasons.

Yes, but to be GDPR compliant I _have_ to remove that link. As you
reported in an earlier email, they say that you can't link patient and
medication if not with an external resource. In this case we are linking
them without an external resource....

Wandering on the web I also bumped on an article that suggested to have
the tables on 2 databases on 2 different servers with different
credentials.... Interesting, but still without RI.

I think I have to answer this question: "is it acceptable to have this
kind of master-detail structure in a database without RI?" (The example
is about 2 tables, obviously in the real world the master will have 50+
detail tables)

Thanks

Moreno.

#12Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Moreno Andreo (#11)
Aw: Re: Key encryption and relational integrity

Yes, but to be GDPR compliant I _have_ to remove that link. As you
reported in an earlier email, they say that you can't link patient and
medication if not with an external resource. In this case we are linking
them without an external resource....

I REALLY doubt that.

I believe you when you say "they say that ..." but I truly doubt that
GDPR intended to make data processing fully impractical.

(I work in the medical field)

Karsten

#13Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Karsten Hilbert (#12)
Aw: Re: Key encryption and relational integrity

I believe you when you say "they say that ..." but I truly doubt that
GDPR intended to make data processing fully impractical.

(I work in the medical field)

In Germany, that is, which usually takes things to the
extreme, for better or worse.

Karsten

#14Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Moreno Andreo (#10)
Re: Key encryption and relational integrity

On 3/28/19 10:36 AM, Moreno Andreo wrote:

Il 26/03/2019 18:08, Adrian Klaver ha scritto:

On 3/26/19 9:08 AM, Moreno Andreo wrote:

Il 26/03/2019 15:24, Adrian Klaver ha scritto:

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without
breaking relational integrity?

This is going to need more information.

OK, I'll try to be as clearer as I can

For starters 'key' has separate meanings for encryption and RI. I
could make some guesses about what you want, but to avoid false
assumptions a simple example would be helpful.

In a master-detail relation, I need to encrypt one of master table PK
or detail table FK, in order to achieve pseudonimization, required by
GDPR in Europe when managing particular data
Imagine I have
Table users
idᅵᅵ surnameᅵᅵᅵ last name
1ᅵᅵᅵ Johnᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
2ᅵᅵᅵ Janeᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Doe
3ᅵᅵᅵ Fooᅵᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Bar

Table medications
idᅵᅵᅵ user_idᅵᅵᅵ med
1ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Medication
2ᅵᅵᅵᅵ 1ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ ᅵᅵᅵ Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
idᅵᅵᅵ user_idᅵᅵ ᅵᅵ ᅵ med
1ᅵᅵᅵ sgkighs98ᅵᅵᅵ Medication
2ᅵᅵᅵ sghighs98ᅵᅵᅵ Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a
way to manage this encryption internally so RI is kept safe?

Not that I know of. RI is based on maintaining a link between parent
and child. So by definition you would be able to get to the parent
record via the child.

That's what I was afraid of :-(

A quick search on pseudonymisation found a boatload of interpretations
of how to implement this:

"Pseudonymisation' means the processing of personal data in such a
manner that the personal data can no longer be attributed to a
specific data subject without the use of additional information,
provided that such additional information is kept separately and is
subject to technical and organisational measures to ensure that the
personal data are not attributed to an identified or identifiable
natural person."

To me it would seem something like:

Table medications
idᅵᅵᅵ user_idᅵᅵᅵᅵᅵᅵᅵ med
1ᅵᅵᅵ sgkighs98ᅵᅵᅵ Medication
2ᅵᅵᅵ sghighs98ᅵᅵᅵ Ear check

Table users
idᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ surnameᅵᅵᅵ last name
sgkighs98ᅵᅵᅵᅵ Johnᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ Doe
jkopkl1ᅵᅵᅵᅵᅵᅵ Janeᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ Doe
uepoti21ᅵᅵᅵᅵᅵ Fooᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ Bar

Where there is no direct link between the two.

Are you sure there isn't?... the key "sgkighs98" is present on both
tables and I can join tables on that field, so the pseudonimysation does
not apply, it's just "separation" (that was OK with the last privacy
act, but not with GDPR

Yes but you can use permissions to make the user table is unreachable by
folks with insufficient permission.

The problem is not on the application side... there you can do almost
anything you want to do. The prolem is that if someone breaks in the
server (data breach) it is easy to join patients and their medications.

That really depends on what level of user they break in as. That is a
separate security issue. It also is the difference between
pseudonymisation and anonymization, where the latter makes the data
totally unrelated to an individuals personal information.

Instead permissions would prevent linking from medications to users
even via a SELECT. One could also use pgcrypto:

https://www.postgresql.org/docs/10/pgcrypto.html

on the users table to further hide the personal info.

That's what I used to try to encrypt first name, last name, street
address and some other fields (that would be the best solution because
RI was not impacted at all), but the customer stated that they have to
perform real-time search (like when you type in the Google search box),
and the query that has to decrypt all names and return only the ones
that begin with a certain set of characters is way too slow (tried on a
good i7 configuration, that's about 2 seconds for each key pressed on a
2500-row table). So I dropped this approach.

*NOTE* I am not a lawyer so any advice on my part as to meeting legal
requirements are just me thinking out loud. I would suggest, if not
already done, getting proper legal advice on what the section quoted
above actually means.

Relax, I'm not here to ask and then sue anyone :-)

Hey, I live in the US its just best policy to make that clear:)

--
Adrian Klaver
adrian.klaver@aklaver.com

#15Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Moreno Andreo (#10)
Re: Key encryption and relational integrity

On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote:

Il 26/03/2019 18:08, Adrian Klaver ha scritto:

To me it would seem something like:

Table medications
id    user_id        med
1    sgkighs98    Medication
2    sghighs98    Ear check

Table users
id            surname    last name
sgkighs98     John            Doe
jkopkl1       Jane            Doe
uepoti21      Foo             Bar

Where there is no direct link between the two.

Are you sure there isn't?... the key "sgkighs98" is present on both
tables and I can join tables on that field, so the pseudonimysation
does not apply,

Yes. It doesn't matter whether the key is 'sgkighs98' or 1438 or
692da0c1-cf2d-476d-8910-7f82c050f8fe.

it's just "separation" (that was OK with the last privacy act, but not
with GDPR

I doubt that this is correct. The GDPR doesn't prescribe specific
technical means (there may be laws or standards in your country which
prescribe such means for medical data, but that's not the GDPR).

The problem is not on the application side... there you can do almost
anything you want to do. The prolem is that if someone breaks in the
server (data breach) it is easy to join patients and their
medications.

I sure hope that the doctors are able to join patients and their
medications. So at some level that has to be possible. If you assume a
break-in into the server, there will always be a level of penetration at
which the attacker will be able to access any data an authorized user
can access.

hp

--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/&gt;

#16Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Moreno Andreo (#8)
Re: Key encryption and relational integrity

On 2019-03-28 15:29:50 +0100, Moreno Andreo wrote:

here I'm trying to find a way so nobody can, without the use of the
application, match a patient with their clinical records (i.e. someone
breaking into the server -- data breach)

I think it is very optimistic to assume that an intruder would get
access to the database but not the application.

hp

--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/&gt;

#17Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Peter J. Holzer (#16)
Re: Key encryption and relational integrity

What Andrea Moreno's client seems to think is required by
GDPR is clearly bogus -- it would render illegal any and all
DICOM images existing today, because they contain
programmatically easily processable personally identifying
data right inside the clinical data.

Karsten Hilbert
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

#18Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Peter J. Holzer (#15)
Re: Key encryption and relational integrity

Il 28/03/2019 23:29, Peter J. Holzer ha scritto:

On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote:

Il 26/03/2019 18:08, Adrian Klaver ha scritto:

To me it would seem something like:

Table medications
id��� user_id������� med
1��� sgkighs98��� Medication
2��� sghighs98��� Ear check

Table users
id����������� surname��� last name
sgkighs98���� John����������� Doe
jkopkl1������ Jane����������� Doe
uepoti21����� Foo������������ Bar

Where there is no direct link between the two.

Are you sure there isn't?... the key "sgkighs98" is present on both
tables and I can join tables on that field, so the pseudonimysation
does not apply,

Yes. It doesn't matter whether the key is 'sgkighs98' or 1438 or
692da0c1-cf2d-476d-8910-7f82c050f8fe.

it's just "separation" (that was OK with the last privacy act, but not
with GDPR

I doubt that this is correct. The GDPR doesn't prescribe specific
technical means (there may be laws or standards in your country which
prescribe such means for medical data, but that's not the GDPR).

That was told me by a privacy consultant, there was an Italian law
(196/2003) that introduced "minimal security measures", that has been
revoked with the GDPR appliance.

The problem is not on the application side... there you can do almost
anything you want to do. The prolem is that if someone breaks in the
server (data breach) it is easy to join patients and their
medications.

I sure hope that the doctors are able to join patients and their
medications. So at some level that has to be possible.

It would be possible at application level, that resides on another
server (so it would be compliant the separation between the
pseudonimysation and the reverse method)

If you assume a
break-in into the server, there will always be a level of penetration at
which the attacker will be able to access any data an authorized user
can access.

That's not what I got reading the GDPR article... but I may have
misunderstood (juridic text is non my cup of tea). My understanding was
that even in a data breach event there should be a mechanism that
prevents (or "mitigate the risk that") the attacker to gain access to
the data in the "joined" form, so he cannot acquire that patient John
Doe has got Alzheimer, for instance, but only that in that database
there is a patient which name is John Doe and someone that has got
Alzheimer.

And I tried to find a solution, and since I did not like that much what
I found (and it seems that neither you do :-) ), I came here hoping that
someone would share his experience to shed some light on the topic.

Show quoted text

hp

#19Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Peter J. Holzer (#16)
Re: Key encryption and relational integrity

Il 28/03/2019 23:50, Peter J. Holzer ha scritto:

On 2019-03-28 15:29:50 +0100, Moreno Andreo wrote:

here I'm trying to find a way so nobody can, without the use of the
application, match a patient with their clinical records (i.e. someone
breaking into the server -- data breach)

I think it is very optimistic to assume that an intruder would get
access to the database but not the application.

hp

Well, application resides on another instance (server), but if the
attacker has been able to take control of one server, he surely could
try to break another one, but it takes time.....

#20Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Moreno Andreo (#18)
Re: Key encryption and relational integrity

On 2019-03-29 17:01:07 +0100, Moreno Andreo wrote:

Il 28/03/2019 23:29, Peter J. Holzer ha scritto:

On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote:

it's just "separation" (that was OK with the last privacy act, but not
with GDPR

I doubt that this is correct. The GDPR doesn't prescribe specific
technical means (there may be laws or standards in your country which
prescribe such means for medical data, but that's not the GDPR).

That was told me by a privacy consultant, there was an Italian law
(196/2003) that introduced "minimal security measures", that has been
revoked with the GDPR appliance.

The problem is not on the application side... there you can do almost
anything you want to do. The prolem is that if someone breaks in the
server (data breach) it is easy to join patients and their
medications.

I sure hope that the doctors are able to join patients and their
medications. So at some level that has to be possible.

It would be possible at application level, that resides on another server
(so it would be compliant the separation between the pseudonimysation and
the reverse method)

But why would you assume that an attacker cannot get access to that
"other server"?

If you assume a break-in into the server, there will always be a
level of penetration at which the attacker will be able to access
any data an authorized user can access.

That's not what I got reading the GDPR article... but I may have
misunderstood (juridic text is non my cup of tea). My understanding was that
even in a data breach event there should be a mechanism that prevents (or
"mitigate the risk that") the attacker to gain access to the data in the

Quoting from article 32 of the GDPR:

| Taking into account the state of the art, the costs of implementation
| and the nature, scope, context and purposes of processing as well as the
| risk of varying likelihood and severity for the rights and freedoms of
| natural persons, the controller and the processor shall implement
| appropriate technical and organisational measures to ensure a level of
| security appropriate to the risk, including inter alia as appropriate:

This is basically the gist of technical part of the GDPR. The controller
and processor are responsible to "ensure a level of security appropriate
to the risk", and it is their job to determine how to do that. The GDPR
doesn't say how to do that at all (the legislators were wise enough that
any attempt to do that would result in a mess). So you can't say "the
GDPR says we have to do it this way" (and if your consultant says that
it is probably time to get a different one). You have to consider all
the risks (and yes, an attacker getting access to some or all of the
data is a risk, but a doctor not being able to access a patient's
records is also a risk) and implement the best you can do considering
"the state of the art, the costs of implementation", etc.

"joined" form, so he cannot acquire that patient John Doe has got Alzheimer,
for instance, but only that in that database there is a patient which name
is John Doe and someone that has got Alzheimer.

I'm not talking about the GDPR here, but about the technical
impossibility. If an authorized user (say a doctor or a nurse) can get
the information that John Doe has Alzheimer's (and as a patient one
would hope that they can), then there will *always* be a way for an
attacker to aquire the privileges of that authorized user and get the
same information. There is no way around that. You can make it harder,
but you can't prevent it.

A much better way (IMHO) is to reduce the attack surface: Store only
data you need, allow access only for personnel which are actually
involved in treating that patient, use good authentication, physically
separate systems which can access the data from the internet, don't
throw printouts into the waste paper (don't laugh - that happened). If
there are people who need access to pseudonymized or aggregate data,
copy that data to a separate system ...

hp

--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/&gt;

#21Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Moreno Andreo (#18)
#22Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Moreno Andreo (#19)
#23Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Adrian Klaver (#21)
#24Rory Campbell-Lange
rory@campbell-lange.net
In reply to: Moreno Andreo (#23)
#25Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Rory Campbell-Lange (#24)