Transparent Data Encryption (TDE) and encrypted files

Started by Bruce Momjianover 6 years ago66 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact, or
other files. Is that correct? Do any other PGDATA files contain user
data?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#2Tels
nospam-pg-abuse@bloodgate.com
In reply to: Bruce Momjian (#1)
Re: Transparent Data Encryption (TDE) and encrypted files

Moin,

On 2019-09-30 23:26, Bruce Momjian wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact,
or
other files. Is that correct? Do any other PGDATA files contain user
data?

IMHO the general rule in crypto is: encrypt everything, or don't bother.

If you don't encrypt some things, somebody is going to find loopholes
and sidechannels
and partial-plaintext attacks. Just a silly example: If you trick the DB
into putting only one row per page,
any "bit-per-page" map suddenly reveals information about a single
encrypted row that it shouldn't reveal.

Many people with a lot of free time on their hands will sit around,
drink a nice cup of tea and come up
with all sorts of attacks on these things that you didn't (and couldn't)
anticipate now.

So IMHO it would be much better to err on the side of caution and
encrypt everything possible.

Best regards,

Tels

#3Moon, Insung
tsukiwamoon.pgsql@gmail.com
In reply to: Tels (#2)
Re: Transparent Data Encryption (TDE) and encrypted files

Dear Tels.

On Tue, Oct 1, 2019 at 4:33 PM Tels <nospam-pg-abuse@bloodgate.com> wrote:

Moin,

On 2019-09-30 23:26, Bruce Momjian wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact,
or
other files. Is that correct? Do any other PGDATA files contain user
data?

IMHO the general rule in crypto is: encrypt everything, or don't bother.

If you don't encrypt some things, somebody is going to find loopholes
and sidechannels
and partial-plaintext attacks. Just a silly example: If you trick the DB
into putting only one row per page,
any "bit-per-page" map suddenly reveals information about a single
encrypted row that it shouldn't reveal.

Many people with a lot of free time on their hands will sit around,
drink a nice cup of tea and come up
with all sorts of attacks on these things that you didn't (and couldn't)
anticipate now.

This is my thinks, but to minimize overhead, we try not to encrypt
data that does not store confidential data.

And I'm not a security expert, so my thoughts may be wrong.
But isn't it more dangerous to encrypt predictable data?

For example, when encrypting data other than the data entered by the user,
it is possible(maybe..) to predict the plain text data.
And if these data are encrypted, I think that there will be a security problem.

Of course, the encryption key will use separately.
But I thought it would be a problem if there were confidential data
encrypted using the same key as the attacked data.

Best regards.
Moon.

Show quoted text

So IMHO it would be much better to err on the side of caution and
encrypt everything possible.

Best regards,

Tels

#4Magnus Hagander
magnus@hagander.net
In reply to: Tels (#2)
Re: Transparent Data Encryption (TDE) and encrypted files

On Tue, Oct 1, 2019 at 9:33 AM Tels <nospam-pg-abuse@bloodgate.com> wrote:

Moin,

On 2019-09-30 23:26, Bruce Momjian wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact,
or
other files. Is that correct? Do any other PGDATA files contain user
data?

IMHO the general rule in crypto is: encrypt everything, or don't bother.

If you don't encrypt some things, somebody is going to find loopholes
and sidechannels
and partial-plaintext attacks. Just a silly example: If you trick the DB
into putting only one row per page,
any "bit-per-page" map suddenly reveals information about a single
encrypted row that it shouldn't reveal.

Many people with a lot of free time on their hands will sit around,
drink a nice cup of tea and come up
with all sorts of attacks on these things that you didn't (and couldn't)
anticipate now.

So IMHO it would be much better to err on the side of caution and
encrypt everything possible.

+1.

Unless we are *absolutely* certain, I bet someone will be able to find a
side-channel that somehow leaks some data or data-about-data, if we don't
encrypt everything. If nothing else, you can get use patterns out of it,
and you can make a lot from that. (E.g. by whether transactions are using
multixacts or not you can potentially determine which transaction they are,
if you know what type of transactions are being issued by the application.
In the simplest case, there might be a single pattern where multixacts end
up actually being used, and in that case being able to see the multixact
data tells you a lot about the system).

As for other things -- by default, we store the log files in text format in
the data directory. That contains *loads* of sensitive data in a lot of
cases. Will those also be encrypted?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#5Moon, Insung
tsukiwamoon.pgsql@gmail.com
In reply to: Magnus Hagander (#4)
Re: Transparent Data Encryption (TDE) and encrypted files

Dear Magnus Hagander.

On Tue, Oct 1, 2019 at 5:37 PM Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Oct 1, 2019 at 9:33 AM Tels <nospam-pg-abuse@bloodgate.com> wrote:

Moin,

On 2019-09-30 23:26, Bruce Momjian wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact,
or
other files. Is that correct? Do any other PGDATA files contain user
data?

IMHO the general rule in crypto is: encrypt everything, or don't bother.

If you don't encrypt some things, somebody is going to find loopholes
and sidechannels
and partial-plaintext attacks. Just a silly example: If you trick the DB
into putting only one row per page,
any "bit-per-page" map suddenly reveals information about a single
encrypted row that it shouldn't reveal.

Many people with a lot of free time on their hands will sit around,
drink a nice cup of tea and come up
with all sorts of attacks on these things that you didn't (and couldn't)
anticipate now.

So IMHO it would be much better to err on the side of caution and
encrypt everything possible.

+1.

Unless we are *absolutely* certain, I bet someone will be able to find a side-channel that somehow leaks some data or data-about-data, if we don't encrypt everything. If nothing else, you can get use patterns out of it, and you can make a lot from that. (E.g. by whether transactions are using multixacts or not you can potentially determine which transaction they are, if you know what type of transactions are being issued by the application. In the simplest case, there might be a single pattern where multixacts end up actually being used, and in that case being able to see the multixact data tells you a lot about the system).

As for other things -- by default, we store the log files in text format in the data directory. That contains *loads* of sensitive data in a lot of cases. Will those also be encrypted?

Maybe...as a result of the discussion so far, we are not encrypted of
the server log.

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#What_to_encrypt.2Fdecrypt

I think Encrypting server logs can be a very difficult challenge,
and will probably need to develop another application to see the
encrypted server logs.

Best regards.
Moon.

Show quoted text

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#6Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Moon, Insung (#5)
Re: Transparent Data Encryption (TDE) and encrypted files

On Tue, Oct 01, 2019 at 06:30:39PM +0900, Moon, Insung wrote:

Dear Magnus Hagander.

On Tue, Oct 1, 2019 at 5:37 PM Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Oct 1, 2019 at 9:33 AM Tels <nospam-pg-abuse@bloodgate.com> wrote:

Moin,

On 2019-09-30 23:26, Bruce Momjian wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact,
or
other files. Is that correct? Do any other PGDATA files contain user
data?

IMHO the general rule in crypto is: encrypt everything, or don't bother.

If you don't encrypt some things, somebody is going to find loopholes
and sidechannels
and partial-plaintext attacks. Just a silly example: If you trick the DB
into putting only one row per page,
any "bit-per-page" map suddenly reveals information about a single
encrypted row that it shouldn't reveal.

Many people with a lot of free time on their hands will sit around,
drink a nice cup of tea and come up
with all sorts of attacks on these things that you didn't (and couldn't)
anticipate now.

So IMHO it would be much better to err on the side of caution and
encrypt everything possible.

+1.

Unless we are *absolutely* certain, I bet someone will be able to find a side-channel that somehow leaks some data or data-about-data, if we don't encrypt everything. If nothing else, you can get use patterns out of it, and you can make a lot from that. (E.g. by whether transactions are using multixacts or not you can potentially determine which transaction they are, if you know what type of transactions are being issued by the application. In the simplest case, there might be a single pattern where multixacts end up actually being used, and in that case being able to see the multixact data tells you a lot about the system).

As for other things -- by default, we store the log files in text format in the data directory. That contains *loads* of sensitive data in a lot of cases. Will those also be encrypted?

Maybe...as a result of the discussion so far, we are not encrypted of
the server log.

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#What_to_encrypt.2Fdecrypt

I think Encrypting server logs can be a very difficult challenge,
and will probably need to develop another application to see the
encrypted server logs.

IMO leaks of sensitive data into the server log (say, as part of error
messages, slow queries, ...) are a serious issue. It's one of the main
issues with pgcrypto-style encryption, because it's trivial to leak e.g.
keys into the server log. Even if proper key management prevents leaking
keys, there are still user data - say, credit card numbers, and such.

So I don't see how we could not encrypt the server log, in the end.

But yes, you're right it's a challenging topis.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#6)
Re: Transparent Data Encryption (TDE) and encrypted files

On Tue, Oct 1, 2019 at 03:48:31PM +0200, Tomas Vondra wrote:

IMO leaks of sensitive data into the server log (say, as part of error
messages, slow queries, ...) are a serious issue. It's one of the main
issues with pgcrypto-style encryption, because it's trivial to leak e.g.
keys into the server log. Even if proper key management prevents leaking
keys, there are still user data - say, credit card numbers, and such.

Fortunately, the full-cluster encryption keys are stored encrypted in
pg_control and are never accessible unencrypted at the SQL level.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#8Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: Transparent Data Encryption (TDE) and encrypted files

On Mon, Sep 30, 2019 at 05:26:33PM -0400, Bruce Momjian wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact, or
other files. Is that correct? Do any other PGDATA files contain user
data?

Oh, there is also consideration that the pg_replslot directory might
also contain user data.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#9Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#1)
Re: Transparent Data Encryption (TDE) and encrypted files

On Mon, Sep 30, 2019 at 5:26 PM Bruce Momjian <bruce@momjian.us> wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact, or
other files. Is that correct? Do any other PGDATA files contain user
data?

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

I'm not sold on the comments that have been made about encrypting the
server log. I agree that could leak data, but that seems like somebody
else's problem: the log files aren't really under PostgreSQL's
management in the same way as pg_clog is. If you want to secure your
logs, send them to syslog and configure it to do whatever you need.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#9)
Re: Transparent Data Encryption (TDE) and encrypted files

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Sep 30, 2019 at 5:26 PM Bruce Momjian <bruce@momjian.us> wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact, or
other files. Is that correct? Do any other PGDATA files contain user
data?

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

That isn't what other database systems do though and isn't what people
actually asking for this feature are expecting to have or deal with.

People who are looking for 'encrypt all the things' should and will be
looking at filesytem-level encryption options. That's not what this
feature is about.

I'm not sold on the comments that have been made about encrypting the
server log. I agree that could leak data, but that seems like somebody
else's problem: the log files aren't really under PostgreSQL's
management in the same way as pg_clog is. If you want to secure your
logs, send them to syslog and configure it to do whatever you need.

I agree with this.

Thanks,

Stephen

#11Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Stephen Frost (#10)
Re: Transparent Data Encryption (TDE) and encrypted files

On Thu, Oct 03, 2019 at 10:40:40AM -0400, Stephen Frost wrote:

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Sep 30, 2019 at 5:26 PM Bruce Momjian <bruce@momjian.us> wrote:

For full-cluster Transparent Data Encryption (TDE), the current plan is
to encrypt all heap and index files, WAL, and all pgsql_tmp (work_mem
overflow). The plan is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption

We don't see much value to encrypting vm, fsm, pg_xact, pg_multixact, or
other files. Is that correct? Do any other PGDATA files contain user
data?

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

That isn't what other database systems do though and isn't what people
actually asking for this feature are expecting to have or deal with.

People who are looking for 'encrypt all the things' should and will be
looking at filesytem-level encryption options. That's not what this
feature is about.

That's almost certainly not true, at least not universally.

It may be true for some people, but a a lot of the people asking for
in-database encryption essentially want to do filesystem encryption but
can't use it for various reasons. E.g. because they're running in
environments that make filesystem encryption impossible to use (OS not
supporting it directly, no access to the block device, lack of admin
privileges, ...). Or maybe they worry about people with fs access.

If you look at how the two threads discussing the FDE design, both of
them pretty much started as "let's do FDE in the database".

I'm not sold on the comments that have been made about encrypting the
server log. I agree that could leak data, but that seems like somebody
else's problem: the log files aren't really under PostgreSQL's
management in the same way as pg_clog is. If you want to secure your
logs, send them to syslog and configure it to do whatever you need.

I agree with this.

I don't. I know it's not an easy problem to solve, but it may contain
user data (which is what we manage). We may allow disabling that, at
which point it becomes someone else's problem.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Stephen Frost
sfrost@snowman.net
In reply to: Tomas Vondra (#11)
Re: Transparent Data Encryption (TDE) and encrypted files

Greetings,

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:

On Thu, Oct 03, 2019 at 10:40:40AM -0400, Stephen Frost wrote:

People who are looking for 'encrypt all the things' should and will be
looking at filesytem-level encryption options. That's not what this
feature is about.

That's almost certainly not true, at least not universally.

It may be true for some people, but a a lot of the people asking for
in-database encryption essentially want to do filesystem encryption but
can't use it for various reasons. E.g. because they're running in
environments that make filesystem encryption impossible to use (OS not
supporting it directly, no access to the block device, lack of admin
privileges, ...). Or maybe they worry about people with fs access.

Anyone coming from other database systems isn't asking for that though
and it wouldn't be a comparable offering to other systems.

If you look at how the two threads discussing the FDE design, both of
them pretty much started as "let's do FDE in the database".

And that's how some folks continue to see it- let's just encrypt all the
things, until they actually look at it and start thinking about what
that means and how to implement it.

Yeah, it'd be great to just encrypt everything, with a bunch of
different keys, all of which are stored somewhere else, and can be
updated and changed by the user when they need to do a rekeying, but
then you start have to asking about what keys need to be available when
for doing crash recovery, how do you handle a crash in the middle of a
rekeying, how do you handle updating keys from the user, etc..

Sure, we could offer a dead simple "here, use this one key at database
start to just encrypt everything" and that would be enough for some set
of users (a very small set, imv, but that's subjective, obviously), but
I don't think we could dare promote that as having TDE because it
wouldn't be at all comparable to what other databases have, and it
wouldn't materially move us in the direction of having real TDE.

I'm not sold on the comments that have been made about encrypting the
server log. I agree that could leak data, but that seems like somebody
else's problem: the log files aren't really under PostgreSQL's
management in the same way as pg_clog is. If you want to secure your
logs, send them to syslog and configure it to do whatever you need.

I agree with this.

I don't. I know it's not an easy problem to solve, but it may contain
user data (which is what we manage). We may allow disabling that, at
which point it becomes someone else's problem.

We also send user data to clients, but I don't imagine we're suggesting
that we need to control what some downstream application does with that
data or how it gets stored. There's definitely a lot of room for
improvement in our logging (in an ideal world, we'd have a way to
actually store the logs in the database, at which point it could be
encrypted or not that way...), but I'm not seeing the need for us to
have a way to encrypt the log files. If we did encrypt them, we'd have
to make sure to do it in a way that users could still access them
without the database being up and running, which might be tricky if the
key is in the vault...

Thanks,

Stephen

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Stephen Frost (#10)
Re: Transparent Data Encryption (TDE) and encrypted files

On 2019-10-03 16:40, Stephen Frost wrote:

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

That isn't what other database systems do though and isn't what people
actually asking for this feature are expecting to have or deal with.

It is what some other database systems do. Perhaps some others don't.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14Stephen Frost
sfrost@snowman.net
In reply to: Peter Eisentraut (#13)
Re: Transparent Data Encryption (TDE) and encrypted files

Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:

On 2019-10-03 16:40, Stephen Frost wrote:

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

That isn't what other database systems do though and isn't what people
actually asking for this feature are expecting to have or deal with.

It is what some other database systems do. Perhaps some others don't.

I looked at the contemporary databases and provided details about all of
them earlier in the thread. Please feel free to review that and let me
know if your research shows differently.

Thanks,

Stephen

#15Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Stephen Frost (#12)
Re: Transparent Data Encryption (TDE) and encrypted files

On Thu, Oct 03, 2019 at 11:51:41AM -0400, Stephen Frost wrote:

Greetings,

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:

On Thu, Oct 03, 2019 at 10:40:40AM -0400, Stephen Frost wrote:

People who are looking for 'encrypt all the things' should and will be
looking at filesytem-level encryption options. That's not what this
feature is about.

That's almost certainly not true, at least not universally.

It may be true for some people, but a a lot of the people asking for
in-database encryption essentially want to do filesystem encryption but
can't use it for various reasons. E.g. because they're running in
environments that make filesystem encryption impossible to use (OS not
supporting it directly, no access to the block device, lack of admin
privileges, ...). Or maybe they worry about people with fs access.

Anyone coming from other database systems isn't asking for that though
and it wouldn't be a comparable offering to other systems.

I don't think that's quite accurate. In the previous message you claimed
(1) this isn't what other database systems do and (2) people who want to
encrypt everything should just use fs encryption, because that's not
what TDE is about.

Regarding (1), I'm pretty sure Oracle TDE does pretty much exactly this,
at least in the mode with tablespace-level encryption. It's true there
is also column-level mode, but from my experience it's far less used
because it has a number of annoying limitations.

So I'm somewhat puzzled by your claim that people coming from other
systems are asking for the column-level mode. At least I'm assuming
that's what they're asking for, because I don't see other options.

If you look at how the two threads discussing the FDE design, both of
them pretty much started as "let's do FDE in the database".

And that's how some folks continue to see it- let's just encrypt all the
things, until they actually look at it and start thinking about what
that means and how to implement it.

This argument also works the other way, though. On Oracle, people often
start with the column-level encryption because it seems naturally
superior (hey, I can encrypt just the columns I want, ...) and then they
start running into the various limitations and eventually just switch to
the tablespace-level encryption.

Now, maybe we'll be able to solve those limitations - but I think it's
pretty unlikely, because those limitations seem quite inherent to how
encryption affects indexes etc.

Yeah, it'd be great to just encrypt everything, with a bunch of
different keys, all of which are stored somewhere else, and can be
updated and changed by the user when they need to do a rekeying, but
then you start have to asking about what keys need to be available when
for doing crash recovery, how do you handle a crash in the middle of a
rekeying, how do you handle updating keys from the user, etc..

Sure, we could offer a dead simple "here, use this one key at database
start to just encrypt everything" and that would be enough for some set
of users (a very small set, imv, but that's subjective, obviously), but
I don't think we could dare promote that as having TDE because it
wouldn't be at all comparable to what other databases have, and it
wouldn't materially move us in the direction of having real TDE.

I think that very much depends on the definition of what "real TDE". I
don't know what exactly that means at this point. And as I said before,
I think such simple mode *is* comparable to (at least some) solutions
available in other databases (as explained above).

As for the users, I don't have any objective data about this, but I
think the amount of people wanting such simple solution is non-trivial.
That does not mean we can't extend it to support more advanced features.

I'm not sold on the comments that have been made about encrypting the
server log. I agree that could leak data, but that seems like somebody
else's problem: the log files aren't really under PostgreSQL's
management in the same way as pg_clog is. If you want to secure your
logs, send them to syslog and configure it to do whatever you need.

I agree with this.

I don't. I know it's not an easy problem to solve, but it may contain
user data (which is what we manage). We may allow disabling that, at
which point it becomes someone else's problem.

We also send user data to clients, but I don't imagine we're suggesting
that we need to control what some downstream application does with that
data or how it gets stored. There's definitely a lot of room for
improvement in our logging (in an ideal world, we'd have a way to
actually store the logs in the database, at which point it could be
encrypted or not that way...), but I'm not seeing the need for us to
have a way to encrypt the log files. If we did encrypt them, we'd have
to make sure to do it in a way that users could still access them
without the database being up and running, which might be tricky if the
key is in the vault...

That's a bit of a straw-man argument, really. The client is obviously
meant to receive and handle sensitive data, that's it's main purpose.
For logging systems the situation is a bit different, it's a general
purpose tool, with no idea what the data is.

I do understand it's pretty pointless to send encrypted message to such
external tools, but IMO it's be good to implement that at least for our
internal logging collector.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Stephen Frost (#14)
Re: Transparent Data Encryption (TDE) and encrypted files

On Thu, Oct 03, 2019 at 11:58:55AM -0400, Stephen Frost wrote:

Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:

On 2019-10-03 16:40, Stephen Frost wrote:

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

That isn't what other database systems do though and isn't what people
actually asking for this feature are expecting to have or deal with.

It is what some other database systems do. Perhaps some others don't.

I looked at the contemporary databases and provided details about all of
them earlier in the thread. Please feel free to review that and let me
know if your research shows differently.

I assume you mean this (in one of the other threads):

/messages/by-id/20190817175217.GE16436@tamriel.snowman.net

FWIW I don't see anything contradicting the idea of just encrypting
everything (including vm, fsm etc.). The only case that seems to be an
exception is the column-level encryption in Oracle, all the other
options (especially the database-level ones) seem to be consistent with
this principle.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17Stephen Frost
sfrost@snowman.net
In reply to: Tomas Vondra (#15)
Re: Transparent Data Encryption (TDE) and encrypted files

Greetings,

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:

On Thu, Oct 03, 2019 at 11:51:41AM -0400, Stephen Frost wrote:

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:

On Thu, Oct 03, 2019 at 10:40:40AM -0400, Stephen Frost wrote:

People who are looking for 'encrypt all the things' should and will be
looking at filesytem-level encryption options. That's not what this
feature is about.

That's almost certainly not true, at least not universally.

It may be true for some people, but a a lot of the people asking for
in-database encryption essentially want to do filesystem encryption but
can't use it for various reasons. E.g. because they're running in
environments that make filesystem encryption impossible to use (OS not
supporting it directly, no access to the block device, lack of admin
privileges, ...). Or maybe they worry about people with fs access.

Anyone coming from other database systems isn't asking for that though
and it wouldn't be a comparable offering to other systems.

I don't think that's quite accurate. In the previous message you claimed
(1) this isn't what other database systems do and (2) people who want to
encrypt everything should just use fs encryption, because that's not
what TDE is about.

Regarding (1), I'm pretty sure Oracle TDE does pretty much exactly this,
at least in the mode with tablespace-level encryption. It's true there
is also column-level mode, but from my experience it's far less used
because it has a number of annoying limitations.

We're probably being too general and that's ending up with us talking
past each other. Yes, Oracle provides tablespace and column level
encryption, but neither case results in *everything* being encrypted.

So I'm somewhat puzzled by your claim that people coming from other
systems are asking for the column-level mode. At least I'm assuming
that's what they're asking for, because I don't see other options.

I've seen asks for tablespace, table, and column-level, but it's always
been about the actual data. Something like clog is an entirely internal
structure that doesn't include the actual data. Yes, it's possible it
could somehow be used for a side-channel attack, as could other things,
such as WAL, and as such I'm not sure that forcing a policy of "encrypt
everything" is actually a sensible approach and it definitely adds
complexity and makes it a lot more difficult to come up with a sensible
solution.

If you look at how the two threads discussing the FDE design, both of
them pretty much started as "let's do FDE in the database".

And that's how some folks continue to see it- let's just encrypt all the
things, until they actually look at it and start thinking about what
that means and how to implement it.

This argument also works the other way, though. On Oracle, people often
start with the column-level encryption because it seems naturally
superior (hey, I can encrypt just the columns I want, ...) and then they
start running into the various limitations and eventually just switch to
the tablespace-level encryption.

Now, maybe we'll be able to solve those limitations - but I think it's
pretty unlikely, because those limitations seem quite inherent to how
encryption affects indexes etc.

It would probably be useful to discuss the specific limitations that
you've seen causes people to move away from column-level encryption.

I definitely agree that figuring out how to make things work with
indexes is a non-trivial challenge, though I'm hopeful that we can come
up with something sensible.

Yeah, it'd be great to just encrypt everything, with a bunch of
different keys, all of which are stored somewhere else, and can be
updated and changed by the user when they need to do a rekeying, but
then you start have to asking about what keys need to be available when
for doing crash recovery, how do you handle a crash in the middle of a
rekeying, how do you handle updating keys from the user, etc..

Sure, we could offer a dead simple "here, use this one key at database
start to just encrypt everything" and that would be enough for some set
of users (a very small set, imv, but that's subjective, obviously), but
I don't think we could dare promote that as having TDE because it
wouldn't be at all comparable to what other databases have, and it
wouldn't materially move us in the direction of having real TDE.

I think that very much depends on the definition of what "real TDE". I
don't know what exactly that means at this point. And as I said before,
I think such simple mode *is* comparable to (at least some) solutions
available in other databases (as explained above).

When I was researching this, I couldn't find any example of a database
that wouldn't start without the one magic key that encrypts everything.
I'm happy to be told that I was wrong in my understanding of that, with
some examples.

As for the users, I don't have any objective data about this, but I
think the amount of people wanting such simple solution is non-trivial.
That does not mean we can't extend it to support more advanced features.

The concern that I raised before and that I continue to worry about is
that providing such a simple capability will have a lot of limitations
too (such as having a single key and only being able to rekey during a
complete downtime, because we have to re-encrypt clog, etc, etc), and
I don't see it helping us get to more granular TDE because, for that,
where we really need to start is by building a vault of some kind to
store the keys in and then figuring out how we do things like crash
recovery in a sensible way and, ideally, without needing to have access
to all of (any of?) the keys.

I'm not sold on the comments that have been made about encrypting the
server log. I agree that could leak data, but that seems like somebody
else's problem: the log files aren't really under PostgreSQL's
management in the same way as pg_clog is. If you want to secure your
logs, send them to syslog and configure it to do whatever you need.

I agree with this.

I don't. I know it's not an easy problem to solve, but it may contain
user data (which is what we manage). We may allow disabling that, at
which point it becomes someone else's problem.

We also send user data to clients, but I don't imagine we're suggesting
that we need to control what some downstream application does with that
data or how it gets stored. There's definitely a lot of room for
improvement in our logging (in an ideal world, we'd have a way to
actually store the logs in the database, at which point it could be
encrypted or not that way...), but I'm not seeing the need for us to
have a way to encrypt the log files. If we did encrypt them, we'd have
to make sure to do it in a way that users could still access them
without the database being up and running, which might be tricky if the
key is in the vault...

That's a bit of a straw-man argument, really. The client is obviously
meant to receive and handle sensitive data, that's it's main purpose.
For logging systems the situation is a bit different, it's a general
purpose tool, with no idea what the data is.

The argument you're making is that the log isn't intended to have
sensitive data, but while that might be a nice place to get to, we
certainly aren't there today, which means that people should really be
sending the logs to a location that's trusted.

I do understand it's pretty pointless to send encrypted message to such
external tools, but IMO it's be good to implement that at least for our
internal logging collector.

It's also less than user friendly to log to encrypted files that you
can't read without having the database system being up, so we'd have to
figure out at least a solution to that problem, and then if you have
downstream systems where the logs are going to, you have to decrypt
them, or have a way to have them not be encrypted perhaps.

In general, wrt the logs, I feel like it's at least a reasonably small
and independent piece of this, though I wonder if it'll cause similar
problems when it comes to dealing with crash recovery (how do we log if
we don't have the key from the vault because we haven't done crash
recovery yet, for example...).

Thanks,

Stephen

#18Stephen Frost
sfrost@snowman.net
In reply to: Tomas Vondra (#16)
Re: Transparent Data Encryption (TDE) and encrypted files

Greetings,

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:

On Thu, Oct 03, 2019 at 11:58:55AM -0400, Stephen Frost wrote:

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:

On 2019-10-03 16:40, Stephen Frost wrote:

As others have said, that sounds wrong to me. I think you need to
encrypt everything.

That isn't what other database systems do though and isn't what people
actually asking for this feature are expecting to have or deal with.

It is what some other database systems do. Perhaps some others don't.

I looked at the contemporary databases and provided details about all of
them earlier in the thread. Please feel free to review that and let me
know if your research shows differently.

I assume you mean this (in one of the other threads):

/messages/by-id/20190817175217.GE16436@tamriel.snowman.net

FWIW I don't see anything contradicting the idea of just encrypting
everything (including vm, fsm etc.). The only case that seems to be an
exception is the column-level encryption in Oracle, all the other
options (especially the database-level ones) seem to be consistent with
this principle.

I don't think I was arguing specifically about VM/FSM in particular but
rather about things which, for us, are cluster level. Admittedly, some
other database systems put more things into tablespaces or databases
than we do (it'd sure be nice if we did in some cases too, but we
don't...), but they do also have things *outside* of those, such that
you can at least bring the system up, to some extent, even if you can't
access a given tablespace or database.

Thanks,

Stephen

#19Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#18)
Re: Transparent Data Encryption (TDE) and encrypted files

On Thu, Oct 3, 2019 at 1:29 PM Stephen Frost <sfrost@snowman.net> wrote:

I don't think I was arguing specifically about VM/FSM in particular but
rather about things which, for us, are cluster level. Admittedly, some
other database systems put more things into tablespaces or databases
than we do (it'd sure be nice if we did in some cases too, but we
don't...), but they do also have things *outside* of those, such that
you can at least bring the system up, to some extent, even if you can't
access a given tablespace or database.

It sounds like you're making this up as you go along. The security
ramifications of encrypting a file don't depend on whether that file
is database-level or cluster-level, but rather on whether the contents
could be useful to an attacker. It doesn't seem like it would require
much work at all to construct an argument that a hacker might enjoy
having unfettered access to pg_clog even if no other part of the
database can be read.

My perspective on this feature is, and has always been, that there are
two different things somebody might want, both of which we seem to be
calling "TDE." One is to encrypt every single data page in the cluster
(and possibly things other than data pages, but at least those) with a
single encryption key, much as filesystem encryption would do, but
internal to the database. Contrary to your assertions, such a solution
has useful properties. One is that it will work the same way on any
system where PostgreSQL runs, whereas filesystem encryption solutions
vary. Another is that it does not require the cooperation of the
person who has root in order to set up. A third is that someone with
access to the system does not have automatic and unfettered access to
the database's data; sure, they can get it with enough work, but it's
significantly harder to finish the encryption keys out of the memory
space of a running process than to tar up the data directory that the
filesystem has already decrypted for you. I would personally not care
about any of this based on my own background as somebody who generally
had to do set up systems from scratch, starting with buying the
hardware, but in enterprise and government environments they can pose
significant problems.

The other thing people sometimes want is to encrypt some of the data
within the database but not all of it. In my view, trying to implement
this is not a great idea, because it's vastly more complicated than
just encrypting everything with one key. Would I like to have the
feature? Sure. Do I expect that we're going to get that feature any
time soon? Nope. Even the thing I described in the previous paragraph,
as limited as it is, is complicated and could take several release
cycles to get into committable shape. Fine-grained encryption is
probably an order of magnitude more complicated. The problem of
figuring out which keys apply to which objects does not seem to have
any reasonably simple solution, assuming you want something that's
neither insecure nor a badly-done hack.

I am unsure what the thought process is among people, such as
yourself, who are arguing that fine-grained encryption is the only way
to go. It seems like you're determined to refuse a free Honda Civic on
the grounds that it's not a Cadillac. It's not even like accepting the
patch for the Honda Civic solution would some how block accepting the
Cadillac if that shows up later. It wouldn't. It would just mean that,
unless or until that patch shows up, we'd have something rather than
nothing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#19)
Re: Transparent Data Encryption (TDE) and encrypted files

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Thu, Oct 3, 2019 at 1:29 PM Stephen Frost <sfrost@snowman.net> wrote:

I don't think I was arguing specifically about VM/FSM in particular but
rather about things which, for us, are cluster level. Admittedly, some
other database systems put more things into tablespaces or databases
than we do (it'd sure be nice if we did in some cases too, but we
don't...), but they do also have things *outside* of those, such that
you can at least bring the system up, to some extent, even if you can't
access a given tablespace or database.

It sounds like you're making this up as you go along.

I'm not surprised, and I doubt that's really got much to do with the
actual topic.

The security
ramifications of encrypting a file don't depend on whether that file
is database-level or cluster-level, but rather on whether the contents
could be useful to an attacker.

I don't believe that I claimed otherwise. I agree with this.

It doesn't seem like it would require
much work at all to construct an argument that a hacker might enjoy
having unfettered access to pg_clog even if no other part of the
database can be read.

The question isn't about what hackers would like to have access to, it's
about what would actually provide them with a channel to get information
that's sensitive, and at what rate. Perhaps there's an argument to be
made that clog would provide a high enough rate of information that
could be used to glean sensitive information, but that's certainly not
an argument that's been put forth, instead it's the knee-jerk reaction
of "oh goodness, if anything isn't encrypted then hackers will be able
to get access to everything" and that's just not a real argument.

My perspective on this feature is, and has always been, that there are
two different things somebody might want, both of which we seem to be
calling "TDE." One is to encrypt every single data page in the cluster
(and possibly things other than data pages, but at least those) with a
single encryption key, much as filesystem encryption would do, but
internal to the database.

Making it all up as I go along notwithstanding, I did go look at other
database systems which I considered on-par with PG, shared that
information here, and am basing my comments on that review.

Which database systems have you looked at which have the properties
you're describing above that we should be working hard towards?

The other thing people sometimes want is to encrypt some of the data
within the database but not all of it. In my view, trying to implement
this is not a great idea, because it's vastly more complicated than
just encrypting everything with one key.

Which database systems that you'd consider to be on-par with PG, and
which do have TDE, don't have some mechanism for supporting multiple
keys and for encrypting only a subset of the data?

Thanks,

Stephen

#21Magnus Hagander
magnus@hagander.net
In reply to: Stephen Frost (#20)
#22Magnus Hagander
magnus@hagander.net
In reply to: Stephen Frost (#10)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#20)
#24Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#23)
#25Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Magnus Hagander (#21)
#26Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Stephen Frost (#17)
#27Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#26)
#28Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Bruce Momjian (#24)
#29Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Bruce Momjian (#27)
#30Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#28)
#31Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#29)
#32Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Bruce Momjian (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#32)
#34Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Bruce Momjian (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#30)
#36Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#34)
#37Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#35)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#37)
#39Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#38)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#39)
#41Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#40)
#42Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#41)
#43Magnus Hagander
magnus@hagander.net
In reply to: Bruce Momjian (#39)
#44Antonin Houska
ah@cybertec.at
In reply to: Robert Haas (#35)
#45Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Bruce Momjian (#36)
#46Robert Haas
robertmhaas@gmail.com
In reply to: Antonin Houska (#44)
#47Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#45)
#48Ants Aasma
ants.aasma@cybertec.at
In reply to: Bruce Momjian (#37)
#49Antonin Houska
ah@cybertec.at
In reply to: Ants Aasma (#48)
#50Antonin Houska
ah@cybertec.at
In reply to: Robert Haas (#46)
#51Stephen Frost
sfrost@snowman.net
In reply to: Magnus Hagander (#4)
#52Robert Haas
robertmhaas@gmail.com
In reply to: Antonin Houska (#50)
#53Moon, Insung
tsukiwamoon.pgsql@gmail.com
In reply to: Antonin Houska (#50)
#54Moon, Insung
tsukiwamoon.pgsql@gmail.com
In reply to: Bruce Momjian (#1)
#55Antonin Houska
ah@cybertec.at
In reply to: Moon, Insung (#53)
#56Moon, Insung
tsukiwamoon.pgsql@gmail.com
In reply to: Antonin Houska (#55)
#57Antonin Houska
ah@cybertec.at
In reply to: Moon, Insung (#56)
#58Stephen Frost
sfrost@snowman.net
In reply to: Magnus Hagander (#22)
#59Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#41)
#60Craig Ringer
craig@2ndquadrant.com
In reply to: Stephen Frost (#58)
#61Stephen Frost
sfrost@snowman.net
In reply to: Craig Ringer (#60)
#62Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Antonin Houska (#57)
#63Antonin Houska
ah@cybertec.at
In reply to: Masahiko Sawada (#62)
#64Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Antonin Houska (#63)
#65Bruce Momjian
bruce@momjian.us
In reply to: Stephen Frost (#61)
#66Stephen Frost
sfrost@snowman.net
In reply to: Bruce Momjian (#65)