Column Redaction

Started by Simon Riggsover 11 years ago55 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** **** 7733'

We can't do all of the above with current facilities inside the database.

The ability to mask output for data in certain cases, for the purpose
of security, is known lately as data redaction, or column-level data
redaction.

The best way to support this requirement would be to allow columns to
have an additional "output formatting function". This would be
executed only when data is about to be returned by a query. All other
uses of that would not restrict the data.

This would have other uses as well, such as default report formats, so
we can store financial amounts as NUMERIC, but format them on
retrieval as $12,345.78 etc..

Suggested user interface would be...
FORMAT functionname(parameters, if any)

e.g.
CREATE TABLE customer
( id ...
...
, stored_card_number NUMERIC FORMAT pci_card_number_redaction()
...
);

We'd need to implement something to allow pg_dump to ignore format
functions. I suggest the best way to do that is by providing a BACKUP
role that can be delegated to other users. We would then allow a
parameter for SET output_formatting = on | off, which can only be set
by superuser and BACKUP role, then have pg_dump issue SET
output_formatting = off explicitly when it runs.

Do we want redaction in PostgreSQL?
Do we want it generalised into output format functions?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Dave Page
dpage@pgadmin.org
In reply to: Simon Riggs (#1)
Re: Column Redaction

On Fri, Oct 10, 2014 at 9:57 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** **** 7733'

We can't do all of the above with current facilities inside the database.

The ability to mask output for data in certain cases, for the purpose
of security, is known lately as data redaction, or column-level data
redaction.

The best way to support this requirement would be to allow columns to
have an additional "output formatting function". This would be
executed only when data is about to be returned by a query. All other
uses of that would not restrict the data.

This would have other uses as well, such as default report formats, so
we can store financial amounts as NUMERIC, but format them on
retrieval as $12,345.78 etc..

Suggested user interface would be...
FORMAT functionname(parameters, if any)

e.g.
CREATE TABLE customer
( id ...
...
, stored_card_number NUMERIC FORMAT pci_card_number_redaction()
...
);

I like that idea a lot - could be very useful (it reminds me of my Pick days).

We'd need to implement something to allow pg_dump to ignore format
functions. I suggest the best way to do that is by providing a BACKUP
role that can be delegated to other users. We would then allow a
parameter for SET output_formatting = on | off, which can only be set
by superuser and BACKUP role, then have pg_dump issue SET
output_formatting = off explicitly when it runs.

That seems like a reasonable approach. I can imagine other uses for a
BACKUP role in the future.

Do we want redaction in PostgreSQL?

+1

Do we want it generalised into output format functions?

+1

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Thom Brown
thom@linux.com
In reply to: Simon Riggs (#1)
Re: Column Redaction

On 10 October 2014 09:57, Simon Riggs <simon@2ndquadrant.com> wrote:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** **** 7733'

One question that immediately springs to mind is: would the format
apply when passing columns to other functions? If not, wouldn't
something like

SELECT upper(redacted_column::text) ...

just bypass the formatting?

Also, how would casting be handled? Would it be forbidden for such cases?

And couldn't the card number be worked out using:

SELECT 1 FROM customer WHERE stored_card_number LIKE '%1 7733';
?column?
----------
(0 rows)

SELECT 1 FROM customer WHERE stored_card_number LIKE '%2 7733';
?column?
----------
1
(1 row)

SELECT 1 FROM customer WHERE stored_card_number LIKE '%12 7733';
?column?
----------
(0 rows)

.. and so on, which could be scripted in a DO statement?

Not so much a challenge to the idea, but just wishing to understand
how it would work.

--
Thom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#1)
Re: Column Redaction

On 10/10/2014 11:57 AM, Simon Riggs wrote:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** **** 7733'

We can't do all of the above with current facilities inside the database.

Deny access to the underlying tables. Write SQL functions to do 1. and
2., and grant privileges to the functions, instead. For 3. create views
that do the redaction.

The ability to mask output for data in certain cases, for the purpose
of security, is known lately as data redaction, or column-level data
redaction.

The best way to support this requirement would be to allow columns to
have an additional "output formatting function". This would be
executed only when data is about to be returned by a query. All other
uses of that would not restrict the data.

I don't see how that could work. Once you have access to the datum, you
can find its value in many indirect ways, without invoking the output
function. For example, write a PL/pgSQL function that takes the card
number as argument. Use < and > to binary search its value. If you block
< and >, I'm sure there are countless other ways.

And messing with output functions seems pretty, well, messy, in general.

I think the only solution that's going to work in practice is to
implement the redaction at a higher level. Don't allow direct access to
the tables with card numbers. Create functions that do whatever joins,
etc. you need to do with them, and grant privileges to only the functions.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Damian Wolgast
damian.wolgast@si-co.net
In reply to: Simon Riggs (#1)
Re: Column Redaction

This would have other uses as well, such as default report formats, so
we can store financial amounts as NUMERIC, but format them on
retrieval as $12,345.78 etc..

Nice idea, but what if you need to do further calculations?
If you output the value of credit card transactions it works fine, but in
case you want to SUM up the values, then you need to cast it back from
text(?) to numeric, calculate it and cast it to text(?) again?
And if you do - for any reason - need the credit card number in your
application (for example sending it to the credit card company to deduct
money) how can you retrieve it¹s original value?

Moreover, if you SELECT from a sub-SELECT which already has the formatted
information and not the plain data?

Maybe you should restrict access to tables for a certain user and only
allow the user to use a view which formats the output.

Modern applications do have a presentation layer which should take care of
data formatting. I am not sure if it is a good idea to mix data storage
and data presentation in the database.

Regards,
Damian Wolgast (irc:asymetrixs)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#4)
Re: Column Redaction

On 10 October 2014 10:29, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 10/10/2014 11:57 AM, Simon Riggs wrote:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** ****
7733'

We can't do all of the above with current facilities inside the database.

Deny access to the underlying tables. Write SQL functions to do 1. and 2.,
and grant privileges to the functions, instead. For 3. create views that do
the redaction.

If everything were easy to lock down the approach you suggest is of
course the best way.

The problem there is that the SQL for (2) changes frequently, so we
want to give people SQL access.

Just not the ability to retrieve data in a usable form.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Thom Brown (#3)
Re: Column Redaction

On 10 October 2014 10:15, Thom Brown <thom@linux.com> wrote:

On 10 October 2014 09:57, Simon Riggs <simon@2ndquadrant.com> wrote:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** **** 7733'

One question that immediately springs to mind is: would the format
apply when passing columns to other functions? If not, wouldn't
something like

SELECT upper(redacted_column::text) ...

just bypass the formatting?

Yes, it would. As would SELECT redacted_column || ' '

I'm not sure how to block such usage, other than to apply it prior to
final calculation of functions.

i.e. we apply it in the SELECT clause, but not in the other clauses
FROM ON/WHERE/GROUP/ORDER/HAVING etc..

Also, how would casting be handled? Would it be forbidden for such cases?

And couldn't the card number be worked out using:

SELECT 1 FROM customer WHERE stored_card_number LIKE '%1 7733';
?column?
----------
(0 rows)

SELECT 1 FROM customer WHERE stored_card_number LIKE '%2 7733';
?column?
----------
1
(1 row)

SELECT 1 FROM customer WHERE stored_card_number LIKE '%12 7733';
?column?
----------
(0 rows)

.. and so on, which could be scripted in a DO statement?

Not so much a challenge to the idea, but just wishing to understand
how it would work.

Yes, covert channels would always exist. It would really be down to
auditing to control such exploits.

Redaction is aimed at minimising access in normal usage.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Damian Wolgast
damian.wolgast@si-co.net
In reply to: Simon Riggs (#6)
Re: Column Redaction

The problem there is that the SQL for (2) changes frequently, so we
want to give people SQL access.

So you want to give people access to your SQL database and worry that they could see specific information (credit card numbers) in plain and therefore you want to format it, so that people cannot see the real data. Is that correct?

I'd either do that by only letting them access a view or be reconsidering if it is really a good idea to give them SQL access to the server as they could do other things which e.g. could slow down the server enormously.
Never trust the user. So I see what you want to achieve but I am not sure if the reason to do that is good. Can you explain please?
Maybe you should provide them an interface (e.g. web app) that restricts access to certain functions and cares about formatting.

Regards
Damian Wolgast (irc:asymetrixs)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Damian Wolgast (#8)
Re: Column Redaction

On 10 October 2014 11:08, Damian Wolgast <damian.wolgast@si-co.net> wrote:

The problem there is that the SQL for (2) changes frequently, so we
want to give people SQL access.

So you want to give people access to your SQL database and worry that they could see specific information (credit card numbers) in plain and therefore you want to format it, so that people cannot see the real data. Is that correct?

I'd either do that by only letting them access a view or be reconsidering if it is really a good idea to give them SQL access to the server as they could do other things which e.g. could slow down the server enormously.
Never trust the user. So I see what you want to achieve but I am not sure if the reason to do that is good. Can you explain please?
Maybe you should provide them an interface (e.g. web app) that restricts access to certain functions and cares about formatting.

The requirement for redaction cannot be provided by a view.

A view provides a single value for each column, no matter whether it
is used in SELECT or WHERE clause.

Redaction requires output formatting only, but unchanged for other purposes.

Redaction is now a feature available in other databases. I guess its
possible its all smoke and mirrors, but thats why we discuss stuff
before we build it.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#9)
Re: Column Redaction

On 10/10/2014 01:21 PM, Simon Riggs wrote:

Redaction is now a feature available in other databases. I guess its
possible its all smoke and mirrors, but thats why we discuss stuff
before we build it.

I googled for Oracle Data redaction, and found "General Usage guidelines":

General Usage Guidelines

* Oracle Data Redaction is not intended to protect against attacks by
privileged database users who run ad hoc queries directly against the
database.

* Oracle Data Redaction is not intended to protect against users who
run exhaustive SQL queries that attempt to determine the actual
values by inference.

So it's not actually suitable for the example you gave. I don't think we
want this feature...

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#9)
Re: Column Redaction

Simon,

* Simon Riggs (simon@2ndquadrant.com) wrote:

The requirement for redaction cannot be provided by a view.

A view provides a single value for each column, no matter whether it
is used in SELECT or WHERE clause.

Redaction requires output formatting only, but unchanged for other purposes.

Redaction is now a feature available in other databases. I guess its
possible its all smoke and mirrors, but thats why we discuss stuff
before we build it.

In general, I'm on-board with the idea and similar requests have come
from users I've talked with.

Is there any additional information available on how these other
databases deal with the questions and concerns which have been raised?

Regarding functions, 'leakproof' functions should be alright to allow,
though Heikki brings up a good point regarding binary search being
possible in a plpgsql function (or even directly by a client). Of
course, that approach also requires that you have a specific item in
mind. Methods to mitigate would include not allowing regular users to
create functions or run DO blocks and rate-limiting their queries, along
with appropriate auditing.

Thanks,

Stephen

#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Simon Riggs (#1)
Re: Column Redaction

Hi

2014-10-10 10:57 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** ****
7733'

We can't do all of the above with current facilities inside the database.

The ability to mask output for data in certain cases, for the purpose
of security, is known lately as data redaction, or column-level data
redaction.

The best way to support this requirement would be to allow columns to
have an additional "output formatting function". This would be
executed only when data is about to be returned by a query. All other
uses of that would not restrict the data.

This would have other uses as well, such as default report formats, so
we can store financial amounts as NUMERIC, but format them on
retrieval as $12,345.78 etc..

Suggested user interface would be...
FORMAT functionname(parameters, if any)

e.g.
CREATE TABLE customer
( id ...
...
, stored_card_number NUMERIC FORMAT pci_card_number_redaction()
...
);

We'd need to implement something to allow pg_dump to ignore format
functions. I suggest the best way to do that is by providing a BACKUP
role that can be delegated to other users. We would then allow a
parameter for SET output_formatting = on | off, which can only be set
by superuser and BACKUP role, then have pg_dump issue SET
output_formatting = off explicitly when it runs.

I see a benefit of this feature as alternative output function .. I
remember a talk about output format of boolean function. But how this
feature can help to security?

You should to disallow any expression over this column marked or you have
to enforced output alternative output function early.

When you require a alternative output format function (should be
implemented in C), then there is not too less work than implementation of
new type. So probably much more practical a any expression can be used

like

stored_card_number NUMERIC FORMAT (right(stored_card_numbe::text, 4))

Regards

Pavel

Show quoted text

Do we want redaction in PostgreSQL?
Do we want it generalised into output format functions?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Thom Brown
thom@linux.com
In reply to: Stephen Frost (#11)
Re: Column Redaction

On 10 October 2014 11:35, Stephen Frost <sfrost@snowman.net> wrote:

Simon,

* Simon Riggs (simon@2ndquadrant.com) wrote:

The requirement for redaction cannot be provided by a view.

A view provides a single value for each column, no matter whether it
is used in SELECT or WHERE clause.

Redaction requires output formatting only, but unchanged for other purposes.

Redaction is now a feature available in other databases. I guess its
possible its all smoke and mirrors, but thats why we discuss stuff
before we build it.

In general, I'm on-board with the idea and similar requests have come
from users I've talked with.

Is there any additional information available on how these other
databases deal with the questions and concerns which have been raised?

Regarding functions, 'leakproof' functions should be alright to allow,
though Heikki brings up a good point regarding binary search being
possible in a plpgsql function (or even directly by a client). Of
course, that approach also requires that you have a specific item in
mind. Methods to mitigate would include not allowing regular users to
create functions or run DO blocks and rate-limiting their queries, along
with appropriate auditing.

To be honest, this all sounds rather flaky. Even if you do rate-limit
their queries, they can use methods that avoid rate-limiting, such as
recursive queries. And if you're only after one credit card number
(to use the original example), you'd get it in a relatively short
amount of time, despite some rate-limiting system.

This gives the vague impression of security, but it really seems just
the placing of a few obstacles in the way.

And "auditing" sounds like a euphemism for "pass the problem of
security on elsewhere anyway".

Thom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Stephen Frost
sfrost@snowman.net
In reply to: Heikki Linnakangas (#10)
Re: Column Redaction

* Heikki Linnakangas (hlinnakangas@vmware.com) wrote:

On 10/10/2014 01:21 PM, Simon Riggs wrote:

Redaction is now a feature available in other databases. I guess its
possible its all smoke and mirrors, but thats why we discuss stuff
before we build it.

I googled for Oracle Data redaction, and found "General Usage guidelines":

General Usage Guidelines

* Oracle Data Redaction is not intended to protect against attacks by
privileged database users who run ad hoc queries directly against the
database.

* Oracle Data Redaction is not intended to protect against users who
run exhaustive SQL queries that attempt to determine the actual
values by inference.

So it's not actually suitable for the example you gave. I don't
think we want this feature...

Or, we need to consider how Oracle addresses these risks and consider if
we can provide a similar capability. Those capabilities may include
specific configuration and could be a prerequisite for this feature, but
I don't think it's sensible to say we don't want this feature simply
because it can't stand alone as a perfect answer to these risks.

As has been discussed before, we are likely in a better position to
identify the concerns and problem areas, come up with recommendations
for configuration and/or develop new capabilities to mitigate those
risks, than the every-day user or DBA. If we provide it and address
these issues in a central location which is generally available, then
fixes and problems can be addressed and fixed rather than every
database implementation faced with these concerns having to address
them independently with, most likely, poorer quality solutions.

While we don't want every feature of every database, this deserves more
consideration.

Thanks,

Stephen

#15Stephen Frost
sfrost@snowman.net
In reply to: Thom Brown (#13)
Re: Column Redaction

* Thom Brown (thom@linux.com) wrote:

To be honest, this all sounds rather flaky. Even if you do rate-limit
their queries, they can use methods that avoid rate-limiting, such as
recursive queries. And if you're only after one credit card number
(to use the original example), you'd get it in a relatively short
amount of time, despite some rate-limiting system.

The discussion about looking up specific card numbers in the original
email from Simon was actually an allowed use-case, as I understood it,
not a risk concern. Indeed, if you know a valid credit card number
already, as in this example, then why are you bothering with the search?
Perhaps it would provide confirmation, but it's not the database's
responsibility to make you forget the number you already have. Doing a
random walk through a keyspace of 10^16 and extracting a significant
enough number of results to be useful should be difficult. I agree that
if we're completely unable to make it difficult then this is less
useful, but I feel it's a bit early to jump to that conclusion.

This gives the vague impression of security, but it really seems just
the placing of a few obstacles in the way.

One might consider that all security is just placing obstacles in the
way.

And "auditing" sounds like a euphemism for "pass the problem of
security on elsewhere anyway".

Auditing is a known requirement for good security.. There's certainly
different levels of it, but if you aren't at least auditing your
security configuration for the attack vectors you're concerned about,
then you're unlikely to have any real security.

Thanks,

Stephen

#16Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Stephen Frost (#11)
Re: Column Redaction

On 10/10/2014 01:35 PM, Stephen Frost wrote:

Regarding functions, 'leakproof' functions should be alright to allow,
though Heikki brings up a good point regarding binary search being
possible in a plpgsql function (or even directly by a client). Of
course, that approach also requires that you have a specific item in
mind.

It doesn't require that you have a specific item in mind. Binary search
is cheap, O(log n). It's easy to write a function to do a binary search
on a single item, passed as argument, and then apply that to all rows:

SELECT binary_search_reveal(cardnumber) FROM redacted_table;

Really, I don't see how this can possible be made to work. You can't
allow ad hoc processing of data, and still avoid revealing it to the user.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Stephen Frost
sfrost@snowman.net
In reply to: Heikki Linnakangas (#16)
Re: Column Redaction

* Heikki Linnakangas (hlinnakangas@vmware.com) wrote:

On 10/10/2014 01:35 PM, Stephen Frost wrote:

Regarding functions, 'leakproof' functions should be alright to allow,
though Heikki brings up a good point regarding binary search being
possible in a plpgsql function (or even directly by a client). Of
course, that approach also requires that you have a specific item in
mind.

It doesn't require that you have a specific item in mind. Binary
search is cheap, O(log n). It's easy to write a function to do a
binary search on a single item, passed as argument, and then apply
that to all rows:

SELECT binary_search_reveal(cardnumber) FROM redacted_table;

Note that your binary_search_reveal wouldn't be marked as leakproof and
therefore this wouldn't be allowed. If this was allowed, you'd simply
do "raise notice" inside the function and call it a day.

Thanks,

Stephen

#18Hannu Krosing
hannu@tm.ee
In reply to: Simon Riggs (#6)
Re: Column Redaction

On 10/10/2014 11:38 AM, Simon Riggs wrote:

On 10 October 2014 10:29, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 10/10/2014 11:57 AM, Simon Riggs wrote:

Postgres currently supports column level SELECT privileges.

1. If we want to confirm a credit card number, we can issue SELECT 1
FROM customer WHERE stored_card_number = '1234 5678 5344 7733'

2. If we want to look for card fraud, we need to be able to use the
full card number to join to transaction data and look up blocked card
lists etc..

3. We want to block the direct retrieval of card numbers for
additional security.
In some cases, we might want to return an answer like '**** ***** ****
7733'

We can't do all of the above with current facilities inside the database.

Deny access to the underlying tables. Write SQL functions to do 1. and 2.,
and grant privileges to the functions, instead. For 3. create views that do
the redaction.

If everything were easy to lock down the approach you suggest is of
course the best way.

The problem there is that the SQL for (2) changes frequently, so we
want to give people SQL access.

1. Give people access to development system with "safe" data where they
write their functions

2. once function is working, pass it to auditors

3. deploy and use the function.

Just not the ability to retrieve data in a usable form.

For an attacker any access is "in a usable form", for honest people you
can just provide a view or set-returning function.

btw, one way to do the "redaction" you suggested above is to write a
special
type, which redacts data on output.

You can even make the type output function dependent on backup role.

Just make sure that users are aware that it is not really a security
feature
which protects against attackers.

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Stephen Frost (#17)
Re: Column Redaction

On 10/10/2014 02:05 PM, Stephen Frost wrote:

* Heikki Linnakangas (hlinnakangas@vmware.com) wrote:

On 10/10/2014 01:35 PM, Stephen Frost wrote:

Regarding functions, 'leakproof' functions should be alright to allow,
though Heikki brings up a good point regarding binary search being
possible in a plpgsql function (or even directly by a client). Of
course, that approach also requires that you have a specific item in
mind.

It doesn't require that you have a specific item in mind. Binary
search is cheap, O(log n). It's easy to write a function to do a
binary search on a single item, passed as argument, and then apply
that to all rows:

SELECT binary_search_reveal(cardnumber) FROM redacted_table;

Note that your binary_search_reveal wouldn't be marked as leakproof and
therefore this wouldn't be allowed. If this was allowed, you'd simply
do "raise notice" inside the function and call it a day.

*shrug*, just do the same with a more complicated query, then. Even if
you can't create a function that does that, you can still execute the
same logic without a function.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Thom Brown
thom@linux.com
In reply to: Stephen Frost (#15)
Re: Column Redaction

On 10 October 2014 12:00, Stephen Frost <sfrost@snowman.net> wrote:

* Thom Brown (thom@linux.com) wrote:

To be honest, this all sounds rather flaky. Even if you do rate-limit
their queries, they can use methods that avoid rate-limiting, such as
recursive queries. And if you're only after one credit card number
(to use the original example), you'd get it in a relatively short
amount of time, despite some rate-limiting system.

The discussion about looking up specific card numbers in the original
email from Simon was actually an allowed use-case, as I understood it,
not a risk concern. Indeed, if you know a valid credit card number
already, as in this example, then why are you bothering with the search?

The topic being "column redaction" rather than "column formatting"
leads me to believe that the main use-case of the feature would be to
prevent the user from discovering the full value of the column. It's
not so much point 1 I was responding do, rather point 3, where you
don't know the card number, but you get information about it in the
results. The purpose of this feature would be to prevent the user
from seeing all that data, which is a security feature, but at the
moment it just seems to be a way of making it a little less easy to
get at that data.

This gives the vague impression of security, but it really seems just
the placing of a few obstacles in the way.

One might consider that all security is just placing obstacles in the
way.

There's a difference between intending that there shouldn't be a way
past security and just making access a matter of walking a longer
route.

I wouldn't be against formatting per se, but for the purposes of that,
I would say that views can already serve that purpose.

--
Thom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Stephen Frost
sfrost@snowman.net
In reply to: Heikki Linnakangas (#19)
#22Stephen Frost
sfrost@snowman.net
In reply to: Thom Brown (#20)
#23Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Stephen Frost (#21)
#24Thom Brown
thom@linux.com
In reply to: Stephen Frost (#22)
#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Thom Brown (#13)
#26Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#16)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#15)
#28Thom Brown
thom@linux.com
In reply to: Simon Riggs (#25)
#29Claudio Freire
klaussfreire@gmail.com
In reply to: Simon Riggs (#1)
#30Stephen Frost
sfrost@snowman.net
In reply to: Heikki Linnakangas (#23)
#31Stephen Frost
sfrost@snowman.net
In reply to: Thom Brown (#24)
#32Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#27)
#33Thom Brown
thom@linux.com
In reply to: Stephen Frost (#31)
#34Claudio Freire
klaussfreire@gmail.com
In reply to: Stephen Frost (#32)
#35Rod Taylor
rbt@rbt.ca
In reply to: Stephen Frost (#31)
#36Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Thom Brown (#33)
#37Bruce Momjian
bruce@momjian.us
In reply to: Thom Brown (#28)
#38Stephen Frost
sfrost@snowman.net
In reply to: Rod Taylor (#35)
#39Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#37)
#40Simon Riggs
simon@2ndQuadrant.com
In reply to: Rod Taylor (#35)
#41Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#10)
#42Joe Conway
mail@joeconway.com
In reply to: Simon Riggs (#40)
#43Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#41)
#44Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Simon Riggs (#1)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#40)
#46Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#45)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#46)
#48Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#47)
#49Claudio Freire
klaussfreire@gmail.com
In reply to: Simon Riggs (#40)
#50Simon Riggs
simon@2ndQuadrant.com
In reply to: Claudio Freire (#49)
#51Claudio Freire
klaussfreire@gmail.com
In reply to: Simon Riggs (#50)
#52Simon Riggs
simon@2ndQuadrant.com
In reply to: Claudio Freire (#51)
#53Claudio Freire
klaussfreire@gmail.com
In reply to: Simon Riggs (#52)
#54Simon Riggs
simon@2ndQuadrant.com
In reply to: Claudio Freire (#53)
#55Ibrar Ahmed
ibrar.ahmad@gmail.com
In reply to: Simon Riggs (#1)