[v9.1] sepgsql - userspace access vector cache
The attached patch adds contrib/sepgsql a cache mechanism for access
control decision of SELinux. It shall reduce the total number of
system call invocations to improve the performance on its access
controls.
In the current implementation, the sepgsql always raises a query to
SELinux in-kernel. However, same answer shall be returned for some
pair of security labels and object class, unless the security policy
got reloaded.
It is a situation caching mechanism works well. Of course, we don't
assume the security policy is reloaded so frequently.
I tried to measure the performance to run sepgsql_restorecon(NULL)
that is used to assign initial labels of schemas, relations, columns
and procedures. It also invokes massive number of "relabelfrom" and
"relabelto" permission checks.
$ time -p psql -c 'SELECT sepgsql_restorecon(NULL);' postgres
without patch
real 2.73
real 2.70
real 2.72
real 2.67
real 2.68
with patch
real 0.67
real 0.61
real 0.63
real 0.63
real 0.63
The improvement is obvious.
From the viewpoint of implementation, this patch replaces
sepgsql_check_perms() by sepgsql_avc_check_perms(), from non-cache
interface to cached interface.
Every cached items are hashed using a pair of security labels and
object class, so, even if different objects have same security label,
system call invocation shall happen only once for an identical
combination.
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachments:
sepgsql-uavc.1.patchapplication/octet-stream; name=sepgsql-uavc.1.patchDownload+753-229
On Thu, Jun 9, 2011 at 3:59 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.
I believe we decided against that previously on the grounds that we
don't want to add syscaches that might get really really big. In
particular, there could be a LOT of labelled large objects floating
around.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
* Kohei KaiGai (kaigai@kaigai.gr.jp) wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.
Perhaps I'm missing it, but.. why is this necessary to implement such a
cache? Also, I thought the SELinux userspace libraries provided a cache
solution? This issue is hardly unique to SELinux in PostgreSQL...
THanks,
Stephen
2011/6/9 Robert Haas <robertmhaas@gmail.com>:
On Thu, Jun 9, 2011 at 3:59 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.I believe we decided against that previously on the grounds that we
don't want to add syscaches that might get really really big. In
particular, there could be a LOT of labelled large objects floating
around.
(Sorry, I missed to Cc: pgsql-hackers, so send again)
As long as we use syscache mechanism to hold security label of
relation or other cached objects, do you think it cause no troubles?
If so, it may be a good idea to distinct cases when we try to reference
the security label of blobs and others.
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
2011/6/9 Stephen Frost <sfrost@snowman.net>:
* Kohei KaiGai (kaigai@kaigai.gr.jp) wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.Perhaps I'm missing it, but.. why is this necessary to implement such a
cache? Also, I thought the SELinux userspace libraries provided a cache
solution? This issue is hardly unique to SELinux in PostgreSQL...
I'm concerned about its interface, although it might be suitable for
X-Windows...
Its avc interface identifies security context using a pointer of
malloc()'ed cstring.
In our case, we need to look up this security context on the hash managed by
libselinux using the result of syscache lookup. It is quite nonsense.
In addition, avc of libselinux confirms whether the security policy is reloaded
for each avc lookup, unless we launch a system state monitoring thread.
But, it is not a suitable design to launch a worker thread for each
pgsql backend.
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
On Thu, Jun 9, 2011 at 12:39 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
2011/6/9 Robert Haas <robertmhaas@gmail.com>:
On Thu, Jun 9, 2011 at 3:59 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.I believe we decided against that previously on the grounds that we
don't want to add syscaches that might get really really big. In
particular, there could be a LOT of labelled large objects floating
around.(Sorry, I missed to Cc: pgsql-hackers, so send again)
As long as we use syscache mechanism to hold security label of
relation or other cached objects, do you think it cause no troubles?
Maybe, but why do we need it?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Jun 9, 2011 at 12:54 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
2011/6/9 Stephen Frost <sfrost@snowman.net>:
* Kohei KaiGai (kaigai@kaigai.gr.jp) wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.Perhaps I'm missing it, but.. why is this necessary to implement such a
cache? Also, I thought the SELinux userspace libraries provided a cache
solution? This issue is hardly unique to SELinux in PostgreSQL...I'm concerned about its interface, although it might be suitable for
X-Windows...Its avc interface identifies security context using a pointer of
malloc()'ed cstring.
In our case, we need to look up this security context on the hash managed by
libselinux using the result of syscache lookup. It is quite nonsense.
So you're going to depend on the syscache not to move the pointers
around? Yikes.
In addition, avc of libselinux confirms whether the security policy is reloaded
for each avc lookup, unless we launch a system state monitoring thread.
But, it is not a suitable design to launch a worker thread for each
pgsql backend.
I thought there was something you could mmap() into each backend...?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2011/6/9 Robert Haas <robertmhaas@gmail.com>:
On Thu, Jun 9, 2011 at 12:39 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
2011/6/9 Robert Haas <robertmhaas@gmail.com>:
On Thu, Jun 9, 2011 at 3:59 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.I believe we decided against that previously on the grounds that we
don't want to add syscaches that might get really really big. In
particular, there could be a LOT of labelled large objects floating
around.(Sorry, I missed to Cc: pgsql-hackers, so send again)
As long as we use syscache mechanism to hold security label of
relation or other cached objects, do you think it cause no troubles?Maybe, but why do we need it?
Of course, I'd like to look up security label of the referenced object with
smallest cost as possible as we can.
Here is two level lookups.
The first is from object identifiers to security label; it can be boosted
using syscache mechanism. The second is from security labels to
access control decision; it can be boosted using userspace avc.
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
2011/6/9 Robert Haas <robertmhaas@gmail.com>:
On Thu, Jun 9, 2011 at 12:54 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
2011/6/9 Stephen Frost <sfrost@snowman.net>:
* Kohei KaiGai (kaigai@kaigai.gr.jp) wrote:
The only modification by this patch to the core routine is a new
syscache for pg_seclabel system catalog. The SECLABELOID enables to
reference security label of the object using syscache interface.Perhaps I'm missing it, but.. why is this necessary to implement such a
cache? Also, I thought the SELinux userspace libraries provided a cache
solution? This issue is hardly unique to SELinux in PostgreSQL...I'm concerned about its interface, although it might be suitable for
X-Windows...Its avc interface identifies security context using a pointer of
malloc()'ed cstring.
In our case, we need to look up this security context on the hash managed by
libselinux using the result of syscache lookup. It is quite nonsense.So you're going to depend on the syscache not to move the pointers
around? Yikes.
No. It is nonsense, because cached security label of table X and table Y
are allocated on different address, so it eventually invokes system calls
for each tables, even if these tables have identical security labels.
In addition, avc of libselinux confirms whether the security policy is reloaded
for each avc lookup, unless we launch a system state monitoring thread.
But, it is not a suitable design to launch a worker thread for each
pgsql backend.I thought there was something you could mmap() into each backend...?
The selinux_status_open() maps a status page of selinux that allows applications
to reference a flag to show whether the policy was reloaded, or not, without
system call invocations. This function is called when sepgsql
initializes itself,
then it shall be inherited to child processes.
(Please note that avc of libselinux does not use this feature yet...)
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
On Thu, Jun 9, 2011 at 3:09 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
Here is two level lookups.
The first is from object identifiers to security label; it can be boosted
using syscache mechanism. The second is from security labels to
access control decision; it can be boosted using userspace avc.
OK. Let's have two separate patches, then.
Thinking a bit more about the issue of adding a syscache, I suspect
it's probably OK to use that mechanism rather than inventing something
more complicated that can kick out entries on an LRU basis. Even if
you accessed a few tens of thousands of entries, the cache shouldn't
grow to more than a few megabytes. And that's presumably going to be
rare, and could happen with other types of objects (such as tables)
using the existing system caches. So I guess it's probably premature
optimization to worry about the issue now.
I would, however, like to see some performance results indicating how
much the cache helps, and how much memory it eats up in the process.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2011/6/12 Robert Haas <robertmhaas@gmail.com>:
On Thu, Jun 9, 2011 at 3:09 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
Here is two level lookups.
The first is from object identifiers to security label; it can be boosted
using syscache mechanism. The second is from security labels to
access control decision; it can be boosted using userspace avc.OK. Let's have two separate patches, then.
Thinking a bit more about the issue of adding a syscache, I suspect
it's probably OK to use that mechanism rather than inventing something
more complicated that can kick out entries on an LRU basis. Even if
you accessed a few tens of thousands of entries, the cache shouldn't
grow to more than a few megabytes. And that's presumably going to be
rare, and could happen with other types of objects (such as tables)
using the existing system caches. So I guess it's probably premature
optimization to worry about the issue now.I would, however, like to see some performance results indicating how
much the cache helps, and how much memory it eats up in the process.
The attached patches are separated ones.
The smaller one adds a new SECLABELOID syscache, and modifies
GetSecurityLabel() that uses syscache interface when supplied
ObjectAddress does not point to LargeObjectRelationId.
The larger one is control/sepgsql part; that adds cache mechanism
of access control decision.
I tried to measure performance with/without these patches.
The avc improved the cost to make access control decision
in all cases. In addition, syscache feature also improved
performance when pg_seclabel is not heavily updated.
* Test 1. time to run SELECT sepgsql_restorecon(NULL)
selinux | SECLABELOID syscache
avc | without | with
---------+---------+----------
without | 2.63[s] | 2.81[s]
---------+---------+----------
with | 0.60[s] | 0.59[s]
---------+---------+----------
The selinux avc efficiently improved performance, however,
the effect of syscache is unclear because this workload also
invokes updates of system catalog, so it might be a bottle-neck
of the benchmark.
* Test 2. time to run 50,000 of SELECT from empty tables
selinux | SECLABELOID syscache
avc | without | with
---------+-----------------------
without | 185.59[s] | 178.38[s]
---------+-----------------------
with | 23.58[s] | 21.79[s]
---------+-----------------------
So, I also measured the performance of read-only queries.
The SECLABELOID syscache also improved performance
nearby 10%, although it contains whole of the query parsing,
optimizing and executing.
Regarding to memory consumption, we don't worry about
consumptions by uavc, because it caches an entry for a pair
of security labels and object class. A particular security label
tends to be shared by large number of objects.
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)
I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2011/6/13 Robert Haas <robertmhaas@gmail.com>:
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.
I checked memory consumption of the backend with / without
patches. Because sepgsql_restorecon() tries to reset security
label of all the schemas, relations, columns and procedures,
an execution of this function is suitable to emphasize differences
between two cases in maximum.
The results shows us about 3MB of additional consumption
in VmRSS, even if it caches all the security label of the objects
being created in the default (3331 entries).
* without patches before/after sepgsql_restorecon()
VmPeak: 150812 kB -> 170864 kB
VmSize: 150804 kB -> 154712 kB
VmLck: 0 kB -> 0kB
VmHWM: 3800 kB -> 22248 kB
VmRSS: 3800 kB -> 10620 kB
VmData: 1940 kB -> 5820 kB
VmStk: 196 kB -> 196 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2468 kB -> 2468 kB
VmPTE: 108 kB -> 120 kB
VmSwap: 0 kB -> 0kB
* with patches before/after sepgsql_restorecon()
VmPeak: 150816 kB -> 175092 kB
VmSize: 150808 kB -> 158804 kB
VmLck: 0 kB -> 0 kB
VmHWM: 3868 kB -> 25956 kB
VmRSS: 3868 kB -> 13736 kB
VmData: 1944 kB -> 9912 kB
VmStk: 192 kB -> 192 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2472 kB -> 2472 kB
VmPTE: 100 kB -> 124 kB
VmSwap: 0 kB -> 0 kB
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
The attached patch re-defines pg_seclabel.provider as NameData, instead of Text,
and revert changes of catcache.c about collations; to keep consistency with the
security label support on shared objects.
All the format changes are hidden by (Get|Set)SecurityLabel(), so no
need to change
on the patch to contrib/sepgsql.
Thanks,
2011/6/13 Kohei KaiGai <kaigai@kaigai.gr.jp>:
2011/6/13 Robert Haas <robertmhaas@gmail.com>:
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.I checked memory consumption of the backend with / without
patches. Because sepgsql_restorecon() tries to reset security
label of all the schemas, relations, columns and procedures,
an execution of this function is suitable to emphasize differences
between two cases in maximum.The results shows us about 3MB of additional consumption
in VmRSS, even if it caches all the security label of the objects
being created in the default (3331 entries).* without patches before/after sepgsql_restorecon()
VmPeak: 150812 kB -> 170864 kB
VmSize: 150804 kB -> 154712 kB
VmLck: 0 kB -> 0kB
VmHWM: 3800 kB -> 22248 kB
VmRSS: 3800 kB -> 10620 kB
VmData: 1940 kB -> 5820 kB
VmStk: 196 kB -> 196 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2468 kB -> 2468 kB
VmPTE: 108 kB -> 120 kB
VmSwap: 0 kB -> 0kB* with patches before/after sepgsql_restorecon()
VmPeak: 150816 kB -> 175092 kB
VmSize: 150808 kB -> 158804 kB
VmLck: 0 kB -> 0 kB
VmHWM: 3868 kB -> 25956 kB
VmRSS: 3868 kB -> 13736 kB
VmData: 1944 kB -> 9912 kB
VmStk: 192 kB -> 192 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2472 kB -> 2472 kB
VmPTE: 100 kB -> 124 kB
VmSwap: 0 kB -> 0 kBThanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
Attachments:
pgsql-v9.2-uavc-syscache.v2.patchapplication/octet-stream; name=pgsql-v9.2-uavc-syscache.v2.patchDownload+53-9
I rebased the userspace access vector cache patch to the latest tree.
I'll describe the background of this patch because this thread has not been
active more than a week.
The sepgsql asks in-kernel selinux when it needs to make its access control
decison, so it always causes system call invocations.
However, access control decision of selinux for a particular pair of security
label is consistent as long as its security policy is not reloaded.
Thus, it is a good idea to cache access control decisions recently used in
userspace.
In addition, current GetSecurityLabel() always open pg_seclabel catalog and
scan to fetch security label of database objects, although it is a situation we
can utilize syscache mechanism.
The "uavc-syscache" patch adds a new SECLABELOID syscache.
It also redefine pg_seclabel.provide as Name, instead of Text, according to
the suggestion from Tom.
(To avoid collation conscious datatype)
The "uavc-selinux-cache" patch adds cache mechanism of contrib/sepgsql.
Its internal api to communicate with selinux (sepgsql_check_perms) was
replaced by newer sepgsql_avc_check_perms that checks cached access
control decision at first, prior to system call invocations.
The result of performance improvement is obvious.
* Test 2. time to run 50,000 of SELECT from empty tables
selinux | SECLABELOID syscache
avc | without | with
---------+-----------------------
without | 185.59[s] | 178.38[s]
---------+-----------------------
with | 23.58[s] | 21.79[s]
---------+-----------------------
I strongly hope this patch (and security label support on shared objects) to
get unstreamed in this commit-fest, because it will perform as a basis of
other upcoming features.
Please volunteer the reviewing!
Thanks,
2011/7/2 Kohei KaiGai <kaigai@kaigai.gr.jp>:
The attached patch re-defines pg_seclabel.provider as NameData, instead of Text,
and revert changes of catcache.c about collations; to keep consistency with the
security label support on shared objects.
All the format changes are hidden by (Get|Set)SecurityLabel(), so no
need to change
on the patch to contrib/sepgsql.Thanks,
2011/6/13 Kohei KaiGai <kaigai@kaigai.gr.jp>:
2011/6/13 Robert Haas <robertmhaas@gmail.com>:
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.I checked memory consumption of the backend with / without
patches. Because sepgsql_restorecon() tries to reset security
label of all the schemas, relations, columns and procedures,
an execution of this function is suitable to emphasize differences
between two cases in maximum.The results shows us about 3MB of additional consumption
in VmRSS, even if it caches all the security label of the objects
being created in the default (3331 entries).* without patches before/after sepgsql_restorecon()
VmPeak: 150812 kB -> 170864 kB
VmSize: 150804 kB -> 154712 kB
VmLck: 0 kB -> 0kB
VmHWM: 3800 kB -> 22248 kB
VmRSS: 3800 kB -> 10620 kB
VmData: 1940 kB -> 5820 kB
VmStk: 196 kB -> 196 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2468 kB -> 2468 kB
VmPTE: 108 kB -> 120 kB
VmSwap: 0 kB -> 0kB* with patches before/after sepgsql_restorecon()
VmPeak: 150816 kB -> 175092 kB
VmSize: 150808 kB -> 158804 kB
VmLck: 0 kB -> 0 kB
VmHWM: 3868 kB -> 25956 kB
VmRSS: 3868 kB -> 13736 kB
VmData: 1944 kB -> 9912 kB
VmStk: 192 kB -> 192 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2472 kB -> 2472 kB
VmPTE: 100 kB -> 124 kB
VmSwap: 0 kB -> 0 kBThanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>--
KaiGai Kohei <kaigai@kaigai.gr.jp>
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
Sorry, the syscache part was mixed to contrib/sepgsql part
in my previous post.
Please see the attached revision.
Although its functionality is enough simple (it just reduces
number of system-call invocation), its performance
improvement is obvious.
So, I hope someone to volunteer to review these patches.
Thanks,
2011/7/11 Kohei KaiGai <kaigai@kaigai.gr.jp>:
I rebased the userspace access vector cache patch to the latest tree.
I'll describe the background of this patch because this thread has not been
active more than a week.
The sepgsql asks in-kernel selinux when it needs to make its access control
decison, so it always causes system call invocations.
However, access control decision of selinux for a particular pair of security
label is consistent as long as its security policy is not reloaded.
Thus, it is a good idea to cache access control decisions recently used in
userspace.
In addition, current GetSecurityLabel() always open pg_seclabel catalog and
scan to fetch security label of database objects, although it is a situation we
can utilize syscache mechanism.The "uavc-syscache" patch adds a new SECLABELOID syscache.
It also redefine pg_seclabel.provide as Name, instead of Text, according to
the suggestion from Tom.
(To avoid collation conscious datatype)The "uavc-selinux-cache" patch adds cache mechanism of contrib/sepgsql.
Its internal api to communicate with selinux (sepgsql_check_perms) was
replaced by newer sepgsql_avc_check_perms that checks cached access
control decision at first, prior to system call invocations.The result of performance improvement is obvious.
* Test 2. time to run 50,000 of SELECT from empty tables
selinux | SECLABELOID syscache
avc | without | with
---------+-----------------------
without | 185.59[s] | 178.38[s]
---------+-----------------------
with | 23.58[s] | 21.79[s]
---------+-----------------------I strongly hope this patch (and security label support on shared objects) to
get unstreamed in this commit-fest, because it will perform as a basis of
other upcoming features.
Please volunteer the reviewing!Thanks,
2011/7/2 Kohei KaiGai <kaigai@kaigai.gr.jp>:
The attached patch re-defines pg_seclabel.provider as NameData, instead of Text,
and revert changes of catcache.c about collations; to keep consistency with the
security label support on shared objects.
All the format changes are hidden by (Get|Set)SecurityLabel(), so no
need to change
on the patch to contrib/sepgsql.Thanks,
2011/6/13 Kohei KaiGai <kaigai@kaigai.gr.jp>:
2011/6/13 Robert Haas <robertmhaas@gmail.com>:
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.I checked memory consumption of the backend with / without
patches. Because sepgsql_restorecon() tries to reset security
label of all the schemas, relations, columns and procedures,
an execution of this function is suitable to emphasize differences
between two cases in maximum.The results shows us about 3MB of additional consumption
in VmRSS, even if it caches all the security label of the objects
being created in the default (3331 entries).* without patches before/after sepgsql_restorecon()
VmPeak: 150812 kB -> 170864 kB
VmSize: 150804 kB -> 154712 kB
VmLck: 0 kB -> 0kB
VmHWM: 3800 kB -> 22248 kB
VmRSS: 3800 kB -> 10620 kB
VmData: 1940 kB -> 5820 kB
VmStk: 196 kB -> 196 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2468 kB -> 2468 kB
VmPTE: 108 kB -> 120 kB
VmSwap: 0 kB -> 0kB* with patches before/after sepgsql_restorecon()
VmPeak: 150816 kB -> 175092 kB
VmSize: 150808 kB -> 158804 kB
VmLck: 0 kB -> 0 kB
VmHWM: 3868 kB -> 25956 kB
VmRSS: 3868 kB -> 13736 kB
VmData: 1944 kB -> 9912 kB
VmStk: 192 kB -> 192 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2472 kB -> 2472 kB
VmPTE: 100 kB -> 124 kB
VmSwap: 0 kB -> 0 kBThanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>--
KaiGai Kohei <kaigai@kaigai.gr.jp>--
KaiGai Kohei <kaigai@kaigai.gr.jp>
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
On 2011-07-14 21:46, Kohei KaiGai wrote:
Sorry, the syscache part was mixed to contrib/sepgsql part
in my previous post.
Please see the attached revision.Although its functionality is enough simple (it just reduces
number of system-call invocation), its performance
improvement is obvious.
So, I hope someone to volunteer to review these patches.
I will be able to look at this patch next week on monday and tuesday,
without wanting to raise any expectations about that time being enough
for me to say anything useful. On a longer timescale, I believe that
sepgsql is one of the most important new features of PostgreSQL and that
I want to commit myself to spend more community work on this subject.
regards,
Yeb
--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data
Hello KaiGai-san,
I've been preparing to review this patch by reading both pgsql-hackers
history on sepgsql, and also the RHEL 6 guide on SELinux this weekend,
today I installed GIT HEAD with --with-selinux on Scientific Linux 6,
developer installation, so far almost everything looking good.
These things should probably be added to the 9.1beta3 documentation branch:
1) the line with for DBNAME in ... do postgres --single etc, lacks a -D
argument and hence gives the error:
postgres does not know where to find the server configuration file.
2) there is a dependency to objects outside of the postgresql
installation tree in /etc/selinux/targeted/contexts/sepgsql_contexts,
and that file has an error that is thrown when contrib/sepgsql is executed:
/etc/selinux/targeted/contexts/sepgsql_contexts: line 33 has invalid
object type db_blobs
(same for db_language)
I found your fix for the error on a forum on oss.tresys.com, but IMHO
either the contrib/sepgsql should mention that the dependency exists and
it might contain errors for (older) reference policies, or it should
include a bugfree reference policy for sepgsql to replace the system
installed refpolicy with (and mention that in the install documentation).
3) sepgsql is currently a bit hard to find in the documentation.
www.postgresql.org website search doesn't find sepgsql and selinux only
refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I
had to manually remember it was a contrib module. Also sepgsql isn't
linked to at the SECURITY LABEL page. At the moment I'm unsure if I have
seen all sepgsql related sgml-based documentation.
After fixing the refpolicy I proceeded with the contrib/sepgsql manual,
with the goal to get something easy done, like create a top secret table
like 'thisyearsbonusses', and a single user 'boss' and configure sepgsql
in such a way that only the boss can access the top secret table. I've
read the the contrib documentation, browsed links on the bottom of the
page but until now I don't even have a clue how to proceed. Until I do
so, I don't feel it's appropriate for me to review the avc patch.
Would you be willing to help me getting a bit started? Specific
questions are:
1) The contrib doc under DML permissions talks about 'db_table:select'
etc? What are these things? They are not labels since I do not see them
listed in the output of 'select distinct label from pg_seclabel'.
2) The regression test label.sql introduces labels with types
sepgsql_trusted_proc_exec_t, sepgsql_ro_table_t. My question is: where
are these defined? What is their meaning? Can I define my own?
3) In the examples so far I've seen unconfined_u and system_u? Can I
define my own?
Thanks,
Yeb Havinga
Show quoted text
On 2011-07-14 21:46, Kohei KaiGai wrote:
Sorry, the syscache part was mixed to contrib/sepgsql part
in my previous post.
Please see the attached revision.Although its functionality is enough simple (it just reduces
number of system-call invocation), its performance
improvement is obvious.
So, I hope someone to volunteer to review these patches.Thanks,
2011/7/11 Kohei KaiGai<kaigai@kaigai.gr.jp>:
I rebased the userspace access vector cache patch to the latest tree.
I'll describe the background of this patch because this thread has not been
active more than a week.
The sepgsql asks in-kernel selinux when it needs to make its access control
decison, so it always causes system call invocations.
However, access control decision of selinux for a particular pair of security
label is consistent as long as its security policy is not reloaded.
Thus, it is a good idea to cache access control decisions recently used in
userspace.
In addition, current GetSecurityLabel() always open pg_seclabel catalog and
scan to fetch security label of database objects, although it is a situation we
can utilize syscache mechanism.The "uavc-syscache" patch adds a new SECLABELOID syscache.
It also redefine pg_seclabel.provide as Name, instead of Text, according to
the suggestion from Tom.
(To avoid collation conscious datatype)The "uavc-selinux-cache" patch adds cache mechanism of contrib/sepgsql.
Its internal api to communicate with selinux (sepgsql_check_perms) was
replaced by newer sepgsql_avc_check_perms that checks cached access
control decision at first, prior to system call invocations.The result of performance improvement is obvious.
* Test 2. time to run 50,000 of SELECT from empty tables
selinux | SECLABELOID syscache
avc | without | with
---------+-----------------------
without | 185.59[s] | 178.38[s]
---------+-----------------------
with | 23.58[s] | 21.79[s]
---------+-----------------------I strongly hope this patch (and security label support on shared objects) to
get unstreamed in this commit-fest, because it will perform as a basis of
other upcoming features.
Please volunteer the reviewing!Thanks,
2011/7/2 Kohei KaiGai<kaigai@kaigai.gr.jp>:
The attached patch re-defines pg_seclabel.provider as NameData, instead of Text,
and revert changes of catcache.c about collations; to keep consistency with the
security label support on shared objects.
All the format changes are hidden by (Get|Set)SecurityLabel(), so no
need to change
on the patch to contrib/sepgsql.Thanks,
2011/6/13 Kohei KaiGai<kaigai@kaigai.gr.jp>:
2011/6/13 Robert Haas<robertmhaas@gmail.com>:
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai<kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.I checked memory consumption of the backend with / without
patches. Because sepgsql_restorecon() tries to reset security
label of all the schemas, relations, columns and procedures,
an execution of this function is suitable to emphasize differences
between two cases in maximum.The results shows us about 3MB of additional consumption
in VmRSS, even if it caches all the security label of the objects
being created in the default (3331 entries).* without patches before/after sepgsql_restorecon()
VmPeak: 150812 kB -> 170864 kB
VmSize: 150804 kB -> 154712 kB
VmLck: 0 kB -> 0kB
VmHWM: 3800 kB -> 22248 kB
VmRSS: 3800 kB -> 10620 kB
VmData: 1940 kB -> 5820 kB
VmStk: 196 kB -> 196 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2468 kB -> 2468 kB
VmPTE: 108 kB -> 120 kB
VmSwap: 0 kB -> 0kB* with patches before/after sepgsql_restorecon()
VmPeak: 150816 kB -> 175092 kB
VmSize: 150808 kB -> 158804 kB
VmLck: 0 kB -> 0 kB
VmHWM: 3868 kB -> 25956 kB
VmRSS: 3868 kB -> 13736 kB
VmData: 1944 kB -> 9912 kB
VmStk: 192 kB -> 192 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2472 kB -> 2472 kB
VmPTE: 100 kB -> 124 kB
VmSwap: 0 kB -> 0 kBThanks,
--
KaiGai Kohei<kaigai@kaigai.gr.jp>--
KaiGai Kohei<kaigai@kaigai.gr.jp>--
KaiGai Kohei<kaigai@kaigai.gr.jp>
Yeb, Thanks for your volunteering.
2011/7/18 Yeb Havinga <yebhavinga@gmail.com>:
Hello KaiGai-san,
I've been preparing to review this patch by reading both pgsql-hackers
history on sepgsql, and also the RHEL 6 guide on SELinux this weekend, today
I installed GIT HEAD with --with-selinux on Scientific Linux 6, developer
installation, so far almost everything looking good.
The Scientific Linux 6 is not suitable, because its libselinux version
is a bit older
than this patch expects (libselinux-2.0.99 or later).
My recommendation is Fedora 15, instead.
1) the line with for DBNAME in ... do postgres --single etc, lacks a -D
argument and hence gives the error:
postgres does not know where to find the server configuration file.
OK. I intended users to adjust their own paths (including -D option),
but an explicit "-D /path/to/database" seems to me more helpful as
an example.
I'll submit a patch in a separate thread.
2) there is a dependency to objects outside of the postgresql installation
tree in /etc/selinux/targeted/contexts/sepgsql_contexts, and that file has
an error that is thrown when contrib/sepgsql is executed:
/etc/selinux/targeted/contexts/sepgsql_contexts: line 33 has invalid object
type db_blobs
(same for db_language)
I found your fix for the error on a forum on oss.tresys.com, but IMHO either
the contrib/sepgsql should mention that the dependency exists and it might
contain errors for (older) reference policies, or it should include a
bugfree reference policy for sepgsql to replace the system installed
refpolicy with (and mention that in the install documentation).
It is not an error, but just a notification to inform users that
sepgsql_contexts
file contains invalid lines. It is harmless, so we can ignore them.
I don't think sepgsql.sgml should mention about this noise, because it purely
come from the problem in libselinux and refpolicy; these are external packages
from viewpoint of PostgreSQL.
3) sepgsql is currently a bit hard to find in the documentation.
www.postgresql.org website search doesn't find sepgsql and selinux only
refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I had
to manually remember it was a contrib module. Also sepgsql isn't linked to
at the SECURITY LABEL page. At the moment I'm unsure if I have seen all
sepgsql related sgml-based documentation.
Improvement of documentation is an issue.
The wiki.postgresql.org should be an appropriate place, maybe.
The reason why SECURITY LABEL does not point to sepgsql.sgml is
that it is general purpose infrastructure for all the upcoming label based
security mechanism, not only sepgsql.
It was our consensus in v9.1 development.
After fixing the refpolicy I proceeded with the contrib/sepgsql manual, with
the goal to get something easy done, like create a top secret table like
'thisyearsbonusses', and a single user 'boss' and configure sepgsql in such
a way that only the boss can access the top secret table. I've read the the
contrib documentation, browsed links on the bottom of the page but until now
I don't even have a clue how to proceed. Until I do so, I don't feel it's
appropriate for me to review the avc patch.
At least, you don't need to fix the policy stuff anything.
The point of this patch is replacement of existing mechanism (that always
asks in-kernel selinux with system-call invocation) by a smart caching
mechanism (it requires minimum number of system-call invocation) without
any user-visible changes.
So, it is not necessary to define a new policy for testing.
Would you be willing to help me getting a bit started? Specific questions
are:1) The contrib doc under DML permissions talks about 'db_table:select' etc?
What are these things? They are not labels since I do not see them listed in
the output of 'select distinct label from pg_seclabel'.
The security label is something like user-id or ownership/object-acl in the
default database access controls. It checks a relationship between user-id
and ownership/object-acl of the target object. If this relationship allowed
particular actions like 'select', 'update' or others, it shall be allowed when
user requires these actions.
In similar way, 'db_table:select' is a type of action; 'select' on table object,
not an identifier of user or objects.
SELinux defines a set of allowed actions (such as 'db_table:select') between
a particular pair of security labels (such as 'staff_t' and 'sepgsql_table_t').
The pg_seclabel holds only security label of object being referenced.
So, you should see /selinux/class/db_*/perms to see list of permissions
defined in the security policy (but limited number of them are in use, now).
2) The regression test label.sql introduces labels with types
sepgsql_trusted_proc_exec_t, sepgsql_ro_table_t. My question is: where are
these defined? What is their meaning? Can I define my own?
The system's default security policy (selinux-policy package) defines all the
necessary labeles, and access control rules between them.
So, we never need to modify security policy to run regression test.
The sepgsql_trusted_proc_exec_t means that functions labeled with this label
is a trusted procedure. It switches security label of the user during
execution of
this function. It is a similar mechanism like SetExec or security
definer function.
The sepgsql_ro_table_t means 'read-only' tables that disallow any
writer operations
except for administrative domains.
3) In the examples so far I've seen unconfined_u and system_u? Can I define
my own?
You can define your own policy, however, I intend to run regression test
without any modification of the default security policy.
Thanks,
Thanks,
Yeb HavingaOn 2011-07-14 21:46, Kohei KaiGai wrote:
Sorry, the syscache part was mixed to contrib/sepgsql part
in my previous post.
Please see the attached revision.Although its functionality is enough simple (it just reduces
number of system-call invocation), its performance
improvement is obvious.
So, I hope someone to volunteer to review these patches.Thanks,
2011/7/11 Kohei KaiGai <kaigai@kaigai.gr.jp>:
I rebased the userspace access vector cache patch to the latest tree.
I'll describe the background of this patch because this thread has not been
active more than a week.
The sepgsql asks in-kernel selinux when it needs to make its access control
decison, so it always causes system call invocations.
However, access control decision of selinux for a particular pair of
security
label is consistent as long as its security policy is not reloaded.
Thus, it is a good idea to cache access control decisions recently used in
userspace.
In addition, current GetSecurityLabel() always open pg_seclabel catalog and
scan to fetch security label of database objects, although it is a situation
we
can utilize syscache mechanism.The "uavc-syscache" patch adds a new SECLABELOID syscache.
It also redefine pg_seclabel.provide as Name, instead of Text, according to
the suggestion from Tom.
(To avoid collation conscious datatype)The "uavc-selinux-cache" patch adds cache mechanism of contrib/sepgsql.
Its internal api to communicate with selinux (sepgsql_check_perms) was
replaced by newer sepgsql_avc_check_perms that checks cached access
control decision at first, prior to system call invocations.The result of performance improvement is obvious.
* Test 2. time to run 50,000 of SELECT from empty tables
selinux | SECLABELOID syscache
avc | without | with
---------+-----------------------
without | 185.59[s] | 178.38[s]
---------+-----------------------
with | 23.58[s] | 21.79[s]
---------+-----------------------I strongly hope this patch (and security label support on shared objects) to
get unstreamed in this commit-fest, because it will perform as a basis of
other upcoming features.
Please volunteer the reviewing!Thanks,
2011/7/2 Kohei KaiGai <kaigai@kaigai.gr.jp>:
The attached patch re-defines pg_seclabel.provider as NameData, instead of
Text,
and revert changes of catcache.c about collations; to keep consistency with
the
security label support on shared objects.
All the format changes are hidden by (Get|Set)SecurityLabel(), so no
need to change
on the patch to contrib/sepgsql.Thanks,
2011/6/13 Kohei KaiGai <kaigai@kaigai.gr.jp>:
2011/6/13 Robert Haas <robertmhaas@gmail.com>:
On Mon, Jun 13, 2011 at 7:51 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
For syscache, length of a typical security label in selinux is
less than 64 bytes. If we assume an entry consume 128bytes
including Oid pairs or pointers, its consumption is 128KBytes
per 1,000 of tables or others.
(Do we have a way to confirm syscache status?)I was thinking you might start a new session, SELECT pg_backendd_pid()
to get the PID, use top/ps to get its memory usage; then do a bunch of
stuff and see how much it's grown. The difference between how much it
grows with and without the patch is the amount of additional memory
the patch consumes.I checked memory consumption of the backend with / without
patches. Because sepgsql_restorecon() tries to reset security
label of all the schemas, relations, columns and procedures,
an execution of this function is suitable to emphasize differences
between two cases in maximum.The results shows us about 3MB of additional consumption
in VmRSS, even if it caches all the security label of the objects
being created in the default (3331 entries).* without patches before/after sepgsql_restorecon()
VmPeak: 150812 kB -> 170864 kB
VmSize: 150804 kB -> 154712 kB
VmLck: 0 kB -> 0kB
VmHWM: 3800 kB -> 22248 kB
VmRSS: 3800 kB -> 10620 kB
VmData: 1940 kB -> 5820 kB
VmStk: 196 kB -> 196 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2468 kB -> 2468 kB
VmPTE: 108 kB -> 120 kB
VmSwap: 0 kB -> 0kB* with patches before/after sepgsql_restorecon()
VmPeak: 150816 kB -> 175092 kB
VmSize: 150808 kB -> 158804 kB
VmLck: 0 kB -> 0 kB
VmHWM: 3868 kB -> 25956 kB
VmRSS: 3868 kB -> 13736 kB
VmData: 1944 kB -> 9912 kB
VmStk: 192 kB -> 192 kB
VmExe: 5324 kB -> 5324 kB
VmLib: 2472 kB -> 2472 kB
VmPTE: 100 kB -> 124 kB
VmSwap: 0 kB -> 0 kBThanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>--
KaiGai Kohei <kaigai@kaigai.gr.jp>--
KaiGai Kohei <kaigai@kaigai.gr.jp>
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
On Mon, Jul 18, 2011 at 4:21 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
3) sepgsql is currently a bit hard to find in the documentation.
www.postgresql.org website search doesn't find sepgsql and selinux only
refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I had
to manually remember it was a contrib module. Also sepgsql isn't linked to
at the SECURITY LABEL page. At the moment I'm unsure if I have seen all
sepgsql related sgml-based documentation.Improvement of documentation is an issue.
The wiki.postgresql.org should be an appropriate place, maybe.The reason why SECURITY LABEL does not point to sepgsql.sgml is
that it is general purpose infrastructure for all the upcoming label based
security mechanism, not only sepgsql.
It was our consensus in v9.1 development.
Actually, I think it's that way mostly because we committed the
SECURITY LABEL stuff first. I'd be in favor of adding some kind of
cross-link.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company