plpython implementation

Started by Szymon Guzabout 13 years ago15 messageshackers

mabewlun@gmail.com

about 13 years ago

I'm reading through plperl and plpython implementations and I don't
understand the way they work.

Comments for plperl say that there are two interpreters (trusted and
untrusted) for each user session, and they are stored in a hash.

Plpython version looks quite different, there is no such global hash with
interpreters, there is just a pointer to an interpreter and one global
function _PG_init, which runs once (but per session, user, or what?).

I'm just wondering how a plpython implementation should look like. We need
another interpreter, but PG_init function is run once, should it then
create two interpreters on init, or should we let this function do nothing
and create a proper interpreter in the first call of plpython(u) function
for current session?

thanks,
Szymon

Martijn van Oosterhout

kleptog@svana.org

about 13 years ago

In reply to: Szymon Guz (#1)

Re: plpython implementation

On Sun, Jun 30, 2013 at 01:49:53PM +0200, Szymon Guz wrote:

I'm reading through plperl and plpython implementations and I don't
understand the way they work.

Comments for plperl say that there are two interpreters (trusted and
untrusted) for each user session, and they are stored in a hash.

The point is that python has no version for untrusted users, since it's
been accepted that there's no way to build a python sandbox for
untrusted code. There was actually a small competition to make one but
it failed, since then they don't bother.

Perl does provide a sandbox, hence you can have two interpreters in a
single backend.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.

-- Arthur Schopenhauer

Andrew Dunstan

andrew@dunslane.net

about 13 years ago

In reply to: Szymon Guz (#1)

Re: plpython implementation

On 06/30/2013 07:49 AM, Szymon Guz wrote:

I'm reading through plperl and plpython implementations and I don't
understand the way they work.

Comments for plperl say that there are two interpreters (trusted and
untrusted) for each user session, and they are stored in a hash.

Plpython version looks quite different, there is no such global hash
with interpreters, there is just a pointer to an interpreter and one
global function _PG_init, which runs once (but per session, user, or
what?).

I'm just wondering how a plpython implementation should look like. We
need another interpreter, but PG_init function is run once, should it
then create two interpreters on init, or should we let this function
do nothing and create a proper interpreter in the first call of
plpython(u) function for current session?

python does not any any sort of reliable sandbox, so there is no
plpython, only plpythonu - hence only one interpreter per backend is needed.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Szymon Guz

mabewlun@gmail.com

about 13 years ago

In reply to: Andrew Dunstan (#3)

Re: plpython implementation

On 30 June 2013 14:13, Andrew Dunstan <andrew@dunslane.net> wrote:

On 06/30/2013 07:49 AM, Szymon Guz wrote:

I'm reading through plperl and plpython implementations and I don't
understand the way they work.

Comments for plperl say that there are two interpreters (trusted and
untrusted) for each user session, and they are stored in a hash.

Plpython version looks quite different, there is no such global hash with
interpreters, there is just a pointer to an interpreter and one global
function _PG_init, which runs once (but per session, user, or what?).

I'm just wondering how a plpython implementation should look like. We
need another interpreter, but PG_init function is run once, should it then
create two interpreters on init, or should we let this function do nothing
and create a proper interpreter in the first call of plpython(u) function
for current session?

python does not any any sort of reliable sandbox, so there is no plpython,
only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple modification
which totally disables importing modules, so I'm just wondering why it
cannot be done.

Szymon

Martijn van Oosterhout

kleptog@svana.org

about 13 years ago

In reply to: Szymon Guz (#4)

Re: plpython implementation

On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:

python does not any any sort of reliable sandbox, so there is no plpython,
only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple modification
which totally disables importing modules, so I'm just wondering why it
cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.

-- Arthur Schopenhauer

Andrew Dunstan

andrew@dunslane.net

about 13 years ago

In reply to: Szymon Guz (#4)

Re: plpython implementation

On 06/30/2013 08:18 AM, Szymon Guz wrote:

python does not any any sort of reliable sandbox, so there is no
plpython, only plpythonu - hence only one interpreter per backend
is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple
modification which totally disables importing modules, so I'm just
wondering why it cannot be done.

If your sandbox is simple it's almost certainly going to be broken. I
suggest you use Google to research the topic. Our discussions should be
in the mailing list archives.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Szymon Guz

mabewlun@gmail.com

about 13 years ago

In reply to: Martijn van Oosterhout (#5)

Re: plpython implementation

On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:

python does not any any sort of reliable sandbox, so there is no

plpython,

only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple modification
which totally disables importing modules, so I'm just wondering why it
cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

Hi Martin,
thanks for links. I was thinking about something else. In fact we don't
need full sandbox, I think it would be enough to have safe python, if it
couldn't import any outside module. Wouldn't be enough?

It seems like the sandbox modules want to limit many external operations,
I'm thinking about not being able to import any module, even standard ones,
wouldn't be enough?

Szymon

Andres Freund

andres@anarazel.de

about 13 years ago

In reply to: Szymon Guz (#7)

Re: plpython implementation

On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:

On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:

python does not any any sort of reliable sandbox, so there is no

plpython,

only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple modification
which totally disables importing modules, so I'm just wondering why it
cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

thanks for links. I was thinking about something else. In fact we don't
need full sandbox, I think it would be enough to have safe python, if it
couldn't import any outside module. Wouldn't be enough?

It seems like the sandbox modules want to limit many external operations,
I'm thinking about not being able to import any module, even standard ones,
wouldn't be enough?

python

open('/etc/passwd', 'r').readlines()

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Szymon Guz

mabewlun@gmail.com

about 13 years ago

In reply to: Andres Freund (#8)

Re: plpython implementation

On 30 June 2013 14:45, Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:

On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:

python does not any any sort of reliable sandbox, so there is no

plpython,

only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple

modification

which totally disables importing modules, so I'm just wondering why

it

cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

thanks for links. I was thinking about something else. In fact we don't
need full sandbox, I think it would be enough to have safe python, if it
couldn't import any outside module. Wouldn't be enough?

It seems like the sandbox modules want to limit many external operations,
I'm thinking about not being able to import any module, even standard

ones,

wouldn't be enough?

python

open('/etc/passwd', 'r').readlines()

thanks :)

#10

Claudio Freire

klaussfreire@gmail.com

about 13 years ago

In reply to: Andres Freund (#8)

Re: plpython implementation

On Sun, Jun 30, 2013 at 9:45 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:

On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:

python does not any any sort of reliable sandbox, so there is no

plpython,

only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple modification
which totally disables importing modules, so I'm just wondering why it
cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

thanks for links. I was thinking about something else. In fact we don't
need full sandbox, I think it would be enough to have safe python, if it
couldn't import any outside module. Wouldn't be enough?

It seems like the sandbox modules want to limit many external operations,
I'm thinking about not being able to import any module, even standard ones,
wouldn't be enough?

python

open('/etc/passwd', 'r').readlines()

Not only that, the CPython interpreter is rather fuzzy about the
division between interpreters. You can initialize multiple
interpreters, but they share a lot of state, so you can never fully
separate them. You'd have some state from the untrusted interpreter
spill over into the trusted one within the same session, which is not
ideal at all (and in fact can be exploited).

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to "fix" that, as they don't
consider it a bug. It's just how it is.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

James Mansion

james@mansionfamily.plus.com

about 13 years ago

In reply to: Claudio Freire (#10)

Re: plpython implementation

On 01/07/2013 02:43, Claudio Freire wrote:

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to "fix" that, as they don't
consider it a bug. It's just how it is.

Given how useful it is to have a scripting language that can be used outside
of the database as well as inside it, would it be reasonable to consider
'promoting' pllua?

My understanding is that it (lua) is much cleaner under the hood (than
CPython).
Although I do recognise that Python as a whole has always had more traction.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Claudio Freire

klaussfreire@gmail.com

about 13 years ago

In reply to: James Mansion (#11)

Re: plpython implementation

On Mon, Jul 1, 2013 at 2:29 AM, james <james@mansionfamily.plus.com> wrote:

On 01/07/2013 02:43, Claudio Freire wrote:

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to "fix" that, as they don't
consider it a bug. It's just how it is.

Given how useful it is to have a scripting language that can be used outside
of the database as well as inside it, would it be reasonable to consider
'promoting' pllua?

My understanding is that it (lua) is much cleaner under the hood (than
CPython).
Although I do recognise that Python as a whole has always had more traction.

Well, that, or you can use another implementation. There are many, and
PyPy should be seriously considered given its JIT and how much faster
it is for raw computation power, which is what a DB is most likely
going to care about. I bet PyPy's sandboxing is a lot better as well.

Making a postgres-interphasing pypy fork I guess would be a nice
project, it's as "simple" as implementing all of plpy's API in RPython
and translating a C module out of it.

No, I'm not volunteering ;-)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Hannu Krosing

hannu@tm.ee

about 13 years ago

In reply to: Claudio Freire (#12)

Re: plpython implementation

On 07/01/2013 07:53 AM, Claudio Freire wrote:

On Mon, Jul 1, 2013 at 2:29 AM, james <james@mansionfamily.plus.com> wrote:

On 01/07/2013 02:43, Claudio Freire wrote:

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to "fix" that, as they don't
consider it a bug. It's just how it is.

Given how useful it is to have a scripting language that can be used outside
of the database as well as inside it, would it be reasonable to consider
'promoting' pllua?

My understanding is that it (lua) is much cleaner under the hood (than
CPython).
Although I do recognise that Python as a whole has always had more traction.

Well, that, or you can use another implementation. There are many, and
PyPy should be seriously considered given its JIT and how much faster
it is for raw computation power, which is what a DB is most likely
going to care about.

OTOH, pypy startup time is bigger than CPython. It is also generally
slower at running small on-call functions before JIT kicks in.

I bet PyPy's sandboxing is a lot better as well.

pypy sandbox implementation seems to be a sound one, as it
delegates all "unsafe" operations to outside controller at bytecode
level. The outside controller usually being a standard CPython wrapper.
Of course this makes any such operations slower, but this is the price
to pay for sandboxing.

Making a postgres-interphasing pypy fork I guess would be a nice
project, it's as "simple" as implementing all of plpy's API in RPython
and translating a C module out of it.

I have some ideas about allowing new pl-s to be written in pl/pythonu

If any of you interested in this are at Europython come talk to me about
this after my presentations ;)

No, I'm not volunteering ;-)

Neither am I, at least not yet

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic Oï¿½

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Andres Freund

andres@anarazel.de

about 13 years ago

In reply to: Claudio Freire (#10)

Re: plpython implementation

On 2013-06-30 22:43:52 -0300, Claudio Freire wrote:

Not only that, the CPython interpreter is rather fuzzy about the
division between interpreters. You can initialize multiple
interpreters, but they share a lot of state, so you can never fully
separate them. You'd have some state from the untrusted interpreter
spill over into the trusted one within the same session, which is not
ideal at all (and in fact can be exploited).

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to "fix" that, as they don't
consider it a bug. It's just how it is.

Doesn't zope's RestrictedPython have a history of working reasonably
well? Now, you sure pay a price for that, but ...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Peter Eisentraut

peter_e@gmx.net

about 13 years ago

In reply to: James Mansion (#11)

Re: plpython implementation

On 7/1/13 1:29 AM, james wrote:

Given how useful it is to have a scripting language that can be used
outside
of the database as well as inside it, would it be reasonable to consider
'promoting' pllua?

You can start promoting pllua by making it work with current PostgreSQL
versions. It hasn't been updated in 5 years, and doesn't build cleanly
last I checked.

Having a well-maintained and fully featured pllua available would surely
be welcome by many.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers