Thread: plpython implementation

plpython implementation

From
Szymon Guz
Date:
I'm reading through plperl and plpython implementations and I don't understand the way they work.

Comments for plperl say that there are two interpreters (trusted and untrusted) for each user session, and they are stored in a hash.

Plpython version looks quite different, there is no such global hash with interpreters, there is just a pointer to an interpreter and one global function _PG_init, which runs once (but per session, user, or what?).

I'm just wondering how a plpython implementation should look like. We need another interpreter, but PG_init function is run once, should it then create two interpreters on init, or should we let this function do nothing and create a proper interpreter in the first call of plpython(u) function for current session?

thanks,
Szymon

Re: plpython implementation

From
Martijn van Oosterhout
Date:
On Sun, Jun 30, 2013 at 01:49:53PM +0200, Szymon Guz wrote:
> I'm reading through plperl and plpython implementations and I don't
> understand the way they work.
>
> Comments for plperl say that there are two interpreters (trusted and
> untrusted) for each user session, and they are stored in a hash.

The point is that python has no version for untrusted users, since it's
been accepted that there's no way to build a python sandbox for
untrusted code. There was actually a small competition to make one but
it failed, since then they don't bother.

Perl does provide a sandbox, hence you can have two interpreters in a
single backend.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: plpython implementation

From
Andrew Dunstan
Date:
On 06/30/2013 07:49 AM, Szymon Guz wrote:
> I'm reading through plperl and plpython implementations and I don't 
> understand the way they work.
>
> Comments for plperl say that there are two interpreters (trusted and 
> untrusted) for each user session, and they are stored in a hash.
>
> Plpython version looks quite different, there is no such global hash 
> with interpreters, there is just a pointer to an interpreter and one 
> global function _PG_init, which runs once (but per session, user, or 
> what?).
>
> I'm just wondering how a plpython implementation should look like. We 
> need another interpreter, but PG_init function is run once, should it 
> then create two interpreters on init, or should we let this function 
> do nothing and create a proper interpreter in the first call of 
> plpython(u) function for current session?
>
>


python does not any any sort of reliable sandbox, so there is no 
plpython, only plpythonu - hence only one interpreter per backend is needed.

cheers

andrew



Re: plpython implementation

From
Szymon Guz
Date:
On 30 June 2013 14:13, Andrew Dunstan <andrew@dunslane.net> wrote:

On 06/30/2013 07:49 AM, Szymon Guz wrote:
I'm reading through plperl and plpython implementations and I don't understand the way they work.

Comments for plperl say that there are two interpreters (trusted and untrusted) for each user session, and they are stored in a hash.

Plpython version looks quite different, there is no such global hash with interpreters, there is just a pointer to an interpreter and one global function _PG_init, which runs once (but per session, user, or what?).

I'm just wondering how a plpython implementation should look like. We need another interpreter, but PG_init function is run once, should it then create two interpreters on init, or should we let this function do nothing and create a proper interpreter in the first call of plpython(u) function for current session?




python does not any any sort of reliable sandbox, so there is no plpython, only plpythonu - hence only one interpreter per backend is needed.


Is there any track of the discussion that there is no way to make the sandbox? I managed to create some kind of sandbox, a simple modification which totally disables importing modules, so I'm just wondering why it cannot be done.

Szymon 

Re: plpython implementation

From
Martijn van Oosterhout
Date:
On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
> > python does not any any sort of reliable sandbox, so there is no plpython,
> > only plpythonu - hence only one interpreter per backend is needed.
> >
> Is there any track of the discussion that there is no way to make the
> sandbox? I managed to create some kind of sandbox, a simple modification
> which totally disables importing modules, so I'm just wondering why it
> cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: plpython implementation

From
Andrew Dunstan
Date:
On 06/30/2013 08:18 AM, Szymon Guz wrote:
>
>
>
>     python does not any any sort of reliable sandbox, so there is no
>     plpython, only plpythonu - hence only one interpreter per backend
>     is needed.
>
>
> Is there any track of the discussion that there is no way to make the 
> sandbox? I managed to create some kind of sandbox, a simple 
> modification which totally disables importing modules, so I'm just 
> wondering why it cannot be done.
>


If your sandbox is simple it's almost certainly going to be broken. I 
suggest you use Google to research the topic. Our discussions should be 
in the mailing list archives.

cheers

andrew




Re: plpython implementation

From
Szymon Guz
Date:
On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:
On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
> > python does not any any sort of reliable sandbox, so there is no plpython,
> > only plpythonu - hence only one interpreter per backend is needed.
> >
> Is there any track of the discussion that there is no way to make the
> sandbox? I managed to create some kind of sandbox, a simple modification
> which totally disables importing modules, so I'm just wondering why it
> cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.


Hi Martin,
thanks for links. I was thinking about something else. In fact we don't need full sandbox, I think it would be enough to have safe python, if it couldn't import any outside module. Wouldn't be enough?

It seems like the sandbox modules want to limit many external operations, I'm thinking about not being able to import any module, even standard ones, wouldn't be enough?

Szymon

Re: plpython implementation

From
Andres Freund
Date:
On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:
> On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:
> 
> > On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
> > > > python does not any any sort of reliable sandbox, so there is no
> > plpython,
> > > > only plpythonu - hence only one interpreter per backend is needed.
> > > >
> > > Is there any track of the discussion that there is no way to make the
> > > sandbox? I managed to create some kind of sandbox, a simple modification
> > > which totally disables importing modules, so I'm just wondering why it
> > > cannot be done.
> >
> > http://wiki.python.org/moin/SandboxedPython
> >
> > This is the thread I was thinking of:
> > http://mail.python.org/pipermail/python-dev/2009-February/086401.html
> >
> > If you read through it I think you will understand the difficulties.
> >
> thanks for links. I was thinking about something else. In fact we don't
> need full sandbox, I think it would be enough to have safe python, if it
> couldn't import any outside module. Wouldn't be enough?
> 
> It seems like the sandbox modules want to limit many external operations,
> I'm thinking about not being able to import any module, even standard ones,
> wouldn't be enough?

python
>> open('/etc/passwd', 'r').readlines()

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: plpython implementation

From
Szymon Guz
Date:



On 30 June 2013 14:45, Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:
> On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:
>
> > On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
> > > > python does not any any sort of reliable sandbox, so there is no
> > plpython,
> > > > only plpythonu - hence only one interpreter per backend is needed.
> > > >
> > > Is there any track of the discussion that there is no way to make the
> > > sandbox? I managed to create some kind of sandbox, a simple modification
> > > which totally disables importing modules, so I'm just wondering why it
> > > cannot be done.
> >
> > http://wiki.python.org/moin/SandboxedPython
> >
> > This is the thread I was thinking of:
> > http://mail.python.org/pipermail/python-dev/2009-February/086401.html
> >
> > If you read through it I think you will understand the difficulties.
> >
> thanks for links. I was thinking about something else. In fact we don't
> need full sandbox, I think it would be enough to have safe python, if it
> couldn't import any outside module. Wouldn't be enough?
>
> It seems like the sandbox modules want to limit many external operations,
> I'm thinking about not being able to import any module, even standard ones,
> wouldn't be enough?

python
>> open('/etc/passwd', 'r').readlines()


thanks :) 

Re: plpython implementation

From
Claudio Freire
Date:
On Sun, Jun 30, 2013 at 9:45 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:
>> On 30 June 2013 14:31, Martijn van Oosterhout <kleptog@svana.org> wrote:
>>
>> > On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
>> > > > python does not any any sort of reliable sandbox, so there is no
>> > plpython,
>> > > > only plpythonu - hence only one interpreter per backend is needed.
>> > > >
>> > > Is there any track of the discussion that there is no way to make the
>> > > sandbox? I managed to create some kind of sandbox, a simple modification
>> > > which totally disables importing modules, so I'm just wondering why it
>> > > cannot be done.
>> >
>> > http://wiki.python.org/moin/SandboxedPython
>> >
>> > This is the thread I was thinking of:
>> > http://mail.python.org/pipermail/python-dev/2009-February/086401.html
>> >
>> > If you read through it I think you will understand the difficulties.
>> >
>> thanks for links. I was thinking about something else. In fact we don't
>> need full sandbox, I think it would be enough to have safe python, if it
>> couldn't import any outside module. Wouldn't be enough?
>>
>> It seems like the sandbox modules want to limit many external operations,
>> I'm thinking about not being able to import any module, even standard ones,
>> wouldn't be enough?
>
> python
>>> open('/etc/passwd', 'r').readlines()

Not only that, the CPython interpreter is rather fuzzy about the
division between interpreters. You can initialize multiple
interpreters, but they share a lot of state, so you can never fully
separate them. You'd have some state from the untrusted interpreter
spill over into the trusted one within the same session, which is not
ideal at all (and in fact can be exploited).

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to "fix" that, as they don't
consider it a bug. It's just how it is.



Re: plpython implementation

From
james
Date:
On 01/07/2013 02:43, Claudio Freire wrote:
> In essence, you'd have to use another implementation. CPython guys
> have left it very clear they don't intend to "fix" that, as they don't
> consider it a bug. It's just how it is.
Given how useful it is to have a scripting language that can be used outside
of the database as well as inside it, would it be reasonable to consider
'promoting' pllua?

My understanding is that it (lua) is much cleaner under the hood (than 
CPython).
Although I do recognise that Python as a whole has always had more traction.





Re: plpython implementation

From
Claudio Freire
Date:
On Mon, Jul 1, 2013 at 2:29 AM, james <james@mansionfamily.plus.com> wrote:
> On 01/07/2013 02:43, Claudio Freire wrote:
>>
>> In essence, you'd have to use another implementation. CPython guys
>> have left it very clear they don't intend to "fix" that, as they don't
>> consider it a bug. It's just how it is.
>
> Given how useful it is to have a scripting language that can be used outside
> of the database as well as inside it, would it be reasonable to consider
> 'promoting' pllua?
>
> My understanding is that it (lua) is much cleaner under the hood (than
> CPython).
> Although I do recognise that Python as a whole has always had more traction.

Well, that, or you can use another implementation. There are many, and
PyPy should be seriously considered given its JIT and how much faster
it is for raw computation power, which is what a DB is most likely
going to care about. I bet PyPy's sandboxing is a lot better as well.

Making a postgres-interphasing pypy fork I guess would be a nice
project, it's as "simple" as implementing all of plpy's API in RPython
and translating a C module out of it.

No, I'm not volunteering ;-)



Re: plpython implementation

From
Hannu Krosing
Date:
On 07/01/2013 07:53 AM, Claudio Freire wrote:
> On Mon, Jul 1, 2013 at 2:29 AM, james <james@mansionfamily.plus.com> wrote:
>> On 01/07/2013 02:43, Claudio Freire wrote:
>>> In essence, you'd have to use another implementation. CPython guys
>>> have left it very clear they don't intend to "fix" that, as they don't
>>> consider it a bug. It's just how it is.
>> Given how useful it is to have a scripting language that can be used outside
>> of the database as well as inside it, would it be reasonable to consider
>> 'promoting' pllua?
>>
>> My understanding is that it (lua) is much cleaner under the hood (than
>> CPython).
>> Although I do recognise that Python as a whole has always had more traction.
> Well, that, or you can use another implementation. There are many, and
> PyPy should be seriously considered given its JIT and how much faster
> it is for raw computation power, which is what a DB is most likely
> going to care about. 
OTOH, pypy startup time is bigger than CPython. It is also generally
slower at running small on-call functions before JIT kicks in.
> I bet PyPy's sandboxing is a lot better as well.
pypy sandbox implementation seems to be a sound one, as it
delegates all "unsafe" operations to outside controller at bytecode
level. The outside controller usually being a standard CPython wrapper.
Of course this makes any such operations slower, but this is the price
to pay for sandboxing.
> Making a postgres-interphasing pypy fork I guess would be a nice
> project, it's as "simple" as implementing all of plpy's API in RPython
> and translating a C module out of it.
I have some ideas about allowing new pl-s to be written in pl/pythonu

If any of you interested in this are at Europython come talk to me about
this after my presentations ;)
> No, I'm not volunteering ;-)
Neither am I, at least not yet

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ




Re: plpython implementation

From
Andres Freund
Date:
On 2013-06-30 22:43:52 -0300, Claudio Freire wrote:
> Not only that, the CPython interpreter is rather fuzzy about the
> division between interpreters. You can initialize multiple
> interpreters, but they share a lot of state, so you can never fully
> separate them. You'd have some state from the untrusted interpreter
> spill over into the trusted one within the same session, which is not
> ideal at all (and in fact can be exploited).
> 
> In essence, you'd have to use another implementation. CPython guys
> have left it very clear they don't intend to "fix" that, as they don't
> consider it a bug. It's just how it is.

Doesn't zope's RestrictedPython have a history of working reasonably
well? Now, you sure pay a price for that, but ...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: plpython implementation

From
Peter Eisentraut
Date:
On 7/1/13 1:29 AM, james wrote:
> Given how useful it is to have a scripting language that can be used
> outside
> of the database as well as inside it, would it be reasonable to consider
> 'promoting' pllua?

You can start promoting pllua by making it work with current PostgreSQL
versions.  It hasn't been updated in 5 years, and doesn't build cleanly
last I checked.

Having a well-maintained and fully featured pllua available would surely
be welcome by many.



PL/Lua (was: plpython implementation)

From
Luis Carvalho
Date:
Hi all,

Claudio Freire wrote:
> On Mon, Jul 1, 2013 at 2:29 AM, james <james@mansionfamily.plus.com> wrote:
> > On 01/07/2013 02:43, Claudio Freire wrote:
> >>
> >> In essence, you'd have to use another implementation. CPython guys
> >> have left it very clear they don't intend to "fix" that, as they don't
> >> consider it a bug. It's just how it is.
> >
> > Given how useful it is to have a scripting language that can be used outside
> > of the database as well as inside it, would it be reasonable to consider
> > 'promoting' pllua?
> >
> > My understanding is that it (lua) is much cleaner under the hood (than
> > CPython).
> > Although I do recognise that Python as a whole has always had more traction.
> 
> Well, that, or you can use another implementation. There are many, and
> PyPy should be seriously considered given its JIT and how much faster
> it is for raw computation power, which is what a DB is most likely
> going to care about. I bet PyPy's sandboxing is a lot better as well.

<snip>
I think that 'promoting' PL/Lua would be too early, but it'd be a great
addition. The latest version, for instance, can run LuaJIT which has a FFI
(check the example in "Anonymous Blocks" at PL/Lua's docs.) I think there are
two main problems: finding maintainers in the core, and lack of popularity to
warrant its promotion (the two problems are related, of course.)


Peter Eisentraut wrote:
> On 7/1/13 1:29 AM, james wrote:
> > Given how useful it is to have a scripting language that can be used
> > outside
> > of the database as well as inside it, would it be reasonable to consider
> > 'promoting' pllua?
> 
> You can start promoting pllua by making it work with current PostgreSQL
> versions.  It hasn't been updated in 5 years, and doesn't build cleanly
> last I checked.
> 
> Having a well-maintained and fully featured pllua available would surely
> be welcome by many.

Thanks for the feedback. Actually, PL/Lua's latest version (1.0) was out one
month ago,

http://pgfoundry.org/frs/?group_id=1000314

but the previous version took around 4 years. I was waiting for bug reports,
since I deemed PL/Lua to be fairly featured, but I have now declared it
"stable".

The project is maintained -- I don't know how to say when something is
well-maintained, but small frequency of code updates is not one of my
criteria; Lua, for instance, took six years between versions 5.2 and 5.1.
BTW, just out of curiosity, when was the last time PL/Tcl was updated?

I think that the project is also fully featured, but I'd appreciate any
comments on the contrary (that is, feature requests.) I might be mistaken, but
PL/Lua has all the features that PL/Python, PL/Perl, and PL/Tcl have, but, for
example, features a trusted flavor when PL/Python does not, and has proper
type mappings, which PL/Perl does not (everything is translated to text.)

PL/Lua 1.0 adds anonymous blocks and a TRUNCATE trigger, and it should run on
PostgreSQL 9.2. It can be used with Lua 5.1, 5.2, and LuaJIT 2.0 (if you want
speed and an easy C interface through a FFI, you should try LuaJIT!)

I'd like to take this opportunity to kindly ask the PostgreSQL doc maintainers
to include PL/Lua in the table at Appendix H.3:

Name: PL/Lua
Language: Lua
Website: http://pgfoundry.org/projects/pllua/

Cheers,
Luis

-- 
Computers are useless. They can only give you answers.               -- Pablo Picasso

-- 
Luis Carvalho (Kozure)
lua -e 'print((("lexcarvalho@NO.gmail.SPAM.com"):gsub("(%u+%.)","")))'



Re: PL/Lua (was: plpython implementation)

From
Peter Eisentraut
Date:
On Mon, 2013-07-01 at 18:15 -0400, Luis Carvalho wrote:
> The project is maintained -- I don't know how to say when something is
> well-maintained, but small frequency of code updates is not one of my
> criteria; 

The bug tracker contains bugs about build problems with PG 8.4, 9.2, and
9.3, which have not been addressed.




Re: PL/Lua (was: plpython implementation)

From
Luis Carvalho
Date:
Peter Eisentraut wrote:
> On Mon, 2013-07-01 at 18:15 -0400, Luis Carvalho wrote:
> > The project is maintained -- I don't know how to say when something is
> > well-maintained, but small frequency of code updates is not one of my
> > criteria; 
> 
> The bug tracker contains bugs about build problems with PG 8.4, 9.2, and
> 9.3, which have not been addressed.

Done (it took me a while to see the bug tracker in pgfoundry...) BTW, thanks
for the patch; I'll release a new version of PL/Lua once PG 9.3 is out.

Cheers,
Luis

-- 
Computers are useless. They can only give you answers.               -- Pablo Picasso

-- 
Luis Carvalho (Kozure)
lua -e 'print((("lexcarvalho@NO.gmail.SPAM.com"):gsub("(%u+%.)","")))'



Re: PL/Lua (was: plpython implementation)

From
Andreas Karlsson
Date:
On 07/02/2013 01:54 AM, Luis Carvalho wrote:
> Peter Eisentraut wrote:
>> On Mon, 2013-07-01 at 18:15 -0400, Luis Carvalho wrote:
>>> The project is maintained -- I don't know how to say when something is
>>> well-maintained, but small frequency of code updates is not one of my
>>> criteria;
>>
>> The bug tracker contains bugs about build problems with PG 8.4, 9.2, and
>> 9.3, which have not been addressed.
>
> Done (it took me a while to see the bug tracker in pgfoundry...) BTW, thanks
> for the patch; I'll release a new version of PL/Lua once PG 9.3 is out.

It might be worth looking at the feature set of PL/v8 which currently 
seems to be larger than PL/Perl, PL/Python and PL/tcl. Including having 
the possibility to implement window functions.

http://pgxn.org/dist/plv8/doc/plv8.html#Window.function.API

Nice job with PL/Lua,
Andreas

-- 
Andreas Karlsson