Re: plpython - Mailing list pgsql-hackers

From elein
Subject Re: plpython
Date
Msg-id 20030904155033.Q3762@cookie
Whole thread Raw
In response to plpython  (James Pye <flaw@rhid.com>)
List pgsql-hackers
NO!!! Don't remove SD and GD!!! They are useful.
I use them in several applications, primarily
for running aggregates.

What needs to be fixed is that the SD needs to be
initialized at the start of each statement.
Joe Conway just implemented this in Pl/R and
Tom Lane had an idea about it too.

See http:/www.varlena.com/GeneralBits/TidBits
for the talk and code I gave on running aggregation
with plpython at OSCON.  It illustrates the
initialization problem.

And don't remove plpy.  You can move it or replace
its implementation, but do not remove it.  People
are really using these things.

People are also depending on python's loose
type conversion from strings.  If you add another
kind of conversion interpretation, you must keep the backward
compatibility or call it something different.

It seems to me is that you need to talk more to
people using plpython.  I am just one person.
There are others.  I hope I've misunderstood you 
about some of these things...

elein@varlena.com

On Thu, Sep 04, 2003 at 03:01:57PM -0700, James Pye wrote:
> 
> Greetings,
> 
>     I've recently been spending some quality time with the plpython module, and I think I'm well on the road to an
improvedversion of it(although, nothing about a trusted variant).  By improved, I mostly mean cleaned up, and
reorganized..
> 
> Here are some of the changes that I have made in my own version:
> 
>     Compilation and execution have been greatly simplified and should be faster(at least execution should be).
>     Caching of compiled code no longer references a Python dictionary(PLyProcedureCache). The handler keeps its own
vectorof procedure structs(should be faster, and is trivial).
 
>     Removal of plpython generated dictionaries SD and GD. They don't seem be very useful, as they are forgotten when
thepostmaster exits and not remembered when a new one starts. SD is questionable, does/did anyone find SD very useful?
GDseems almost pointless as the global keyword should be sufficient. Although, I do think there was a mention of GD
being"safe globals", but I don't know why it would be safer than "global var".
 
>     Removal of the built-in "plpy" python module that plpython creates. This is done because it provides interfaces
topgsql functions that I feel should be located elsewhere; elsewhere being another python module. I've already
generateda preliminary interface to elog and SPI_* with SWIG that at first glance seems quite functional(it links, and
isat least able to properly call elog, I haven't really tested SPI).
 
>     Improvement to tracebacks, as it now NOTICE's the python tracebacks(There is already an ERROR, so I don't think
WARNINGis necessary). PLy_traceback, originally, seemed to ignore the tb of the PyErr_Fetch.
 
>     Removal of plpython type conversion routines and data structures. This was done because I felt that there was a
betterway to do it. Not sure what yet, as it is one of my questions to the list, but it will probably end up being a
similarimplementation.
 
>     I also plan to make some changes to trigger handling, but I haven't done anything worth mentioning yet..
> 
> 
> Type conversion
> 
>     plpython's current type conversion implementation appears to be dependent on strings as the common format. This
isfine, but not very extensible as is, unless you don't mind explicitly parsing strings inside each function that takes
anunsupported data type.
 
>     I was thinking that a better solution would be creating a python object type inside the database. Thus allowing
usersto write casts to and from non-standard or unimplemented data types with little difficulty(well, maybe some :).
Thiswould allow conversion in an extensible way, which doesn't require modification to plpython. Storage could be
easilyachieved by pickling the object.
 
>     Another thought would be to just pass valid PyObject pointers in and out of conversion procedures, effectively
disallowingstorage(outside the process in which the object was created in), unless it is possible to have a persistent
storagemechanism that makes it possible to go through pickle?.?..(yeah, I'm new to pgsql dev).
 
> 
> 
> Python PostgreSQL Interface
> 
>     plpython, currently, implements its own built-in module to interface with a few pgsql routines, and it works, but
Ifeel it should be located elsewhere, as I said before.
 
>     For the most part, I can only see most people using elog, and SPI within plpy, but perhaps that is too narrow of
aview. Perhaps it would be useful to many to have access to some backend routines through plpy, but I'm not sure and
thatis why I'm asking the list.
 
>     How far should such an PostgreSQL interface module go?
>     What should its name be if full/semi-full interface is created? I was thinking simply py-pgsql as the package
name,and the module name, of course, would be pgsql.
 
>     What should the name be if it was only elog and SPI? py-pgspi?
>     I'm leaning towards py-pgsql, a partial interface consisting of elog and SPI and perhaps a few other useful
routines.But have the module as a package as to allow easy extensions to the package as subpackages..
 
>     From this interface, a DB-API 2.0 compatible SPI interface will come as well.
> 
> 
>     My version has a short ways to go before it is ready for usage, but if you want to see what I've done, just drop
mean e-mail.
 
> 
> 
> Comments? Criticisms? Feature suggestions?
> Anyone else doing significant work on plpython?
> 
> 
> -James
> 




pgsql-hackers by date:

Previous
From: Darko Prenosil
Date:
Subject: Re: Win32 native port
Next
From: Larry Rosenman
Date:
Subject: Re: ANONCVS? Is it being updated correctly?