Thread: splitting plpython into smaller parts
Hi, attached are two incremental patches that refactor plpython into smaller modules. The first one factors out some bolerplate related to executing SPI functions in subtransactions (and idea borrowed from pltcl.c). The second one is the actual split. plpython.c has been split into 11 separate files and one header. The separate files are: * plpython.c - top-level handlers and stuff that everything else uses * plpython_io.c - transforming Python objects to PG structures and vice versa * plpython_procedure.c - handling and caching PLyProcedure objects * plpython_exec.c - actually executing the Python code * plpython_plpy.c - defining the global plpy module and setting up the Python interpreter * plpython_spi.c - interface to SPI functions * plpython_result.c, plpython_plan.c, plpython_subtrasaction.c - a file per Python class created by plpython with their method definitions * plpython_functions.c - Python functions available from the plpy module * plpython_elog.c - transforming Python errors into Postgres elogs All regression tests pass, I tested on Python 2.3, 2.7 and 3.1. The other plpython patch I submitted (cursor support) is not included here. If it gets accepted, I'll update this patch to add a plpython_cursor.c file. If this gets accepted first, I'll update the cursor patch accordingly. There's still a lot of room for refactoring and getting rid of repetitive code from plpython, but that split should be fundamental to make it a bit more manageable (it's almost 5K lines now). I've tried to change as little code as possible during the split, apart from making a bunch of functions non-static I only had to change the type initialization to call functions from the plpython_{result,plan,...} modules to avoid exposing the PyTypeObject structs outside of their respective files and get rid of the is_PLyPlanObject macro in favour of a function. Cheers, Jan PS: the patches are gzipped because they're rather big - 270K uncompressed. J PPS: I guess a README in the plpython dir would be in order. If we accept these patches, I'll write one up based on the contents of this mail. J
Attachment
On 11/13/2011 09:45 AM, Jan Urbański wrote: > The first one factors out some bolerplate related to executing SPI > functions in subtransactions (and idea borrowed from pltcl.c). While I haven't looked at the code, this seems worthwhile from the description. > The second one is the actual split. plpython.c has been split into 11 > separate files and one header. Could you comment a bit more about what the goal of this is? We don't have a reviewer for this patch yet, and I think part of the reason is because it's not really obvious what it's supposed to be doing, and why that's useful.
On 28/11/11 11:00, Greg Smith wrote: > On 11/13/2011 09:45 AM, Jan Urbański wrote: >> The second one is the actual split. plpython.c has been split into 11 >> separate files and one header. > > Could you comment a bit more about what the goal of this is? We don't > have a reviewer for this patch yet, and I think part of the reason is > because it's not really obvious what it's supposed to be doing, and why > that's useful. The idea of splitting plpython.c (an almost 5k lines file) into something more modular. It's been floated around here: http://postgresql.1045698.n5.nabble.com/Large-C-files-tt4766446.html#a4773493 and I think at other occasions, too. The patch introduces no behavioural changes, it's only shuffling code around. The only goal is to improve the maintainability. I guess the reviewer could verify that a) I haven't botched the split and it all still compiles and workds b) the choice of which modules were defined is correct Cheers, Jan
On mån, 2011-11-28 at 02:00 -0800, Greg Smith wrote: > On 11/13/2011 09:45 AM, Jan Urbański wrote: > > The first one factors out some bolerplate related to executing SPI > > functions in subtransactions (and idea borrowed from pltcl.c). > > While I haven't looked at the code, this seems worthwhile from the > description. > > > The second one is the actual split. plpython.c has been split into 11 > > separate files and one header. > > Could you comment a bit more about what the goal of this is? We don't > have a reviewer for this patch yet, and I think part of the reason is > because it's not really obvious what it's supposed to be doing, and why > that's useful. I will look into it.
Rebased against master after the SPI cursor patch has been committed. The first patch removes SPI boilerplate from the cursor functions as well and the second patch creates a plpython_cursor.c file. A side effect of creating a separate file for cursors is that I had to make PLy_spi_transaction_{begin,commit,abort} helper functions external since they're used both by regular SPI execution functions and the cursor functions. They live the plpython_spi.c which is not an ideal place for them, but IMHO it's not bad either. Cheers, Jan
Attachment
How to people feel about naming the files (as proposed) ! OBJS = plpython.o plpython_io.o plpython_procedure.o plpython_exec.o \ ! plpython_plpy.o plpython_spi.o plpython_result.o plpython_cursor.o \ ! plpython_plan.o plpython_subtransaction.o plpython_functions.o \ ! plpython_elog.o vs. say ! OBJS = main.o io.o procedure.o exec.o plpy.o spi.o result.o cursor.o \ ! plan.o subtransaction.o functions.o elog.o ?
Excerpts from Peter Eisentraut's message of jue dic 15 12:00:13 -0300 2011: > How to people feel about naming the files (as proposed) > > ! OBJS = plpython.o plpython_io.o plpython_procedure.o plpython_exec.o \ > ! plpython_plpy.o plpython_spi.o plpython_result.o plpython_cursor.o \ > ! plpython_plan.o plpython_subtransaction.o plpython_functions.o \ > ! plpython_elog.o > > vs. say > > ! OBJS = main.o io.o procedure.o exec.o plpy.o spi.o result.o cursor.o \ > ! plan.o subtransaction.o functions.o elog.o > > ? I find the extra prefix unnecessary and ugly; if we had to had a prefix, I'd choose a shorter one (maybe "py" instead of "plpython_"). -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Peter Eisentraut's message of jue dic 15 12:00:13 -0300 2011: >> How to people feel about naming the files (as proposed) >> >> ! OBJS = plpython.o plpython_io.o plpython_procedure.o plpython_exec.o \ >> ! plpython_plpy.o plpython_spi.o plpython_result.o plpython_cursor.o \ >> ! plpython_plan.o plpython_subtransaction.o plpython_functions.o \ >> ! plpython_elog.o >> >> vs. say >> >> ! OBJS = main.o io.o procedure.o exec.o plpy.o spi.o result.o cursor.o \ >> ! plan.o subtransaction.o functions.o elog.o >> >> ? > I find the extra prefix unnecessary and ugly; if we had to had a > prefix, I'd choose a shorter one (maybe "py" instead of "plpython_"). +1 for a prefix, mainly because the shorter names duplicate some names already in use elsewhere in our tree. But I agree with Alvaro that "py" would be sufficient. regards, tom lane
On tis, 2011-12-06 at 00:58 +0100, Jan Urbański wrote: > Rebased against master after the SPI cursor patch has been committed. > > The first patch removes SPI boilerplate from the cursor functions as > well and the second patch creates a plpython_cursor.c file. > > A side effect of creating a separate file for cursors is that I had to > make PLy_spi_transaction_{begin,commit,abort} helper functions external > since they're used both by regular SPI execution functions and the > cursor functions. > > They live the plpython_spi.c which is not an ideal place for them, but > IMHO it's not bad either. Committed now. I moved a few more things around. I split up the plpython.h header file to create a separate header file for each .c file, so that the hierarchical releationship of the modules is clearer. (The only cases of circular relationships should be caused by use of global variables.) Also, I named the files that contain classes or modules more like they are in the CPython source code, e.g., plpy_cursorobject.c.
On 18/12/11 20:53, Peter Eisentraut wrote: > On tis, 2011-12-06 at 00:58 +0100, Jan Urbański wrote: >> Rebased against master after the SPI cursor patch has been committed. >> >> The first patch removes SPI boilerplate from the cursor functions as >> well and the second patch creates a plpython_cursor.c file. >> >> A side effect of creating a separate file for cursors is that I had to >> make PLy_spi_transaction_{begin,commit,abort} helper functions external >> since they're used both by regular SPI execution functions and the >> cursor functions. >> >> They live the plpython_spi.c which is not an ideal place for them, but >> IMHO it's not bad either. > > Committed now. Great, thanks! I hope this will make for a more maintanable PL/Python. By the way, the buildfarm is turning red because it's missing the attached patch. Cheers, Jan