Re: Psycopg2 and LIXA - Mailing list psycopg

From Christian Ferrari
Subject Re: Psycopg2 and LIXA
Date
Msg-id 1329084178.24559.YahooMailNeo@web29501.mail.ird.yahoo.com
Whole thread Raw
In response to Re: Psycopg2 and LIXA  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Responses Re: Psycopg2 and LIXA
Re: Psycopg2 and LIXA
List psycopg
Hi Daniele,
I've greatly appreciated your answer. There is a lot of stuff to work on.
At first glance it seems some of the standard followed by LIXA could hurt some Python de-facto standard, but I think a
solutionpath can be found. 

>>  Thinking about integration between LIXA and Psycopg2 I'm proposing three different paths:
[...]

>>  3. it could be very easy to overload the "psycopg2.connect()" method: if it accepted a "PGconn *" too, the
integrationwould be straightforward like: psycopg2.connect([...], lixa.lixa_pq_get_conn() ) 

> The big shortcoming I see in this method is that it takes a Python
> object which is a wrapper to the PGconn, to be accessed from C. Swig
> is not the only way to create such a wrapper, so I wouldn't like to
> bind the psycopg C implementation specifically to swig (note that
> psycopg should unpack manually the swig wrapper: it wouldn't be
> automatic as psycopg is not swig-generated). A more portable method
> would be to have connect() just receive an integer which would be a
> pointer to the PGconn... but I wouldn't define it as a robust
> interface! An entirely different wrapper for dynamic libraries is the
> one provided by ctypes <http://docs.python.org/library/ctypes.html>,
> which is part of the Python standard library and requires no code
> generation beforehand (two reasons for which it could be considered a
> somewhat blessed wrapper).

At this time I'm looking to SWIG just because it could help me in wrapping LIXA with Python, PHP, Perl and Ruby;
withouta tool like SWIG, the mileage would probably be too long. 
I don't know CTYPE (unfortunately I'm not a Python expert at all) and in the meantime I'm going to discover the power
ofCTYPE. 
I agree with you: if Psycopg2 is not based on SWIG, it will be a bad choice to bind it with SWIG for the sake of this
integration.
The idea behind 

psycopg2.connect([...], lixa.lixa_pq_get_conn() )

is: if Psycopg2 could accept the PostgreSQL (libpq-fe) connection from third party supplying an overloaded method, I
couldintegrate LIXA with Psycopg2 without strange tricks and hacks inside LIXA. The type of object an overloaded
psycopg2.connectmethod accepted, should be the best choice for Psycopg2, practical for LIXA. 
We could discuss this point in a successive reply because I'm more interested in the following issues.

>>  What's your opinions and suggestions?
>
> I guess it depends on how do you want your library to be used by the
> Python code, what level or transparency you require from it, or
> conversely how much explicit you want using lixa to be.

I think this is the right time to explain some design choices of LIXA.
When I started developing LIXA (3 years ago), I was principally interested in the XA protocol. I studied the official
documentpublished by X/Open and discovered XA is a *system* interface: it was designed as a standard API (and protocol)
betweenone Transaction Manager and many Resource Managers. It was not designed to be used by the developers of an
ApplicationProgram (I'm using the same terminology used by X/Open documentation). 
I discovered there was a standard for the interface exposed by a Transaction Manager and invoked by an Application
Program:it's name is "TX (Transaction Demarcation) Specification". 
In my honest opinion TX is not a marvel, but incidentally it was supported by Encina (now IBM TXSeries) and by Tuxedo
(onceBEA, now Oracle). 
I chose to avoid reinventing the wheel and sticked to TX standard.
The TX API is not complete and does not solve some issues, but it has two interesting features: it's easy to understand
(andimplement) and it doesn't specify too many restrictions (some issues can be bypassed). Speaking about TX, I would
say"it just works". 
The TX API was designed for C and COBOL languages: I suppose no one could imagine a crazy guy would try to extend it to
Python,PHP, Perl and Ruby that time. 

> One thing I
> notice, I don't know how much do you know about it, is that all Python
> database modules implement the same basic interface, called DBAPI
> <http://www.python.org/dev/peps/pep-0249/>: this interface also
> defines how to perform 2-phase commit, and it declares to follow the
> same XA X/Open standard lixa implements... although it does with
> different methods.

I briefly examined DBAPI, but unfortunately there is a major drawback: DBAPI supplies an API that is equivalent to XA
andcan be used to implement a Transaction Manager, but it should not be used by an Application Program. If an
ApplicationProgram had to deal with "prepare" and "recover" verbs, it would implement a Transaction Manager itself. 

> I'm afraid I'm no 2PC expert (although I've
> implemented the support for such methods in psycopg). So in first
> place I wonder if there is a different level at which the libraries
> may interoperate, using the DBAPI 2PC-related method. BTW, if lixa
> could operate with the generic DBAPI 2-phase interface, it could work
> for free not only with psycopg but with any driver implementing such
> interface (well, to be honest I don't know how many of them exist
> yet...).

LIXA already implements all the Transaction Manager logic, and all the
code is C code: extending that logic to deal with Python API is a
complex task with many risks.

> Also note that if you force your database connection to have a
> specific interface of yours, you would make harder for Python users to
> use such connection in conjunction with already written code or third
> party libraries: you'd have much more success if you could map the xa
> methods to dbapi methods, which means no tx_close() and such (good
> Python interfaces tend to differ from good C interface, that's why
> swing-generated wrappers are usually poor ones by themselves and
> require further wrapping to stop being painful).

I don't think this is a real issue: LIXA supports Distributed Transaction Processing using TX API. There's no way to
pick-upan Application Program designed for one phase commit and convert it to an Application Program for two phase
commitwithout some changes. The scenario I imagine is the following one:  
1. there is an Application Program designed for only one Resource Manager, for example PostgreSQL
2. the same Application Program must be re-engineered to deal with two Resource Managers (PostgreSQL and MySQL) and
sometransactions must change data inside both Resource Managers (INSERT INTO PostgreSQL, UPDATE MySQL).  
If the data was critical, the developer would use a Transaction Manager with 2 phase commit support: LIXA might be a
choice.

> Before suggesting any specific solution, I'd like to know what is
> between the client function I understand the lixa user should invoke
> (lixa_pq_get_conn()) and the function that actually creates a libpq
> connection (lixa_pq_open()):

Using TX API (supplied by LIXA), there are four steps:
1. tx_open()
2. tx_begin()
3. connection handler retrieval
4. business logic (using connection handler, PGconn * for PostgreSQL)

(2. and 3. can be swapped if necessary)
This is one of the key aspects of TX: the connection must be opened by the Transaction Manager, the configuration
necessaryto open the connection must be managed by the system engineers in charge of the Transaction Manager. 
When you use TX, you don't have to know how the Resource Managers will be reached. TX standard does not specify how the
TransactionManager implements the behavior, it only specifies it's a task of the Transaction Manager. 
LIXA implementation uses a flexible approach: a configuration file contains some profiles, every profile references a
setof Resource Managers and theirs configurations. If the Application Program does not specify the LIXA_PROFILE
environmentvariable, it will use the default (first) available profile; if the Application Program specifies the
LIXA_PROFILEenvironment variable, it will use the desired profile. Some commercial Transaction Managers does not allow
suchflexibility: they behave like LIXA with a single configured profile. 
The X/Open (TX) standard specifies the connection must be opened by the Transaction Manager for many reasons; there are
atleast two really important reasons: 
1. the Resource Manager can not be used independently by the Transaction Manager (an Application Program could create a
session,perform some work and then pass it to the Transaction Manager creating a potential inconsistent state) 
2. the Transaction Manager inspects the Resource Managers at tx_open() time to perform automatic recovery of previous
prepared/recoverypending transactions 

The method "lixa_pq_get_conn()" is a work-around necessary for PostgreSQL and MySQL (lixa_my_get_conn()). Oracle and
DB2do not need such a work-around: they supply specific API; PostgreSQL and MySQL *do* *not* *implement* *standard*
*XA*,but only some proprietary extensions that can be used to arrange an XA like interface (LIXA provides that stubs
too).

> I suspect a good solution could be for
> the lixa code to create a psycopg connection (which in turn calls
> PQconnectdb) and get the PGconn from there, then returning the python
> object to the invoking python code. Also, because lixa seems modular,
> couldn't you create a "psycopg" module, which would largely use
> the
> same "postgresql" module implementation but would offers methods that
> are meaningful for a Python user (i.e. return a connectionObject
> instead of a PGconn)?

Wrapping "lixa_pq_get_conn()" (that's C code) with a method retrieving a different type could be done at C level and
Pythonlevel as well. Where could I found the exact "connectionObject" specification? Could you point me to the right
directionat first step? 

> I believe it is possible for the two libraries to interoperate, but I
> think the implementation is only a detail, easy to solve: I'd rather
> try to understand the way lixa would  be used from Python (creation,
> usage, finalization of the connections, of the transactions and the
> relation between the two) and derive the implementation from such use
> case.

> -- Daniele

There's probably another interesting detail: LIXA implementation of TX API is thread safe and "thread related".
Two distinct threads must invoke two distinct "tx_open()/tx_close()" functions.
The transactional state must not be shared between distinct threads because the state is indexed using the thread id
(TXfunctions do not pass a reference to the state, so it must be implicitly managed by the API). 
This is an example:
Thread1                  Thread2
tx_open()                tx_open()
tx_begin()               tx_begin()
some stuff               some stuff
tx_commit()              tx_rollback()
tx_begin()               tx_close()
some stuff
tx_rollback()
tx_close()
the connection handler must not be passed by Thread1 to Thread2 and vice versa.

Thanks in advance.
Ch.

psycopg by date:

Previous
From: Daniele Varrazzo
Date:
Subject: Re: Psycopg2 and LIXA
Next
From: Federico Di Gregorio
Date:
Subject: Re: Psycopg2 and LIXA