msdtc with 32-bit app fails to resolve in-doubt or not-notifed transactions - Mailing list pgsql-odbc

From Craig Ringer
Subject msdtc with 32-bit app fails to resolve in-doubt or not-notifed transactions
Date
Msg-id 53A45B59.70303@2ndquadrant.com
Whole thread Raw
Responses Re: msdtc with 32-bit app fails to resolve in-doubt or not-notifed transactions  ("Inoue, Hiroshi" <inoue@tpf.co.jp>)
List pgsql-odbc
Hi folks

I've found an issue with psqlODBC's MSDTC support and pgxalib.dll, where
a 32-bit application on a 64-bit server will intermittently leave
transactions in the "only failed to notify" state in MSDTC.

This occurs when:

- The application exits normally after its final ITransaction::Commit
call returns but before MSDTC has invoked
ITransactionResourceAsync::CommitRequest on the psqlODBC-provided
IAsyncPG object; or

- When the application or server crash after MSDTC Phase I but before
Phase II.

In both these cases the resource manager is supposed to handle
transaction resolution. It uses pgxalib.dll for this as that's the
registered XA co-ordinator for the resource type.

I've been able to trace pgxalib.dll (which, btw, was painful, will
follow up on that) and found that XAConnection::xa_recover() is being
called on the transaction, as expected. It's calling into
XAConnection::ActivateConnection, where it fails to establish an ODBC
connection and bails out at the test at 142 after getting return code -1
from SQLDriverConnect(...).

http://msdn.microsoft.com/en-us/library/ms716219(v=vs.85).aspx

suggests that this is SQL_ERROR. pgxalib.dll doesn't call SQLGetDiagRec
or SQLGetDiagField to get any details and log them; I'll submit a
separate patch for that.

It took me a while to figure it out, but SQLDriverConnect is failing
because it's using the name of the 32-bit driver, since it got the DSN
from a 32-bit application. So there's no such driver as far as the
64-bit application is concerned.

(It didn't help that I couldn't enable system-wide ODBC tracing on the
system for unrelated and annoying as-yet-unresolved reasons with the
ODBC driver manager).

Anyway - it looks like it'll be necessary to figure out in pgxalib.dll
when this is happening and remap the driver name. That seems pretty
crude, though, so I'm looking for better ideas.

I'll follow up when it's not midnight with:

- a patch to add proper error diagnostics in pgxalib.dll on connection
failure;

- results of testing a hack that just mangles the dsn connection string
manually, as a proof of concept to show that this is really the issue; and

- If I can figure out how to do it the right way (as opposed to just
abusing a breakpoint to set the lvalue on return like I ended up doing),
some documentation on how to turn pgxalib tracing on.


As part of this I've been wondering whether it's possible to deal with
that exit race condition. I'm not sure how to tackle that - I don't
speak fluent COM or OLE. Do you think it'd be legal to delay in the
IAsyncPG dtor until either we confirm commit of an a tx we know is in
flight or we hit a (short) timeout?

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-odbc by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Protocol de-synchronisation bug, bogus query sent
Next
From: Desenvolvimento
Date:
Subject: Bug when performing command SELECT without cast