Re: CC_send_query_append crash - Mailing list pgsql-odbc

From Malcolm MacLeod
Subject Re: CC_send_query_append crash
Date
Msg-id 1399473449.4578.127.camel@watchmen.homenetwork
Whole thread Raw
In response to Re: CC_send_query_append crash  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: CC_send_query_append crash
List pgsql-odbc
On Mon, 2014-05-05 at 11:07 +0300, Heikki Linnakangas wrote:
> On 05/01/2014 02:47 PM, Malcolm MacLeod wrote:
> >
> >> <malcolm.macleod@tshwanedje.com> wrote:
> >>> The crash seems to occur because CC_send_query_append crash takes a
> >>> local copy of the pointer 'self->sock' at the top of the function,
> >>> 'self' is then passed around to various functions (some of which have
> >>> the side effect of setting self->sock to NULL (and deleting) if there is
> >>> a lost connection) and then the local copy of the pointer (which is now
> >>> dangling) is dereferenced lower down in the function.
> >>> Essentially if there is a disconnect while CC_send_query_append is
> >>> running there is a risk of crash.
> >> Looking at the code, I am seeing that the problem is related to
> >> CC_on_abort where conn->sock is set to NULL when the connection is
> >> considered as dead. And I am indeed seeing two code paths (when
> >> sending the 'C' message there is an ABORT check and in cleanup
> >> section) that could use this NULL socket afterwards. Your patch is
> >> perhaps a bit too much. So I am proposing the attached patch instead.
> >> Let me know if this fixes your issue as well.
> >
> > Thanks for the fast response!
> > Your proposed patch would also fix the issue, so I have no problem with
> > it being used instead.
> >
> > I guess from my side I just don't personally understand the point of
> > keeping the local pointer copy at all (it just seems like an invitation
> > for this sort of thing to occur) - so it made more sense to me to remove
> > it entirely to prevent future occurrences of similar issues - although I
> > suppose also the less code disturbed the better. I am not overly
> > familiar with the code so can't say what is best.
>
> I agree that keeping the local pointer copy is an accident waiting to
> happen. Michael's more surgical approach would make sense if we had to
> carefully track the places where self->sock might become NULL. But we
> don't, as all the SOCK_* macros and functions contain NULL checks.
>
> I think the extra null check that both patch versions added to the
> cleanup code path is also unnecessary and wrong. SOCK_get_errcode()
> returns SOCKET_CLOSED error if the argument is null, which will cause
> the connection to be marked as dead, which is the right thing to do.
> However, when I removed that check, the "diagnostics" regression test
> started to fail. What happens is that when the socket is closed while
> processing results in CC_send_query_append, the incomplete QResult
> object is discarded, even though it contained a useful error message. So
> we lose some information that way. But that in turn is pretty simple to
> fix: if we don't set ReadyToReturn in the cleanup routine, then the
> partial QResult object is returned again.
>
> Looking closer at the loop in CC_send_query_append(), I think it's
> basically wrong to set ReadyToReturn when an error occurs. ReadyToReturn
> indicates that the whole result set has been processed, and we're ready
> to send a new query. In many of the error conditions where we currently
> set ReadyToReturn, we're clearly not ready to process the next query.
> For example, in the 'T' case, if QR_Constructor() fails because of
> out-of-memory, we set ReadyToReturn=true and just bail out of the loop.
> That's wrong - we must process the rest of the result set before returning.
>
> Anyway, I committed your original patch with minor changes, which fixes
> your original problem (commit 22b151e). But the error handling in that
> function could use some more work...

Thanks! Will get a copy of the latest code and use it for testing.
Unfortunately I don't have the time to get familiar with the code myself
or I would try help a bit more. I will throw our stress tests at the
latest code and see if anything else turns up though.

Any idea on when the next official psqlodbc build is likely to take
place? Would be nice to know for planning purposes so we can get our
clients onto a fixed version.

- Malcolm MacLeod



pgsql-odbc by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: CC_send_query_append crash
Next
From: PG User
Date:
Subject: keep alive support