Thread: BUG #6208: crash when c: occupied by removable drive

BUG #6208: crash when c: occupied by removable drive

From
"Richard G. Bayer"
Date:
The following bug has been logged online:

Bug reference:      6208
Logged by:          Richard G. Bayer
Email address:      bayer@nws.cz
PostgreSQL version: 8.4.7
Operating system:   Windiws XP 32-bit
Description:        crash when c: occupied by removable drive
Details:

After hours of debugging an application we bought we figured out that the PG
library was the reason.
The computer had a specialty: Installed with a all-in one reader drive
(Apacer) the main HDD partition settled on E: and left C: and D: for
removable SD, CompactFlash etc.
The library handled well the case when C: was not present at all. But as
soon as there was a removable media drive (in UI saying "insert media") the
library crashes. It somehow checks C: and does not handle this Exception
(Insert-Some-Damn-SD-Exception).

I found that out when I moved those removables to X: and Y:. With no C: at
all the application started to work like a charm.

I have no idea if this remained to the following versions, but it is such a
rare circumstance, that I believe your testers could miss it.

Hope this was helpful, Richard G. Bayer

Re: BUG #6208: crash when c: occupied by removable drive

From
Craig Ringer
Date:
On 09/16/2011 05:37 PM, Richard G. Bayer wrote:
> The following bug has been logged online:
>
> Bug reference:      6208
> Logged by:          Richard G. Bayer
> Email address:      bayer@nws.cz
> PostgreSQL version: 8.4.7
> Operating system:   Windiws XP 32-bit
> Description:        crash when c: occupied by removable drive
> Details:
>
> After hours of debugging an application we bought we figured out that the PG
> library was the reason.
When you say "the Pg library" do you mean libpq? PgODBC? The PostGreSQL
server (postgres.exe)?

If you mean libpq or PgODBC, how did you determine that it was the
culprit of the crash? Do you have a test case or simplified demo program
showing the crash? Do you have a backtrace?

If it's actually postgres.exe not a library that crashes, when exactly
does it crash? What is shown in the server log? Can you try to get a
backtrace using the instructions here:

http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows

--
Craig Ringer

Re: BUG #6208: crash when c: occupied by removable drive

From
Craig Ringer
Date:
Please reply all or reply to the pgsql-bugs list, not direct to me.

My response follows inline.

On 18/09/2011 2:03 PM, Bayer G. Richard wrote:
>
> The suspicious library is the libpq.dll that the application uses to
> connect to a database.
>
> On the machine the problem is there are no developer tools. With the
> vendor of the software
> we pinpointed the problem with a primitive approach using MsgBoxes, so
> we found that
> it is the call to the library function that crashes the app.
>
OK, that's the immediate cause, but not necessarily the root cause
("bug"). You can crash in a library by passing a null `char*' where it
expects a valid pointer, by passing a wrong length parameter or a
too-short struct, by having a corrupt heap you haven't detected yet, or
all sorts of other ways.

That doesn't mean there isn't a bug in libpq, just that it's not
necessarily as simple as "it crashed in libpq so the bug is in libpq".

> I even can not specify the term “call to library function” better
>
Ah, so you're not the author of the program having problems?

Can you get the author/vendor to get in touch, or to submit a test-case?

 > I wrote the rest in the description.
 >
 > In short: no C: at all worked, but having a drive with removable
media mapped to C: with
 > no media in it caused trouble.

Yeah, but that's not all that much information.  It'd be really good to
know which _exact_ function call failed? With which arguments? Does it
still fail when run in a cut-down test program that only does the
minimum work required to make that function call?

I'm not enthusiastic about setting up a weirdly configured Windows VM to
play with this. I suffer Windows development in my free time too much
already :S

> All I know is that it is the call
> where the IP of the server is first used (some kind of connect() maybe)
>

OK, that's something. It's probably PQconnectdbParams(...),
PQconnectdb(...), PQsetdbLogin(...) or PQsetdb(...), assuming you're
actually using libpq directly. You might be using libpq via PgODBC
though. Again, a test case or backtrace (from the vendor perhaps?) would
really help; as you can see even if it is libpq there are still a lot of
unknown variables with the amount of information you've provided.

I realise you don't want to spend more time on this, so it's up to you
whether you follow up with the vendor of your software and get a decent
test case or a backtrace or something. Even if you could test with
PgAgent III and verify that *it* crashes when you connect to the
database, that'd help.

If you don't want to do more, it's possible that one of the more
Windows-focused PostgreSQL outfits like EnterpriseDB might pick this up
and look into it, but given the weird nature of the problem config it's
just as likely they won't be able to justify the time/cost. Unless you
can at least show that it crashes psql or PgAgent III there isn't even
any convincing evidence it's not just a bug in the vendor's app that's
being tripped over inside a call to libpq.

--
Craig Ringer