Thread: pgwin32_open returning EINVAL

pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Hi,

We've seen several cases of people (running 8.1 or 8.2) that see
messages like this:

2007-11-26 11:41:59 ERROR:  could not open relation 1663/352369/353685: Invalid argument

The platform is Win32.

The problem is that pgwin32_open reduces any error code from
GetLastError that's not ERROR_PATH_NOT_FOUND, ERROR_FILE_NOT_FOUND,
ERROR_FILE_EXISTS or ERROR_ACCESS_DENIED into EINVAL, without further
notice.  This is problematic because the true cause of the problem is
being hidden and it's hard to track down what's going on.

Can we do something like this to report the Win32 error code so that the
user has a higher chance of figuring out what's going on?

--
Alvaro Herrera                          Developer, http://www.PostgreSQL.org/
"Las navajas y los monos deben estar siempre distantes"   (Germán Poo)

Attachment

Re: pgwin32_open returning EINVAL

From
Martijn van Oosterhout
Date:
On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> Can we do something like this to report the Win32 error code so that the
> user has a higher chance of figuring out what's going on?

The Windows API provides a way to get the error message associated
with the code. It seems it would be more user-friendly if the message
was displayed (just as for unix) rather than the code.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Martijn van Oosterhout wrote:
> On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > Can we do something like this to report the Win32 error code so that the
> > user has a higher chance of figuring out what's going on?
> 
> The Windows API provides a way to get the error message associated
> with the code. It seems it would be more user-friendly if the message
> was displayed (just as for unix) rather than the code.

Yeah, I looked into that (it's FormatMessage) but it's sufficiently hard
to use that I refrained.  All the more when I can't actually compile
that code.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Nov 28, 2007 at 12:24:26PM -0300, Alvaro Herrera wrote:
> Martijn van Oosterhout wrote:
> > On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > > Can we do something like this to report the Win32 error code so that the
> > > user has a higher chance of figuring out what's going on?

We already do something very similar to what you're donig in
backend/port/wni32/socket.c :: TranslateSocketError().

So I think it's a good idea to do it, yes.


> > The Windows API provides a way to get the error message associated
> > with the code. It seems it would be more user-friendly if the message
> > was displayed (just as for unix) rather than the code.
> 
> Yeah, I looked into that (it's FormatMessage) but it's sufficiently hard
> to use that I refrained.  All the more when I can't actually compile
> that code.

It's fairly trivial to use, really. See for example pg_SSPI_error in
auth.c. But if it only happens rarely, we shouldn't need it. Let's try it
with just an int for now, and we can perhaps detect the specific cases when
it happens often, and deal with those separately.

I can put together one that uses messages if you want me to (since I can
compile it...) though. I assume you consider your current patch safe enough
not to need help buiklding that part before it hits the buildfarm :-P


//Magnus


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Nov 28, 2007 at 05:20:46PM +0100, Magnus Hagander wrote:
> On Wed, Nov 28, 2007 at 12:24:26PM -0300, Alvaro Herrera wrote:
> > Martijn van Oosterhout wrote:
> > > On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > > > Can we do something like this to report the Win32 error code so that the
> > > > user has a higher chance of figuring out what's going on?
>
> We already do something very similar to what you're donig in
> backend/port/wni32/socket.c :: TranslateSocketError().
>
> So I think it's a good idea to do it, yes.

Thinking about this some more, I think this is a better patch - we already
have a function that takes care of this, including both error and debug
logging. Anybody disagree? If not, I'll go ahead and apply it...

//Magnus


Attachment

Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Magnus Hagander wrote:
> On Wed, Nov 28, 2007 at 05:20:46PM +0100, Magnus Hagander wrote:
> > On Wed, Nov 28, 2007 at 12:24:26PM -0300, Alvaro Herrera wrote:
> > > Martijn van Oosterhout wrote:
> > > > On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > > > > Can we do something like this to report the Win32 error code so that the
> > > > > user has a higher chance of figuring out what's going on?
> > 
> > We already do something very similar to what you're donig in
> > backend/port/wni32/socket.c :: TranslateSocketError().
> > 
> > So I think it's a good idea to do it, yes.
> 
> Thinking about this some more, I think this is a better patch - we already
> have a function that takes care of this, including both error and debug
> logging. Anybody disagree? If not, I'll go ahead and apply it...

Hmm, but this still mixes some error codes.  To absolutely get the
Windows error code you would have to be running with DEBUG5, which I
don't think is acceptable for a production system during any interesting
length of time ...  Can we have that DEBUG5 message changed to LOG
instead?

Or maybe have _dosmaperr receive the elevel with which to report the
message as a parameter, and have current callers use DEBUG5, and
pgwin32_open use LOG.  That way we can backpatch it to 8.2 without
changing current behavior.

-- 
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Prefiero omelette con amigos que caviar con tontos"                                                 (Alain Nonnet)


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Nov 29, 2007 at 09:09:47AM -0300, Alvaro Herrera wrote:
> Magnus Hagander wrote:
> > On Wed, Nov 28, 2007 at 05:20:46PM +0100, Magnus Hagander wrote:
> > > On Wed, Nov 28, 2007 at 12:24:26PM -0300, Alvaro Herrera wrote:
> > > > Martijn van Oosterhout wrote:
> > > > > On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > > > > > Can we do something like this to report the Win32 error code so that the
> > > > > > user has a higher chance of figuring out what's going on?
> > > 
> > > We already do something very similar to what you're donig in
> > > backend/port/wni32/socket.c :: TranslateSocketError().
> > > 
> > > So I think it's a good idea to do it, yes.
> > 
> > Thinking about this some more, I think this is a better patch - we already
> > have a function that takes care of this, including both error and debug
> > logging. Anybody disagree? If not, I'll go ahead and apply it...
> 
> Hmm, but this still mixes some error codes.  To absolutely get the
> Windows error code you would have to be running with DEBUG5, which I
> don't think is acceptable for a production system during any interesting
> length of time ...  Can we have that DEBUG5 message changed to LOG
> instead?

Maybe. I'm concerned we might end up logging a whole lot more, for cases
where it's not an actual error. For example, a file that doesn't exist
doesn't necessarily mean it's an error... I don't want to have to go
through all code-paths that end up calling that function to see if that may
be so...

We can safely drop it to some lower level DEBUG though, maybe DEBUG1?


But it is true that we have mixed error codes. But we only do that when we
know they're actually there. If we have an unknown code, we don't just
return it as EINVAL without logging (as open did before) - and that log 
goes out as LOG.

(the open code already mixed two win32 codes into ENOENT already, btw)


> Or maybe have _dosmaperr receive the elevel with which to report the
> message as a parameter, and have current callers use DEBUG5, and
> pgwin32_open use LOG.  That way we can backpatch it to 8.2 without
> changing current behavior.

I don't think we can, or at least should, touch the signature for
_dosmaperr(). It's a system replacment, I think we should keep the same
signature for it.


//Magnus


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Magnus Hagander wrote:
> On Thu, Nov 29, 2007 at 09:09:47AM -0300, Alvaro Herrera wrote:
> > Magnus Hagander wrote:
> > > On Wed, Nov 28, 2007 at 05:20:46PM +0100, Magnus Hagander wrote:
> > > > On Wed, Nov 28, 2007 at 12:24:26PM -0300, Alvaro Herrera wrote:
> > > > > Martijn van Oosterhout wrote:
> > > > > > On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > > > > > > Can we do something like this to report the Win32 error code so that the
> > > > > > > user has a higher chance of figuring out what's going on?
> > > > 
> > > > We already do something very similar to what you're donig in
> > > > backend/port/wni32/socket.c :: TranslateSocketError().
> > > > 
> > > > So I think it's a good idea to do it, yes.
> > > 
> > > Thinking about this some more, I think this is a better patch - we already
> > > have a function that takes care of this, including both error and debug
> > > logging. Anybody disagree? If not, I'll go ahead and apply it...
> > 
> > Hmm, but this still mixes some error codes.  To absolutely get the
> > Windows error code you would have to be running with DEBUG5, which I
> > don't think is acceptable for a production system during any interesting
> > length of time ...  Can we have that DEBUG5 message changed to LOG
> > instead?
> 
> Maybe. I'm concerned we might end up logging a whole lot more, for cases
> where it's not an actual error. For example, a file that doesn't exist
> doesn't necessarily mean it's an error... I don't want to have to go
> through all code-paths that end up calling that function to see if that may
> be so...

I just checked.  I see there are only five callers.  In three cases (two
in file/fd.c and one in port/dirent.c), there is at a single error code
which is "possibly expected".  It is taken care of without calling
_dosmaperr at all.  In syslogger.c there are two possibly expected error
codes, dealt with in the same way.  And the last caller is
port/getrusage.c, which has no possibly-expected error code.

So I don't think this is a concern -- whenever _dosmaperr is called, a
"true" error message is already going to be logged.

(The only case where a message would be logged inappropriately would be
in file/fd.c if _dosmaperr returned EINTR, but AFAICS we don't map any
code to that).

> But it is true that we have mixed error codes. But we only do that when we
> know they're actually there. If we have an unknown code, we don't just
> return it as EINVAL without logging (as open did before) - and that log 
> goes out as LOG.

Yeah, I understand.  But for example there are several different cases
that are mapped to EACCES.  In the cases we're currently following,
having the exact error code could prove crucial to determining the cause
of the error.

> (the open code already mixed two win32 codes into ENOENT already, btw)

Yeah, I saw that, but it's not really a problem because the errors are
"file does not exist" and "dir does not exist", and it can be very
easily confirmed whether the file and dir actually exist or not.  But if
you have ERROR_LOCK_VIOLATION reported identically to
ERROR_SHARING_VIOLATION, or whatever, there's no way you can tell which
one actually occured.


> > Or maybe have _dosmaperr receive the elevel with which to report the
> > message as a parameter, and have current callers use DEBUG5, and
> > pgwin32_open use LOG.  That way we can backpatch it to 8.2 without
> > changing current behavior.
> 
> I don't think we can, or at least should, touch the signature for
> _dosmaperr(). It's a system replacment, I think we should keep the same
> signature for it.

Good point.  If this were truly a problem we could make another routine
with the same code called differently, but I don't think it's really
important.

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Nov 29, 2007 at 09:43:30AM -0300, Alvaro Herrera wrote:
> Magnus Hagander wrote:
> > On Thu, Nov 29, 2007 at 09:09:47AM -0300, Alvaro Herrera wrote:
> > > Magnus Hagander wrote:
> > > > On Wed, Nov 28, 2007 at 05:20:46PM +0100, Magnus Hagander wrote:
> > > > > On Wed, Nov 28, 2007 at 12:24:26PM -0300, Alvaro Herrera wrote:
> > > > > > Martijn van Oosterhout wrote:
> > > > > > > On Wed, Nov 28, 2007 at 11:57:35AM -0300, Alvaro Herrera wrote:
> > > > > > > > Can we do something like this to report the Win32 error code so that the
> > > > > > > > user has a higher chance of figuring out what's going on?
> > > > > 
> > > > > We already do something very similar to what you're donig in
> > > > > backend/port/wni32/socket.c :: TranslateSocketError().
> > > > > 
> > > > > So I think it's a good idea to do it, yes.
> > > > 
> > > > Thinking about this some more, I think this is a better patch - we already
> > > > have a function that takes care of this, including both error and debug
> > > > logging. Anybody disagree? If not, I'll go ahead and apply it...
> > > 
> > > Hmm, but this still mixes some error codes.  To absolutely get the
> > > Windows error code you would have to be running with DEBUG5, which I
> > > don't think is acceptable for a production system during any interesting
> > > length of time ...  Can we have that DEBUG5 message changed to LOG
> > > instead?
> > 
> > Maybe. I'm concerned we might end up logging a whole lot more, for cases
> > where it's not an actual error. For example, a file that doesn't exist
> > doesn't necessarily mean it's an error... I don't want to have to go
> > through all code-paths that end up calling that function to see if that may
> > be so...
> 
> I just checked.  I see there are only five callers.  In three cases (two
> in file/fd.c and one in port/dirent.c), there is at a single error code
> which is "possibly expected".  It is taken care of without calling
> _dosmaperr at all.  In syslogger.c there are two possibly expected error
> codes, dealt with in the same way.  And the last caller is
> port/getrusage.c, which has no possibly-expected error code.
> 
> So I don't think this is a concern -- whenever _dosmaperr is called, a
> "true" error message is already going to be logged.

What about all points that call readdir() which maps to that acll in
port/dirent.c?

If we can disregard that problem, then I think it's good to increase the
level of logging for that one to either NOTICE or LOG.


> (The only case where a message would be logged inappropriately would be
> in file/fd.c if _dosmaperr returned EINTR, but AFAICS we don't map any
> code to that).

No, we don't - the concept of EINTR doesn't really exist on win32, since
there are no signals..


> > But it is true that we have mixed error codes. But we only do that when we
> > know they're actually there. If we have an unknown code, we don't just
> > return it as EINVAL without logging (as open did before) - and that log 
> > goes out as LOG.
> 
> Yeah, I understand.  But for example there are several different cases
> that are mapped to EACCES.  In the cases we're currently following,
> having the exact error code could prove crucial to determining the cause
> of the error.

Yeah, agreed.

//Magnus



Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Magnus Hagander wrote:
> On Thu, Nov 29, 2007 at 09:43:30AM -0300, Alvaro Herrera wrote:
> > Magnus Hagander wrote:

> > > Maybe. I'm concerned we might end up logging a whole lot more, for cases
> > > where it's not an actual error. For example, a file that doesn't exist
> > > doesn't necessarily mean it's an error... I don't want to have to go
> > > through all code-paths that end up calling that function to see if that may
> > > be so...
> >
> > I just checked.  I see there are only five callers.  In three cases (two
> > in file/fd.c and one in port/dirent.c), there is at a single error code
> > which is "possibly expected".  It is taken care of without calling
> > _dosmaperr at all.  In syslogger.c there are two possibly expected error
> > codes, dealt with in the same way.  And the last caller is
> > port/getrusage.c, which has no possibly-expected error code.
> > 
> > So I don't think this is a concern -- whenever _dosmaperr is called, a
> > "true" error message is already going to be logged.
> 
> What about all points that call readdir() which maps to that acll in
> port/dirent.c?

Sorry, I don't follow.  I think the expected case is that FindNextFile
fails with ERROR_NO_MORE_FILES when there are no more files, on which
case we don't call _dosmaperr.

...

Oh, I see what you mean: for the unexpected cases that readdir() does
call _dosmaperr, readdir returns NULL but what does the caller do?

The good news is that most callers of readdir are in frontend programs:
pg_standby, initdb, pg_resetxlog.  There are two callers in the backend:
file/fd.c again, which already calls ereport(ERROR) if anything weird
happen, and pgfnames() which also logs a WARNING.

Callers in frontend programs are not a problem, because the current
_dosmaperr already calls fprintf(stderr) with the code mapping message
in all cases.


Hmm, I just noticed a bug in those fprintf calls -- they are missing
the terminating newline.  Please change that too if you're going to
patch this part of the code.  In order to avoid translation problems, I
think it should be like this:

fprintf(stderr, _("mapped win32 error code %lu to %d" "\n"), e, errno);

Thanks!

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
One man's impedance mismatch is another man's layer of abstraction.
(Lincoln Yeoh)


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Magnus Hagander wrote:
>>> Maybe. I'm concerned we might end up logging a whole lot more, for cases
>>> where it's not an actual error.

I'm very concerned about that too, and think that DEBUG5 is just fine.

> Hmm, I just noticed a bug in those fprintf calls -- they are missing
> the terminating newline.  Please change that too if you're going to
> patch this part of the code.  In order to avoid translation problems, I
> think it should be like this:

> fprintf(stderr, _("mapped win32 error code %lu to %d" "\n"), e, errno);

That's both unreadable and useless, because gettext will smash it to one
string anyway.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Magnus Hagander wrote:
> >>> Maybe. I'm concerned we might end up logging a whole lot more, for cases
> >>> where it's not an actual error.
> 
> I'm very concerned about that too, and think that DEBUG5 is just fine.

Well, the analysis was already done which says this shouldn't be a problem.

> > Hmm, I just noticed a bug in those fprintf calls -- they are missing
> > the terminating newline.  Please change that too if you're going to
> > patch this part of the code.  In order to avoid translation problems, I
> > think it should be like this:
> 
> > fprintf(stderr, _("mapped win32 error code %lu to %d" "\n"), e, errno);
> 
> That's both unreadable and useless, because gettext will smash it to one
> string anyway.

Huh, right.  I meant to have the second string outside the _() call, but
I don't think that can be made to work easily.

-- 
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"Puedes elegir el color de tu auto -- siempre y cuando sea negro."
(HenryFord)
 


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> I'm very concerned about that too, and think that DEBUG5 is just fine.

> Well, the analysis was already done which says this shouldn't be a problem.

That analysis is full of holes --- FileRead and FileWrite for starters.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Tom Lane wrote:
> >> I'm very concerned about that too, and think that DEBUG5 is just fine.
> 
> > Well, the analysis was already done which says this shouldn't be a problem.
> 
> That analysis is full of holes --- FileRead and FileWrite for starters.

I already did.  The case where they retry do not call _dosmaperr.

-- 
Alvaro Herrera                          Developer, http://www.PostgreSQL.org/
"No es bueno caminar con un hombre muerto"


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> That analysis is full of holes --- FileRead and FileWrite for starters.

> I already did.  The case where they retry do not call _dosmaperr.

What's retry got to do with it?  What's displeasing me is the idea of
LOG messages showing up during perfectly normal operation.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Tom Lane wrote:
> >> That analysis is full of holes --- FileRead and FileWrite for starters.
> 
> > I already did.  The case where they retry do not call _dosmaperr.
> 
> What's retry got to do with it?  What's displeasing me is the idea of
> LOG messages showing up during perfectly normal operation.

Oh, I see your point.  It's that those routines will sometimes be
called, they return an error, and this is *ignored* by the caller.  In
the case of FileRead, the only such caller is ExecHashJoinGetSavedTuple.

In the case of FileWrite, the thing is quite a bit more difficult to
follow, but it goes through BufFileWrite and BufFileFlush.

So yeah, it seems there is valid code trying to call FileRead and
FileWrite, have it error out, and silently ignore the error.  I'm not
going to argue this late in the cycle that all that code be changed, so
I think a reasonable compromise is to turn the ereport() in _dosmaperr
to DEBUG1 instead.  That way it won't clutter any log by default, and in
the cases where we're actually interested in tracking the problematic
situation, we don't need to get huge amounts of log traffic coming from
other parts of the system.

All the cases where BufFileFlush is called and it ignores an error are
bugs.  I think it's quite safe to modify BufFileFlush to ereport(ERROR)
if it cannot do what it was asked.

And all the callers of BufFileWrite immediately ereport() if it cannot
write the specified amount of bytes.

So there is exactly one case where these routines would be unnecessarily
noisy, and that is ExecHashJoinGetSavedTuple.

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"No necesitamos banderasNo reconocemos fronteras"                  (Jorge González)


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> I think a reasonable compromise is to turn the ereport() in _dosmaperr
> to DEBUG1 instead.  That way it won't clutter any log by default, and in
> the cases where we're actually interested in tracking the problematic
> situation, we don't need to get huge amounts of log traffic coming from
> other parts of the system.

I'm still not convinced what you think the problematic situation is.
It's already the case (and reasonable, I think) that _dosmaperr issues a
LOG message if it sees a GetLastError code it doesn't recognize; that
addresses the problem that this thread started with.  Why do we need to
make the success case chattier?
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> I think a reasonable compromise is to turn the ereport() in _dosmaperr
>> to DEBUG1 instead.  That way it won't clutter any log by default, and in
>> the cases where we're actually interested in tracking the problematic
>> situation, we don't need to get huge amounts of log traffic coming from
>> other parts of the system.
> 
> I'm still not convinced what you think the problematic situation is.
> It's already the case (and reasonable, I think) that _dosmaperr issues a
> LOG message if it sees a GetLastError code it doesn't recognize; that
> addresses the problem that this thread started with.  Why do we need to
> make the success case chattier?

I believe Alvaros point is that several different GetLastError codes map
to the same errno code, making it impossible to see the difference
between those errors.

//Magnus


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Tom Lane wrote:
>> I'm still not convinced what you think the problematic situation is.

> I believe Alvaros point is that several different GetLastError codes map
> to the same errno code, making it impossible to see the difference
> between those errors.

(1) If we need to know the difference, then we shouldn't be mapping them
to the same thing.

(2) Do we have any live cases where we must know this?  Much less enough
cases to justify a global increase in the log chattiness?  It's not like
we cannot get the information if we really need it.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > Tom Lane wrote:
> >> I'm still not convinced what you think the problematic situation is.
> 
> > I believe Alvaros point is that several different GetLastError codes map
> > to the same errno code, making it impossible to see the difference
> > between those errors.
> 
> (1) If we need to know the difference, then we shouldn't be mapping them
> to the same thing.

I totally agree with that, but the problem is the API is not very
flexible in this regard.  We can only use error codes that strerror() is
going to recognize.

A radical idea would be to store the Windows error code from
GetLastError in a new field in ErrorData, and display it in the next
ereport().  Perhaps use a special error callback to add an errcontext in
the possibly problematic cases.

> (2) Do we have any live cases where we must know this?  Much less enough
> cases to justify a global increase in the log chattiness?  It's not like
> we cannot get the information if we really need it.

Yes.  This thread shows the problem:

http://archives.postgresql.org/pgsql-es-ayuda/2007-11/msg00354.php

Basically he is getting this error:

2007-11-16 14:54:16 ERROR:  could not open relation 1663/16403/16487: Invalid argument

and no further indication of what's going on.  I did some searching a
couple of days ago and none of the cases I could find got a resolution.

http://archives.free.net.ph/message/20060930.223306.6ac7d657.en.html
http://www.mydatabasesupport.com/forums/postgresql/314656-general-postgre-crash-currval-problem-help.html

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> (2) Do we have any live cases where we must know this?

> Yes.  This thread shows the problem:
> http://archives.postgresql.org/pgsql-es-ayuda/2007-11/msg00354.php
> Basically he is getting this error:
> 2007-11-16 14:54:16 ERROR:  could not open relation 1663/16403/16487: Invalid argument

Well, since EINVAL is the default result from _dosmaperr, and none of
the cases it represents are "expected", why don't we just remove all of
the explicit mappings to EINVAL from doserrors[]?  Then we will get the
LOG message you need, and we won't have to increase the chattiness level
for anything else.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Tom Lane wrote:
> >> (2) Do we have any live cases where we must know this?
> 
> > Yes.  This thread shows the problem:
> > http://archives.postgresql.org/pgsql-es-ayuda/2007-11/msg00354.php
> > Basically he is getting this error:
> > 2007-11-16 14:54:16 ERROR:  could not open relation 1663/16403/16487: Invalid argument
> 
> Well, since EINVAL is the default result from _dosmaperr, and none of
> the cases it represents are "expected", why don't we just remove all of
> the explicit mappings to EINVAL from doserrors[]?  Then we will get the
> LOG message you need, and we won't have to increase the chattiness level
> for anything else.

Well, the problematic routine is not already using _dosmaperr currently.
It is doing it's own mapping and neglecting to report anything.  In
fact, after all the problems that appeared after Magnus proposed to use
_dosmaperr, I'm inclined to go with my original suggestion: don't use
_dosmaperr at all and instead add an ereport(LOG) with the Windows error
code.

The routine I'm talking about (pgwin32_open) has this:
       switch (err)       {               /* EMFILE, ENFILE should not occur from CreateFile. */           case
ERROR_PATH_NOT_FOUND:          case ERROR_FILE_NOT_FOUND:               errno = ENOENT;               break;
caseERROR_FILE_EXISTS:               errno = EEXIST;               break;           case ERROR_ACCESS_DENIED:
   errno = EACCES;               break;           default:               errno = EINVAL;       }
 

So _anything_ could be EINVAL.  Including the several cases that
_dosmaperr treat as EACCES.  So I'm afraid that for this experiment to
be successful, we would have to remove not only the EINVAL cases from
doserrors[], but also any other code that appears more than once on it.
Otherwise the output could be ambiguous.

-- 
Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
Alvaro Herrera wrote:
> Tom Lane wrote:
>> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>>> Tom Lane wrote:
>>>> (2) Do we have any live cases where we must know this?
>>> Yes.  This thread shows the problem:
>>> http://archives.postgresql.org/pgsql-es-ayuda/2007-11/msg00354.php
>>> Basically he is getting this error:
>>> 2007-11-16 14:54:16 ERROR:  could not open relation 1663/16403/16487: Invalid argument
>> Well, since EINVAL is the default result from _dosmaperr, and none of
>> the cases it represents are "expected", why don't we just remove all of
>> the explicit mappings to EINVAL from doserrors[]?  Then we will get the
>> LOG message you need, and we won't have to increase the chattiness level
>> for anything else.
> 
> Well, the problematic routine is not already using _dosmaperr currently.
> It is doing it's own mapping and neglecting to report anything.  In
> fact, after all the problems that appeared after Magnus proposed to use
> _dosmaperr, I'm inclined to go with my original suggestion: don't use
> _dosmaperr at all and instead add an ereport(LOG) with the Windows error
> code.

That'll just cause the same problem in a different location, no? We'd
still be logging too often?

If we want to avoid duplicate code, we could go around to your idea of
being able to specify the elog level, by creating a _dosmaperr_internal
that takes that as parameter, and then have _dosmaperr call it with
DEBUG5 as argument... But that's only if we think it's ok for
pgwin32_open() to log that much - I thought we didn't like that?


> The routine I'm talking about (pgwin32_open) has this:
> 
>         switch (err)
>         {
>                 /* EMFILE, ENFILE should not occur from CreateFile. */
>             case ERROR_PATH_NOT_FOUND:
>             case ERROR_FILE_NOT_FOUND:
>                 errno = ENOENT;
>                 break;
>             case ERROR_FILE_EXISTS:
>                 errno = EEXIST;
>                 break;
>             case ERROR_ACCESS_DENIED:
>                 errno = EACCES;
>                 break;
>             default:
>                 errno = EINVAL;
>         }
> 
> So _anything_ could be EINVAL.  Including the several cases that
> _dosmaperr treat as EACCES.  So I'm afraid that for this experiment to
> be successful, we would have to remove not only the EINVAL cases from
> doserrors[], but also any other code that appears more than once on it.
> Otherwise the output could be ambiguous.
> 

Could be, but may not be :P

//Magnus


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> Well, since EINVAL is the default result from _dosmaperr, and none of
>> the cases it represents are "expected", why don't we just remove all of
>> the explicit mappings to EINVAL from doserrors[]?

> Well, the problematic routine is not already using _dosmaperr currently.
> It is doing it's own mapping and neglecting to report anything.

Oh, well then why are we arguing?  There is no reason at all to assume
that _dosmaperr wouldn't give us a sufficiently good fix on the error
if it were only being used.  At the very least, I think we should put
in Magnus' patch and find out whether it gives sufficient information.
If it doesn't, then we can think about changing to a higher log level.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Nov 29, 2007 at 05:48:23PM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Tom Lane wrote:
> >> Well, since EINVAL is the default result from _dosmaperr, and none of
> >> the cases it represents are "expected", why don't we just remove all of
> >> the explicit mappings to EINVAL from doserrors[]?
> 
> > Well, the problematic routine is not already using _dosmaperr currently.
> > It is doing it's own mapping and neglecting to report anything.
> 
> Oh, well then why are we arguing?  There is no reason at all to assume
> that _dosmaperr wouldn't give us a sufficiently good fix on the error
> if it were only being used.  At the very least, I think we should put
> in Magnus' patch and find out whether it gives sufficient information.
> If it doesn't, then we can think about changing to a higher log level.

Applied.

//Magnus


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> So _anything_ could be EINVAL.  Including the several cases that
> _dosmaperr treat as EACCES.  So I'm afraid that for this experiment to
> be successful, we would have to remove not only the EINVAL cases from
> doserrors[], but also any other code that appears more than once on it.
> Otherwise the output could be ambiguous.

OK, this got tested by the original reporter and the result is this:

2007-12-12 10:55:49 ERROR:  could not open relation 1663/16403/1247: Permission denied

I think this is EACCES, right?  So we now need to know which one it is:

ERROR_ACCESS_DENIED
ERROR_CURRENT_DIRECTORY
ERROR_LOCK_VIOLATION
ERROR_SHARING_VIOLATION
ERROR_NETWORK_ACCESS_DENIED
ERROR_CANNOT_MAKE
ERROR_FAIL_I24
ERROR_DRIVE_LOCKED
ERROR_SEEK_ON_DEVICE
ERROR_NOT_LOCKED
ERROR_LOCK_FAILED

(These are the Win32 codes mapped to EACCES by our _dosmaperr.)

Many of these are nonsensical -- we know this is not a device, nor
network access.  Still there is more than one possibility, and I don't
know which ones should be really acceptable in this context or not.
(What's ERROR_FAIL_I24??)  SHARING_VIOLATION seems the most likely
problem; an antivirus perhaps?

My suggestion continues to be: these codes oughta be logged separately
from the strerror() message.  If there are cases which we know are
deemed to occur in regular operation, let's suppress only those.  None
of them is one of these EACCES, AFAIR.

-- 
Alvaro Herrera                         http://www.flickr.com/photos/alvherre/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Dec 13, 2007 at 09:55:33AM -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> 
> > So _anything_ could be EINVAL.  Including the several cases that
> > _dosmaperr treat as EACCES.  So I'm afraid that for this experiment to
> > be successful, we would have to remove not only the EINVAL cases from
> > doserrors[], but also any other code that appears more than once on it.
> > Otherwise the output could be ambiguous.
> 
> OK, this got tested by the original reporter and the result is this:
> 
> 2007-12-12 10:55:49 ERROR:  could not open relation 1663/16403/1247: Permission denied
> 
> I think this is EACCES, right?  So we now need to know which one it is:
> 
> ERROR_ACCESS_DENIED
> ERROR_CURRENT_DIRECTORY
> ERROR_LOCK_VIOLATION
> ERROR_SHARING_VIOLATION
> ERROR_NETWORK_ACCESS_DENIED
> ERROR_CANNOT_MAKE
> ERROR_FAIL_I24
> ERROR_DRIVE_LOCKED
> ERROR_SEEK_ON_DEVICE
> ERROR_NOT_LOCKED
> ERROR_LOCK_FAILED
> 
> (These are the Win32 codes mapped to EACCES by our _dosmaperr.)
> 
> Many of these are nonsensical -- we know this is not a device, nor
> network access.  Still there is more than one possibility, and I don't
> know which ones should be really acceptable in this context or not.
> (What's ERROR_FAIL_I24??)  SHARING_VIOLATION seems the most likely
> problem; an antivirus perhaps?

If you have an antivirus running on the system, you really should get rid
of taht long before you start looking at the code...

But you still have a point :-)

I take it it's not reproducible enough to run at DEBUG5 until it happens?


> My suggestion continues to be: these codes oughta be logged separately
> from the strerror() message.  If there are cases which we know are
> deemed to occur in regular operation, let's suppress only those.  None
> of them is one of these EACCES, AFAIR.

Ok. Say we added a third field to the struct named "verbose" which
indicates which values shuold be logged? And then log those separately
("mapped error %i to %i" or similar). The clearest message would be if we
could log it at the point of the caller, but that's not a reasonable change
IMHO (would change every single caller). But this way we could at least get
something traceable. What level would you have that logged?

//Magnus


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Magnus Hagander wrote:
> On Thu, Dec 13, 2007 at 09:55:33AM -0300, Alvaro Herrera wrote:
> > 
> > Many of these are nonsensical -- we know this is not a device, nor
> > network access.  Still there is more than one possibility, and I don't
> > know which ones should be really acceptable in this context or not.
> > (What's ERROR_FAIL_I24??)  SHARING_VIOLATION seems the most likely
> > problem; an antivirus perhaps?
> 
> If you have an antivirus running on the system, you really should get rid
> of taht long before you start looking at the code...

FWIW I noticed by accident that the latest stable version of a
not-competing database system has fixed a related bug:

http://bugs.mysql.com/bug.php?id=9709
(yes, it only took them two and a half years to fix it).

Note that their behavior on finding SHARING_VIOLATION or LOCK_VIOLATION
is to retry forever until the error goes away, on the theory that the
antivirus/backup software will soon release the file.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude."                              (Brian Kernighan)


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
Alvaro Herrera wrote:
> Magnus Hagander wrote:
>> On Thu, Dec 13, 2007 at 09:55:33AM -0300, Alvaro Herrera wrote:
>>> Many of these are nonsensical -- we know this is not a device, nor
>>> network access.  Still there is more than one possibility, and I don't
>>> know which ones should be really acceptable in this context or not.
>>> (What's ERROR_FAIL_I24??)  SHARING_VIOLATION seems the most likely
>>> problem; an antivirus perhaps?
>> If you have an antivirus running on the system, you really should get rid
>> of taht long before you start looking at the code...
> 
> FWIW I noticed by accident that the latest stable version of a
> not-competing database system has fixed a related bug:
> 
> http://bugs.mysql.com/bug.php?id=9709
> (yes, it only took them two and a half years to fix it).
> 
> Note that their behavior on finding SHARING_VIOLATION or LOCK_VIOLATION
> is to retry forever until the error goes away, on the theory that the
> antivirus/backup software will soon release the file.

Interesting. Maybe forever is going a bit too far, but retrying for <n>
seconds or so.

So assuming we'd want to do that, how do we do it. If it's just the
"open" operation that needs it, we can probably just stick the retry in
the port file for that. But do we need to be able to retry on say
read/write as well?

//Magnus



Re: pgwin32_open returning EINVAL

From
Andrew Dunstan
Date:

Magnus Hagander wrote:
> Alvaro Herrera wrote:
>   
>> Magnus Hagander wrote:
>>     
>>> On Thu, Dec 13, 2007 at 09:55:33AM -0300, Alvaro Herrera wrote:
>>>       
>>>> Many of these are nonsensical -- we know this is not a device, nor
>>>> network access.  Still there is more than one possibility, and I don't
>>>> know which ones should be really acceptable in this context or not.
>>>> (What's ERROR_FAIL_I24??)  SHARING_VIOLATION seems the most likely
>>>> problem; an antivirus perhaps?
>>>>         
>>> If you have an antivirus running on the system, you really should get rid
>>> of taht long before you start looking at the code...
>>>       
>> FWIW I noticed by accident that the latest stable version of a
>> not-competing database system has fixed a related bug:
>>
>> http://bugs.mysql.com/bug.php?id=9709
>> (yes, it only took them two and a half years to fix it).
>>
>> Note that their behavior on finding SHARING_VIOLATION or LOCK_VIOLATION
>> is to retry forever until the error goes away, on the theory that the
>> antivirus/backup software will soon release the file.
>>     
>
> Interesting. Maybe forever is going a bit too far, but retrying for <n>
> seconds or so.
>
> So assuming we'd want to do that, how do we do it. If it's just the
> "open" operation that needs it, we can probably just stick the retry in
> the port file for that. But do we need to be able to retry on say
> read/write as well?
>
>
>   

Let's start with open and see what happens - we're surely going to need 
it there anyway. We already have a 30 second retry loop in unlink.

cheers

andrew


Re: pgwin32_open returning EINVAL

From
Gregory Stark
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:

> Magnus Hagander wrote:
>> Alvaro Herrera wrote:
>>> Magnus Hagander wrote:
>>>     
>>> Note that their behavior on finding SHARING_VIOLATION or LOCK_VIOLATION
>>> is to retry forever until the error goes away, on the theory that the
>>> antivirus/backup software will soon release the file.
>>
>> Interesting. Maybe forever is going a bit too far, but retrying for <n>
>> seconds or so.

I think looping forever is the right thing. Having a fixed timeout just means
Postgres will break sometimes instead of all the time. And it introduces
non-deterministic behaviour too.

You could print a warning after 30s but then I think you have to keep trying
forever. Just like database operations hang forever waiting for a lock.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
>> Interesting. Maybe forever is going a bit too far, but retrying for <n>
>> seconds or so.

> I think looping forever is the right thing. Having a fixed timeout just means
> Postgres will break sometimes instead of all the time. And it introduces
> non-deterministic behaviour too.

Looping forever would be considered broken by a very large fraction of
the community.

IIRC we have a 30-second timeout in rename() for Windows, and that seems
to be working well enough, so I'd be inclined to copy the behavior for
this case.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Andrew Dunstan" <andrew@dunslane.net> writes:
>>> Interesting. Maybe forever is going a bit too far, but retrying for <n>
>>> seconds or so.
>
>> I think looping forever is the right thing. Having a fixed timeout just means
>> Postgres will break sometimes instead of all the time. And it introduces
>> non-deterministic behaviour too.
>
> Looping forever would be considered broken by a very large fraction of
> the community.

Really? I understood we're talking about having Postgres fail with an error if
any of its files are opened by another program such as backup software. So
with a 30s limit it means Postgres might or might not fail depending on how
long this other software has the file open. That doesn't seem like an
improvement.

>
> IIRC we have a 30-second timeout in rename() for Windows, and that seems
> to be working well enough, so I'd be inclined to copy the behavior for
> this case.

I thought it was unlink, and the worst-case there is that we leak a file until
some later time. I'm wasn't exactly following that case though.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Sun, Dec 16, 2007 at 10:31:59AM +0000, Gregory Stark wrote:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
> 
> > Gregory Stark <stark@enterprisedb.com> writes:
> >> "Andrew Dunstan" <andrew@dunslane.net> writes:
> >>> Interesting. Maybe forever is going a bit too far, but retrying for <n>
> >>> seconds or so.
> >
> >> I think looping forever is the right thing. Having a fixed timeout just means
> >> Postgres will break sometimes instead of all the time. And it introduces
> >> non-deterministic behaviour too.
> >
> > Looping forever would be considered broken by a very large fraction of
> > the community.
> 
> Really? I understood we're talking about having Postgres fail with an error if
> any of its files are opened by another program such as backup software. So
> with a 30s limit it means Postgres might or might not fail depending on how
> long this other software has the file open. That doesn't seem like an
> improvement.

If your software is locking a file for that long, that software is more
than just broken, it's horribly broken. Having a workaround against
something that might happen once or twice because of a bug in the other
software is one thing, but if it's actually *designed* to do that you
really need to get that software removed from your machine.

Having the server completely stop for 30 seconds waiting for something to
happen is bad enough, I think. Better to error out to let the user know
that there's a major configuration problem on the machine.

> > IIRC we have a 30-second timeout in rename() for Windows, and that seems
> > to be working well enough, so I'd be inclined to copy the behavior for
> > this case.
> 
> I thought it was unlink, and the worst-case there is that we leak a file until
> some later time. I'm wasn't exactly following that case though.

We do it on both rename and unlink.

//Magnus


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
> > "Andrew Dunstan" <andrew@dunslane.net> writes:
> >> Interesting. Maybe forever is going a bit too far, but retrying for <n>
> >> seconds or so.
>
> > I think looping forever is the right thing. Having a fixed timeout just means
> > Postgres will break sometimes instead of all the time. And it introduces
> > non-deterministic behaviour too.
>
> Looping forever would be considered broken by a very large fraction of
> the community.
>
> IIRC we have a 30-second timeout in rename() for Windows, and that seems
> to be working well enough, so I'd be inclined to copy the behavior for
> this case.

Here's a patch that I think implements this ;) Alvaro - do you have a build
env so you can test it? I can't reproduce the problem in my environment...

Also, it currently just silently loops. Would it be interesting to
ereport(WARNING) that it's looping on the open, to let the user know
there's a problem? (Naturally, only warning the first time it tries it on
each file, so we don't spam the log too hard)

//Magnus

Attachment

Re: pgwin32_open returning EINVAL

From
Gregory Stark
Date:
"Magnus Hagander" <magnus@hagander.net> writes:

> If your software is locking a file for that long, that software is more
> than just broken, it's horribly broken. Having a workaround against
> something that might happen once or twice because of a bug in the other
> software is one thing, but if it's actually *designed* to do that you
> really need to get that software removed from your machine.

I was under the impression though that this was just how open worked on
windows. Only one process can have a file open at the same time.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Dec 19, 2007 at 01:05:00PM +0000, Gregory Stark wrote:
> "Magnus Hagander" <magnus@hagander.net> writes:
> 
> > If your software is locking a file for that long, that software is more
> > than just broken, it's horribly broken. Having a workaround against
> > something that might happen once or twice because of a bug in the other
> > software is one thing, but if it's actually *designed* to do that you
> > really need to get that software removed from your machine.
> 
> I was under the impression though that this was just how open worked on
> windows. Only one process can have a file open at the same time.

Then you're wrong. You can open files in shared mode no problem - it's how
we have multiple backensd opening the same file. You do have to remember to
specify the flag when you open the file - something backup software for
example has always done, unless it's really crappy design. 

Most proper backup software will use VSS these days, which makes the whole
point moot.

And antivirus is supposed to be entirely transparent, so it's definitely a
bug when it happens. But we've seen a lot of buggy antivirus.

//Magnus


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Magnus Hagander wrote:
> On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
> > Gregory Stark <stark@enterprisedb.com> writes:
> > > "Andrew Dunstan" <andrew@dunslane.net> writes:
> > >> Interesting. Maybe forever is going a bit too far, but retrying for <n>
> > >> seconds or so.
> > 
> > > I think looping forever is the right thing. Having a fixed timeout just means
> > > Postgres will break sometimes instead of all the time. And it introduces
> > > non-deterministic behaviour too.
> > 
> > Looping forever would be considered broken by a very large fraction of
> > the community.
> > 
> > IIRC we have a 30-second timeout in rename() for Windows, and that seems
> > to be working well enough, so I'd be inclined to copy the behavior for
> > this case.
> 
> Here's a patch that I think implements this ;) Alvaro - do you have a build
> env so you can test it? I can't reproduce the problem in my environment...

Thanks -- forwarded to the appropriate parties. :-)

> Also, it currently just silently loops. Would it be interesting to
> ereport(WARNING) that it's looping on the open, to let the user know
> there's a problem? (Naturally, only warning the first time it tries it on
> each file, so we don't spam the log too hard)

Yeah, I think it would be useful to log one message if after (say) 5
seconds we still haven't been able to open the file.

Is the sleep time correct?  If I'm reading it right, it sleeps 100 ms
each time, 30 times, that totals 3 seconds ... ?

(Are we OK with the idea of sleeping 1 second each time?)



-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Dec 19, 2007 at 11:23:35AM -0300, Alvaro Herrera wrote:
> Magnus Hagander wrote:
> > On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
> > > Gregory Stark <stark@enterprisedb.com> writes:
> > > > "Andrew Dunstan" <andrew@dunslane.net> writes:
> > > >> Interesting. Maybe forever is going a bit too far, but retrying for <n>
> > > >> seconds or so.
> > > 
> > > > I think looping forever is the right thing. Having a fixed timeout just means
> > > > Postgres will break sometimes instead of all the time. And it introduces
> > > > non-deterministic behaviour too.
> > > 
> > > Looping forever would be considered broken by a very large fraction of
> > > the community.
> > > 
> > > IIRC we have a 30-second timeout in rename() for Windows, and that seems
> > > to be working well enough, so I'd be inclined to copy the behavior for
> > > this case.
> > 
> > Here's a patch that I think implements this ;) Alvaro - do you have a build
> > env so you can test it? I can't reproduce the problem in my environment...
> 
> Thanks -- forwarded to the appropriate parties. :-)

Thanks. Let us know the results :-)

> > Also, it currently just silently loops. Would it be interesting to
> > ereport(WARNING) that it's looping on the open, to let the user know
> > there's a problem? (Naturally, only warning the first time it tries it on
> > each file, so we don't spam the log too hard)
> 
> Yeah, I think it would be useful to log one message if after (say) 5
> seconds we still haven't been able to open the file.

Either that, or on the first run.


> Is the sleep time correct?  If I'm reading it right, it sleeps 100 ms
> each time, 30 times, that totals 3 seconds ... ?

Uh, I copied that from pgunlink() and pgrename(), but forgot a zero on the
loop. It's supposed to loop 300 times.

> (Are we OK with the idea of sleeping 1 second each time?)

I think not. 0.1 seconds is better. We don't want to delay a full second if
it's just a transient thing.

//Magnus


Re: pgwin32_open returning EINVAL

From
"Zeugswetter Andreas ADI SD"
Date:
> > Yeah, I think it would be useful to log one message if after (say) 5
> > seconds we still haven't been able to open the file.
>
> Either that, or on the first run.

Imho 1-5s is better, so that would be after the 10-50th try.

> loop. It's supposed to loop 300 times.

Yes.
> > (Are we OK with the idea of sleeping 1 second each time?)
>
> I think not. 0.1 seconds is better. We don't want to delay a full
second if
> it's just a transient thing.

Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
(http://support.microsoft.com/kb/316609)

Andreas


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Zeugswetter Andreas ADI SD wrote:

> > I think not. 0.1 seconds is better. We don't want to delay a full
> > second if it's just a transient thing.
> 
> Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
> (http://support.microsoft.com/kb/316609) 

Hm, the article only mentions ERROR_SHARING_VIOLATION but we're also
considering ERROR_LOCKING_VIOLATION.  Should we retry only on the
SHARING violation?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pgwin32_open returning EINVAL

From
Andrew Dunstan
Date:

Zeugswetter Andreas ADI SD wrote:
>
>
> Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
> (http://support.microsoft.com/kb/316609) 
>
>
>   

We have seen cases in the past where these locks last quite a long time. 
That 30s total timeout in rename and unlink was not chosen arbitrarily - 
it's based on experience.

I think a retry interval of 0.1s here should be OK.

cheers

andrew


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Dec 19, 2007 at 04:20:23PM +0100, Zeugswetter Andreas ADI SD wrote:
> 
> > > Yeah, I think it would be useful to log one message if after (say) 5
> > > seconds we still haven't been able to open the file.
> > 
> > Either that, or on the first run.
> 
> Imho 1-5s is better, so that would be after the 10-50th try.

Ok, so I'll put in a warning after 50 tries.


> > loop. It's supposed to loop 300 times.
> 
> Yes.
>  
> > > (Are we OK with the idea of sleeping 1 second each time?)
> > 
> > I think not. 0.1 seconds is better. We don't want to delay a full
> second if
> > it's just a transient thing.
> 
> Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
> (http://support.microsoft.com/kb/316609) 

Their sample code sleeps for 0.25 seconds though. (Sleep() takes
milliseconds). So we're definitely fine with 0.1sec I think.

//Magnus


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Dec 19, 2007 at 04:44:12PM +0100, Magnus Hagander wrote:
> On Wed, Dec 19, 2007 at 04:20:23PM +0100, Zeugswetter Andreas ADI SD wrote:
> >
> > > > Yeah, I think it would be useful to log one message if after (say) 5
> > > > seconds we still haven't been able to open the file.
> > >
> > > Either that, or on the first run.
> >
> > Imho 1-5s is better, so that would be after the 10-50th try.
>
> Ok, so I'll put in a warning after 50 tries.

Updated version attached. Comments on the wording of the messages are also
welcome ;-)

//Magnus

Attachment

Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Updated version attached. Comments on the wording of the messages are also
> welcome ;-)

Well, since you asked ;-) ... I don't particularly like this.

1. Doesn't distinguish SHARING_VIOLATION from LOCK_VIOLATION.
We might want to know that.

2. Do we really want this to be WARNING?  LOG seems a better idea,
since it's not warning about anything the client app did wrong.

3. Doesn't follow the message style guide: shoehorning two sentences
into errmsg() is surely not compliant.  Perhaps
errmsg: could not open file \"%s\": sharing violationOR errmsg: could not open file \"%s\": lock violation
errdetail: Continuing to retry for up to 30 seconds.
errhint: OK as you have it.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Wed, Dec 19, 2007 at 07:50:29PM -0500, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > Updated version attached. Comments on the wording of the messages are also
> > welcome ;-)
> 
> Well, since you asked ;-) ... I don't particularly like this.

Well, it's better to have you say that before than to have to redo it
later, of course. Especially when we're this close to translation deadline,
and it contains new strings to translate :-P


> 1. Doesn't distinguish SHARING_VIOLATION from LOCK_VIOLATION.
> We might want to know that.

Yeah, I had that fixed locally already.


> 2. Do we really want this to be WARNING?  LOG seems a better idea,
> since it's not warning about anything the client app did wrong.

I put it as warning because I wanted to be sure the admin notices. If your
database is hanging 5+ seconds to open a file, you have a *big* problem,
and you need to fix it. Just putting it as LOG will probably make it much
more likely it's missed.

Is there a guideline somewhere that says what shuld be WARNING and what
should be LOG in general? If not, maybe there shuold be? ;-)


> 3. Doesn't follow the message style guide: shoehorning two sentences
> into errmsg() is surely not compliant.  Perhaps
> 
>     errmsg: could not open file \"%s\": sharing violation
>     OR errmsg: could not open file \"%s\": lock violation
> 
>     errdetail: Continuing to retry for up to 30 seconds.
> 
>     errhint: OK as you have it.

Yeah, Alvaro pointed parts of that out on IM already. Updated before
application, pending discussion about log level above.

ereport(WARNING,   (errmsg("could not open file \"%s\": %s violation", fileName,   (GetLastError() ==
ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")), errdetail("Continuing to retry for 30 seconds."), errhint("You may
haveantivirus, backup or similar software
 
interfering with the database.")));


Is _("") the proper way to get a translatable string in like that, or is
there a better way?

//Magnus


Re: pgwin32_open returning EINVAL

From
Martijn van Oosterhout
Date:
On Thu, Dec 20, 2007 at 09:32:38AM +0100, Magnus Hagander wrote:
>     (errmsg("could not open file \"%s\": %s violation", fileName,
>       (GetLastError() ==
> ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")),
>
> Is _("") the proper way to get a translatable string in like that, or is
> there a better way?

Hmm, from a translation point of view wouldn't it be better to have the
word "violation" inside each of the _(), since not all languages might
allow such substitions (gender problems).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Re: pgwin32_open returning EINVAL

From
"Trevor Talbot"
Date:
On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:

> ereport(WARNING,
>     (errmsg("could not open file \"%s\": %s violation", fileName,
>           (GetLastError() ==
> ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")),
>          errdetail("Continuing to retry for 30 seconds."),
>          errhint("You may have antivirus, backup or similar software
> interfering with the database.")));

Without looking myself, is it possible for errhint() or errdetail() to
do something that affects GetLastError()? It's like errno, checking it
very far away from the call site makes me nervous.


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Dec 20, 2007 at 04:39:55AM -0800, Trevor Talbot wrote:
> On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:
> 
> > ereport(WARNING,
> >     (errmsg("could not open file \"%s\": %s violation", fileName,
> >           (GetLastError() ==
> > ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")),
> >          errdetail("Continuing to retry for 30 seconds."),
> >          errhint("You may have antivirus, backup or similar software
> > interfering with the database.")));
> 
> Without looking myself, is it possible for errhint() or errdetail() to
> do something that affects GetLastError()? It's like errno, checking it
> very far away from the call site makes me nervous.

I guess, but it shouldn't matter. We're giong to loop right back up and do a
new CreateFile() after this, which will overwrite it again.

//Magnus


Re: pgwin32_open returning EINVAL

From
"Trevor Talbot"
Date:
On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Dec 20, 2007 at 04:39:55AM -0800, Trevor Talbot wrote:
> > On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:

> > > ereport(WARNING,
> > >     (errmsg("could not open file \"%s\": %s violation", fileName,
> > >           (GetLastError() ==
> > > ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")),
> > >          errdetail("Continuing to retry for 30 seconds."),
> > >          errhint("You may have antivirus, backup or similar software
> > > interfering with the database.")));

> > Without looking myself, is it possible for errhint() or errdetail() to
> > do something that affects GetLastError()? It's like errno, checking it
> > very far away from the call site makes me nervous.

> I guess, but it shouldn't matter. We're giong to loop right back up and do a
> new CreateFile() after this, which will overwrite it again.

I mean for the purposes of the report. I'm worried the message might
say it's a lock violation when it's really a sharing violation.


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Trevor Talbot escribió:

> Without looking myself, is it possible for errhint() or errdetail() to
> do something that affects GetLastError()? It's like errno, checking it
> very far away from the call site makes me nervous.

Yeah, they call gettext(), palloc/pfree/pstrdup and various StringInfo
routines.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Dec 20, 2007 at 04:47:33AM -0800, Trevor Talbot wrote:
> On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:
> > On Thu, Dec 20, 2007 at 04:39:55AM -0800, Trevor Talbot wrote:
> > > On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:
> 
> > > > ereport(WARNING,
> > > >     (errmsg("could not open file \"%s\": %s violation", fileName,
> > > >           (GetLastError() ==
> > > > ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")),
> > > >          errdetail("Continuing to retry for 30 seconds."),
> > > >          errhint("You may have antivirus, backup or similar software
> > > > interfering with the database.")));
> 
> > > Without looking myself, is it possible for errhint() or errdetail() to
> > > do something that affects GetLastError()? It's like errno, checking it
> > > very far away from the call site makes me nervous.
> 
> > I guess, but it shouldn't matter. We're giong to loop right back up and do a
> > new CreateFile() after this, which will overwrite it again.
> 
> I mean for the purposes of the report. I'm worried the message might
> say it's a lock violation when it's really a sharing violation.

Wouldn't we then have the same problem for every place that does a %i and
report GetLastError() or errno? And we have a *lot* of those... I would've
thought the framework thinks of that, but I haven't actually verified that.

If it's a problem, it's trivial to save the value off in a variable, of
course.

//Magnus


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
"Trevor Talbot" <quension@gmail.com> writes:
> Without looking myself, is it possible for errhint() or errdetail() to
> do something that affects GetLastError()?

Hmm ... the macro framework is designed so that the arguments get
evaluated before anything very interesting happens, but it might be
better to use a variable anyway --- for onm thing we could get rid of
the redundant GetLastError calls in the test in front of this.

BTW, I concur with Martijn's comment to avoid constructing phrases from
spare parts; it creates a translation problem, and in this case you'd
only be saving a few bytes anyway.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Dec 20, 2007 at 10:11:10AM -0500, Tom Lane wrote:
> "Trevor Talbot" <quension@gmail.com> writes:
> > Without looking myself, is it possible for errhint() or errdetail() to
> > do something that affects GetLastError()?
> 
> Hmm ... the macro framework is designed so that the arguments get
> evaluated before anything very interesting happens, but it might be
> better to use a variable anyway --- for onm thing we could get rid of
> the redundant GetLastError calls in the test in front of this.

I'd expect the compiler to optimize away those, but I'll make it a var
anyawy.

> BTW, I concur with Martijn's comment to avoid constructing phrases from
> spare parts; it creates a translation problem, and in this case you'd
> only be saving a few bytes anyway.

Ok. Are you saying whole sentence, or are you saying it's fine to just
break out the part that's after the colon into "sharing violation" and
"lock violation"? I assume the second?

//Magnus


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Wed, Dec 19, 2007 at 07:50:29PM -0500, Tom Lane wrote:
>> 2. Do we really want this to be WARNING?  LOG seems a better idea,
>> since it's not warning about anything the client app did wrong.

> I put it as warning because I wanted to be sure the admin notices. If your
> database is hanging 5+ seconds to open a file, you have a *big* problem,
> and you need to fix it. Just putting it as LOG will probably make it much
> more likely it's missed.

This reasoning is faulty.  For logging purposes, LOG is *more* severe
(higher priority) than WARNING.  I think it's fairly common to set
log_min_messages = ERROR, which would mean that warnings disappear.
On the client side, unless you're issuing queries by hand with psql,
it's entirely likely that all non-error messages go into the bit bucket.
You can't count on anyone ever noticing them in a production app.

Use LOG.  That's what it's there for.  (If you want a more formal
definition, I'd say it's for messages that a DBA would be interested in
but are not directly relevant to a client app.)
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Dec 20, 2007 at 10:26:46AM -0500, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > On Wed, Dec 19, 2007 at 07:50:29PM -0500, Tom Lane wrote:
> >> 2. Do we really want this to be WARNING?  LOG seems a better idea,
> >> since it's not warning about anything the client app did wrong.
> 
> > I put it as warning because I wanted to be sure the admin notices. If your
> > database is hanging 5+ seconds to open a file, you have a *big* problem,
> > and you need to fix it. Just putting it as LOG will probably make it much
> > more likely it's missed.
> 
> This reasoning is faulty.  For logging purposes, LOG is *more* severe
> (higher priority) than WARNING.  I think it's fairly common to set
> log_min_messages = ERROR, which would mean that warnings disappear.
> On the client side, unless you're issuing queries by hand with psql,
> it's entirely likely that all non-error messages go into the bit bucket.
> You can't count on anyone ever noticing them in a production app.
> 
> Use LOG.  That's what it's there for.  (If you want a more formal
> definition, I'd say it's for messages that a DBA would be interested in
> but are not directly relevant to a client app.)

Ah, wasn't aware of that at all. Then LOG certainly makes a lot more sense,
yes. Thanks for clearifying.


//Magnus


Re: pgwin32_open returning EINVAL

From
"Trevor Talbot"
Date:
On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Dec 20, 2007 at 04:39:55AM -0800, Trevor Talbot wrote:
> > On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:

> > > ereport(WARNING,
> > >     (errmsg("could not open file \"%s\": %s violation", fileName,
> > >           (GetLastError() ==
> > > ERROR_SHARING_VIOLATION)?_("sharing"):_("lock")),
> > >          errdetail("Continuing to retry for 30 seconds."),
> > >          errhint("You may have antivirus, backup or similar software
> > > interfering with the database.")));

> > Without looking myself, is it possible for errhint() or errdetail() to
> > do something that affects GetLastError()? It's like errno, checking it
> > very far away from the call site makes me nervous.

> Wouldn't we then have the same problem for every place that does a %i and
> report GetLastError() or errno? And we have a *lot* of those... I would've
> thought the framework thinks of that, but I haven't actually verified that.

A function's arguments are evaluated before the call, so that's safe
in general. What is implementation-specific is the order of evaluation
of different arguments, and I don't know if the parentheses above
override that.


On 12/20/07, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Dec 20, 2007 at 10:11:10AM -0500, Tom Lane wrote:

> > Hmm ... the macro framework is designed so that the arguments get
> > evaluated before anything very interesting happens, but it might be
> > better to use a variable anyway --- for onm thing we could get rid of
> > the redundant GetLastError calls in the test in front of this.

Sounds like it was already thought of then.

> I'd expect the compiler to optimize away those, but I'll make it a var
> anyawy.

It can't; it's an opaque callout to kernel32.dll, and there's nothing
that tells the optimizer when you can expect to get the same result.
That said, it's cheaper than it looks, since the error code is stored
at a fixed location in thread-specific VM space. I guess it wasn't
interesting enough to make an MSVC intrinsic instead of keeping it a
system detail.


Re: pgwin32_open returning EINVAL

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Thu, Dec 20, 2007 at 10:11:10AM -0500, Tom Lane wrote:
>> BTW, I concur with Martijn's comment to avoid constructing phrases from
>> spare parts; it creates a translation problem, and in this case you'd
>> only be saving a few bytes anyway.

> Ok. Are you saying whole sentence, or are you saying it's fine to just
> break out the part that's after the colon into "sharing violation" and
> "lock violation"? I assume the second?

Right, that's one reason for using the colon: to help divide the message
into phrases that can be translated separately without too much
grammatical mayhem.  _("sharing violation") seems OK to me.
        regards, tom lane


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Thu, Dec 20, 2007 at 07:57:23AM -0800, Trevor Talbot wrote:
> > I'd expect the compiler to optimize away those, but I'll make it a var
> > anyawy.
> 
> It can't; it's an opaque callout to kernel32.dll, and there's nothing
> that tells the optimizer when you can expect to get the same result.
> That said, it's cheaper than it looks, since the error code is stored
> at a fixed location in thread-specific VM space. I guess it wasn't
> interesting enough to make an MSVC intrinsic instead of keeping it a
> system detail.

Interesting. I was certainly it was an intrinsic, but it seems not...
Variable, it is..

//Magnus


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
Magnus Hagander wrote:
> On Thu, Dec 20, 2007 at 10:26:46AM -0500, Tom Lane wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>> On Wed, Dec 19, 2007 at 07:50:29PM -0500, Tom Lane wrote:
>>>> 2. Do we really want this to be WARNING?  LOG seems a better idea,
>>>> since it's not warning about anything the client app did wrong.
>>> I put it as warning because I wanted to be sure the admin notices. If your
>>> database is hanging 5+ seconds to open a file, you have a *big* problem,
>>> and you need to fix it. Just putting it as LOG will probably make it much
>>> more likely it's missed.
>> This reasoning is faulty.  For logging purposes, LOG is *more* severe
>> (higher priority) than WARNING.  I think it's fairly common to set
>> log_min_messages = ERROR, which would mean that warnings disappear.
>> On the client side, unless you're issuing queries by hand with psql,
>> it's entirely likely that all non-error messages go into the bit bucket.
>> You can't count on anyone ever noticing them in a production app.
>>
>> Use LOG.  That's what it's there for.  (If you want a more formal
>> definition, I'd say it's for messages that a DBA would be interested in
>> but are not directly relevant to a client app.)
> 
> Ah, wasn't aware of that at all. Then LOG certainly makes a lot more sense,
> yes. Thanks for clearifying.

I've applied a patch for this to head, to get some proper buildfarming
on it. If it passes the tests that Alvaro's contact will be running
(since they had a reasonably repeatable case), I think we should
backpatch it. But not until then...

//Magnus


Re: pgwin32_open returning EINVAL

From
"Jaime Casanova"
Date:
On Dec 20, 2007 3:29 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Magnus Hagander wrote:
> > On Thu, Dec 20, 2007 at 10:26:46AM -0500, Tom Lane wrote:
> >> Magnus Hagander <magnus@hagander.net> writes:
> >>> On Wed, Dec 19, 2007 at 07:50:29PM -0500, Tom Lane wrote:
> >>>> 2. Do we really want this to be WARNING?  LOG seems a better idea,
> >>>> since it's not warning about anything the client app did wrong.
> >>> I put it as warning because I wanted to be sure the admin notices. If your
> >>> database is hanging 5+ seconds to open a file, you have a *big* problem,
> >>> and you need to fix it. Just putting it as LOG will probably make it much
> >>> more likely it's missed.
> >> This reasoning is faulty.  For logging purposes, LOG is *more* severe
> >> (higher priority) than WARNING.  I think it's fairly common to set
> >> log_min_messages = ERROR, which would mean that warnings disappear.
> >> On the client side, unless you're issuing queries by hand with psql,
> >> it's entirely likely that all non-error messages go into the bit bucket.
> >> You can't count on anyone ever noticing them in a production app.
> >>
> >> Use LOG.  That's what it's there for.  (If you want a more formal
> >> definition, I'd say it's for messages that a DBA would be interested in
> >> but are not directly relevant to a client app.)
> >
> > Ah, wasn't aware of that at all. Then LOG certainly makes a lot more sense,
> > yes. Thanks for clearifying.
>
> I've applied a patch for this to head, to get some proper buildfarming
> on it. If it passes the tests that Alvaro's contact will be running
> (since they had a reasonably repeatable case), I think we should
> backpatch it. But not until then...
>

it doesn't compile on current head on mingw 5.1 and msys 1.0.10; of
course, it doesn't compile on 8.2 neither in order to Alvaro's contact
to test it


gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-DFRONTEND -I. -I../../../src/include -I./src/include/port/win32
-DEXEC_BACKEND  "-I../../../src/include/port/win32"
-I../../../src/port  -c -o pthread-win32.o pthread-win32.c
dllwrap  -o libpq.dll --dllname libpq.dll  --def libpqdll.def
fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o
fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.o fe-secure.o
md5.o ip.o wchar.o encnames.o noblock.o pgstrcasecmp.o thread.o
crypt.o inet_aton.o strlcpy.o getaddrinfo.o open.o win32error.o
snprintf.o win32.o libpqrc.o pthread-win32.o  -L../../../src/port
-lshfolder -lwsock32 -lws2_32 -lsecur32
open.o:open.c:(.text+0x144): undefined reference to `pg_usleep'
collect2: ld returned 1 exit status
d:\mingw\bin\dllwrap.exe: d:\mingw\bin\gcc exited with status 1
make[3]: *** [libpq.a] Error 1
make[3]: Leaving directory
`/home/General/PG_RELEASES/pgsql-8.3/src/interfaces/libpq'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/home/General/PG_RELEASES/pgsql-8.3/src/interfaces'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/General/PG_RELEASES/pgsql-8.3/src'
make: *** [all] Error 2




-- 
Atentamente,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Jaime Casanova escribió:

> it doesn't compile on current head on mingw 5.1 and msys 1.0.10; of
> course, it doesn't compile on 8.2 neither in order to Alvaro's contact
> to test it

Yeah, the buildfarm is failing in the same way.

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dawn_bat&dt=2007-12-20%2022:00:01

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Jaime Casanova escribió:

> it doesn't compile on current head on mingw 5.1 and msys 1.0.10; of
> course, it doesn't compile on 8.2 neither in order to Alvaro's contact
> to test it

I think this patch should fix it.  I think win32.mak needs to be
similarly patched.


--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

Re: pgwin32_open returning EINVAL

From
Andrew Dunstan
Date:

Alvaro Herrera wrote:
> Jaime Casanova escribió:
>
>   
>> it doesn't compile on current head on mingw 5.1 and msys 1.0.10; of
>> course, it doesn't compile on 8.2 neither in order to Alvaro's contact
>> to test it
>>     
>
> I think this patch should fix it.  I think win32.mak needs to be
> similarly patched.
>
>
>   

Don't you also need to add pgsleep.o to $(OBJS) in the win32 stanza?

cheers

andrew


Re: pgwin32_open returning EINVAL

From
"Jaime Casanova"
Date:
On Dec 20, 2007 9:31 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Jaime Casanova escribió:
>
> > it doesn't compile on current head on mingw 5.1 and msys 1.0.10; of
> > course, it doesn't compile on 8.2 neither in order to Alvaro's contact
> > to test it
>
> I think this patch should fix it.  I think win32.mak needs to be
> similarly patched.
>
>

ok. it's solved now. i added the lines in the patch and pgsleep.o in
(circa line 34 in src/interfaces/libpq)

OBJS=    fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \fe-protocol2.o fe-protocol3.o pqexpbuffer.o
pqsignal.ofe-secure.o \md5.o ip.o wchar.o encnames.o noblock.o pgstrcasecmp.o thread.o \$(filter crypt.o getaddrinfo.o
inet_aton.oopen.o snprintf.o 
strerror.o strlcpy.o win32error.o pgsleep.o, $(LIBOBJS))


--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook


Re: pgwin32_open returning EINVAL

From
Magnus Hagander
Date:
On Fri, Dec 21, 2007 at 01:30:07AM -0500, Jaime Casanova wrote:
> On Dec 20, 2007 9:31 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> > Jaime Casanova escribió:
> >
> > > it doesn't compile on current head on mingw 5.1 and msys 1.0.10; of
> > > course, it doesn't compile on 8.2 neither in order to Alvaro's contact
> > > to test it
> >
> > I think this patch should fix it.  I think win32.mak needs to be
> > similarly patched.
> >
> >
> 
> ok. it's solved now. i added the lines in the patch and pgsleep.o in
> (circa line 34 in src/interfaces/libpq)
> 
> OBJS=    fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \
>     fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.o fe-secure.o \
>     md5.o ip.o wchar.o encnames.o noblock.o pgstrcasecmp.o thread.o \
>     $(filter crypt.o getaddrinfo.o inet_aton.o open.o snprintf.o
> strerror.o strlcpy.o win32error.o pgsleep.o, $(LIBOBJS))

That's the wrong place - it should be added for win32 only.

Proper patch committed. Sorry 'bout that one. *grumble*

//Magnus


Re: pgwin32_open returning EINVAL

From
Alvaro Herrera
Date:
Andrew Dunstan escribió:
>
> Alvaro Herrera wrote:

>> I think this patch should fix it.  I think win32.mak needs to be
>> similarly patched.
>
> Don't you also need to add pgsleep.o to $(OBJS) in the win32 stanza?

Hmm.  Wow, that's silly.  I introduced a hack in a Replicator's Makefile
to avoid having to duplicate the file list.  I'll look into submitting
it into libpq's for 8.4.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support