Thread: again...
I'm still getting a lot of these entries in my eventlog whenever I have a reasonably large amount of logging:
Event Type: Error
Event Source: PostgreSQL
Event Category: None
Event ID: 0
Date: 10/22/2008
Time: 9:36:28 AM
User: N/A
Computer: ----------
Description:
could not write to log file: Bad file descriptor
production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. I'd send this to psql-bugs, but I can't figure out how to reproduce it.. and
apparently noone else has seen this? Or is noone else running postgres on win2003??? (survey says??)
The timing of each appears to be when log is rolled over (size limited to 10MB) and postgres attempts to write a log entry at the same time:
Any ideas? help? How to proceed???
Cheers
Ati
My log settings from postgresql.conf:
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
log_destination = 'stderr' # Valid values are combinations of
# This is used when logging to stderr:
logging_collector = on # Enable capturing of stderr and csvlog
# These are only used if logging_collector is on:
#log_directory = 'pg_log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_truncate_on_rotation = off # If on, an existing log file of the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 to disable.
log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 to disable.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
# - When to Log -
#client_min_messages = notice # values in order of decreasing detail:
#log_min_messages = notice # values in order of decreasing detail:
#log_error_verbosity = default # terse, default, or verbose messages
#log_min_error_statement = error # values in order of decreasing detail:
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this time.
#silent_mode = off # DO NOT USE without syslog or
# logging_collector
# (change requires restart)
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_checkpoints = off
log_connections = on
log_disconnections = on
#log_duration = off
#log_hostname = off
#log_line_prefix = '%t ' # special values:
log_line_prefix = '%t [%h:%u@%d] '
#log_lock_waits = off # log lock waits >= deadlock_timeout
log_statement = 'all' # none, ddl, mod, all
#log_temp_files = -1 # log temporary files equal or larger
# than specified size;
# -1 disables, 0 logs all temp files
#log_timezone = unknown # actually, defaults to TZ environment
Event Type: Error
Event Source: PostgreSQL
Event Category: None
Event ID: 0
Date: 10/22/2008
Time: 9:36:28 AM
User: N/A
Computer: ----------
Description:
could not write to log file: Bad file descriptor
production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. I'd send this to psql-bugs, but I can't figure out how to reproduce it.. and
apparently noone else has seen this? Or is noone else running postgres on win2003??? (survey says??)
The timing of each appears to be when log is rolled over (size limited to 10MB) and postgres attempts to write a log entry at the same time:
Any ideas? help? How to proceed???
Cheers
Ati
My log settings from postgresql.conf:
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
log_destination = 'stderr' # Valid values are combinations of
# This is used when logging to stderr:
logging_collector = on # Enable capturing of stderr and csvlog
# These are only used if logging_collector is on:
#log_directory = 'pg_log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_truncate_on_rotation = off # If on, an existing log file of the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 to disable.
log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 to disable.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
# - When to Log -
#client_min_messages = notice # values in order of decreasing detail:
#log_min_messages = notice # values in order of decreasing detail:
#log_error_verbosity = default # terse, default, or verbose messages
#log_min_error_statement = error # values in order of decreasing detail:
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this time.
#silent_mode = off # DO NOT USE without syslog or
# logging_collector
# (change requires restart)
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_checkpoints = off
log_connections = on
log_disconnections = on
#log_duration = off
#log_hostname = off
#log_line_prefix = '%t ' # special values:
log_line_prefix = '%t [%h:%u@%d] '
#log_lock_waits = off # log lock waits >= deadlock_timeout
log_statement = 'all' # none, ddl, mod, all
#log_temp_files = -1 # log temporary files equal or larger
# than specified size;
# -1 disables, 0 logs all temp files
#log_timezone = unknown # actually, defaults to TZ environment
On Sat, Oct 25, 2008 at 9:12 AM, Ati Rosselet <ati.rosselet@gmail.com> wrote: > I'm still getting a lot of these entries in my eventlog whenever I have a > reasonably large amount of logging: > > Event Type: Error > Event Source: PostgreSQL > Event Category: None > Event ID: 0 > Date: 10/22/2008 > Time: 9:36:28 AM > User: N/A > Computer: ---------- > Description: > could not write to log file: Bad file descriptor > > production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. > I'd send this to psql-bugs, but I can't figure out how to reproduce it.. and > apparently noone else has seen this? Or is noone else running postgres on > win2003??? (survey says??) > > The timing of each appears to be when log is rolled over (size limited to > 10MB) and postgres attempts to write a log entry at the same time: > > Any ideas? help? How to proceed??? Maybe you've got a virus scanner that's locking your log files on you?
I wish... no such luck. No virus scanner on the back end (nothing gets on the server except the database :))
It seems to be a race condition where the old file is closed before the new file is opened, and the logging attempt fails right at that time. The new file is created just fine, and from then, until the next 10MB checkpoint, all is fine. The reason I'm guessing this, is that it doesn't occur at each file rollover, just sometimes.. indicating (to me at least) that its a timing problem... I tried to get the code and figure out what its doing.. but.. well... um..not really able to figure it out :(
It seems to be a race condition where the old file is closed before the new file is opened, and the logging attempt fails right at that time. The new file is created just fine, and from then, until the next 10MB checkpoint, all is fine. The reason I'm guessing this, is that it doesn't occur at each file rollover, just sometimes.. indicating (to me at least) that its a timing problem... I tried to get the code and figure out what its doing.. but.. well... um..not really able to figure it out :(
Maybe you've got a virus scanner that's locking your log files on you?
----- Original Message ----- From: "Scott Marlowe" <scott.marlowe@gmail.com> To: "Ati Rosselet" <ati.rosselet@gmail.com> Cc: <pgsql-general@postgresql.org> Sent: Saturday, October 25, 2008 12:04 PM Subject: Re: [GENERAL] again... > On Sat, Oct 25, 2008 at 9:12 AM, Ati Rosselet <ati.rosselet@gmail.com> > wrote: >> I'm still getting a lot of these entries in my eventlog whenever I have a >> reasonably large amount of logging: >> >> Event Type: Error >> Event Source: PostgreSQL >> Event Category: None >> Event ID: 0 >> Date: 10/22/2008 >> Time: 9:36:28 AM >> User: N/A >> Computer: ---------- >> Description: >> could not write to log file: Bad file descriptor >> >> production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. >> I'd send this to psql-bugs, but I can't figure out how to reproduce it.. >> and >> apparently noone else has seen this? Or is noone else running postgres >> on >> win2003??? (survey says??) >> >> The timing of each appears to be when log is rolled over (size limited to >> 10MB) and postgres attempts to write a log entry at the same time: >> >> Any ideas? help? How to proceed??? > > Maybe you've got a virus scanner that's locking your log files on you? Hi Ati, Inconsistent misbehavior is always fun ;). You may need to increase the probability that the misbehavior shows itself. I'd change the 10 MB limit to 1MB and document your findings. Additionally, you may take the size limit out of the picture and lower the time limit to something ridiculously low. (120 minutes?) Just a suggestion... Good luck! Jeff > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
good suggestion jeff
I would also add starting postgres with log_connections=on
postgres -c log_connections=on
check postgressql.cong logdirectory
log_destination = '/fubar'
just to make sure you have permission to write to fubar..
HTH
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> From: postgres@image-place.com
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] again...
> Date: Sun, 26 Oct 2008 11:47:24 -0400
>
>
> ----- Original Message -----
> From: "Scott Marlowe" <scott.marlowe@gmail.com>
> To: "Ati Rosselet" <ati.rosselet@gmail.com>
> Cc: <pgsql-general@postgresql.org>
> Sent: Saturday, October 25, 2008 12:04 PM
> Subject: Re: [GENERAL] again...
>
>
> > On Sat, Oct 25, 2008 at 9:12 AM, Ati Rosselet <ati.rosselet@gmail.com>
> > wrote:
> >> I'm still getting a lot of these entries in my eventlog whenever I have a
> >> reasonably large amount of logging:
> >>
> >> Event Type: Error
> >> Event Source: PostgreSQL
> >> Event Category: None
> >> Event ID: 0
> >> Date: 10/22/2008
> >> Time: 9:36:28 AM
> >> User: N/A
> >> Computer: ----------
> >> Description:
> >> could not write to log file: Bad file descriptor
> >>
> >> production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server.
> >> I'd send this to psql-bugs, but I can't figure out how to reproduce it..
> >> and
> >> apparently noone else has seen this? Or is noone else running postgres
> >> on
> >> win2003??? (survey says??)
> >>
> >> The timing of each appears to be when log is rolled over (size limited to
> >> 10MB) and postgres attempts to write a log entry at the same time:
> >>
> >> Any ideas? help? How to proceed???
> >
> > Maybe you've got a virus scanner that's locking your log files on you?
>
> Hi Ati,
>
> Inconsistent misbehavior is always fun ;). You may need to increase the
> probability that the misbehavior shows itself. I'd change the 10 MB limit
> to 1MB and document your findings. Additionally, you may take the size
> limit out of the picture and lower the time limit to something ridiculously
> low. (120 minutes?)
>
> Just a suggestion... Good luck!
>
> Jeff
>
>
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> >
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
When your life is on the go—take your life with you. Try Windows Mobile® today
I would also add starting postgres with log_connections=on
postgres -c log_connections=on
check postgressql.cong logdirectory
log_destination = '/fubar'
just to make sure you have permission to write to fubar..
HTH
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> From: postgres@image-place.com
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] again...
> Date: Sun, 26 Oct 2008 11:47:24 -0400
>
>
> ----- Original Message -----
> From: "Scott Marlowe" <scott.marlowe@gmail.com>
> To: "Ati Rosselet" <ati.rosselet@gmail.com>
> Cc: <pgsql-general@postgresql.org>
> Sent: Saturday, October 25, 2008 12:04 PM
> Subject: Re: [GENERAL] again...
>
>
> > On Sat, Oct 25, 2008 at 9:12 AM, Ati Rosselet <ati.rosselet@gmail.com>
> > wrote:
> >> I'm still getting a lot of these entries in my eventlog whenever I have a
> >> reasonably large amount of logging:
> >>
> >> Event Type: Error
> >> Event Source: PostgreSQL
> >> Event Category: None
> >> Event ID: 0
> >> Date: 10/22/2008
> >> Time: 9:36:28 AM
> >> User: N/A
> >> Computer: ----------
> >> Description:
> >> could not write to log file: Bad file descriptor
> >>
> >> production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server.
> >> I'd send this to psql-bugs, but I can't figure out how to reproduce it..
> >> and
> >> apparently noone else has seen this? Or is noone else running postgres
> >> on
> >> win2003??? (survey says??)
> >>
> >> The timing of each appears to be when log is rolled over (size limited to
> >> 10MB) and postgres attempts to write a log entry at the same time:
> >>
> >> Any ideas? help? How to proceed???
> >
> > Maybe you've got a virus scanner that's locking your log files on you?
>
> Hi Ati,
>
> Inconsistent misbehavior is always fun ;). You may need to increase the
> probability that the misbehavior shows itself. I'd change the 10 MB limit
> to 1MB and document your findings. Additionally, you may take the size
> limit out of the picture and lower the time limit to something ridiculously
> low. (120 minutes?)
>
> Just a suggestion... Good luck!
>
> Jeff
>
>
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> >
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
When your life is on the go—take your life with you. Try Windows Mobile® today
Ati Rosselet wrote: > I'm still getting a lot of these entries in my eventlog whenever I > have a reasonably large amount of logging: > > Event Type: Error > Event Source: PostgreSQL > Event Category: None > Event ID: 0 > Date: 10/22/2008 > Time: 9:36:28 AM > User: N/A > Computer: ---------- > Description: > could not write to log file: Bad file descriptor > > production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 > server. I'd send this to psql-bugs, but I can't figure out how to > reproduce it.. and > apparently noone else has seen this? Or is noone else running > postgres on win2003??? (survey says??) > > The timing of each appears to be when log is rolled over (size limited > to 10MB) and postgres attempts to write a log entry at the same time: > > Any ideas? help? How to proceed??? Change the log settings to panic and up the size limit. Almost nothing gets written to the log and the log never gets full enough to rotate. Other than that, you'll need to get visual studio and the source and work it out. Reproducing it is relatively easy - make a function that does raise notice and call it a lot. I had one that essentially took out the server. http://archives.postgresql.org/pgsql-general/2008-08/msg00615.php +1 for survey. klint. -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au
Agreed!
make SURE the user you used to login has rights
cmd
cacls LogFileName
HTH
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> Date: Mon, 27 Oct 2008 13:11:36 +1100
> From: kgore4@une.edu.au
> To: ati.rosselet@gmail.com
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] again... (win32 logging errors)
>
> Ati Rosselet wrote:
> > I'm still getting a lot of these entries in my eventlog whenever I
> > have a reasonably large amount of logging:
> >
> > Event Type: Error
> > Event Source: PostgreSQL
> > Event Category: None
> > Event ID: 0
> > Date: 10/22/2008
> > Time: 9:36:28 AM
> > User: N/A
> > Computer: ----------
> > Description:
> > could not write to log file: Bad file descriptor
> >
> > production server 8.3.3 (tried with 8.3.4 - same thing), win 2003
> > server. I'd send this to psql-bugs, but I can't figure out how to
> > reproduce it.. and
> > apparently noone else has seen this? Or is noone else running
> > postgres on win2003??? (survey says??)
> >
> > The timing of each appears to be when log is rolled over (size limited
> > to 10MB) and postgres attempts to write a log entry at the same time:
> >
> > Any ideas? help? How to proceed???
>
> Change the log settings to panic and up the size limit. Almost nothing
> gets written to the log and the log never gets full enough to rotate.
> Other than that, you'll need to get visual studio and the source and
> work it out.
>
> Reproducing it is relatively easy - make a function that does raise
> notice and call it a lot. I had one that essentially took out the server.
> http://archives.postgresql.org/pgsql-general/2008-08/msg00615.php
>
> +1 for survey.
>
> klint.
>
> --
> Klint Gore
> Database Manager
> Sheep CRC
> A.G.B.U.
> University of New England
> Armidale NSW 2350
>
> Ph: 02 6773 3789
> Fax: 02 6773 3266
> EMail: kgore4@une.edu.au
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Stay organized with simple drag and drop from Windows Live Hotmail. Try it
make SURE the user you used to login has rights
cmd
cacls LogFileName
HTH
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> Date: Mon, 27 Oct 2008 13:11:36 +1100
> From: kgore4@une.edu.au
> To: ati.rosselet@gmail.com
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] again... (win32 logging errors)
>
> Ati Rosselet wrote:
> > I'm still getting a lot of these entries in my eventlog whenever I
> > have a reasonably large amount of logging:
> >
> > Event Type: Error
> > Event Source: PostgreSQL
> > Event Category: None
> > Event ID: 0
> > Date: 10/22/2008
> > Time: 9:36:28 AM
> > User: N/A
> > Computer: ----------
> > Description:
> > could not write to log file: Bad file descriptor
> >
> > production server 8.3.3 (tried with 8.3.4 - same thing), win 2003
> > server. I'd send this to psql-bugs, but I can't figure out how to
> > reproduce it.. and
> > apparently noone else has seen this? Or is noone else running
> > postgres on win2003??? (survey says??)
> >
> > The timing of each appears to be when log is rolled over (size limited
> > to 10MB) and postgres attempts to write a log entry at the same time:
> >
> > Any ideas? help? How to proceed???
>
> Change the log settings to panic and up the size limit. Almost nothing
> gets written to the log and the log never gets full enough to rotate.
> Other than that, you'll need to get visual studio and the source and
> work it out.
>
> Reproducing it is relatively easy - make a function that does raise
> notice and call it a lot. I had one that essentially took out the server.
> http://archives.postgresql.org/pgsql-general/2008-08/msg00615.php
>
> +1 for survey.
>
> klint.
>
> --
> Klint Gore
> Database Manager
> Sheep CRC
> A.G.B.U.
> University of New England
> Armidale NSW 2350
>
> Ph: 02 6773 3789
> Fax: 02 6773 3266
> EMail: kgore4@une.edu.au
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Stay organized with simple drag and drop from Windows Live Hotmail. Try it
On Sat, Oct 25, 2008 at 6:12 PM, Ati Rosselet <ati.rosselet@gmail.com> wrote: > I'm still getting a lot of these entries in my eventlog whenever I have a > reasonably large amount of logging: > > Event Type: Error > Event Source: PostgreSQL > Event Category: None > Event ID: 0 > Date: 10/22/2008 > Time: 9:36:28 AM > User: N/A > Computer: ---------- > Description: > could not write to log file: Bad file descriptor > > production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. > I'd send this to psql-bugs, but I can't figure out how to reproduce it.. and > apparently noone else has seen this? Or is noone else running postgres on > win2003??? (survey says??) The same problem is registered for PG 8.3.3 instance under win2k3 (w/o SP, w/o antivirus) control. Trying to figure out what is the reason and how to solve it (installation is not mine)... This message is just to say that you're not alone :-) Maybe this is common problem for win2k3, will see. -- Sincerely yours, Nikolay Samokhvalov Postgresmen LLC, http://postgresmen.ru
Hi All,
Acting on a customer's report I analyzed this bug, and have found a fix for it. It is not a critical error, but it definitely is a bug, and can have security implications.
This error is raised from syslogger.c, and since this sub-process is not responsible for any backend communication, the backend (or the non-backend) that raised this error has already successfully done its work of communicating the message back to the application, if any.
The security implication is that, that the actual error which was supposed to be also logged in the server log file, is lost; and instead we have this error message in the WindowsErrorLog/Syslog.
Issue:
====
On Windows, the write to log file is done by a thread (whose main function is pipeThread() ), and since it works completely independent of the SysLoggerMain() ( which is responsible for calling logfile_rotate() periodically, which in turn changes the global variable syslogFile) this is causing a race condition due to an error in the way we are using the related critical section.
Here's the flow in my opinion which is causing this error:
main => SysLoggerMain() and thread => pipeThread()
main : calls logfile_rotate()
thread: calls process_pipe_input()/flush_pipe_input()
-> calls write_syslogger_file()
thread: reads in the current value of global variable
syslogFile.
main : enters critical section sysfileSection, and
assigns a new value to the global variable
syslogFile
thread: blocked by the same critical section
main : leaves critical section
thread: enters the critical section and attempts to
write to the old value of syslogFile.
OS : throws an error, as that handle is already
closed by main
The solution is to read the global variable inside the critical section, in write_syslogger_file().
How to reproduce:
=================
Apply the syslogger_race_bug_reproduce.patch patch.
Set the following GUCs in postgresql.conf:
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_size = 1kB
log_min_messages = notice
Restart the DB for these changes to take effect.
Create the following plpgsql function:
create or replace function raise_notices() returns void as $$
begin
loop
raise notice 'dummy';
end loop;
end;
$$ language plpgsql;
Execute: select raise_notices();
Keep an eye on Windows error log (refresh often; screenshot).
Attachments:
=========
syslogger_race_bug_reproduce.patch :
To reproduce the bug.
syslogger_race_bug_reproduce_temp_fix.patch :
To see the effect of final patch, but with other support code.
syslogger_race_bugfix.patch :
Final patch, ready for application to HEAD.
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
Mail sent from my BlackLaptop device
Acting on a customer's report I analyzed this bug, and have found a fix for it. It is not a critical error, but it definitely is a bug, and can have security implications.
This error is raised from syslogger.c, and since this sub-process is not responsible for any backend communication, the backend (or the non-backend) that raised this error has already successfully done its work of communicating the message back to the application, if any.
The security implication is that, that the actual error which was supposed to be also logged in the server log file, is lost; and instead we have this error message in the WindowsErrorLog/Syslog.
Issue:
====
On Windows, the write to log file is done by a thread (whose main function is pipeThread() ), and since it works completely independent of the SysLoggerMain() ( which is responsible for calling logfile_rotate() periodically, which in turn changes the global variable syslogFile) this is causing a race condition due to an error in the way we are using the related critical section.
Here's the flow in my opinion which is causing this error:
main => SysLoggerMain() and thread => pipeThread()
main : calls logfile_rotate()
thread: calls process_pipe_input()/flush_pipe_input()
-> calls write_syslogger_file()
thread: reads in the current value of global variable
syslogFile.
main : enters critical section sysfileSection, and
assigns a new value to the global variable
syslogFile
thread: blocked by the same critical section
main : leaves critical section
thread: enters the critical section and attempts to
write to the old value of syslogFile.
OS : throws an error, as that handle is already
closed by main
The solution is to read the global variable inside the critical section, in write_syslogger_file().
How to reproduce:
=================
Apply the syslogger_race_bug_reproduce.patch patch.
Set the following GUCs in postgresql.conf:
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_size = 1kB
log_min_messages = notice
Restart the DB for these changes to take effect.
Create the following plpgsql function:
create or replace function raise_notices() returns void as $$
begin
loop
raise notice 'dummy';
end loop;
end;
$$ language plpgsql;
Execute: select raise_notices();
Keep an eye on Windows error log (refresh often; screenshot).
Attachments:
=========
syslogger_race_bug_reproduce.patch :
To reproduce the bug.
syslogger_race_bug_reproduce_temp_fix.patch :
To see the effect of final patch, but with other support code.
syslogger_race_bugfix.patch :
Final patch, ready for application to HEAD.
Best regards,
On Sat, Oct 25, 2008 at 8:42 PM, Ati Rosselet <ati.rosselet@gmail.com> wrote:
I'm still getting a lot of these entries in my eventlog whenever I have a reasonably large amount of logging:
Event Type: Error
Event Source: PostgreSQL
Event Category: None
Event ID: 0
Date: 10/22/2008
Time: 9:36:28 AM
User: N/A
Computer: ----------
Description:
could not write to log file: Bad file descriptor
production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. I'd send this to psql-bugs, but I can't figure out how to reproduce it.. and
apparently noone else has seen this? Or is noone else running postgres on win2003??? (survey says??)
The timing of each appears to be when log is rolled over (size limited to 10MB) and postgres attempts to write a log entry at the same time:
Any ideas? help? How to proceed???
Cheers
Ati
My log settings from postgresql.conf:
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
log_destination = 'stderr' # Valid values are combinations of
# This is used when logging to stderr:
logging_collector = on # Enable capturing of stderr and csvlog
# These are only used if logging_collector is on:
#log_directory = 'pg_log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_truncate_on_rotation = off # If on, an existing log file of the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 to disable.
log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 to disable.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
# - When to Log -
#client_min_messages = notice # values in order of decreasing detail:
#log_min_messages = notice # values in order of decreasing detail:
#log_error_verbosity = default # terse, default, or verbose messages
#log_min_error_statement = error # values in order of decreasing detail:
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this time.
#silent_mode = off # DO NOT USE without syslog or
# logging_collector
# (change requires restart)
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_checkpoints = off
log_connections = on
log_disconnections = on
#log_duration = off
#log_hostname = off
#log_line_prefix = '%t ' # special values:
log_line_prefix = '%t [%h:%u@%d] '
#log_lock_waits = off # log lock waits >= deadlock_timeout
log_statement = 'all' # none, ddl, mod, all
#log_temp_files = -1 # log temporary files equal or larger
# than specified size;
# -1 disables, 0 logs all temp files
#log_timezone = unknown # actually, defaults to TZ environment
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
Mail sent from my BlackLaptop device