Re: log : bad file dscriptor???? - Mailing list pgsql-bugs
From | Gurjeet Singh |
---|---|
Subject | Re: log : bad file dscriptor???? |
Date | |
Msg-id | 65937bea0903141130n28f593fcm823486849d98297b@mail.gmail.com Whole thread Raw |
In response to | log : bad file dscriptor???? ("Ati Rosselet" <ati.rosselet@gmail.com>) |
Responses |
Re: log : bad file dscriptor????
|
List | pgsql-bugs |
--- Sorry about the previous mail; that did not have the proper subject line (for no fault of mine)
Hi All,
Acting on a customer's report I analyzed this bug, and have found a fix for it. It is not a critical error, but it definitely is a bug, and can have security implications.
This error is raised from syslogger.c, and since this sub-process is not responsible for any backend communication, the backend (or the non-backend) that raised this error has already successfully done its work of communicating the message back to the application, if any.
The security implication is that, that the actual error which was supposed to be also logged in the server log file, is lost; and instead we have this error message in the WindowsErrorLog/Syslog.
Issue:
====
On Windows, the write to log file is done by a thread (whose main function is pipeThread() ), and since it works completely independent of the SysLoggerMain() ( which is responsible for calling logfile_rotate() periodically, which in turn changes the global variable syslogFile) this is causing a race condition due to an error in the way we are using the related critical section.
Here's the flow in my opinion which is causing this error:
main => SysLoggerMain() and thread => pipeThread()
main : calls logfile_rotate()
thread: calls process_pipe_input()/flush_pipe_input()
-> calls write_syslogger_file()
thread: reads in the current value of global variable
syslogFile.
main : enters critical section sysfileSection, and
assigns a new value to the global variable
syslogFile
thread: blocked by the same critical section
main : leaves critical section
thread: enters the critical section and attempts to
write to the old value of syslogFile.
OS : throws an error, as that handle is already
closed by main
The solution is to read the global variable inside the critical section, in write_syslogger_file().
How to reproduce:
=================
Apply the syslogger_race_bug_reproduce.patch patch.
Set the following GUCs in postgresql.conf:
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_size = 1kB
log_min_messages = notice
Restart the DB for these changes to take effect.
Create the following plpgsql function:
create or replace function raise_notices() returns void as $$
begin
loop
raise notice 'dummy';
end loop;
end;
$$ language plpgsql;
Execute: select raise_notices();
Keep an eye on Windows error log (refresh often; screenshot).
Attachments:
=========
syslogger_race_bug_reproduce.patch :
To reproduce the bug.
syslogger_race_bug_reproduce_temp_fix.patch :
To see the effect of final patch, but with other support code.
syslogger_race_bugfix.patch :
Final patch, ready for application to HEAD.
Best regards,
Hi All,
Acting on a customer's report I analyzed this bug, and have found a fix for it. It is not a critical error, but it definitely is a bug, and can have security implications.
This error is raised from syslogger.c, and since this sub-process is not responsible for any backend communication, the backend (or the non-backend) that raised this error has already successfully done its work of communicating the message back to the application, if any.
The security implication is that, that the actual error which was supposed to be also logged in the server log file, is lost; and instead we have this error message in the WindowsErrorLog/Syslog.
Issue:
====
On Windows, the write to log file is done by a thread (whose main function is pipeThread() ), and since it works completely independent of the SysLoggerMain() ( which is responsible for calling logfile_rotate() periodically, which in turn changes the global variable syslogFile) this is causing a race condition due to an error in the way we are using the related critical section.
Here's the flow in my opinion which is causing this error:
main => SysLoggerMain() and thread => pipeThread()
main : calls logfile_rotate()
thread: calls process_pipe_input()/flush_pipe_input()
-> calls write_syslogger_file()
thread: reads in the current value of global variable
syslogFile.
main : enters critical section sysfileSection, and
assigns a new value to the global variable
syslogFile
thread: blocked by the same critical section
main : leaves critical section
thread: enters the critical section and attempts to
write to the old value of syslogFile.
OS : throws an error, as that handle is already
closed by main
The solution is to read the global variable inside the critical section, in write_syslogger_file().
How to reproduce:
=================
Apply the syslogger_race_bug_reproduce.patch patch.
Set the following GUCs in postgresql.conf:
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_min_messages = notice
Restart the DB for these changes to take effect.
Create the following plpgsql function:
create or replace function raise_notices() returns void as $$
begin
loop
raise notice 'dummy';
end loop;
end;
$$ language plpgsql;
Execute: select raise_notices();
Keep an eye on Windows error log (refresh often; screenshot).
Attachments:
=========
syslogger_race_bug_reproduce.patch :
To reproduce the bug.
syslogger_race_bug_reproduce_temp_fix.patch :
To see the effect of final patch, but with other support code.
syslogger_race_bugfix.patch :
Final patch, ready for application to HEAD.
Best regards,
On Sat, Oct 25, 2008 at 8:43 PM, Ati Rosselet <ati.rosselet@gmail.com> wrote:
I'm still getting a lot of these entries in my eventlog whenever I have a reasonably large amount of logging:
Event Type: Error
Event Source: PostgreSQL
Event Category: None
Event ID: 0
Date: 10/22/2008
Time: 9:36:28 AM
User: N/A
Computer: ----------
Description:
could not write to log file: Bad file descriptor
production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server. I'd register this as a formal bug, but I can't figure out how to reproduce it.. and
apparently noone else has seen this? Or is noone else running postgres on win2003??? (survey says??)
The timing of each appears to be when log is rolled over (size limited to 10MB) and postgres attempts to write a log entry at the same time:
Any ideas? help? How to proceed???
Cheers
Ati
My log settings from postgresql.conf:
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
log_destination = 'stderr' # Valid values are combinations of
# This is used when logging to stderr:
logging_collector = on # Enable capturing of stderr and csvlog
# These are only used if logging_collector is on:
#log_directory = 'pg_log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_truncate_on_rotation = off # If on, an existing log file of the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 to disable.
log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 to disable.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
# - When to Log -
#client_min_messages = notice # values in order of decreasing detail:
#log_min_messages = notice # values in order of decreasing detail:
#log_error_verbosity = default # terse, default, or verbose messages
#log_min_error_statement = error # values in order of decreasing detail:
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this time.
#silent_mode = off # DO NOT USE without syslog or
# logging_collector
# (change requires restart)
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_checkpoints = off
log_connections = on
log_disconnections = on
#log_duration = off
#log_hostname = off
#log_line_prefix = '%t ' # special values:
log_line_prefix = '%t [%h:%u@%d] '
#log_lock_waits = off # log lock waits >= deadlock_timeout
log_statement = 'all' # none, ddl, mod, all
#log_temp_files = -1 # log temporary files equal or larger
# than specified size;
# -1 disables, 0 logs all temp files
#log_timezone = unknown # actually, defaults to TZ environment
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
Mail sent from my BlackLaptop device
Attachment
pgsql-bugs by date: