Thread: Re: query log corrupted-looking entries
Hoping to resurrect this thread. I am seeing more and more of this as the database gets more usage and it really messes up query log analysis. A quick summary: When I posted this was getting corrupted query log entries. I still am. They look like this (the problem line + one line before and after): 2007-05-15 14:24:52 PDT [3859]: [968-1] LOG: duration: 0.071 ms 2007-05-15 14:24:52 PDT [3859]: [969-1] LOG: statement2007-05-15 14:24:52 PDT [3885]: [208-1] LOG: duration: 11.800 ms : PREPARE <unnamed> AS SELECT P.IS_DISPLAY, P.IS_ACTIVE, P.CITY_POSTAL_CODE_ID,P.property_name,P.address_1,P.address_2, This seems to happen with statements of all lengths (some are long, but many are not). This used to happen on 8.1.3. I am now on: PostgreSQL 8.1.8 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3) I think what Tom suggested last (see below) is that I try something on my machine, but it was not entirely to me what exactly and I failed to follow up. I'd be happy to do any tests to help diagnose this better, although I don't have a reliable way to reproduce it ("put a lot of stress on the DB" is all I can do and hope it happens). Thanks, George > -----Original Message----- > From: George Pavlov > Sent: Wednesday, October 18, 2006 1:23 PM > To: Tom Lane > Cc: pgsql-hackers@postgreSQL.org > Subject: RE: [GENERAL] query log corrupted-looking entries > > > the behavior. It'd be interesting to verify whether it's > the same on > > George's machine though. > > Let me know how to test this. (Please do a "for dummies" > version -- I am not sure I can figure it out from the thread > even though someone else might be able to.) > > > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Wednesday, October 18, 2006 1:17 PM > To: George Pavlov > Cc: pgsql-hackers@postgreSQL.org > Subject: Re: [GENERAL] query log corrupted-looking entries > > I wrote: > > I checked around with some kernel/glibc gurus in Red Hat, and the > > consensus seemed to be that we'd be better off to bypass > fprintf() and > > just send message strings to stderr using write() --- ie, instead of > > elog.c doing > > fprintf(stderr, "%s", buf.data); > > do > > write(fileno(stderr), buf.data, strlen(buf.data)); > > I did some strace'ing of the backend, and observed that even for very > long messages (upwards of 100K) the fprintf results in a > single write() > call. This was true on both Fedora Core 5 and a pretty old > HPUX version. > So it'd seem that most versions of libc already know about optimizing > fprintf-%s into a direct write(), and changing our code > wouldn't change > the behavior. It'd be interesting to verify whether it's the same on > George's machine though. > > regards, tom lane >
On Wednesday 23 May 2007 1:04 pm, George Pavlov wrote: > Hoping to resurrect this thread. I am seeing more and more of > this as the database gets more usage and it really messes up > query log analysis. > > > A quick summary: When I posted this was getting corrupted > query log entries. I still am. They look like this (the > problem line + one line before and after): > > 2007-05-15 14:24:52 PDT [3859]: [968-1] LOG: duration: 0.071 > ms 2007-05-15 14:24:52 PDT [3859]: [969-1] LOG: > statement2007-05-15 14:24:52 PDT [3885]: [208-1] LOG: > duration: 11.800 ms FWIW, I've also been seeing this sort of query log corruption for as long as I can remember, 7.1 through 8.2, HPUX (parisc, ia64), Linux on intel, amd... Ed
On 5/29/2007 10:19 AM, Ed L. wrote: > On Wednesday 23 May 2007 1:04 pm, George Pavlov wrote: > FWIW, I've also been seeing this sort of query log corruption for > as long as I can remember, 7.1 through 8.2, HPUX (parisc, ia64), > Linux on intel, amd... Do you have any tricks for dealing with the problem from a query analyzer perspective? That is, if you use something like pgfouine do you have any quick and easy way to remove those lines (and the affected lines around them)? Or someway to "fix" the corrupted lines? I'd say that lately ~40% of my daily query logs are suffering from this problem making query analysis very hard. George
On Friday 01 June 2007 3:09 pm, George Pavlov wrote: > On 5/29/2007 10:19 AM, Ed L. wrote: > > On Wednesday 23 May 2007 1:04 pm, George Pavlov wrote: > > FWIW, I've also been seeing this sort of query log > > corruption for as long as I can remember, 7.1 through 8.2, > > HPUX (parisc, ia64), Linux on intel, amd... > > Do you have any tricks for dealing with the problem from a > query analyzer perspective? That is, if you use something like > pgfouine do you have any quick and easy way to remove those > lines (and the affected lines around them)? Or someway to > "fix" the corrupted lines? I'd say that lately ~40% of my > daily query logs are suffering from this problem making query > analysis very hard. Not really. I filter by perl regular expressions, but I wouldn't say that is reliable. However, our query volumes are high enough that losing a few here and there is no big deal in analysis. I long ago realized that reliably replaying query logs was infeasible due in part to this issue. Regex filtering is problematic and error-prone (for example, how do you reliably tell the difference between log timestamp and DB data?). Perhaps others have better answers. Ed
"George Pavlov" <gpavlov@mynewplace.com> writes: > On 5/29/2007 10:19 AM, Ed L. wrote: >> FWIW, I've also been seeing this sort of query log corruption for >> as long as I can remember, 7.1 through 8.2, HPUX (parisc, ia64), >> Linux on intel, amd... What *exactly* is the logging setup you guys use, and have you tried alternatives? As best I can tell, the variant with a downstream syslogger process ought to be OK for messages up to PIPEBUF bytes long on most standard Unixen. Of course that's no help if you've got lots of very verbose queries, but the examples posted so far didn't seem to indicate that. regards, tom lane
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > "George Pavlov" <gpavlov@mynewplace.com> writes: > > On 5/29/2007 10:19 AM, Ed L. wrote: > >> FWIW, I've also been seeing this sort of query log corruption for > >> as long as I can remember, 7.1 through 8.2, HPUX (parisc, ia64), > >> Linux on intel, amd... > > What *exactly* is the logging setup you guys use, and have you tried > alternatives? Have not tried many alternatives as I am using Pgfouine and it is fairly rigid re. the log format it expects. As I had originally posted my logging setup is: > the relevant config settings are (show all | grep ' log_' output): > > log_connections | on > log_destination | stderr > log_directory | pg_log > log_disconnections | on > log_duration | on > log_error_verbosity | default > log_executor_stats | off > log_filename | postgresql-%a-%H.log > log_hostname | off > log_line_prefix | %t [%p]: [%l-1] > log_min_duration_statement | -1 > log_min_error_statement | panic > log_min_messages | notice > log_parser_stats | off > log_planner_stats | off > log_rotation_age | 60 > log_rotation_size | 0 > log_statement | all > log_statement_stats | off > log_truncate_on_rotation | on Also redirect_stderr = on. Let me know if there are any other config settings that might be important. I am currently on: "PostgreSQL 8.1.8 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)", but this has happened with other 8.1.* versions and Ed L. indicates with other major versions. This happens even with very short messages (definitely less than 4096 bytes) -- I had pasted some examples originally; it does not seem to be related to message/query length. George
On Friday 01 June 2007 3:36 pm, Tom Lane wrote: > "George Pavlov" <gpavlov@mynewplace.com> writes: > > On 5/29/2007 10:19 AM, Ed L. wrote: > >> FWIW, I've also been seeing this sort of query log > >> corruption for as long as I can remember, 7.1 through 8.2, > >> HPUX (parisc, ia64), Linux on intel, amd... > > What *exactly* is the logging setup you guys use, and have you > tried alternatives? > > As best I can tell, the variant with a downstream syslogger > process ought to be OK for messages up to PIPEBUF bytes long > on most standard Unixen. Of course that's no help if you've > got lots of very verbose queries, but the examples posted so > far didn't seem to indicate that. On older versions, we use stderr piped to Apache's rotatelogs. On newer versions with built-in logging, we use it. We do have considerably verbose queries (sometimes a page or so). No idea what PIPEBUF equals. Ed
On Friday 01 June 2007 3:36 pm, Tom Lane wrote: > What *exactly* is the logging setup you guys use, and have you > tried alternatives? redirect_stderr = on # Enable capturing of stderr into log log_directory = '/users/.../logs' # Directory where log files are written log_filename = 'server_log.%a.%H:00-%H:59' # Log file name pattern. log_truncate_on_rotation = on # If on, any existing log file of the same log_rotation_age = 60min # Automatic rotation of logfiles will log_min_messages = notice # Values, in order of decreasing detail: log_connections = on log_disconnections = on log_duration = on log_line_prefix = '%m [%p] %u %r %d ' # Special values: log_statement = 'all' # none, ddl, mod, all I have not tried alternatives. The status quo is adequate. Ed
"George Pavlov" <gpavlov@mynewplace.com> writes: >> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20 >> What *exactly* is the logging setup you guys use, and have you tried >> alternatives? > ... Also redirect_stderr = on. Hm. Well, that's the bit that ought to get you into the PIPE_BUF exception. There's been some speculation that a change like the attached would help. I've found that it makes no difference with my libc, but maybe yours is different --- want to try it? regards, tom lane *** src/backend/utils/error/elog.c.orig Thu May 3 22:01:02 2007 --- src/backend/utils/error/elog.c Fri Jun 1 18:22:49 2007 *************** *** 1788,1794 **** write_eventlog(edata->elevel, buf.data); else #endif ! fprintf(stderr, "%s", buf.data); } /* If in the syslogger process, try to write messages direct to file */ --- 1788,1794 ---- write_eventlog(edata->elevel, buf.data); else #endif ! write(fileno(stderr), buf.data, buf.len); } /* If in the syslogger process, try to write messages direct to file */ *************** *** 2120,2125 **** --- 2120,2126 ---- #ifndef WIN32 /* On Unix, we just fprintf to stderr */ vfprintf(stderr, fmt, ap); + fflush(stderr); #else /*
From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > "George Pavlov" <gpavlov@mynewplace.com> writes: > > ... Also redirect_stderr = on. > > Hm. Well, that's the bit that ought to get you into the PIPE_BUF > exception. There's been some speculation that a change like the > attached would help. I've found that it makes no difference with > my libc, but maybe yours is different --- want to try it? I will. I may need some time though, since I first need to find a way to reproduce the problem reliably on my test environments and right now I cannot seem to. I have seen the problem mostly under production loads (also under certain kinds of stress-testing, but I forget exactly which kind...) In the meantime I went and looked at the logs in more detail and the corrupted entries seem much more prevalent than what I originally thought. Apart from the ones pgfouine complains about there are many more such lines. For example out of a (average-load) day's log file with 17+ million lines pgfouine complains about 8 lines, but there are in fact 1400+ lines with these kinds of entries. George
I tried the patch and it has no effect whatsoever -- even with the patch, under the correct load the corrupted entries are coming fast and furious (I found a load profile on my app that reproduces these very clearly). Here are a few other observations for what they are worth: The problem seems very easy to reproduce on my production-like environment: 16GB memory, 4 CPUs, RedHat, only DB running on that machine, the DB is accessed by 4 appservers, running on 2 other machines, each of the 4 appservers configured with up to 20 connections in the pool, incoming connections load balanced among the appservers. Conversely, the problem is very hard (but not impossible) to reproduce on a "lesser" environment: 4GB mem, 2 CPUs, Fedora Core, DB and 1 appserver running on same machine (and competing for resources), appserver still configured for up to 20 connections. The problem only happens when I put a bit of a load on the application, not necessarily a lot of connections, but a steady amount of requests per second -- a few simulated users hammering on it without pauses results in at least one corrupted line every couple of seconds. So it seems to me related to how many processes are writing at the same time uninterrupted. Anything else I can do to diagnose? > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of George Pavlov > Sent: Saturday, June 02, 2007 11:33 AM > To: Tom Lane > Cc: Ed L.; pgsql-general@postgresql.org > Subject: Re: [GENERAL] query log corrupted-looking entries > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "George Pavlov" <gpavlov@mynewplace.com> writes: > > > ... Also redirect_stderr = on. > > > > Hm. Well, that's the bit that ought to get you into the PIPE_BUF > > exception. There's been some speculation that a change like the > > attached would help. I've found that it makes no difference with > > my libc, but maybe yours is different --- want to try it? > > I will. I may need some time though, since I first need to > find a way to > reproduce the problem reliably on my test environments and right now I > cannot seem to. I have seen the problem mostly under production loads > (also under certain kinds of stress-testing, but I forget > exactly which > kind...) > > In the meantime I went and looked at the logs in more detail and the > corrupted entries seem much more prevalent than what I originally > thought. Apart from the ones pgfouine complains about there are many > more such lines. For example out of a (average-load) day's > log file with > 17+ million lines pgfouine complains about 8 lines, but there are in > fact 1400+ lines with these kinds of entries. > > George
"George Pavlov" <gpavlov@mynewplace.com> writes: > I tried the patch and it has no effect whatsoever -- even with the > patch, under the correct load the corrupted entries are coming fast and > furious (I found a load profile on my app that reproduces these very > clearly). What are the total lengths of the log entries in which you see the failure? (The "length" here includes all the lines belonging to a single logical entry, eg, ERROR, DETAIL, HINT.) I believe that this shouldn't be happening except in the case in which the entry-interpolated-into exceeds PIPE_BUF bytes. I'm not entirely sure what PIPE_BUF is on Linux machines, but IIRC the Posix spec says it has to be at least 512, and on older Unix kernels it tends to be 8K. regards, tom lane
Tom Lane wrote: > "George Pavlov" <gpavlov@mynewplace.com> writes: > > I tried the patch and it has no effect whatsoever -- even with the > > patch, under the correct load the corrupted entries are coming fast and > > furious (I found a load profile on my app that reproduces these very > > clearly). > > What are the total lengths of the log entries in which you see the > failure? (The "length" here includes all the lines belonging to a > single logical entry, eg, ERROR, DETAIL, HINT.) > > I believe that this shouldn't be happening except in the case in which > the entry-interpolated-into exceeds PIPE_BUF bytes. I'm not entirely > sure what PIPE_BUF is on Linux machines, but IIRC the Posix spec says it > has to be at least 512, and on older Unix kernels it tends to be 8K. 4096, although for some archs it may be bigger (it's the page size). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > What are the total lengths of the log entries in which you see the > failure? (The "length" here includes all the lines belonging to a > single logical entry, eg, ERROR, DETAIL, HINT.) It is very hard to tease these apart because now that I look at it closely it is a total mess; there are multiple interruptions and interruptions inside of interruptions... The more I look at this the worse it seems and now I am actually impressed that pgFouine is able to not trip on most of these. It also makes me more suspect of the logs, although laws of large numbers hopefully make them at least directionally correct. The interruption can happen anywhere, including the leading log timestamp, so I am looking at the full log entry when counting. So, counting from the beginning of the log entry, timestamp included, (e.g. "2007-06-01 17:00:17 PDT [28955]: [1225-1] LOG: statement: PREPARE <unnamed> AS SELECT ...") to the first interruption (the first start of another log entry in the midst of the original one, e.g. up to and including the "f" in "... left outer join f2007-06-01 17:00:19 PDT [28920]: [1595-1] LOG: statement: EXECUTE <unnamed> [PREPARE: SELECT PROPERTY_ID ...") I see all sorts of lengths, both smaller and larger than 4096: I see interruptions at the 24th character, at the 399th, 4050th, 5063rd, etc. As to the full length of the entries that get interrupted they do seem to be all on the long side--I can't say with total certainty, but the couple of dozen that I looked at were all > 4096 when all the interruptions are taken out. So I think I can say that I see corruptions happen within "long" entries, but they can happen anywhere within that long entry, and one can have multiple interruptions within one entry.
On Friday 08 June 2007 10:30 am, George Pavlov wrote: > > It is very hard to tease these apart because now that I look > at it closely it is a total mess; there are multiple > interruptions and interruptions inside of interruptions... > The interruption can happen anywhere, including the leading > log timestamp This describes my experiences as well on our high-volume clusters. Ed
On Fri, Jun 08, 2007 at 09:30:21AM -0700, George Pavlov wrote: > As to the full length of the entries that get > interrupted they do seem to be all on the long side--I can't say with > total certainty, but the couple of dozen that I looked at were all > > 4096 when all the interruptions are taken out. So I think I can say that > I see corruptions happen within "long" entries, but they can happen > anywhere within that long entry, and one can have multiple interruptions > within one entry. That sounds about right. There's a requirement to be atomic for writes less than PIPE_BUF, but for larger there are no guarentees at all. I imagine the kernel noted the write was large and proceeded to fill the last block of the buffer upto a round 4k. That would basically mean they could get chopped into peices of any size. Incidently, on systems I've worked with logging went through syslog and it seems to split long lines at 80 characters or so: [528-1] SELECT fo... [528-2] ... etc... I always assumed it was postgres doing the splitting but perhaps it was syslog all along. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.