Thread: Add jsonlog log_destination for JSON server logs
Hi,
This patch adds a new log_destination, "jsonlog", that writes log entries as lines of JSON. It was originally started by David Fetter using the jsonlog module by Michael Paquier (https://github.com/michaelpq/pg_plugins/blob/master/jsonlog/jsonlog.c) as a basis for how to serialize the log messages. Thanks to both of them because this wouldn't be possible without that starting point.
The first commit splits out the destination in log pipe messages into its own field. Previously it would piggyback on the "is_last" field. This adds an int to the message size but makes the rest of the code easier to follow.
The second commit adds a TAP test for log_destination "csvlog". This was done to both confirm that the previous change didn't break anything and as a skeleton for the test in the next commit.
The third commit adds the new log_destination "jsonlog". The output format is one line per entry with the top level output being a JSON object keyed with the log fields. Newlines in the output fields are escaped as \n so the output file has exactly one line per log entry. It also includes a new test for verifying the JSON output with some basic regex checks (similar to the csvlog test).
Here's a sample of what the log entries look like:
{"timestamp":"2021-08-31 10:15:25.129 EDT","user":"sehrope","dbname":"postgres","pid":12012,"remote_host":"[local]","session_id":"612e397d.2eec","line_num":1,"ps":"idle","session_start":"2021-08-31 10:15:25 EDT","vxid":"3/2","txid":"0","error_severity":"LOG","application_name":"006_jsonlog.pl","message":"statement: SELECT 1/0"}
It builds and passes "make check-world" on Linux. It also includes code to handle Windows as well but I have not actually tried building it there.
Attachment
On Tue, Aug 31, 2021 at 11:34:56AM -0400, Sehrope Sarkuni wrote: > The second commit adds a TAP test for log_destination "csvlog". This was > done to both confirm that the previous change didn't break anything and as > a skeleton for the test in the next commit. +note "Before sleep"; +usleep(100_000); +note "Before rotate"; +$node->logrotate(); +note "After rotate"; +usleep(100_000); Do you really need a rotation of the log files here? Wouldn't it be better to grab the position of the current log file with a fixed log file name, and then slurp the file from this position with your expected output? That would make the test faster, as well. > The third commit adds the new log_destination "jsonlog". The output format > is one line per entry with the top level output being a JSON object keyed > with the log fields. Newlines in the output fields are escaped as \n so the > output file has exactly one line per log entry. It also includes a new test > for verifying the JSON output with some basic regex checks (similar to the > csvlog test). + * Write logs in json format. + */ +static void +write_jsonlog(ErrorData *edata) +{ Rather than making elog.c larger, I think that we should try to split that into more files. Why not refactoring out the CSV part first? You could just call that csvlog.c, then create a new jsonlog.c for the meat of the patch. The list of fields is not up to date. At quick glance, you are missing: - backend type. - leader PID. - query ID. - Session start timestamp (?) -- Michael
Attachment
On Tue, Aug 31, 2021 at 8:43 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Aug 31, 2021 at 11:34:56AM -0400, Sehrope Sarkuni wrote:
> The second commit adds a TAP test for log_destination "csvlog". This was
> done to both confirm that the previous change didn't break anything and as
> a skeleton for the test in the next commit.
+note "Before sleep";
+usleep(100_000);
+note "Before rotate";
+$node->logrotate();
+note "After rotate";
+usleep(100_000);
Do you really need a rotation of the log files here? Wouldn't it be
better to grab the position of the current log file with a fixed log
file name, and then slurp the file from this position with your
expected output? That would make the test faster, as well.
Yes, that was intentional. I used the log rotation tap test as a base and kept that piece in there so it verifies that the csv log files are actually rotated. Same for the json logs.
Rather than making elog.c larger, I think that we should try to split
that into more files. Why not refactoring out the CSV part first?
You could just call that csvlog.c, then create a new jsonlog.c for the
meat of the patch.
That's a good idea. I'll try that out.
The list of fields is not up to date. At quick glance, you are
missing:
- backend type.
- leader PID.
- query ID.
- Session start timestamp (?)
Thanks. I'll take a look.
Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
Updated patch set is attached.
This version splits out the existing csvlog code into its own file and centralizes the common helpers into a new elog-internal.h so that they're only included by the actual write_xyz sources.
This version splits out the existing csvlog code into its own file and centralizes the common helpers into a new elog-internal.h so that they're only included by the actual write_xyz sources.
That makes the elog.c changes in the JSON logging patch minimal as all it's really doing is invoking the new write_jsonlog(...) function.
It also adds those missing fields to the JSON logger output.
It also adds those missing fields to the JSON logger output.
Attachment
On Wed, Sep 01, 2021 at 04:39:43PM -0400, Sehrope Sarkuni wrote: > That makes the elog.c changes in the JSON logging patch minimal as all it's > really doing is invoking the new write_jsonlog(...) function. Looking at 0001, to do things in order. > @@ -46,8 +46,8 @@ typedef struct > char nuls[2]; /* always \0\0 */ > uint16 len; /* size of this chunk (counts data only) */ > int32 pid; /* writer's pid */ > - char is_last; /* last chunk of message? 't' or 'f' ('T' or > - * 'F' for CSV case) */ > + int32 dest; /* log destination */ > + char is_last; /* last chunk of message? 't' or 'f'*/ > char data[FLEXIBLE_ARRAY_MEMBER]; /* data payload starts here */ > } PipeProtoHeader; Making PipeProtoHeader larger is not free, and that could penalize workloads with a lot of short messages and many backends as the syslogger relies on pipes with sync calls. Why not switching is_last to bits8 flags instead? That should be enough for the addition of JSON. 3 bits are enough at the end: one to know if it is the last chunk of message, one for CSV and one for JSON. -- Michael
Attachment
On 9/8/21 2:58 AM, Michael Paquier wrote: > On Wed, Sep 01, 2021 at 04:39:43PM -0400, Sehrope Sarkuni wrote: >> That makes the elog.c changes in the JSON logging patch minimal as all it's >> really doing is invoking the new write_jsonlog(...) function. > Looking at 0001, to do things in order. > >> @@ -46,8 +46,8 @@ typedef struct >> char nuls[2]; /* always \0\0 */ >> uint16 len; /* size of this chunk (counts data only) */ >> int32 pid; /* writer's pid */ >> - char is_last; /* last chunk of message? 't' or 'f' ('T' or >> - * 'F' for CSV case) */ >> + int32 dest; /* log destination */ >> + char is_last; /* last chunk of message? 't' or 'f'*/ >> char data[FLEXIBLE_ARRAY_MEMBER]; /* data payload starts here */ >> } PipeProtoHeader; > Making PipeProtoHeader larger is not free, and that could penalize > workloads with a lot of short messages and many backends as the > syslogger relies on pipes with sync calls. Why not switching is_last > to bits8 flags instead? That should be enough for the addition of > JSON. 3 bits are enough at the end: one to know if it is the last > chunk of message, one for CSV and one for JSON. Yeah. A very simple change would be to use two different values for json (say 'y' and 'n'). A slightly more principled scheme might use the top bit for the end marker and the bottom 3 bits for the dest type (so up to 8 types possible), with the rest available for future use. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Wed, Sep 08, 2021 at 08:46:44AM -0400, Andrew Dunstan wrote: > Yeah. A very simple change would be to use two different values for json > (say 'y' and 'n'). A slightly more principled scheme might use the top > bit for the end marker and the bottom 3 bits for the dest type (so up to > 8 types possible), with the rest available for future use. I was thinking to track stderr as a case where no bits are set in the flags for the area of the destinations, but that's a bit crazy if we have a lot of margin in what can be stored. I have looked at that and finished with the attached which is an improvement IMO, especially when it comes to the header validation. FWIW, while looking at my own external module for the JSON logs, I noticed that I used the chunk protocol when the log redirection is enabled, but just enforced everything to be sent to stderr. -- Michael
Attachment
Fwiw I was shocked when I saw the t/f T/F kluge when I went to work on jsonlogging. That's the kind of dead-end short-sighted hack that just lays traps and barriers for future hackers to have to clean up before they can do the work they want to do. Please just put a "format" field (or "channel" field -- the logging daemon doesn't really care what format) with a list of defined formats that can easily be extended in the future. If you want to steal the high bit for "is last" and only allow 128 values instead of 256 so be it.
On Wed, Sep 08, 2021 at 10:58:51PM -0400, Greg Stark wrote: > Please just put a "format" field (or "channel" field -- the logging > daemon doesn't really care what format) with a list of defined formats > that can easily be extended in the future. If you want to steal the > high bit for "is last" and only allow 128 values instead of 256 so be > it. Which is what I just posted here: https://www.postgresql.org/message-id/YTlunSciDRl1z7ik@paquier.xyz Well, we could also do things so as we have two fields, as of something like: typedef struct { [...] bits8 flags:4, format:4; [...] } PipeProtoHeader; I am not sure if this is an improvement in readability though, and that's less consistent with the recent practice we've been trying to follow with bitmasks and flag-like options. -- Michael
Attachment
On Thu, Sep 09, 2021 at 11:17:01AM +0900, Michael Paquier wrote: > I was thinking to track stderr as a case where no bits are set in the > flags for the area of the destinations, but that's a bit crazy if we > have a lot of margin in what can be stored. I have looked at that and > finished with the attached which is an improvement IMO, especially > when it comes to the header validation. One part that was a bit flacky after more consideration is that the header validation would consider as correct the case where both stderr and csvlog are set in the set of flags. I have finished by just using pg_popcount() on one byte with a filter on the log destinations, making the whole more robust. If there are any objections, please let me know. -- Michael
Attachment
On Wed, Sep 01, 2021 at 04:39:43PM -0400, Sehrope Sarkuni wrote: > This version splits out the existing csvlog code into its own file and > centralizes the common helpers into a new elog-internal.h so that they're > only included by the actual write_xyz sources. > > That makes the elog.c changes in the JSON logging patch minimal as all it's > really doing is invoking the new write_jsonlog(...) function. > > It also adds those missing fields to the JSON logger output. Forking a bit this thread while looking at 0002 that adds new tests for csvlog. While I agree that it would be useful to have more coverage with the syslogger message chunk protocol in this area, I think that having a separate test is a waste of resources. Creating a new node is not cheap either, and this adds more wait phases, making the tests take longer. It would be much better to extend 004_logrotate.pl and update it to use log_destination = 'stderr, csvlog', to minimize the number of nodes we create as well as the additional amount of time we'd spend for those tests. Plugging in JSON into that would not be complicated either once we have in place a set of small routines that limit the code duplication between the checks for each log destination type. -- Michael
Attachment
On Fri, Sep 10, 2021 at 01:07:00PM +0900, Michael Paquier wrote: > Forking a bit this thread while looking at 0002 that adds new tests > for csvlog. While I agree that it would be useful to have more > coverage with the syslogger message chunk protocol in this area, I > think that having a separate test is a waste of resources. Creating a > new node is not cheap either, and this adds more wait phases, making > the tests take longer. It would be much better to extend > 004_logrotate.pl and update it to use log_destination = 'stderr, > csvlog', to minimize the number of nodes we create as well as the > additional amount of time we'd spend for those tests. Plugging in > JSON into that would not be complicated either once we have in place a > set of small routines that limit the code duplication between the > checks for each log destination type. And this part leads me to the attached, where the addition of the JSON format would result in the addition of a couple of lines. -- Michael
Attachment
On Fri, Sep 10, 2021 at 03:56:18PM +0900, Michael Paquier wrote: > And this part leads me to the attached, where the addition of the JSON > format would result in the addition of a couple of lines. Okay, I have worked through the first half of the patch set, and applied the improved versions of 0001 (refactoring of the chunk protocol) and 0002 (addition of the tests for csvlog). I have not looked in details at 0003 and 0004 yet. Still, here are some comments after a quick scan. + * elog-internal.h I'd rather avoid the hyphen, and use elog_internal.h. +#define FORMATTED_TS_LEN 128 +extern char formatted_start_time[FORMATTED_TS_LEN]; +extern char formatted_log_time[FORMATTED_TS_LEN]; + +void setup_formatted_log_time(void); +void setup_formatted_start_time(void); We could just use a static buffer in each one of those routines, and return back a pointer to the caller. + else if ((Log_destination & LOG_DESTINATION_JSONLOG) && + (jsonlogFile == NULL || + time_based_rotation || (size_rotation_for & LOG_DESTINATION_JSONLOG))) [...] + /* Write to JSON log if enabled */ + else if (Log_destination & LOG_DESTINATION_JSONLOG) + { Those bits in 0004 are wrong. They should be two "if" clauses. This is leading to issues when setting log_destination to multiple values with jsonlog in the set of values and logging_connector = on, and the logs are not getting redirected properly to the .json file. We would get the data for the .log and .csv files though. This issue also happens with the original patch set applied on top of e757080. I think that we should also be more careful with the way we free StringInfoData in send_message_to_server_log(), or we are going to finish with unwelcome leaks. The same coding pattern is now repeated three times in logfile_rotate(): - Check if a logging type is enabled. - Optionally open new file, with logfile_open(). - Business with ENFILE and EMFILE. - pfree() and save of the static FILE* ane file name for each type. I think that we have enough material for a wrapper routine that does this work, where we pass down LOG_DESTINATION_* and pointers to the static FILE* and the static last file names. That would have avoided the bug I found above. The rebased patch set, that has reworked the tests to be in line with HEAD, also fails. They compile. Sehrope, could you adjust that? -- Michael
Attachment
On Sun, Sep 12, 2021 at 10:22 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Sep 10, 2021 at 03:56:18PM +0900, Michael Paquier wrote:
> And this part leads me to the attached, where the addition of the JSON
> format would result in the addition of a couple of lines.
Okay, I have worked through the first half of the patch set, and
applied the improved versions of 0001 (refactoring of the chunk
protocol) and 0002 (addition of the tests for csvlog).
Thanks. I finally got a chance to look through those changes. I like it. The popcount and pulling out the flags are much easier to follow than the old hard coded letters.
I have not looked in details at 0003 and 0004 yet. Still, here are
some comments after a quick scan.
+ * elog-internal.h
I'd rather avoid the hyphen, and use elog_internal.h.
+#define FORMATTED_TS_LEN 128
+extern char formatted_start_time[FORMATTED_TS_LEN];
+extern char formatted_log_time[FORMATTED_TS_LEN];
+
+void setup_formatted_log_time(void);
+void setup_formatted_start_time(void);
We could just use a static buffer in each one of those routines, and
return back a pointer to the caller.
+1
+ else if ((Log_destination & LOG_DESTINATION_JSONLOG) &&
+ (jsonlogFile == NULL ||
+ time_based_rotation || (size_rotation_for & LOG_DESTINATION_JSONLOG)))
[...]
+ /* Write to JSON log if enabled */
+ else if (Log_destination & LOG_DESTINATION_JSONLOG)
+ {
Those bits in 0004 are wrong. They should be two "if" clauses. This
is leading to issues when setting log_destination to multiple values
with jsonlog in the set of values and logging_connector = on, and the
logs are not getting redirected properly to the .json file. We would
get the data for the .log and .csv files though. This issue also
happens with the original patch set applied on top of e757080. I
think that we should also be more careful with the way we free
StringInfoData in send_message_to_server_log(), or we are going to
finish with unwelcome leaks.
Good catch. Staring at that piece again, that's tricky as it tries to aggressively free the buffer before calling write_cvslog(...). Which can't just be duplicated for additional destinations.
I think we need to pull up the negative case (i.e. syslogger not available) before the other destinations and if it matches, free and exit early. Otherwise, free the buffer and call whatever destination routines are enabled.
The same coding pattern is now repeated three times in logfile_rotate():
- Check if a logging type is enabled.
- Optionally open new file, with logfile_open().
- Business with ENFILE and EMFILE.
- pfree() and save of the static FILE* ane file name for each type.
I think that we have enough material for a wrapper routine that does
this work, where we pass down LOG_DESTINATION_* and pointers to the
static FILE* and the static last file names. That would have avoided
the bug I found above.
I started on a bit of this as well. There's so much overlap already between the syslog_ and csvlog code that I'm going to put that together first. Then the json addition should just fall into a couple of call sites.
I'm thinking of adding an internal struct to house the FILE* and file names. Then all the opening, closing, and log rotation code can pass pointers to those and centralize the pfree() and NULL checks.
The rebased patch set, that has reworked the tests to be in line with
HEAD, also fails. They compile.
Sehrope, could you adjust that?
Yes I'm looking to sync those up and address the above later this week.
On Mon, Sep 13, 2021 at 11:56:52PM -0400, Sehrope Sarkuni wrote: > Good catch. Staring at that piece again, that's tricky as it tries to > aggressively free the buffer before calling write_cvslog(...). Which can't > just be duplicated for additional destinations. > > I think we need to pull up the negative case (i.e. syslogger not available) > before the other destinations and if it matches, free and exit early. > Otherwise, free the buffer and call whatever destination routines are > enabled. Yes, I got a similar impression. > I started on a bit of this as well. There's so much overlap already between > the syslog_ and csvlog code that I'm going to put that together first. Then > the json addition should just fall into a couple of call sites. > > I'm thinking of adding an internal struct to house the FILE* and file > names. Then all the opening, closing, and log rotation code can pass > pointers to those and centralize the pfree() and NULL checks. Agreed on both points (using a structure and doing the refactoring as a first patch). -- Michael
Attachment
Attached three patches refactor the syslogger handling of file based destinations a bit ... and then a lot.
Second patch refactors the file descriptor serialization and re-opening in EXEC_BACKEND forking. Previously the code was duplicated for both stderr and csvlog. Again, this should simplify adding new destinations as they'd just invoke the same helper. There's an existing comment about not handling failed opens in syslogger_parseArgs(...) and this patch doesn't fix that, but it does provide a single location to do so in the future.
The third patch adds a new internal (to syslogger.c) structure, FileLogDestination, for file based log destinations and modifies the existing handling for syslogFile and csvlogFile to defer to a bunch of helper functions. It also renames "syslogFile" to "stderr_file_info". Working through this patch, it was initially confusing that the stderr log file was named "syslogFile", the C file is named syslogger.c, and there's an entirely separate syslog (the message logging API) destination. I think this clears that up a bit.
The patches pass check-world on Linux. I haven't tested it on Windows but it does pass check-world with EXEC_BACKEND defined to try out the forking code paths. Definitely needs some review to ensure it's functionally correct for the different log files.
I haven't tried splitting the csvlog code out or adding the new jsonlog atop this yet. There's enough changes here that I'd like to get this settled first.
Attachment
The previous patches failed on the cfbot Appveyor Windows build. That also makes me question whether I did the EXEC_BACKEND testing correctly as it should have caught that compile error. I'm going to look into that.
In the meantime the attached should correct that part of the build.
Attachment
On Thu, Sep 16, 2021 at 05:27:20PM -0400, Sehrope Sarkuni wrote: > Attached three patches refactor the syslogger handling of file based > destinations a bit ... and then a lot. > > First patch adds a new constant to represent both file based destinations. > This should make it easier to ensure additional destinations are handled in > "For all file destinations..." situations (e.g. when we add the jsonlog > destination). > > Second patch refactors the file descriptor serialization and re-opening in > EXEC_BACKEND forking. Previously the code was duplicated for both stderr > and csvlog. Again, this should simplify adding new destinations as they'd > just invoke the same helper. There's an existing comment about not handling > failed opens in syslogger_parseArgs(...) and this patch doesn't fix that, > but it does provide a single location to do so in the future. > > The third patch adds a new internal (to syslogger.c) structure, > FileLogDestination, for file based log destinations and modifies the > existing handling for syslogFile and csvlogFile to defer to a bunch of > helper functions. It also renames "syslogFile" to "stderr_file_info". > Working through this patch, it was initially confusing that the stderr log > file was named "syslogFile", the C file is named syslogger.c, and there's > an entirely separate syslog (the message logging API) destination. I think > this clears that up a bit. > > The patches pass check-world on Linux. I haven't tested it on Windows but > it does pass check-world with EXEC_BACKEND defined to try out the forking > code paths. Definitely needs some review to ensure it's functionally > correct for the different log files. Compilation with EXEC_BACKEND on Linux should work, so no need to bother with Windows when it comes to that. There is a buildfarm machine testing this configuration, for example. > I haven't tried splitting the csvlog code out or adding the new jsonlog > atop this yet. There's enough changes here that I'd like to get this > settled first. Makes sense. I am not really impressed by 0001 and the introduction of LOG_DESTINATIONS_WITH_FILES. So I would leave that out and keep the list of destinations listed instead. And we are talking about two places here, only within syslogger.c. 0002 looks like an improvement, but I think that 0003 makes the readability of the code worse with the introduction of eight static routines to handle all those cases. file_log_dest_should_rotate_for_size(), file_log_dest_close(), file_log_dest_check_rotate_for_size(), get_syslogger_file() don't bring major improvements. On the contrary, file_log_dest_write_metadata and file_log_dest_rotate seem to reduce a bit the noise. -- Michael
Attachment
On Thu, Sep 16, 2021 at 9:36 PM Michael Paquier <michael@paquier.xyz> wrote:
I am not really impressed by 0001 and the introduction of
LOG_DESTINATIONS_WITH_FILES. So I would leave that out and keep the
list of destinations listed instead. And we are talking about two
places here, only within syslogger.c.
I've taken that out for now. The idea was to simplify the additions when jsonlog is added but can circle back to that later if it makes sense.
0002 looks like an improvement,
Nice. That's left unchanged (renamed to 0001 now).
but I think that 0003 makes the
readability of the code worse with the introduction of eight static
routines to handle all those cases.
file_log_dest_should_rotate_for_size(), file_log_dest_close(),
file_log_dest_check_rotate_for_size(), get_syslogger_file() don't
bring major improvements. On the contrary,
file_log_dest_write_metadata and file_log_dest_rotate seem to reduce a
bit the noise.
It was helpful to split them out while working on the patch but I see your point upon re-reading through the result.
Attached version (renamed to 002) adds only three static functions for checking rotation size, performing the actual rotation, and closing. Also one other to handle generating the logfile names with a suffix to handle the pfree-ing (wrapping logfile_open(...)).
The rest of the changes happen in place using the new structures.
Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
Attachment
On Fri, Sep 17, 2021 at 04:36:57PM -0400, Sehrope Sarkuni wrote: > It was helpful to split them out while working on the patch but I see your > point upon re-reading through the result. > > Attached version (renamed to 002) adds only three static functions for > checking rotation size, performing the actual rotation, and closing. Also > one other to handle generating the logfile names with a suffix to handle > the pfree-ing (wrapping logfile_open(...)). > > The rest of the changes happen in place using the new structures. I have looked at that in details, and found that the introduction of FileLogDestination makes the code harder to follow, and that the introduction of the file extension, the destination name and the expected target destination LOG_DESTINATION_* had a limited impact because they are used in few places. The last two useful pieces are the FILE* handle and the last file name for current_logfiles. Attached are updated patches. The logic of 0001 to refactor the fd fetch/save logic when forking the syslogger in EXEC_BACKEND builds is unchanged. I have tweaked the patch with more comments and different routine names though. Patch 0002 refactors the main point that introduced FileLogDestination by refactoring the per-destination file rotation, not forgetting the fact that the file last name and handle for stderr can never be cleaned up even if LOG_DESTINATION_STDERR is disabled. Grepping after LOG_DESTINATION_CSVLOG in the code tree, I'd be fine to live with this level of abstraction as each per-destination change are grouped with each other so they are hard to miss. 0001 is in a rather commitable shape, and I have made the code consistent with HEAD. However, I think that its handling of _get_osfhandle() is clunky for 64-bit compilations as long is 32b in WIN32 but intptr_t is platform-dependent as it could be 32b or 64b, so atoi() would overflow if the handle is larger than INT_MAX for 64b builds: https://docs.microsoft.com/en-us/cpp/c-runtime-library/standard-types This problem deserves a different thread. It would be good for 0002 if an extra pair of eyes looks at it. While on it, I have renamed the existing last_file_name to last_sys_file_name in 0002 to make the naming more consistent with syslogFile. It is independent of 0001, so it could be done first as well. -- Michael
Attachment
On Tue, Sep 28, 2021 at 12:30:10PM +0900, Michael Paquier wrote: > 0001 is in a rather commitable shape, and I have made the code > consistent with HEAD. However, I think that its handling of > _get_osfhandle() is clunky for 64-bit compilations as long is 32b in > WIN32 but intptr_t is platform-dependent as it could be 32b or 64b, so > atoi() would overflow if the handle is larger than INT_MAX for 64b > builds: > https://docs.microsoft.com/en-us/cpp/c-runtime-library/standard-types > This problem deserves a different thread. This happens to not be a problem as only 32 bits are significant for handles for both Win32 and Win64. This also means that we should be able to remove the use for "long" in this code, making the routines more symmetric. I have done more tests with Win32 and Win64, and applied it. I don't have MinGW environments at hand, but looking at the upstream code that should not matter. The buildfarm will let us know soon enough if there is a problem thanks to the TAP tests of pg_ctl. -- Michael
Attachment
On Wed, Sep 29, 2021 at 11:02:10AM +0900, Michael Paquier wrote: > This happens to not be a problem as only 32 bits are significant for > handles for both Win32 and Win64. This also means that we should be > able to remove the use for "long" in this code, making the routines > more symmetric. I have done more tests with Win32 and Win64, and > applied it. I don't have MinGW environments at hand, but looking at > the upstream code that should not matter. The buildfarm will let > us know soon enough if there is a problem thanks to the TAP tests of > pg_ctl. So, I have been looking at the rest of the patch set for the last couple of days, and I think that I have spotted all the code paths that need to be smarter when it comes to multiple file-based log destinations. Attached is a new patch set: - 0001 does some refactoring of the file rotation in syslogger.c, that's the same patch as previously posted. - 0002 is more refactoring of elog.c, adding routines for the start timestamp, log timestamp, the backend type and an extra one to check if a query can be logged or not. - 0003 is a change to send_message_to_server_log() to be smarter regarding the fallback to stderr if a csvlog (or a jsonlog!) entry cannot be logged because the redirection is not ready yet. The code of HEAD processes first stderr, then csvlog, with csvlog moving back to stderr if not done yet. That's a bit strange, because for example on WIN32 we would lose any csvlog entry for a service. I propose here to do csvlog first, and fallback to stderr so as it gets done in one code path instead of two. I have spent quite a bit of time thinking about the best way to handle the case of multiple file log destinations here because we don't want to log multiple times to stderr if csvlog and jsonlog are both enabled. And I think that this is the simplest thing we could do. - 0004 moves the CSV-specific code into its own file. This include some refactoring of elog.c that should be moved to 0002, as this requires more routines of elog.c to be published: -- write_pipe_chunks() -- error_severity() - 0005 is the main meat, that introduces JSON as log_destination. This compiles and passes all my tests, but I have not really done an in-depth review of this code yet. 0002 and 0004 could be more polished and most of their pieces had better be squashed together. 0003, though, would improve the case of WIN32 where only csvlog is enabled so as log entries are properly redirected to the event logs if the redirection is not done yet. I'd like to move on with 0001 and 0003 as independent pieces. Sehrope, any thoughts? -- Michael
Attachment
On 10/5/21, 12:22 AM, "Michael Paquier" <michael@paquier.xyz> wrote: > - 0001 does some refactoring of the file rotation in syslogger.c, > that's the same patch as previously posted. My compiler is unhappy with 5c6e33f: syslogger.c: In function ‘logfile_rotate_dest’: syslogger.c:1302:11: warning: ‘logFileExt’ may be used uninitialized in this function [-Wmaybe-uninitialized] filename = logfile_getname(fntime, logFileExt); ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The attached patch seems to fix it. Nathan
Attachment
On Thu, Oct 07, 2021 at 06:00:08AM +0000, Bossart, Nathan wrote: > The attached patch seems to fix it. Thanks, sorry about that. I was able to see that once I compiled without assertions. -- Michael
Attachment
On Tue, Oct 05, 2021 at 04:18:17PM +0900, Michael Paquier wrote: > 0002 and 0004 could be more polished and most of their pieces had > better be squashed together. 0003, though, would improve the case of > WIN32 where only csvlog is enabled so as log entries are properly > redirected to the event logs if the redirection is not done yet. I'd > like to move on with 0001 and 0003 as independent pieces. 0001 and 0003 have been applied independently, attached is a rebased version. -- Michael
Attachment
On Fri, Oct 08, 2021 at 12:28:58PM +0900, Michael Paquier wrote: > 0001 and 0003 have been applied independently, attached is a rebased > version. Attached are rebased versions of the patch set, where I have done a cleaner split: - 0001 includes all the refactoring of the routines from elog.c. - 0002 moves csv logging into its own file. - 0003 introduces the JSON log. 0001 and 0002, the refactoring bits, are in a rather committable shape, so I'd like to apply that as the last refactoring pieces I know of for this thread. 0003 still needs a closer lookup, and one part I do not like much in it is the split for [u]int and long values when it comes to key and values. -- Michael
Attachment
On Tue, Oct 19, 2021 at 08:02:02PM +0900, Michael Paquier wrote: > 0001 and 0002, the refactoring bits, are in a rather committable > shape, so I'd like to apply that as the last refactoring pieces I know > of for this thread. 0003 still needs a closer lookup, and one part I > do not like much in it is the split for [u]int and long values when it > comes to key and values. I have finally come around 0003 and reviewed it. There were a couple of issues within it, from complications in the code that did not feel necessary to incorrect handling of the values logged, mostly around when values should be escaped or not. jsonlog.c has been reorganized so as its fields match with csvlog.c, and I have simplified the APIs in charge of saving the integers into a single one with an argument list and an option to control if the value should be escaped or not. postgresql.conf.sample also needed a refresh. I have also spent some time on the documentation, where the list of JSON keys with their descriptions and types has been changed to a table, for clarity. The list was a bit incorrect (incorrect fields and missing entries), so that should hopefully be clean now. Patch 0003 has been heavily reworked, and it would be good to have an extra pair of eyes on it. So I have switched the CF entry as "Needs Review" and added my name to the list of authors (originally this stuff took code portions of own module, as well). -- Michael
Attachment
Hi, On 2021-11-10 22:44:49 +0900, Michael Paquier wrote: > Patch 0003 has been heavily reworked, and it would be good to have an > extra pair of eyes on it. So I have switched the CF entry as "Needs > Review" and added my name to the list of authors (originally this > stuff took code portions of own module, as well). The tests don't seem to pass on windows: https://cirrus-ci.com/task/5412456754315264?logs=test_bin#L47 https://api.cirrus-ci.com/v1/artifact/task/5412456754315264/tap/src/bin/pg_ctl/tmp_check/log/regress_log_004_logrotate psql:<stdin>:1: ERROR: division by zero could not open "c:/cirrus/src/bin/pg_ctl/tmp_check/t_004_logrotate_primary_data/pgdata/current_logfiles": The system cannotfind the file specified at t/004_logrotate.pl line 87. Greetings, Andres Freund
On Sun, Jan 02, 2022 at 01:34:45PM -0800, Andres Freund wrote: > The tests don't seem to pass on windows: > https://cirrus-ci.com/task/5412456754315264?logs=test_bin#L47 > https://api.cirrus-ci.com/v1/artifact/task/5412456754315264/tap/src/bin/pg_ctl/tmp_check/log/regress_log_004_logrotate > > psql:<stdin>:1: ERROR: division by zero > could not open "c:/cirrus/src/bin/pg_ctl/tmp_check/t_004_logrotate_primary_data/pgdata/current_logfiles": The system cannotfind the file specified at t/004_logrotate.pl line 87. This seems to point out that the syslogger is too slow to capture the logrotate signal, and the patch set is introducing nothing new in terms of infrastructure, just an extra value for log_destination. This stuff passes here, and I am not spotting something amiss after an extra close read. Attached is an updated patch set that increases the test timeout (5min -> 10min). That should help, I assume. -- Michael
Attachment
On 1/5/22 02:32, Michael Paquier wrote: > On Sun, Jan 02, 2022 at 01:34:45PM -0800, Andres Freund wrote: >> The tests don't seem to pass on windows: >> https://cirrus-ci.com/task/5412456754315264?logs=test_bin#L47 >> https://api.cirrus-ci.com/v1/artifact/task/5412456754315264/tap/src/bin/pg_ctl/tmp_check/log/regress_log_004_logrotate >> >> psql:<stdin>:1: ERROR: division by zero >> could not open "c:/cirrus/src/bin/pg_ctl/tmp_check/t_004_logrotate_primary_data/pgdata/current_logfiles": The system cannotfind the file specified at t/004_logrotate.pl line 87. > This seems to point out that the syslogger is too slow to capture the > logrotate signal, and the patch set is introducing nothing new in > terms of infrastructure, just an extra value for log_destination. > This stuff passes here, and I am not spotting something amiss after an > extra close read. > > Attached is an updated patch set that increases the test timeout (5min > -> 10min). That should help, I assume. ITYM 3 min -> 6 min. But in any case, is that really going to solve this? The file should exist, even if its contents are not up to date, AIUI. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 1/6/22 13:06, Andrew Dunstan wrote: > On 1/5/22 02:32, Michael Paquier wrote: >> On Sun, Jan 02, 2022 at 01:34:45PM -0800, Andres Freund wrote: >>> The tests don't seem to pass on windows: >>> https://cirrus-ci.com/task/5412456754315264?logs=test_bin#L47 >>> https://api.cirrus-ci.com/v1/artifact/task/5412456754315264/tap/src/bin/pg_ctl/tmp_check/log/regress_log_004_logrotate >>> >>> psql:<stdin>:1: ERROR: division by zero >>> could not open "c:/cirrus/src/bin/pg_ctl/tmp_check/t_004_logrotate_primary_data/pgdata/current_logfiles": The systemcannot find the file specified at t/004_logrotate.pl line 87. >> This seems to point out that the syslogger is too slow to capture the >> logrotate signal, and the patch set is introducing nothing new in >> terms of infrastructure, just an extra value for log_destination. >> This stuff passes here, and I am not spotting something amiss after an >> extra close read. >> >> Attached is an updated patch set that increases the test timeout (5min >> -> 10min). That should help, I assume. > > ITYM 3 min -> 6 min. But in any case, is that really going to solve > this? The file should exist, even if its contents are not up to date, AIUI. I have tested on an msys2 setup with your v8 patches and I am getting this: # Failed test 'current_logfiles is sane' # at t/004_logrotate.pl line 96. # 'stderr log/postgresql-2022-01-06_222419.log # csvlog log/postgresql-2022-01-06_222419.csv # ' # doesn't match '(?^:^stderr log/postgresql-.*log # csvlog log/postgresql-.*csv # jsonlog log/postgresql-.*json$)' # Failed test 'found expected log file content for stderr' # at t/004_logrotate.pl line 103. # '' # doesn't match '(?^:division by zero)' # Failed test 'found expected log file content for jsonlog' # at t/004_logrotate.pl line 105. # undef # doesn't match '(?^:division by zero)' # Failed test 'pg_current_logfile() gives correct answer with jsonlog' # at t/004_logrotate.pl line 105. # got: '' # expected: undef # Looks like you failed 4 tests of 14. [22:37:31] t/004_logrotate.pl ... Dubious, test returned 4 (wstat 1024, 0x400) Failed 4/14 subtests [22:37:31] cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, Jan 06, 2022 at 06:28:26PM -0500, Andrew Dunstan wrote: > I have tested on an msys2 setup with your v8 patches and I am getting this: > > # Failed test 'current_logfiles is sane' > # at t/004_logrotate.pl line 96. > # 'stderr log/postgresql-2022-01-06_222419.log > # csvlog log/postgresql-2022-01-06_222419.csv Yes, I was waiting for the latest results, but that did not help at all. Something is wrong with the patch, I am not sure what yet, but that seems like a mistake in the backend part of it rather than the tests. I have switched the CF entry as waiting on author until this is addressed. -- Michael
Attachment
On Fri, Jan 07, 2022 at 03:49:47PM +0900, Michael Paquier wrote: > Yes, I was waiting for the latest results, but that did not help at > all. Something is wrong with the patch, I am not sure what yet, but > that seems like a mistake in the backend part of it rather than the > tests. I have switched the CF entry as waiting on author until this > is addressed. The issue comes from an incorrect change in syslogger_parseArgs() where I missed that the incrementation of argv by 3 has no need to be changed. A build with -DEXEC_BACKEND is enough to show the failure, which caused a crash when starting up the syslogger because of a NULL pointer dereference. The attached v9 should be enough to switch the CF bot to green. -- Michael
Attachment
On 1/10/22, 4:51 AM, "Michael Paquier" <michael@paquier.xyz> wrote: > The issue comes from an incorrect change in syslogger_parseArgs() > where I missed that the incrementation of argv by 3 has no need to be > changed. A build with -DEXEC_BACKEND is enough to show the failure, > which caused a crash when starting up the syslogger because of a NULL > pointer dereference. The attached v9 should be enough to switch the > CF bot to green. I've been looking at the latest patch set intermittently and playing around with jsonlog a little. It seems to work well, and I don't have any significant comments about the code. 0001 and 0002 seem straightforward and uncontroversial. IIUC 0003 simply introduces jsonlog using the existing framework. I wonder if we should consider tracking each log destination as a set of function pointers. The main logging code would just loop through the enabled log destinations and use these functions, and it otherwise would be completely detached (i.e., no "if jsonlog" blocks). This might open up the ability to define custom log destinations via modules, too. However, I don't know if there's any real demand for something like this, and it should probably be done separately from introducing jsonlog, anyway. Nathan
On Tue, Jan 11, 2022 at 08:34:26PM +0000, Bossart, Nathan wrote: > I've been looking at the latest patch set intermittently and playing > around with jsonlog a little. It seems to work well, and I don't have > any significant comments about the code. 0001 and 0002 seem > straightforward and uncontroversial. Thanks. I have looked again at 0001 and 0002 today and applied both, so it means that we are done with all the refactoring pieces proposed up to now. > IIUC 0003 simply introduces jsonlog using the existing framework. This part will have to wait a bit more, but yes, this piece should be straight-forward. > I wonder if we should consider tracking each log destination as a set > of function pointers. The main logging code would just loop through > the enabled log destinations and use these functions, and it otherwise > would be completely detached (i.e., no "if jsonlog" blocks). This > might open up the ability to define custom log destinations via > modules, too. However, I don't know if there's any real demand for > something like this, and it should probably be done separately from > introducing jsonlog, anyway. I am not sure that this is worth the complications, either. -- Michael
Attachment
On Wed, Jan 12, 2022 at 03:27:19PM +0900, Michael Paquier wrote: > This part will have to wait a bit more, but yes, this piece should be > straight-forward. Okay, this last piece has been applied this morning, after more review and a couple of adjustments, mainly cosmetic (pg_current_logfile missed a refresh, incorrect copyright in jsonlog.c, etc.). Let's see what the buildfarm thinks. -- Michael
Attachment
On Mon, Jan 17, 2022 at 10:48:06AM +0900, Michael Paquier wrote: > Okay, this last piece has been applied this morning, after more review > and a couple of adjustments, mainly cosmetic (pg_current_logfile > missed a refresh, incorrect copyright in jsonlog.c, etc.). Let's see > what the buildfarm thinks. By the way, while on it, using directly COPY to load the logs from a generated .json file can be trickier than it looks, as backslashes require an extra escap when loading the data. One idea, while not the best performance-wise, is to rely on COPY FROM PROGRAM with commands like that: CREATE TABLE logs (data jsonb); COPY logs FROM PROGRAM 'cat logs.json | sed ''s/\\/\\\\/g'''; -- Michael
Attachment
So, thinking about this, there is one important piece that is missing here, which is the ability to change the default format for what we write to stderr. Right now, if you have stderr output, it is always in the "plain multiline" format, with no option to change it. If you want a JSON log, you have to read a file. But ISTM it would be pretty useful if you could say "log_default_format=json" and get the log that we get in stderr in the JSON format instead. From what I hear in the container world, what they would *prefer* (but they don't often get) is to receive the JSON-format logs directly in stderr from the daemons they run; they capture stderr and they have the logs just in the format they need, without having to open the log files, parsing the lines to rewrite in a different format as is done currently. I think this would be a relatively easy patch to do. Opinions? -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > So, thinking about this, there is one important piece that is missing > here, which is the ability to change the default format for what we > write to stderr. Right now, if you have stderr output, it is always in > the "plain multiline" format, with no option to change it. If you want > a JSON log, you have to read a file. But ISTM it would be pretty useful > if you could say "log_default_format=json" and get the log that we get > in stderr in the JSON format instead. >> From what I hear in the container world, what they would *prefer* (but > they don't often get) is to receive the JSON-format logs directly in > stderr from the daemons they run; they capture stderr and they have the > logs just in the format they need, without having to open the log files, > parsing the lines to rewrite in a different format as is done currently. > I think this would be a relatively easy patch to do. Opinions? I think assuming that everything that comes out on the postmaster's stderr is generated by our code is hopelessly naive. See for example glibc's bleats when it detects malloc corruption, or when loading a shlib fails. So I don't believe something like this can be made to work reliably. The existing syslogger logic does have the ability to cope with such out-of-protocol data. So maybe, if you are using syslogger, you could have it transform such messages into some lowest-common-denominator jsonlog format. But it's not going to work to expect that to happen with raw stderr. regards, tom lane
On Thu, Feb 10, 2022 at 07:45:17PM -0300, Alvaro Herrera wrote: > From what I hear in the container world, what they would *prefer* (but > they don't often get) is to receive the JSON-format logs directly in > stderr from the daemons they run; they capture stderr and they have the > logs just in the format they need, without having to open the log files, > parsing the lines to rewrite in a different format as is done currently. Yes, I have been pinged about that, which is why there are still cases for my out-of-core extension jsonlog that uses the elog hook. > I think this would be a relatively easy patch to do. Opinions? The postmaster goes through a couple of loops with the fd to open for the default format, that the syslogger inherits from the postmaster, and I am pretty sure that there are a couple of code paths around the postmaster startup that can be tricky to reason about. Making the new parameter PGC_POSTMASTER makes things easier to handle, still the postmaster generates a couple of LOG entries and redirects them to stderr before loading any GUC values, which would mean that we cannot make sure that all the logs are valid JSON objects. If we want to be 100% waterproof here, we may want to track down the format to use by default with a mean different than a GUC for the postmaster startup? A file holding this information in the root of the data folder would be one way. -- Michael