Thread: Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

Dear Hackers








When I audit the Postgresql database recently, I found that after configuring the log type as csv, the output log content is as follows: "database ""lp_db1"" does not exist",,,,,"DROP DATABASE lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very inconvenient to understand the real meaning of each field. And in the log content," is escaped as "", which is not friendly to regular expression matching. Therefore, I want to modify the csv log function, change its format to key:value, assign the content of the non-existing field to NULL, and at the same time, " will be escaped as \" in the log content. After the modification, the above log format is as follows: Log_time:"2022-03-15 09:17:55.289 CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number: "622fe941.464b",PS_display:"DROP DATABASE",Session_start_timestamp:"2022-03-15 09:17:53 CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code :"3D000",Errmessage:"database \"lp_db1\" does not exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query :"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb, dbcommands.c:841",Application_name:"NULL",Backend_type:"client backend",Leader_PID:"0",Query_id:"0"
















Regards,




--
-lupeng

Hi,

On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:
>
> When I audit the Postgresql database recently, I found that after configuring
> the log type as csv, the output log content is as follows: "database
> ""lp_db1"" does not exist",,,,,"DROP DATABASE lp_db1;",,"dropdb,
> dbcommands.c:841","","client backend",,0 It is very inconvenient to
> understand the real meaning of each field. And in the log content," is
> escaped as "", which is not friendly to regular expression matching.
> Therefore, I want to modify the csv log function, change its format to
> key:value, assign the content of the non-existing field to NULL, and at the
> same time, " will be escaped as  \" in the log content. After the
> modification, the above log format is as follows: Log_time:"2022-03-15
> 09:17:55.289
> CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995", [...]

This would make the logs a lot more verbose, and a lot less easy to process if
you process them with tools intended for csv files.

You should consider using the newly introduced jsonlog format (as soon as pg15
is released), which seems closer to what you want.



On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:
> Dear Hackers
> When I audit the Postgresql database recently, I found that after
> configuring the log type as csv, the output log content is as follows:
> "database ""lp_db1"" does not exist",,,,,"DROP DATABASE
> lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very
> inconvenient to understand the real meaning of each field. And in the
> log content," is escaped as "", which is not friendly to regular
> expression matching. Therefore, I want to modify the csv log function,
> change its format to key:value, assign the content of the non-existing
> field to NULL, and at the same time, " will be escaped as  \" in the
> log content. After the modification, the above log format is as
> follows: Log_time:"2022-03-15 09:17:55.289
>
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number:
> "622fe941.464b",PS_display:"DROP
> DATABASE",Session_start_timestamp:"2022-03-15 09:17:53
> CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code
> :"3D000",Errmessage:"database \"lp_db1\" does not
> exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query
> :"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb,
> dbcommands.c:841",Application_name:"NULL",Backend_type:"client
> backend",Leader_PID:"0",Query_id:"0"

CSV format is well documented
(https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG).

If you want named fields you can wait for pg15 and its jsonlog
(https://www.depesz.com/2022/01/17/waiting-for-postgresql-15-introduce-log_destinationjsonlog/).

I, for one, wouldn't want to have to deal with field names repeated in
every single record.

depesz



On 3/15/22 09:30, hubert depesz lubaczewski wrote:
> On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:
>> Dear Hackers
>> When I audit the Postgresql database recently, I found that after
>> configuring the log type as csv, the output log content is as follows:
>> "database ""lp_db1"" does not exist",,,,,"DROP DATABASE
>> lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very
>> inconvenient to understand the real meaning of each field. And in the
>> log content," is escaped as "", which is not friendly to regular
>> expression matching. Therefore, I want to modify the csv log function,
>> change its format to key:value, assign the content of the non-existing
>> field to NULL, and at the same time, " will be escaped as  \" in the
>> log content. After the modification, the above log format is as
>> follows: Log_time:"2022-03-15 09:17:55.289
>>
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number:
>> "622fe941.464b",PS_display:"DROP
>> DATABASE",Session_start_timestamp:"2022-03-15 09:17:53
>> CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code
>> :"3D000",Errmessage:"database \"lp_db1\" does not
>> exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query
>> :"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb,
>> dbcommands.c:841",Application_name:"NULL",Backend_type:"client
>> backend",Leader_PID:"0",Query_id:"0"
> CSV format is well documented
> (https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG).
>
> If you want named fields you can wait for pg15 and its jsonlog
> (https://www.depesz.com/2022/01/17/waiting-for-postgresql-15-introduce-log_destinationjsonlog/).
>
> I, for one, wouldn't want to have to deal with field names repeated in
> every single record.
>

Indeed. And even if this were a good idea, which it's not, it would be
15 years too late.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




On 3/15/22 10:12, Andrew Dunstan wrote:
> 
> On 3/15/22 09:30, hubert depesz lubaczewski wrote:
>> On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:
>>> Dear Hackers
>>> When I audit the Postgresql database recently, I found that after
>>> configuring the log type as csv, the output log content is as follows:
>>> "database ""lp_db1"" does not exist",,,,,"DROP DATABASE
>>> lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very
>>> inconvenient to understand the real meaning of each field. And in the
>>> log content," is escaped as "", which is not friendly to regular
>>> expression matching. Therefore, I want to modify the csv log function,
>>> change its format to key:value, assign the content of the non-existing
>>> field to NULL, and at the same time, " will be escaped as  \" in the
>>> log content. After the modification, the above log format is as
>>> follows: Log_time:"2022-03-15 09:17:55.289
>>>
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number:
>>> "622fe941.464b",PS_display:"DROP
>>> DATABASE",Session_start_timestamp:"2022-03-15 09:17:53
>>> CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code
>>> :"3D000",Errmessage:"database \"lp_db1\" does not
>>> exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query
>>> :"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb,
>>> dbcommands.c:841",Application_name:"NULL",Backend_type:"client
>>> backend",Leader_PID:"0",Query_id:"0"
>> CSV format is well documented
>> (https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG).
>>
>> If you want named fields you can wait for pg15 and its jsonlog
>> (https://www.depesz.com/2022/01/17/waiting-for-postgresql-15-introduce-log_destinationjsonlog/).
>>
>> I, for one, wouldn't want to have to deal with field names repeated in
>> every single record.
>>
> 
> Indeed. And even if this were a good idea, which it's not, it would be
> 15 years too late.

Also, the CSV format, while human readable to a degree, wasn't meant for 
direct, human consumption. It was meant to be read by programs and at 
the time, CSV made the most sense.


Regards, Jan



On Tue, Mar 15, 2022 at 10:33:42AM -0400, Jan Wieck wrote:
> Also, the CSV format, while human readable to a degree, wasn't meant for
> direct, human consumption. It was meant to be read by programs and at the
> time, CSV made the most sense.

FWIW, I have noticed that this patch was still listed in the next CF,
with a reference to an incorrect thread:
https://commitfest.postgresql.org/38/3591/

I have updated the CF entry to poin to this thread, and it is clear
that csvlog is not going to change now so this patch status has been
switched to rejected.
--
Michael

Attachment