Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

Started by lupengalmost 4 years ago6 messages
#1lupeng
lpmstsc@foxmail.com

Dear Hackers

When I audit the Postgresql database recently, I found that after configuring the log type as csv, the output log content is as follows: "database ""lp_db1"" does not exist",,,,,"DROP DATABASE lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very inconvenient to understand the real meaning of each field. And in the log content," is escaped as "", which is not friendly to regular expression matching. Therefore, I want to modify the csv log function, change its format to key:value, assign the content of the non-existing field to NULL, and at the same time, " will be escaped as \" in the log content. After the modification, the above log format is as follows: Log_time:"2022-03-15 09:17:55.289 CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number: "622fe941.464b",PS_display:"DROP DATABASE",Session_start_timestamp:"2022-03-15 09:17:53 CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code :"3D000",Errmessage:"database \"lp_db1\" does not exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query :"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb, dbcommands.c:841",Application_name:"NULL",Backend_type:"client backend",Leader_PID:"0",Query_id:"0"

Regards,

--
-lupeng

#2Julien Rouhaud
rjuju123@gmail.com
In reply to: lupeng (#1)
Re: Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

Hi,

On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:

When I audit the Postgresql database recently, I found that after configuring
the log type as csv, the output log content is as follows: "database
""lp_db1"" does not exist",,,,,"DROP DATABASE lp_db1;",,"dropdb,
dbcommands.c:841","","client backend",,0 It is very inconvenient to
understand the real meaning of each field. And in the log content," is
escaped as "", which is not friendly to regular expression matching.
Therefore, I want to modify the csv log function, change its format to
key:value, assign the content of the non-existing field to NULL, and at the
same time, " will be escaped as \" in the log content. After the
modification, the above log format is as follows: Log_time:"2022-03-15
09:17:55.289
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995", [...]

This would make the logs a lot more verbose, and a lot less easy to process if
you process them with tools intended for csv files.

You should consider using the newly introduced jsonlog format (as soon as pg15
is released), which seems closer to what you want.

In reply to: lupeng (#1)
Re: Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:

Dear Hackers
When I audit the Postgresql database recently, I found that after
configuring the log type as csv, the output log content is as follows:
"database ""lp_db1"" does not exist",,,,,"DROP DATABASE
lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very
inconvenient to understand the real meaning of each field. And in the
log content," is escaped as "", which is not friendly to regular
expression matching. Therefore, I want to modify the csv log function,
change its format to key:value, assign the content of the non-existing
field to NULL, and at the same time, " will be escaped as \" in the
log content. After the modification, the above log format is as
follows: Log_time:"2022-03-15 09:17:55.289
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number:
"622fe941.464b",PS_display:"DROP
DATABASE",Session_start_timestamp:"2022-03-15 09:17:53
CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code
:"3D000",Errmessage:"database \"lp_db1\" does not
exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query
:"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb,
dbcommands.c:841",Application_name:"NULL",Backend_type:"client
backend",Leader_PID:"0",Query_id:"0"

CSV format is well documented
(https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG).

If you want named fields you can wait for pg15 and its jsonlog
(https://www.depesz.com/2022/01/17/waiting-for-postgresql-15-introduce-log_destinationjsonlog/).

I, for one, wouldn't want to have to deal with field names repeated in
every single record.

depesz

#4Andrew Dunstan
andrew@dunslane.net
In reply to: hubert depesz lubaczewski (#3)
Re: Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

On 3/15/22 09:30, hubert depesz lubaczewski wrote:

On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:

Dear Hackers
When I audit the Postgresql database recently, I found that after
configuring the log type as csv, the output log content is as follows:
"database ""lp_db1"" does not exist",,,,,"DROP DATABASE
lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very
inconvenient to understand the real meaning of each field. And in the
log content," is escaped as "", which is not friendly to regular
expression matching. Therefore, I want to modify the csv log function,
change its format to key:value, assign the content of the non-existing
field to NULL, and at the same time, " will be escaped as \" in the
log content. After the modification, the above log format is as
follows: Log_time:"2022-03-15 09:17:55.289
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number:
"622fe941.464b",PS_display:"DROP
DATABASE",Session_start_timestamp:"2022-03-15 09:17:53
CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code
:"3D000",Errmessage:"database \"lp_db1\" does not
exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query
:"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb,
dbcommands.c:841",Application_name:"NULL",Backend_type:"client
backend",Leader_PID:"0",Query_id:"0"

CSV format is well documented
(https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG).

If you want named fields you can wait for pg15 and its jsonlog
(https://www.depesz.com/2022/01/17/waiting-for-postgresql-15-introduce-log_destinationjsonlog/).

I, for one, wouldn't want to have to deal with field names repeated in
every single record.

Indeed. And even if this were a good idea, which it's not, it would be
15 years too late.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#5Jan Wieck
jan@wi3ck.info
In reply to: Andrew Dunstan (#4)
Re: Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

On 3/15/22 10:12, Andrew Dunstan wrote:

On 3/15/22 09:30, hubert depesz lubaczewski wrote:

On Tue, Mar 15, 2022 at 09:31:19AM +0800, lupeng wrote:

Dear Hackers
When I audit the Postgresql database recently, I found that after
configuring the log type as csv, the output log content is as follows:
"database ""lp_db1"" does not exist",,,,,"DROP DATABASE
lp_db1;",,"dropdb, dbcommands.c:841","","client backend",,0 It is very
inconvenient to understand the real meaning of each field. And in the
log content," is escaped as "", which is not friendly to regular
expression matching. Therefore, I want to modify the csv log function,
change its format to key:value, assign the content of the non-existing
field to NULL, and at the same time, " will be escaped as \" in the
log content. After the modification, the above log format is as
follows: Log_time:"2022-03-15 09:17:55.289
CST",User_name:"postgres",Database_name:"lp_db",Process_id:"17995",Remote_host:"192.168.88.130",Remote_port:"38402",Line_number:
"622fe941.464b",PS_display:"DROP
DATABASE",Session_start_timestamp:"2022-03-15 09:17:53
CST",Virtual_transaction_id:"3/2",Transaction_id:"NULL",Error_severity:"ERROR",SQL_state_code
:"3D000",Errmessage:"database \"lp_db1\" does not
exist",Errdetail:"NULL",Errhint:"NULL",Internal_query:"NULL",Internal_pos:"0",Errcontext:"NULL",User_query
:"DROP DATABASE lp_db1;",Cursorpos:"NULL",File_location:"dropdb,
dbcommands.c:841",Application_name:"NULL",Backend_type:"client
backend",Leader_PID:"0",Query_id:"0"

CSV format is well documented
(https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG).

If you want named fields you can wait for pg15 and its jsonlog
(https://www.depesz.com/2022/01/17/waiting-for-postgresql-15-introduce-log_destinationjsonlog/).

I, for one, wouldn't want to have to deal with field names repeated in
every single record.

Indeed. And even if this were a good idea, which it's not, it would be
15 years too late.

Also, the CSV format, while human readable to a degree, wasn't meant for
direct, human consumption. It was meant to be read by programs and at
the time, CSV made the most sense.

Regards, Jan

#6Michael Paquier
michael@paquier.xyz
In reply to: Jan Wieck (#5)
Re: Change the csv log to 'key:value' to facilitate the user to understanding and processing of logs

On Tue, Mar 15, 2022 at 10:33:42AM -0400, Jan Wieck wrote:

Also, the CSV format, while human readable to a degree, wasn't meant for
direct, human consumption. It was meant to be read by programs and at the
time, CSV made the most sense.

FWIW, I have noticed that this patch was still listed in the next CF,
with a reference to an incorrect thread:
https://commitfest.postgresql.org/38/3591/

I have updated the CF entry to poin to this thread, and it is clear
that csvlog is not going to change now so this patch status has been
switched to rejected.
--
Michael