Truncate logs by max_log_size
Hi hackers,
I would like to suggest a patch to truncate some long queries. I believe
sometimes there is no need to log a query containing some gigabytes of
blob, for example. In patch a new parameter, named max_log_size, is
introduced. It defines the maximum size of logged query, in bytes.
Everything beyond that size is truncated.
Best regards,
Kirill Gavrilov
Attachments:
0001-parameter-max_log_size-to-truncate-logs.patchapplication/octet-stream; name=0001-parameter-max_log_size-to-truncate-logs.patchDownload
From 8f2d9a5c9aa6cf863a29df6ba178f9b054e9a484 Mon Sep 17 00:00:00 2001
From: diphantxm <diphantxm@yandex-team.ru>
Date: Thu, 26 Sep 2024 21:13:21 +0300
Subject: [PATCH] parameter max_log_size to truncate logs
There is no need to log the entire query, because it may be large and take lots of space on disk. Parameter max_log_size set the maximum length for logged query. Everything beyond that length is truncated. Value 0 disables the parameter.
---
src/backend/utils/error/elog.c | 8 ++++++++
src/backend/utils/misc/guc_tables.c | 11 +++++++++++
src/backend/utils/misc/postgresql.conf.sample | 2 ++
src/bin/pg_ctl/t/004_logrotate.pl | 15 +++++++++++++++
src/include/utils/elog.h | 1 +
5 files changed, 37 insertions(+)
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 5898100acb..5938aecf69 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -114,6 +114,7 @@ int Log_destination = LOG_DESTINATION_STDERR;
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+int max_log_size = 0;
/* Processed form of backtrace_symbols GUC */
static char *backtrace_symbol_list;
@@ -1675,6 +1676,13 @@ EmitErrorReport(void)
CHECK_STACK_DEPTH();
oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+ if (max_log_size != 0 && debug_query_string != NULL)
+ {
+ char* str = debug_query_string;
+ str[pg_mbcliplen(str, strlen(str), max_log_size)] = '\0';
+ debug_query_string = str;
+ }
+
/*
* Call hook before sending message to log. The hook function is allowed
* to turn off edata->output_to_server, so we must recheck that afterward.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index b46e3b8c55..4e18bfee8c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3467,6 +3467,17 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_log_size", PGC_SIGHUP, LOGGING_WHAT,
+ gettext_noop("Sets max size of logged statement."),
+ NULL
+ },
+ &max_log_size,
+ 5 * (1024 * 1024),
+ 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index d06074b86f..f26812d6da 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -592,6 +592,8 @@
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
+#max_log_size = 0 # max size of logged statement_timeout
+ # 0 disables the feature
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index 10815a60d4..d1a0a21473 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -69,6 +69,7 @@ log_destination = 'stderr, csvlog, jsonlog'
# these ensure stability of test results:
log_rotation_age = 0
lc_messages = 'C'
+max_log_size = 32
));
$node->start();
@@ -135,6 +136,20 @@ check_log_pattern('stderr', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
+$node->psql('postgres', 'INSERT INTO SOME_NON_EXISTANT_TABLE VALUES (TEST)');
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+{
+ eval {
+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
+ };
+ last unless $@;
+ usleep(100_000);
+}
+die $@ if $@;
+check_log_pattern('stderr', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('csvlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('jsonlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+
$node->stop();
done_testing();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 4a9562fdaa..43330ab893 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -501,6 +501,7 @@ extern PGDLLIMPORT int Log_destination;
extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT int max_log_size;
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
--
2.39.2 (Apple Git-143)
On Thu, Sep 26, 2024, at 3:30 PM, diPhantxm wrote:
I would like to suggest a patch to truncate some long queries. I believe sometimes there is no need to log a query containing some gigabytes of blob, for example. In patch a new parameter, named max_log_size, is introduced. It defines the maximum size of logged query, in bytes. Everything beyond that size is truncated.
I don't know how useful is this proposal. IMO the whole query is usually
crucial for an analysis. Let's say you arbitrarily provide max_log_size = 100
but it means you cannot see a WHERE clause and you have a performance issue in
that query. It won't be possible to obtain the whole query for an EXPLAIN. It
would break audit systems that requires the whole query. I don't know if there
are some log-based replication systems but it would break such tools too.
There are other ways to avoid logging such long queries. The GRANT ... ON
PARAMETER and SET LOCAL commands are your friends. Hence, you can disable
specific long queries even if you are not a superuser.
If your main problem is disk space, you can adjust the rotation settings or have
an external tool to manage your log files (or even use syslog).
--
Euler Taveira
EDB https://www.enterprisedb.com/
On 27 Sep 2024, at 03:30, Euler Taveira <euler@eulerto.com> wrote:
Let's say you arbitrarily provide max_log_size = 100
Consider max_log_size = 10Mb. The perspective might look very different. It’s not about WHERE anymore. It's a guard against heavy abuse.
The feature looks very important for me.
Best regards, Andrey Borodin.
On 27.09.24 12:36, Andrey M. Borodin wrote:
Consider max_log_size = 10Mb. The perspective might look very different.
It’s not about WHERE anymore. It's a guard against heavy abuse.The feature looks very important for me.
I have the same opinion. As a fail safe it sounds very useful to me.
Unfortunately, the patch does not apply:
$ git apply
~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch
-v
Checking patch src/backend/utils/error/elog.c...
error: while searching for:
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
/* Processed form of backtrace_symbols GUC */
static char *backtrace_symbol_list;
error: patch failed: src/backend/utils/error/elog.c:114
error: src/backend/utils/error/elog.c: patch does not apply
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset 247 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset 23 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
error: while searching for:
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
$node->stop();
done_testing();
error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135
error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply
Checking patch src/include/utils/elog.h...
Hunk #1 succeeded at 502 (offset 1 line).
Does it truncate only single queries or also a transaction with many
statements?
Thanks!
Best, Jim
On Fri, Sep 27, 2024 at 6:37 AM Andrey M. Borodin <x4mmm@yandex-team.ru>
wrote:
Consider max_log_size = 10Mb. The perspective might look very different.
It’s not about WHERE anymore. It's a guard against heavy abuse.
Can you elaborate on this? Do you mean someone purposefully writing large
entries to your log file?
On Tue, Oct 1, 2024 at 2:46 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 27.09.24 12:36, Andrey M. Borodin wrote:
Consider max_log_size = 10Mb. The perspective might look very different.
It’s not about WHERE anymore. It's a guard against heavy abuse.The feature looks very important for me.
I have the same opinion. As a fail safe it sounds very useful to me.
Unfortunately, the patch does not apply:
$ git apply
~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch
-v
Checking patch src/backend/utils/error/elog.c...
error: while searching for:
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;/* Processed form of backtrace_symbols GUC */
static char *backtrace_symbol_list;error: patch failed: src/backend/utils/error/elog.c:114
error: src/backend/utils/error/elog.c: patch does not apply
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset 247 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset 23 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
error: while searching for:
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);$node->stop();
done_testing();
error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135
error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply
Checking patch src/include/utils/elog.h...
Hunk #1 succeeded at 502 (offset 1 line).Does it truncate only single queries or also a transaction with many
statements?Thanks!
Best, Jim
My apologies, attached patch should work on master branch.
It truncates single queries and a statement in transaction that occurred an
error.
Attachments:
0002-parameter-max_log_size-to-truncate-logs.patchapplication/octet-stream; name=0002-parameter-max_log_size-to-truncate-logs.patchDownload
From d2eb9ccde970f752be12ce4dbcf40c55806599c9 Mon Sep 17 00:00:00 2001
From: diphantxm <diphantxm@yandex-team.ru>
Date: Tue, 1 Oct 2024 17:28:16 +0300
Subject: [PATCH] parameter max_log_size to truncate logs
There is no need to log the entire query, because it may be large and take lots of space on disk. Parameter max_log_size set the maximum length for logged query. Everything beyond that length is truncated. Value 0 disables the parameter.
---
src/backend/utils/error/elog.c | 8 ++++++++
src/backend/utils/misc/guc_tables.c | 11 +++++++++++
src/backend/utils/misc/postgresql.conf.sample | 2 ++
src/bin/pg_ctl/t/004_logrotate.pl | 15 +++++++++++++++
src/include/utils/elog.h | 1 +
5 files changed, 37 insertions(+)
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 5cbb5b5416..0690ec57ce 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -111,6 +111,7 @@ int Log_destination = LOG_DESTINATION_STDERR;
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+int max_log_size = 0;
/* Processed form of backtrace_functions GUC */
static char *backtrace_function_list;
@@ -1693,6 +1694,13 @@ EmitErrorReport(void)
CHECK_STACK_DEPTH();
oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+ if (max_log_size != 0 && debug_query_string != NULL)
+ {
+ char* str = debug_query_string;
+ str[pg_mbcliplen(str, strlen(str), max_log_size)] = '\0';
+ debug_query_string = str;
+ }
+
/*
* Reset the formatted timestamp fields before emitting any logs. This
* includes all the log destinations and emit_log_hook, as the latter
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..cf1404c059 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3714,6 +3714,17 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_log_size", PGC_SIGHUP, LOGGING_WHAT,
+ gettext_noop("Sets max size of logged statement."),
+ NULL
+ },
+ &max_log_size,
+ 5 * (1024 * 1024),
+ 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 667e0dc40a..0ffb32d5fb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -615,6 +615,8 @@
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
+#max_log_size = 0 # max size of logged statement_timeout
+ # 0 disables the feature
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index eacca1a652..1c1f89cb0a 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -69,6 +69,7 @@ log_destination = 'stderr, csvlog, jsonlog'
# these ensure stability of test results:
log_rotation_age = 0
lc_messages = 'C'
+max_log_size = 32
));
$node->start();
@@ -135,6 +136,20 @@ check_log_pattern('stderr', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
+$node->psql('postgres', 'INSERT INTO SOME_NON_EXISTANT_TABLE VALUES (TEST)');
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+{
+ eval {
+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
+ };
+ last unless $@;
+ usleep(100_000);
+}
+die $@ if $@;
+check_log_pattern('stderr', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('csvlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('jsonlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+
$node->stop();
done_testing();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index e54eca5b48..a8e30006f2 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -502,6 +502,7 @@ extern PGDLLIMPORT int Log_destination;
extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT int max_log_size;
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
--
2.39.2 (Apple Git-143)
On 01.10.24 17:46, Kirill Gavrilov wrote:
My apologies, attached patch should work on master branch.
Nice.
I tested the feature and it does what it is intended to.
postgres=# SHOW max_log_size;
max_log_size
--------------
10
(1 row)
postgres=# INSERT INTO t VALUES (1,2,3,4,5,6,7,8,9,0);
ERROR: relation "t" does not exist
LINE 1: INSERT INTO t VALUES (1,2,3,4,5,6,7,8,9,0);
Log file:
2024-10-02 00:57:13.618 CEST [1975926] ERROR: relation "t" does not
exist at character 13
2024-10-02 00:57:13.618 CEST [1975926] STATEMENT: INSERT INT
A few observations:
1) You missed a const qualifier in elog.c:
elog.c: In function ‘EmitErrorReport’:
elog.c:1699:29: warning: initialization discards ‘const’ qualifier from
pointer target type [-Wdiscarded-qualifiers]
1699 | char* str = debug_query_string;
| ^~~~~~~~~~~~~~~~~~
2) The new parameter cannot be set within a session with SET. Is it
supposed to be like this? IMHO it would be nice to able to temporarily
set this parameter without having to reload the postgresql.conf
postgres=# SET max_log_size TO 100;
ERROR: parameter "max_log_size" cannot be changed now
3) I personally find -1 more intuitive than 0 to disable a parameter,
but I have no strong opinion about it.
4) It still lacks documentation.
--
Jim
On Tue, 1 Oct 2024 at 20:46, Kirill Gavrilov <diphantxm@gmail.com> wrote:
On Tue, Oct 1, 2024 at 2:46 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 27.09.24 12:36, Andrey M. Borodin wrote:
Consider max_log_size = 10Mb. The perspective might look very different.
It’s not about WHERE anymore. It's a guard against heavy abuse.The feature looks very important for me.
I have the same opinion. As a fail safe it sounds very useful to me.
Unfortunately, the patch does not apply:
$ git apply
~/patches/max_log_query/0001-parameter-max_log_size-to-truncate-logs.patch
-v
Checking patch src/backend/utils/error/elog.c...
error: while searching for:
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;/* Processed form of backtrace_symbols GUC */
static char *backtrace_symbol_list;error: patch failed: src/backend/utils/error/elog.c:114
error: src/backend/utils/error/elog.c: patch does not apply
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset 247 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset 23 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
error: while searching for:
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);$node->stop();
done_testing();
error: patch failed: src/bin/pg_ctl/t/004_logrotate.pl:135
error: src/bin/pg_ctl/t/004_logrotate.pl: patch does not apply
Checking patch src/include/utils/elog.h...
Hunk #1 succeeded at 502 (offset 1 line).Does it truncate only single queries or also a transaction with many
statements?Thanks!
Best, Jim
My apologies, attached patch should work on master branch.
It truncates single queries and a statement in transaction that occurred an error.
Hello! Please deliver the most recent patch version and fix the issues
Jim identified [0]/messages/by-id/35096a36-04d4-480b-a7cd-a2d8151fb737@uni-muenster.de as the current commitfest draws to a close.
Do not forget to include both parts of this patch (This was actually
developed off-list, and we are now using this on our cloud PostgreSQL
distribution on versions 12–17).
[0]: /messages/by-id/35096a36-04d4-480b-a7cd-a2d8151fb737@uni-muenster.de
--
Best regards,
Kirill Reshke
On 25.11.24 17:52, Kirill Reshke wrote:
Hello! Please deliver the most recent patch version and fix the issues
Jim identified [0] as the current commitfest draws to a close.
Do not forget to include both parts of this patch (This was actually
developed off-list, and we are now using this on our cloud PostgreSQL
distribution on versions 12–17).[0] /messages/by-id/35096a36-04d4-480b-a7cd-a2d8151fb737@uni-muenster.de
In addition to these points, this feature seems to fail with queries
containing special characters (more than one byte):
With this "max_log_size"..
postgres=# SHOW max_log_size;
max_log_size
--------------
20
(1 row)
... and this query ..
postgres=# SELECT "ÄÜÖ" FROM t;
ERROR: relation "t" does not exist
LINE 1: SELECT "ÄÜÖ" FROM t;
.. this is the [truncated] log entry we get ..
2024-11-28 14:58:57.912 CET [2258876] ERROR: relation "t" does not
exist at character 19
2024-11-28 14:58:57.912 CET [2258876] STATEMENT: SELECT "ÄÜÖ" FROM
... although the query originally had exactly 20 characters:
postgres=# SELECT length('SELECT "ÄÜÖ" FROM t;');
length
--------
20
(1 row)
postgres=# SELECT length('ÄÜÖ'::bytea), length('AUO'::bytea);
length | length
--------+--------
6 | 3
(1 row)
If it is supposed to be like this, it should be clearly stated so in the
docs.
--
Jim
On Thu, Nov 28, 2024 at 5:23 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 25.11.24 17:52, Kirill Reshke wrote:
Hello! Please deliver the most recent patch version and fix the issues
Jim identified [0] as the current commitfest draws to a close.
Do not forget to include both parts of this patch (This was actually
developed off-list, and we are now using this on our cloud PostgreSQL
distribution on versions 12–17).[0]
/messages/by-id/35096a36-04d4-480b-a7cd-a2d8151fb737@uni-muenster.de
In addition to these points, this feature seems to fail with queries
containing special characters (more than one byte):With this "max_log_size"..
postgres=# SHOW max_log_size;
max_log_size
--------------
20
(1 row)... and this query ..
postgres=# SELECT "ÄÜÖ" FROM t;
ERROR: relation "t" does not exist
LINE 1: SELECT "ÄÜÖ" FROM t;.. this is the [truncated] log entry we get ..
2024-11-28 14:58:57.912 CET [2258876] ERROR: relation "t" does not
exist at character 19
2024-11-28 14:58:57.912 CET [2258876] STATEMENT: SELECT "ÄÜÖ" FROM... although the query originally had exactly 20 characters:
postgres=# SELECT length('SELECT "ÄÜÖ" FROM t;');
length
--------
20
(1 row)postgres=# SELECT length('ÄÜÖ'::bytea), length('AUO'::bytea);
length | length
--------+--------
6 | 3
(1 row)If it is supposed to be like this, it should be clearly stated so in the
docs.--
JimHere is version 3 of this patch. I found another place where this
setting can be applied.
Also added some documentation and specified that this setting truncates
queries by size in bytes.
Attachments:
V3-0001-parameter-max_log_size-to-truncate-logs.patchapplication/octet-stream; name=V3-0001-parameter-max_log_size-to-truncate-logs.patchDownload
From 9439eb1c5c194d38a6aa388d8d8ca34350ae300f Mon Sep 17 00:00:00 2001
From: diphantxm <diphantxm@yandex-team.ru>
Date: Thu, 28 Nov 2024 21:24:59 +0300
Subject: [PATCH] parameter max_log_size to truncate logs
There is no need to log the entire query, because it may be large and take lots of space on disk. Parameter max_log_size set the maximum length for logged query. Everything beyond that length is truncated. Value 0 disables the parameter.
---
doc/src/sgml/config.sgml | 15 ++++++++
src/backend/tcop/postgres.c | 10 ++++-
src/backend/utils/error/elog.c | 38 +++++++++++++++++++
src/backend/utils/misc/guc_tables.c | 11 ++++++
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/pg_ctl/t/004_logrotate.pl | 15 ++++++++
src/include/utils/elog.h | 2 +
7 files changed, 91 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 76ab72db96..5110b61ba5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7913,6 +7913,21 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-log-size" xreflabel="max_log_size">
+ <term><varname>max_log_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_log_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If greater than zero, each query logged is truncated to this many bytes.
+ Zero disables the setting.
+ Default value is 5kB.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
<sect2 id="runtime-config-logging-csvlog">
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4b985bd056..549be13e5b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -70,6 +70,7 @@
#include "tcop/pquery.h"
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
+#include "utils/elog.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -1029,11 +1030,14 @@ exec_simple_query(const char *query_string)
bool was_logged = false;
bool use_implicit_block;
char msec_str[32];
+ char* query_log;
/*
* Report query to various monitoring facilities.
*/
debug_query_string = query_string;
+ bool copied = false;
+ query_log = build_query_log(query_string, &copied);
pgstat_report_activity(STATE_RUNNING, query_string);
@@ -1078,7 +1082,7 @@ exec_simple_query(const char *query_string)
if (check_log_statement(parsetree_list))
{
ereport(LOG,
- (errmsg("statement: %s", query_string),
+ (errmsg("statement: %s", query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
was_logged = true;
@@ -1377,7 +1381,7 @@ exec_simple_query(const char *query_string)
case 2:
ereport(LOG,
(errmsg("duration: %s ms statement: %s",
- msec_str, query_string),
+ msec_str, query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
break;
@@ -1388,6 +1392,8 @@ exec_simple_query(const char *query_string)
TRACE_POSTGRESQL_QUERY_DONE(query_string);
+ if (query_log && copied)
+ pfree(query_log);
debug_query_string = NULL;
}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 289059435a..aa9712ba64 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -111,6 +111,7 @@ int Log_destination = LOG_DESTINATION_STDERR;
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+int max_log_size = 0;
/* Processed form of backtrace_functions GUC */
static char *backtrace_function_list;
@@ -1693,6 +1694,10 @@ EmitErrorReport(void)
CHECK_STACK_DEPTH();
oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+ const char* old_query_string = debug_query_string;
+ bool copied = false;
+ debug_query_string = build_query_log(debug_query_string, &copied);
+
/*
* Reset the formatted timestamp fields before emitting any logs. This
* includes all the log destinations and emit_log_hook, as the latter
@@ -1733,6 +1738,12 @@ EmitErrorReport(void)
MemoryContextSwitchTo(oldcontext);
recursion_depth--;
+
+ if (debug_query_string && copied)
+ {
+ pfree(debug_query_string);
+ debug_query_string = old_query_string;
+ }
}
/*
@@ -3773,3 +3784,30 @@ write_stderr(const char *fmt,...)
#endif
va_end(ap);
}
+
+/*
+ * Apply truncation to build query that will be logged.
+ *
+ * If query needs to be truncated, copied will be set to true
+ * and returned string must be freed
+*/
+char*
+build_query_log(const char* query, bool *copied)
+{
+ *copied = false;
+ if (!query)
+ return NULL;
+
+ size_t query_len = strlen(query);
+ if (max_log_size == 0 || query_len < max_log_size)
+ {
+ return query;
+ }
+
+ *copied = true;
+ size_t query_log_len = pg_mbcliplen(query, query_len, max_log_size);
+ char* query_log = (char*)palloc(query_log_len+1);
+ memcpy(query_log, query, query_log_len);
+ query_log[query_log_len] = '\0';
+ return query_log;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9845abd693..64f21377ad 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3724,6 +3724,17 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_log_size", PGC_SUSET, LOGGING_WHAT,
+ gettext_noop("Sets max size in bytes of logged statement."),
+ NULL
+ },
+ &max_log_size,
+ 5 * (1024 * 1024),
+ 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 407cd1e08c..44971cecc6 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -617,6 +617,8 @@
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
+#max_log_size = 0 # max size of logged statement_timeout (in bytes)
+ # 0 disables the feature
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index eacca1a652..1c1f89cb0a 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -69,6 +69,7 @@ log_destination = 'stderr, csvlog, jsonlog'
# these ensure stability of test results:
log_rotation_age = 0
lc_messages = 'C'
+max_log_size = 32
));
$node->start();
@@ -135,6 +136,20 @@ check_log_pattern('stderr', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
+$node->psql('postgres', 'INSERT INTO SOME_NON_EXISTANT_TABLE VALUES (TEST)');
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+{
+ eval {
+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
+ };
+ last unless $@;
+ usleep(100_000);
+}
+die $@ if $@;
+check_log_pattern('stderr', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('csvlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('jsonlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+
$node->stop();
done_testing();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index e54eca5b48..f6689796a5 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -502,6 +502,7 @@ extern PGDLLIMPORT int Log_destination;
extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT int max_log_size;
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
@@ -516,6 +517,7 @@ extern void log_status_format(StringInfo buf, const char *format,
extern void DebugFileOpen(void);
extern char *unpack_sql_state(int sql_state);
extern bool in_error_recursion_trouble(void);
+extern char* build_query_log(const char* query, bool *copied);
/* Common functions shared across destinations */
extern void reset_formatted_start_time(void);
--
2.39.2 (Apple Git-143)
On 28.11.24 20:20, Kirill Gavrilov wrote:
Here is version 3 of this patch. I found another place where this
setting can be applied.
Also added some documentation and specified that this setting
truncates queries by size in bytes.
Thanks. It is now possible to change the parameter using SET
postgres=# SET max_log_size TO 15;
SET
postgres=# SHOW max_log_size ;
max_log_size
--------------
15
(1 row)
In the postgresql.conf the default value is set to 0
#max_log_size = 0
But if we take a look at the parameter it shows something else
postgres=# SHOW max_log_size ;
max_log_size
--------------
5242880
(1 row)
Perhaps it would be best to keep the default value in the
postgresql.conf consistent with the documentation and the actual
max_log_size value. IMHO 0 (or -1) should disable it.
There are also several compilation issues:
postgres.c: In function ‘exec_simple_query’:
postgres.c:1040:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
1040 | bool copied = false;
| ^~~~
elog.c: In function ‘EmitErrorReport’:
elog.c:1697:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
1697 | const char* old_query_string = debug_query_string;
| ^~~~~
elog.c:1744:23: warning: passing argument 1 of ‘pfree’ discards ‘const’
qualifier from pointer target type [-Wdiscarded-qualifiers]
1744 | pfree(debug_query_string);
| ^~~~~~~~~~~~~~~~~~
In file included from ../../../../src/include/postgres.h:47,
from elog.c:55:
../../../../src/include/utils/palloc.h:86:25: note: expected ‘void *’
but argument is of type ‘const char *’
86 | extern void pfree(void *pointer);
| ~~~~~~^~~~~~~
elog.c: In function ‘build_query_log’:
elog.c:3798:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
3798 | size_t query_len = strlen(query);
| ^~~~~~
elog.c:3801:24: warning: return discards ‘const’ qualifier from pointer
target type [-Wdiscarded-qualifiers]
3801 | return query;
| ^~~~~
elog.c:3805:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
3805 | size_t query_log_len = pg_mbcliplen(query, query_len,
max_log_size);
| ^~~~~~
and there is a trailing whitespace at:
+ * If query needs to be truncated, copied will be set to true
/home/jim/patches/max_log_query/V3-0001-parameter-max_log_size-to-truncate-logs.patch:141:
trailing whitespace.
* If query needs to be truncated, copied will be set to true
Checking patch doc/src/sgml/config.sgml...
Hunk #1 succeeded at 7865 (offset -48 lines).
Checking patch src/backend/tcop/postgres.c...
Hunk #1 succeeded at 71 (offset 1 line).
Hunk #2 succeeded at 1031 (offset 1 line).
Hunk #3 succeeded at 1083 (offset 1 line).
Hunk #4 succeeded at 1382 (offset 1 line).
Hunk #5 succeeded at 1393 (offset 1 line).
Checking patch src/backend/utils/error/elog.c...
Hunk #4 succeeded at 3781 (offset -3 lines).
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset -10 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset -2 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
Checking patch src/include/utils/elog.h...
Applied patch doc/src/sgml/config.sgml cleanly.
Applied patch src/backend/tcop/postgres.c cleanly.
Applied patch src/backend/utils/error/elog.c cleanly.
Applied patch src/backend/utils/misc/guc_tables.c cleanly.
Applied patch src/backend/utils/misc/postgresql.conf.sample cleanly.
Applied patch src/bin/pg_ctl/t/004_logrotate.pl cleanly.
Applied patch src/include/utils/elog.h cleanly.
warning: 1 line adds whitespace errors.
--
Jim
On Fri, Nov 29, 2024 at 3:24 AM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 28.11.24 20:20, Kirill Gavrilov wrote:
Here is version 3 of this patch. I found another place where this
setting can be applied.
Also added some documentation and specified that this setting
truncates queries by size in bytes.Thanks. It is now possible to change the parameter using SET
postgres=# SET max_log_size TO 15;
SET
postgres=# SHOW max_log_size ;
max_log_size
--------------
15
(1 row)
In the postgresql.conf the default value is set to 0
#max_log_size = 0
But if we take a look at the parameter it shows something else
postgres=# SHOW max_log_size ;
max_log_size
--------------
5242880
(1 row)Perhaps it would be best to keep the default value in the
postgresql.conf consistent with the documentation and the actual
max_log_size value. IMHO 0 (or -1) should disable it.There are also several compilation issues:
postgres.c: In function ‘exec_simple_query’:
postgres.c:1040:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
1040 | bool copied = false;
| ^~~~
elog.c: In function ‘EmitErrorReport’:
elog.c:1697:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
1697 | const char* old_query_string = debug_query_string;
| ^~~~~
elog.c:1744:23: warning: passing argument 1 of ‘pfree’ discards ‘const’
qualifier from pointer target type [-Wdiscarded-qualifiers]
1744 | pfree(debug_query_string);
| ^~~~~~~~~~~~~~~~~~
In file included from ../../../../src/include/postgres.h:47,
from elog.c:55:
../../../../src/include/utils/palloc.h:86:25: note: expected ‘void *’
but argument is of type ‘const char *’
86 | extern void pfree(void *pointer);
| ~~~~~~^~~~~~~
elog.c: In function ‘build_query_log’:
elog.c:3798:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
3798 | size_t query_len = strlen(query);
| ^~~~~~
elog.c:3801:24: warning: return discards ‘const’ qualifier from pointer
target type [-Wdiscarded-qualifiers]
3801 | return query;
| ^~~~~
elog.c:3805:9: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
3805 | size_t query_log_len = pg_mbcliplen(query, query_len,
max_log_size);
| ^~~~~~and there is a trailing whitespace at:
+ * If query needs to be truncated, copied will be set to true
/home/jim/patches/max_log_query/V3-0001-parameter-max_log_size-to-truncate-logs.patch:141:
trailing whitespace.
* If query needs to be truncated, copied will be set to true
Checking patch doc/src/sgml/config.sgml...
Hunk #1 succeeded at 7865 (offset -48 lines).
Checking patch src/backend/tcop/postgres.c...
Hunk #1 succeeded at 71 (offset 1 line).
Hunk #2 succeeded at 1031 (offset 1 line).
Hunk #3 succeeded at 1083 (offset 1 line).
Hunk #4 succeeded at 1382 (offset 1 line).
Hunk #5 succeeded at 1393 (offset 1 line).
Checking patch src/backend/utils/error/elog.c...
Hunk #4 succeeded at 3781 (offset -3 lines).
Checking patch src/backend/utils/misc/guc_tables.c...
Hunk #1 succeeded at 3714 (offset -10 lines).
Checking patch src/backend/utils/misc/postgresql.conf.sample...
Hunk #1 succeeded at 615 (offset -2 lines).
Checking patch src/bin/pg_ctl/t/004_logrotate.pl...
Checking patch src/include/utils/elog.h...
Applied patch doc/src/sgml/config.sgml cleanly.
Applied patch src/backend/tcop/postgres.c cleanly.
Applied patch src/backend/utils/error/elog.c cleanly.
Applied patch src/backend/utils/misc/guc_tables.c cleanly.
Applied patch src/backend/utils/misc/postgresql.conf.sample cleanly.
Applied patch src/bin/pg_ctl/t/004_logrotate.pl cleanly.
Applied patch src/include/utils/elog.h cleanly.
warning: 1 line adds whitespace errors.--
Jim
Thanks for spotting mistakes.
Setting max_log_size to 0 already disables truncation.
I changed guc units to bytes, just to make it more human-friendly. Did
some refactoring and fixed all warnings. Set default value in
postgresql.conf.
Attachments:
V4-0001-parameter-max_log_size-to-truncate-logs.patchapplication/octet-stream; name=V4-0001-parameter-max_log_size-to-truncate-logs.patchDownload
From 978693e3375f8ac53a44e9a1496d085cbd99fa9e Mon Sep 17 00:00:00 2001
From: diphantxm <diphantxm@yandex-team.ru>
Date: Fri, 29 Nov 2024 14:02:04 +0300
Subject: [PATCH] parameter max_log_size to truncate logs
There is no need to log the entire query, because it may be large and take lots of space on disk. Parameter max_log_size set the maximum length for logged query. Everything beyond that length is truncated. Value 0 disables the parameter.
---
doc/src/sgml/config.sgml | 15 ++++++
src/backend/tcop/postgres.c | 18 ++++++-
src/backend/utils/error/elog.c | 47 +++++++++++++++++++
src/backend/utils/misc/guc_tables.c | 12 +++++
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/pg_ctl/t/004_logrotate.pl | 15 ++++++
src/include/utils/elog.h | 4 ++
7 files changed, 111 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 76ab72db96..82dc458a4a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7913,6 +7913,21 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-log-size" xreflabel="max_log_size">
+ <term><varname>max_log_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_log_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If greater than zero, each query logged is truncated to this many bytes.
+ Zero disables the setting.
+ Default value is 5MB.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
<sect2 id="runtime-config-logging-csvlog">
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4b985bd056..033de5862a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -70,6 +70,7 @@
#include "tcop/pquery.h"
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
+#include "utils/elog.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -1029,11 +1030,22 @@ exec_simple_query(const char *query_string)
bool was_logged = false;
bool use_implicit_block;
char msec_str[32];
+ char* truncated_query = NULL;
+ const char* query_log;
/*
* Report query to various monitoring facilities.
*/
debug_query_string = query_string;
+ if (need_truncate_query_log(query_string))
+ {
+ truncated_query = truncate_query_log(query_string);
+ query_log = truncated_query;
+ }
+ else
+ {
+ query_log = query_string;
+ }
pgstat_report_activity(STATE_RUNNING, query_string);
@@ -1078,7 +1090,7 @@ exec_simple_query(const char *query_string)
if (check_log_statement(parsetree_list))
{
ereport(LOG,
- (errmsg("statement: %s", query_string),
+ (errmsg("statement: %s", query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
was_logged = true;
@@ -1377,7 +1389,7 @@ exec_simple_query(const char *query_string)
case 2:
ereport(LOG,
(errmsg("duration: %s ms statement: %s",
- msec_str, query_string),
+ msec_str, query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
break;
@@ -1388,6 +1400,8 @@ exec_simple_query(const char *query_string)
TRACE_POSTGRESQL_QUERY_DONE(query_string);
+ if (truncated_query)
+ pfree(truncated_query);
debug_query_string = NULL;
}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 289059435a..8a1071677b 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -111,6 +111,7 @@ int Log_destination = LOG_DESTINATION_STDERR;
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+int max_log_size = 0;
/* Processed form of backtrace_functions GUC */
static char *backtrace_function_list;
@@ -1688,11 +1689,18 @@ EmitErrorReport(void)
{
ErrorData *edata = &errordata[errordata_stack_depth];
MemoryContext oldcontext;
+ char* truncated_query = NULL;
recursion_depth++;
CHECK_STACK_DEPTH();
oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+ if (need_truncate_query_log(debug_query_string))
+ {
+ truncated_query = truncate_query_log(debug_query_string);
+ debug_query_string = truncated_query;
+ }
+
/*
* Reset the formatted timestamp fields before emitting any logs. This
* includes all the log destinations and emit_log_hook, as the latter
@@ -1733,6 +1741,9 @@ EmitErrorReport(void)
MemoryContextSwitchTo(oldcontext);
recursion_depth--;
+
+ if (truncated_query)
+ pfree(truncated_query);
}
/*
@@ -3773,3 +3784,39 @@ write_stderr(const char *fmt,...)
#endif
va_end(ap);
}
+
+/*
+ * Apply truncation to build query that will be logged.
+ *
+ * If query needs to be truncated, copied will be set to true
+ * and returned string must be freed
+*/
+char*
+truncate_query_log(const char* query)
+{
+ size_t truncated_query_len;
+ char* truncatd_query;
+ size_t query_len;
+
+ if (!query)
+ return NULL;
+
+ query_len = strlen(query);
+ truncated_query_len = pg_mbcliplen(query, query_len, max_log_size);
+ truncatd_query = (char *) palloc(truncated_query_len+1);
+ memcpy(truncatd_query, query, truncated_query_len);
+ truncatd_query[truncated_query_len] = '\0';
+ return truncatd_query;
+}
+
+/*
+ * Checks if query should be truncated
+ * according to max_log_size
+*/
+bool
+need_truncate_query_log(const char* query)
+{
+ if (!query)
+ return false;
+ return !(max_log_size == 0 || strlen(query) < max_log_size);
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9845abd693..8aa6e3c7a8 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3724,6 +3724,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_log_size", PGC_SUSET, LOGGING_WHAT,
+ gettext_noop("Sets max size in bytes of logged statement."),
+ NULL,
+ GUC_UNIT_BYTE
+ },
+ &max_log_size,
+ 5 * (1024 * 1024),
+ 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 407cd1e08c..848067d87c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -617,6 +617,8 @@
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
+#max_log_size = 5MB # max size of logged statement (in bytes)
+ # 0 disables the feature
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index eacca1a652..1c1f89cb0a 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -69,6 +69,7 @@ log_destination = 'stderr, csvlog, jsonlog'
# these ensure stability of test results:
log_rotation_age = 0
lc_messages = 'C'
+max_log_size = 32
));
$node->start();
@@ -135,6 +136,20 @@ check_log_pattern('stderr', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
+$node->psql('postgres', 'INSERT INTO SOME_NON_EXISTANT_TABLE VALUES (TEST)');
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+{
+ eval {
+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
+ };
+ last unless $@;
+ usleep(100_000);
+}
+die $@ if $@;
+check_log_pattern('stderr', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('csvlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('jsonlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+
$node->stop();
done_testing();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index e54eca5b48..cafa8de3c0 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -502,6 +502,7 @@ extern PGDLLIMPORT int Log_destination;
extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT int max_log_size;
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
@@ -517,6 +518,9 @@ extern void DebugFileOpen(void);
extern char *unpack_sql_state(int sql_state);
extern bool in_error_recursion_trouble(void);
+extern bool need_truncate_query_log(const char* query);
+extern char* truncate_query_log(const char* query);
+
/* Common functions shared across destinations */
extern void reset_formatted_start_time(void);
extern char *get_formatted_start_time(void);
--
2.39.2 (Apple Git-143)
On 29.11.24 12:05, Kirill Gavrilov wrote:
Setting max_log_size to 0 already disables truncation.
I changed guc units to bytes, just to make it more human-friendly.
Did some refactoring and fixed all warnings. Set default value in
postgresql.conf.
Nice. The patch applies cleanly and the warnings are gone.
A few other things:
IMHO the documentation can be a bit clearer. It says to input the values
in bytes, but the default value is in megabytes.
+ If greater than zero, each query logged is truncated to this
many bytes.
+ Zero disables the setting.
+ Default value is 5MB.
Perhaps something like "If this value is specified without units, it is
taken as bytes." (see wal_keep_size)
And a pedantic note: This hint suggests that you can use TB as unit ..
postgres=# set max_log_size = '1foo';
ERROR: invalid value for parameter "max_log_size": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
.. but it can never be used, as it exceeds the integer range:
postgres=# SET max_log_size = '1TB';
ERROR: invalid value for parameter "max_log_size": "1TB"
HINT: Value exceeds integer range.
Thanks
--
Jim
On Fri, Nov 29, 2024 at 10:12 PM Jim Jones <jim.jones@uni-muenster.de>
wrote:
On 29.11.24 12:05, Kirill Gavrilov wrote:
Setting max_log_size to 0 already disables truncation.
I changed guc units to bytes, just to make it more human-friendly.
Did some refactoring and fixed all warnings. Set default value in
postgresql.conf.Nice. The patch applies cleanly and the warnings are gone.
A few other things:
IMHO the documentation can be a bit clearer. It says to input the values
in bytes, but the default value is in megabytes.+ If greater than zero, each query logged is truncated to this many bytes. + Zero disables the setting. + Default value is 5MB.Perhaps something like "If this value is specified without units, it is
taken as bytes." (see wal_keep_size)
Added to documentation. I hope it's clear now.
And a pedantic note: This hint suggests that you can use TB as unit ..
postgres=# set max_log_size = '1foo';
ERROR: invalid value for parameter "max_log_size": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB"... but it can never be used, as it exceeds the integer range:
postgres=# SET max_log_size = '1TB';
ERROR: invalid value for parameter "max_log_size": "1TB"
HINT: Value exceeds integer range.Thanks
--
Jim
Same thing applies to log_parameter_max_length, for example.
postgres=# set log_parameter_max_length = '1foo';
ERROR: invalid value for parameter "log_parameter_max_length": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
postgres=# set log_parameter_max_length = '1TB';
ERROR: invalid value for parameter "log_parameter_max_length": "1TB"
HINT: Value exceeds integer range.
I think we can leave it as is.
Attachments:
V5-0001-parameter-max_log_size-to-truncate-logs.patchapplication/octet-stream; name=V5-0001-parameter-max_log_size-to-truncate-logs.patchDownload
From 15aa94e0cf2066e3135e6739d6fc957510568a58 Mon Sep 17 00:00:00 2001
From: diphantxm <diphantxm@yandex-team.ru>
Date: Fri, 29 Nov 2024 23:52:11 +0300
Subject: [PATCH] parameter max_log_size to truncate logs
There is no need to log the entire query, because it may be large and take lots of space on disk. Parameter max_log_size set the maximum length for logged query. Everything beyond that length is truncated. Value 0 disables the parameter.
---
doc/src/sgml/config.sgml | 16 +++++++
src/backend/tcop/postgres.c | 18 ++++++-
src/backend/utils/error/elog.c | 47 +++++++++++++++++++
src/backend/utils/misc/guc_tables.c | 12 +++++
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/pg_ctl/t/004_logrotate.pl | 15 ++++++
src/include/utils/elog.h | 4 ++
7 files changed, 112 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 76ab72db96..83b1b40b0f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7913,6 +7913,22 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-log-size" xreflabel="max_log_size">
+ <term><varname>max_log_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_log_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If greater than zero, each query logged is truncated to this many bytes.
+ Zero disables the setting.
+ If this value is specified without units, it is taken as bytes.
+ This feature is disabled by default.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
<sect2 id="runtime-config-logging-csvlog">
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4b985bd056..033de5862a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -70,6 +70,7 @@
#include "tcop/pquery.h"
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
+#include "utils/elog.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -1029,11 +1030,22 @@ exec_simple_query(const char *query_string)
bool was_logged = false;
bool use_implicit_block;
char msec_str[32];
+ char* truncated_query = NULL;
+ const char* query_log;
/*
* Report query to various monitoring facilities.
*/
debug_query_string = query_string;
+ if (need_truncate_query_log(query_string))
+ {
+ truncated_query = truncate_query_log(query_string);
+ query_log = truncated_query;
+ }
+ else
+ {
+ query_log = query_string;
+ }
pgstat_report_activity(STATE_RUNNING, query_string);
@@ -1078,7 +1090,7 @@ exec_simple_query(const char *query_string)
if (check_log_statement(parsetree_list))
{
ereport(LOG,
- (errmsg("statement: %s", query_string),
+ (errmsg("statement: %s", query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
was_logged = true;
@@ -1377,7 +1389,7 @@ exec_simple_query(const char *query_string)
case 2:
ereport(LOG,
(errmsg("duration: %s ms statement: %s",
- msec_str, query_string),
+ msec_str, query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
break;
@@ -1388,6 +1400,8 @@ exec_simple_query(const char *query_string)
TRACE_POSTGRESQL_QUERY_DONE(query_string);
+ if (truncated_query)
+ pfree(truncated_query);
debug_query_string = NULL;
}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 289059435a..8a1071677b 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -111,6 +111,7 @@ int Log_destination = LOG_DESTINATION_STDERR;
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+int max_log_size = 0;
/* Processed form of backtrace_functions GUC */
static char *backtrace_function_list;
@@ -1688,11 +1689,18 @@ EmitErrorReport(void)
{
ErrorData *edata = &errordata[errordata_stack_depth];
MemoryContext oldcontext;
+ char* truncated_query = NULL;
recursion_depth++;
CHECK_STACK_DEPTH();
oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+ if (need_truncate_query_log(debug_query_string))
+ {
+ truncated_query = truncate_query_log(debug_query_string);
+ debug_query_string = truncated_query;
+ }
+
/*
* Reset the formatted timestamp fields before emitting any logs. This
* includes all the log destinations and emit_log_hook, as the latter
@@ -1733,6 +1741,9 @@ EmitErrorReport(void)
MemoryContextSwitchTo(oldcontext);
recursion_depth--;
+
+ if (truncated_query)
+ pfree(truncated_query);
}
/*
@@ -3773,3 +3784,39 @@ write_stderr(const char *fmt,...)
#endif
va_end(ap);
}
+
+/*
+ * Apply truncation to build query that will be logged.
+ *
+ * If query needs to be truncated, copied will be set to true
+ * and returned string must be freed
+*/
+char*
+truncate_query_log(const char* query)
+{
+ size_t truncated_query_len;
+ char* truncatd_query;
+ size_t query_len;
+
+ if (!query)
+ return NULL;
+
+ query_len = strlen(query);
+ truncated_query_len = pg_mbcliplen(query, query_len, max_log_size);
+ truncatd_query = (char *) palloc(truncated_query_len+1);
+ memcpy(truncatd_query, query, truncated_query_len);
+ truncatd_query[truncated_query_len] = '\0';
+ return truncatd_query;
+}
+
+/*
+ * Checks if query should be truncated
+ * according to max_log_size
+*/
+bool
+need_truncate_query_log(const char* query)
+{
+ if (!query)
+ return false;
+ return !(max_log_size == 0 || strlen(query) < max_log_size);
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9845abd693..8ad18f7a47 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3724,6 +3724,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_log_size", PGC_SUSET, LOGGING_WHAT,
+ gettext_noop("Sets max size in bytes of logged statement."),
+ NULL,
+ GUC_UNIT_BYTE
+ },
+ &max_log_size,
+ 0,
+ 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 407cd1e08c..de695e257d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -617,6 +617,8 @@
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
+#max_log_size = 0 # max size of logged statement
+ # 0 disables the feature
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index eacca1a652..1c1f89cb0a 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -69,6 +69,7 @@ log_destination = 'stderr, csvlog, jsonlog'
# these ensure stability of test results:
log_rotation_age = 0
lc_messages = 'C'
+max_log_size = 32
));
$node->start();
@@ -135,6 +136,20 @@ check_log_pattern('stderr', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
+$node->psql('postgres', 'INSERT INTO SOME_NON_EXISTANT_TABLE VALUES (TEST)');
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+{
+ eval {
+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
+ };
+ last unless $@;
+ usleep(100_000);
+}
+die $@ if $@;
+check_log_pattern('stderr', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('csvlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('jsonlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+
$node->stop();
done_testing();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index e54eca5b48..cafa8de3c0 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -502,6 +502,7 @@ extern PGDLLIMPORT int Log_destination;
extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT int max_log_size;
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
@@ -517,6 +518,9 @@ extern void DebugFileOpen(void);
extern char *unpack_sql_state(int sql_state);
extern bool in_error_recursion_trouble(void);
+extern bool need_truncate_query_log(const char* query);
+extern char* truncate_query_log(const char* query);
+
/* Common functions shared across destinations */
extern void reset_formatted_start_time(void);
extern char *get_formatted_start_time(void);
--
2.39.2 (Apple Git-143)
On Sat, 30 Nov 2024 at 01:57, Kirill Gavrilov <diphantxm@gmail.com> wrote:
On Fri, Nov 29, 2024 at 10:12 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 29.11.24 12:05, Kirill Gavrilov wrote:
Setting max_log_size to 0 already disables truncation.
I changed guc units to bytes, just to make it more human-friendly.
Did some refactoring and fixed all warnings. Set default value in
postgresql.conf.Nice. The patch applies cleanly and the warnings are gone.
A few other things:
IMHO the documentation can be a bit clearer. It says to input the values
in bytes, but the default value is in megabytes.+ If greater than zero, each query logged is truncated to this many bytes. + Zero disables the setting. + Default value is 5MB.Perhaps something like "If this value is specified without units, it is
taken as bytes." (see wal_keep_size)Added to documentation. I hope it's clear now.
And a pedantic note: This hint suggests that you can use TB as unit ..
postgres=# set max_log_size = '1foo';
ERROR: invalid value for parameter "max_log_size": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB"... but it can never be used, as it exceeds the integer range:
postgres=# SET max_log_size = '1TB';
ERROR: invalid value for parameter "max_log_size": "1TB"
HINT: Value exceeds integer range.Thanks
--
JimSame thing applies to log_parameter_max_length, for example.
postgres=# set log_parameter_max_length = '1foo';
ERROR: invalid value for parameter "log_parameter_max_length": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
postgres=# set log_parameter_max_length = '1TB';
ERROR: invalid value for parameter "log_parameter_max_length": "1TB"
HINT: Value exceeds integer range.I think we can leave it as is.
Hi
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++) +{ + eval { + $current_logfiles = slurp_file($node->data_dir . '/current_logfiles'); + }; + last unless $@; + usleep(100_000); +}
`usleep` in tap tests is usually a bad pattern. Do we have a chance to
test this using `wait_for_log` or similar?
--
Best regards,
Kirill Reshke
Hi
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++) +{ + eval { + $current_logfiles = slurp_file($node->data_dir . '/current_logfiles'); + }; + last unless $@; + usleep(100_000); +}`usleep` in tap tests is usually a bad pattern. Do we have a chance to
test this using `wait_for_log` or similar?
I'm not sure we can use `wait_for_log` because it checks for only one
logfile. But even if it's possible, I don't think it's a good idea to use
different checks in the same file or to change tests for another feature. I
used test case from above as an example for mine.
On 29.11.24 21:57, Kirill Gavrilov wrote:
Same thing applies to log_parameter_max_length, for example.
postgres=# set log_parameter_max_length = '1foo';
ERROR: invalid value for parameter "log_parameter_max_length": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".
postgres=# set log_parameter_max_length = '1TB';
ERROR: invalid value for parameter "log_parameter_max_length": "1TB"
HINT: Value exceeds integer range.I think we can leave it as is.
I see. So I guess it is out of scope to change this message here.
Small nitpicks:
1) The indentation of the comment at postgresql.conf.sample is a little
bit off
#max_log_size = 0 # max size of logged statement
# 0 disables the feature
IMHO it looks better like this:
#max_log_size = 0 # max size of logged statement
# 0 disables the feature
2) You introduced a trailing whitespace at L34 (Not critical :))
+ Zero disables the setting.
It happens to me all the time, so I usually try to apply my patches in a
clean branch just to make sure I didn't miss anything.
Other than that, I have nothing more to add at this point.
Thanks
--
Jim
On Mon, Dec 2, 2024 at 1:09 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 29.11.24 21:57, Kirill Gavrilov wrote:
Same thing applies to log_parameter_max_length, for example.
postgres=# set log_parameter_max_length = '1foo';
ERROR: invalid value for parameter "log_parameter_max_length": "1foo"
HINT: Valid units for this parameter are "B", "kB", "MB", "GB", and"TB".
postgres=# set log_parameter_max_length = '1TB';
ERROR: invalid value for parameter "log_parameter_max_length": "1TB"
HINT: Value exceeds integer range.I think we can leave it as is.
I see. So I guess it is out of scope to change this message here.
Small nitpicks:
1) The indentation of the comment at postgresql.conf.sample is a little
bit off#max_log_size = 0 # max size of logged statement
# 0 disables the featureIMHO it looks better like this:
#max_log_size = 0 # max size of logged statement
# 0 disables the feature2) You introduced a trailing whitespace at L34 (Not critical :))
+ Zero disables the setting.
It happens to me all the time, so I usually try to apply my patches in a
clean branch just to make sure I didn't miss anything.Other than that, I have nothing more to add at this point.
Thanks
--
Jim
Sorry for the long silence. I fixed the indentation and a trailing
whitespace. Should look fine now.
Attachments:
V6-0001-parameter-max_log_size-to-truncate-logs.patchapplication/octet-stream; name=V6-0001-parameter-max_log_size-to-truncate-logs.patchDownload
From a020529b6ac33393e9992e2118762f2f39762d9b Mon Sep 17 00:00:00 2001
From: diphantxm <diphantxm@yandex-team.ru>
Date: Fri, 31 Jan 2025 13:31:50 +0300
Subject: [PATCH] parameter max_log_size to truncate logs
There is no need to log the entire query, because it may be large and take lots of space on disk. Parameter max_log_size set the maximum length for logged query. Everything beyond that length is truncated. Value 0 disables the parameter.
---
doc/src/sgml/config.sgml | 16 +++++++
src/backend/tcop/postgres.c | 18 ++++++-
src/backend/utils/error/elog.c | 47 +++++++++++++++++++
src/backend/utils/misc/guc_tables.c | 12 +++++
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/pg_ctl/t/004_logrotate.pl | 15 ++++++
src/include/utils/elog.h | 4 ++
7 files changed, 112 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 76ab72db96..86dcebfe36 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7913,6 +7913,22 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
</listitem>
</varlistentry>
+ <varlistentry id="guc-max-log-size" xreflabel="max_log_size">
+ <term><varname>max_log_size</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_log_size</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If greater than zero, each query logged is truncated to this many bytes.
+ Zero disables the setting.
+ If this value is specified without units, it is taken as bytes.
+ This feature is disabled by default.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
<sect2 id="runtime-config-logging-csvlog">
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4b985bd056..033de5862a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -70,6 +70,7 @@
#include "tcop/pquery.h"
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
+#include "utils/elog.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -1029,11 +1030,22 @@ exec_simple_query(const char *query_string)
bool was_logged = false;
bool use_implicit_block;
char msec_str[32];
+ char* truncated_query = NULL;
+ const char* query_log;
/*
* Report query to various monitoring facilities.
*/
debug_query_string = query_string;
+ if (need_truncate_query_log(query_string))
+ {
+ truncated_query = truncate_query_log(query_string);
+ query_log = truncated_query;
+ }
+ else
+ {
+ query_log = query_string;
+ }
pgstat_report_activity(STATE_RUNNING, query_string);
@@ -1078,7 +1090,7 @@ exec_simple_query(const char *query_string)
if (check_log_statement(parsetree_list))
{
ereport(LOG,
- (errmsg("statement: %s", query_string),
+ (errmsg("statement: %s", query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
was_logged = true;
@@ -1377,7 +1389,7 @@ exec_simple_query(const char *query_string)
case 2:
ereport(LOG,
(errmsg("duration: %s ms statement: %s",
- msec_str, query_string),
+ msec_str, query_log),
errhidestmt(true),
errdetail_execute(parsetree_list)));
break;
@@ -1388,6 +1400,8 @@ exec_simple_query(const char *query_string)
TRACE_POSTGRESQL_QUERY_DONE(query_string);
+ if (truncated_query)
+ pfree(truncated_query);
debug_query_string = NULL;
}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 289059435a..8a1071677b 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -111,6 +111,7 @@ int Log_destination = LOG_DESTINATION_STDERR;
char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+int max_log_size = 0;
/* Processed form of backtrace_functions GUC */
static char *backtrace_function_list;
@@ -1688,11 +1689,18 @@ EmitErrorReport(void)
{
ErrorData *edata = &errordata[errordata_stack_depth];
MemoryContext oldcontext;
+ char* truncated_query = NULL;
recursion_depth++;
CHECK_STACK_DEPTH();
oldcontext = MemoryContextSwitchTo(edata->assoc_context);
+ if (need_truncate_query_log(debug_query_string))
+ {
+ truncated_query = truncate_query_log(debug_query_string);
+ debug_query_string = truncated_query;
+ }
+
/*
* Reset the formatted timestamp fields before emitting any logs. This
* includes all the log destinations and emit_log_hook, as the latter
@@ -1733,6 +1741,9 @@ EmitErrorReport(void)
MemoryContextSwitchTo(oldcontext);
recursion_depth--;
+
+ if (truncated_query)
+ pfree(truncated_query);
}
/*
@@ -3773,3 +3784,39 @@ write_stderr(const char *fmt,...)
#endif
va_end(ap);
}
+
+/*
+ * Apply truncation to build query that will be logged.
+ *
+ * If query needs to be truncated, copied will be set to true
+ * and returned string must be freed
+*/
+char*
+truncate_query_log(const char* query)
+{
+ size_t truncated_query_len;
+ char* truncatd_query;
+ size_t query_len;
+
+ if (!query)
+ return NULL;
+
+ query_len = strlen(query);
+ truncated_query_len = pg_mbcliplen(query, query_len, max_log_size);
+ truncatd_query = (char *) palloc(truncated_query_len+1);
+ memcpy(truncatd_query, query, truncated_query_len);
+ truncatd_query[truncated_query_len] = '\0';
+ return truncatd_query;
+}
+
+/*
+ * Checks if query should be truncated
+ * according to max_log_size
+*/
+bool
+need_truncate_query_log(const char* query)
+{
+ if (!query)
+ return false;
+ return !(max_log_size == 0 || strlen(query) < max_log_size);
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9845abd693..8ad18f7a47 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3724,6 +3724,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_log_size", PGC_SUSET, LOGGING_WHAT,
+ gettext_noop("Sets max size in bytes of logged statement."),
+ NULL,
+ GUC_UNIT_BYTE
+ },
+ &max_log_size,
+ 0,
+ 0, INT_MAX,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 407cd1e08c..80a45341df 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -617,6 +617,8 @@
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
+#max_log_size = 0 # max size of logged statement
+ # 0 disables the feature
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl b/src/bin/pg_ctl/t/004_logrotate.pl
index eacca1a652..1c1f89cb0a 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -69,6 +69,7 @@ log_destination = 'stderr, csvlog, jsonlog'
# these ensure stability of test results:
log_rotation_age = 0
lc_messages = 'C'
+max_log_size = 32
));
$node->start();
@@ -135,6 +136,20 @@ check_log_pattern('stderr', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('csvlog', $new_current_logfiles, 'syntax error', $node);
check_log_pattern('jsonlog', $new_current_logfiles, 'syntax error', $node);
+$node->psql('postgres', 'INSERT INTO SOME_NON_EXISTANT_TABLE VALUES (TEST)');
+for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+{
+ eval {
+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
+ };
+ last unless $@;
+ usleep(100_000);
+}
+die $@ if $@;
+check_log_pattern('stderr', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('csvlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+check_log_pattern('jsonlog', $current_logfiles, 'INSERT INTO SOME_NON_EXISTANT_TA(?!(BLE VALUES \(TEST\)))', $node);
+
$node->stop();
done_testing();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index e54eca5b48..cafa8de3c0 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -502,6 +502,7 @@ extern PGDLLIMPORT int Log_destination;
extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT int max_log_size;
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
@@ -517,6 +518,9 @@ extern void DebugFileOpen(void);
extern char *unpack_sql_state(int sql_state);
extern bool in_error_recursion_trouble(void);
+extern bool need_truncate_query_log(const char* query);
+extern char* truncate_query_log(const char* query);
+
/* Common functions shared across destinations */
extern void reset_formatted_start_time(void);
extern char *get_formatted_start_time(void);
--
2.39.2 (Apple Git-143)
Hi Kirill
On 31.01.25 11:46, Kirill Gavrilov wrote:
Sorry for the long silence. I fixed the indentation and a trailing
whitespace. Should look fine now.
The patch applies cleanly, the documentation is clear, and all tests pass.
It is possible to change this new parameter session-wise, which is nice!
postgres=# SET max_log_size TO 7;
SET
postgres=# SHOW max_log_size;
max_log_size
--------------
7B
(1 row)
The default value now is clear and it corresponds to the value set on
postgresql.conf:
#max_log_size = 0 # max size of logged statement
postgres=# SHOW max_log_size;
max_log_size
--------------
0
(1 row)
Logs are truncated as expected:
postgres=# SET max_log_size TO 6;
SET
postgres=# SELECT length('CALL xyz;');
length
--------
9
(1 row)
postgres=# CALL xyz;
ERROR: syntax error at or near ";"
LINE 1: CALL xyz;
log entry:
2025-02-03 10:58:19.975 CET [123945] ERROR: syntax error at or near ";"
at character 9
2025-02-03 10:58:19.975 CET [123945] STATEMENT: CALL x
The issue with log entry sizes for queries containing special characters
was resolved by setting the unit to bytes.
Overall, everythingLGTM.
The new status of this patch is: Ready for Committer
Jim
On 2025/02/03 19:31, Jim Jones wrote:
Hi Kirill
On 31.01.25 11:46, Kirill Gavrilov wrote:
Sorry for the long silence. I fixed the indentation and a trailing
whitespace. Should look fine now.The patch applies cleanly, the documentation is clear, and all tests pass.
It is possible to change this new parameter session-wise, which is nice!
postgres=# SET max_log_size TO 7;
SET
postgres=# SHOW max_log_size;
max_log_size
--------------
7B
(1 row)The default value now is clear and it corresponds to the value set on
postgresql.conf:#max_log_size = 0 # max size of logged statement
postgres=# SHOW max_log_size;
max_log_size
--------------
0
(1 row)Logs are truncated as expected:
postgres=# SET max_log_size TO 6;
SET
postgres=# SELECT length('CALL xyz;');
length
--------
9
(1 row)postgres=# CALL xyz;
ERROR: syntax error at or near ";"
LINE 1: CALL xyz;log entry:
2025-02-03 10:58:19.975 CET [123945] ERROR: syntax error at or near ";"
at character 9
2025-02-03 10:58:19.975 CET [123945] STATEMENT: CALL xThe issue with log entry sizes for queries containing special characters
was resolved by setting the unit to bytes.Overall, everythingLGTM.
The new status of this patch is: Ready for Committer
Since this patch is marked as ready-for-committer, I started reviewing it.
Basically I like the proposed idea.
When I set log_statement to 'all', max_log_size to 3, and ran "SELECT 1/0",
only the first three bytes of the query were logged by log_statement.
However, no query statement appeared under STATEMENT, as shown below.
Is this a bug?
--------------------------
=# SET log_statement TO 'all';
=# SET max_log_size TO 3;
=# SELECT 1/0;
LOG: statement: SEL
ERROR: division by zero
STATEMENT:
--------------------------
When log_replication_commands is enabled, replication commands are logged.
Should max_log_size apply to these logs as well to prevent excessively
large commands from being logged in full?
The parameter name max_log_size seems misleading. It sounds like
it controls the maximum log file size. Would a name like
log_statement_max_length be clearer?
The functions like exec_parse_message(), exec_bind_message(),
and exec_execute_message() may log query statements (e.g., via
log_min_duration_statement), but the patch doesn't seem to
update them to consider max_log_size.
Queries can also be logged in the CONTEXT line, such as when running
"DO $$ BEGIN SELECT 1/0; END $$;", but max_log_size doesn't seem to
apply in this case.
There might be other cases where queries are logged, but the patch
doesn't handle them. I'm not sure we can identify and address all of
them before the feature freeze.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION