Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block
Implement TODO item:
Fix log_line_prefix to display the transaction id (%x) for statements
not in a transaction block
Currently it displays zero.
Check that the XID has been assigned at the location where the statement
log is now printed. If not, no statement log is output.
And then before finish_xact_command. If the statement has not been
output to the log. Here the log can get XID.
DML that does not manipulate any data still does not get XID.
[32718]: [0] LOG: statement: delete from t1;
[32718]: [0] LOG: statement: delete from t1;
[32718]: [0] LOG: statement: delete from t1;
--
Quan Zongliang
Attachments:
postgres.patchtext/plain; charset=UTF-8; name=postgres.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 21b9763183..b73676ab9d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1063,8 +1063,9 @@ exec_simple_query(const char *query_string)
*/
parsetree_list = pg_parse_query(query_string);
- /* Log immediately if dictated by log_statement */
- if (check_log_statement(parsetree_list))
+ /* Log immediately if dictated by log_statement and XID assigned. */
+ if (GetTopTransactionIdIfAny() != InvalidTransactionId &&
+ check_log_statement(parsetree_list))
{
ereport(LOG,
(errmsg("statement: %s", query_string),
@@ -1281,6 +1282,16 @@ exec_simple_query(const char *query_string)
PortalDrop(portal, false);
+ /* Log if dictated by log_statement and has not been logged. */
+ if (!was_logged && check_log_statement(parsetree_list))
+ {
+ ereport(LOG,
+ (errmsg("statement: %s", query_string),
+ errhidestmt(true),
+ errdetail_execute(parsetree_list)));
+ was_logged = true;
+ }
+
if (lnext(parsetree_list, parsetree_item) == NULL)
{
/*
Hi.
+ /* Log immediately if dictated by log_statement and XID assigned. */
+ if (GetTopTransactionIdIfAny() != InvalidTransactionId &&
+ check_log_statement(parsetree_list))
change to
+ /* Log immediately if dictated by log_statement and XID assigned. */
+ if ( check_log_statement(parsetree_list) &&
+ GetTopTransactionIdIfAny() != InvalidTransactionId)
I think it would reduce GetTopTransactionIdIfAny() calls.
I guess people will have different opinion that
simple query like:
`explain(analyze) select g from generate_series(1,1e6) g, pg_sleep(10);`
The log output will only be generated after 10 seconds.
of course, there is pg_stat_activity and other ways to view the running query.
playing around with the patch.
The patch is better than the current HEAD, in some cases.
both under condition:
alter system set log_line_prefix = '%m [%p] %q%u@%d/%a XID:%x ';
set log_statement = 'all';
select pg_reload_conf();
With Patch:
src3=# create table x1(a int);
2024-01-11 17:11:51.150 CST [115782] jian@src3/psql XID:814 LOG:
statement: create table x1(a int);
CREATE TABLE
src3=#
src3=# insert into x1 select 100;
2024-01-11 17:12:06.953 CST [115782] jian@src3/psql XID:815 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src3=# begin;
2024-01-11 17:12:17.543 CST [115782] jian@src3/psql XID:0 LOG:
statement: begin;
BEGIN
src3=*# insert into x1 select 100;
2024-01-11 17:12:24.779 CST [115782] jian@src3/psql XID:816 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src3=*# commit;
2024-01-11 17:12:29.851 CST [115782] jian@src3/psql XID:816 LOG:
statement: commit;
COMMIT
src3=# select 11;
2024-01-11 17:14:22.909 CST [115782] jian@src3/psql XID:0 LOG:
statement: select 11;
?column?
----------
11
(1 row)
src3=# drop table x1;
2024-01-11 17:15:01.409 CST [115782] jian@src3/psql XID:817 LOG:
statement: drop table x1;
DROP TABLE
src3=# select pg_current_xact_id();
2024-01-11 17:21:55.602 CST [115782] jian@src3/psql XID:818 LOG:
statement: select pg_current_xact_id();
pg_current_xact_id
--------------------
818
(1 row)
---------------------------------------------------------------------------------
without patch:
src4=# insert into x1 select 100;
2024-01-11 17:07:13.556 CST [115240] jian@src4/psql XID:0 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src4=# begin;
2024-01-11 17:07:31.345 CST [115240] jian@src4/psql XID:0 LOG:
statement: begin;
BEGIN
src4=*# insert into x1 select 100;
2024-01-11 17:07:35.475 CST [115240] jian@src4/psql XID:0 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src4=*# commit;
2024-01-11 17:07:39.095 CST [115240] jian@src4/psql XID:863 LOG:
statement: commit;
COMMIT
src4=# show logging_collector;
2024-01-11 17:09:59.307 CST [115240] jian@src4/psql XID:0 LOG:
statement: show logging_collector;
logging_collector
-------------------
off
(1 row)
src4=# select 11;
2024-01-11 17:14:30.001 CST [115240] jian@src4/psql XID:0 LOG:
statement: select 11;
?column?
----------
11
(1 row)
src4=# drop table x1;
2024-01-11 17:15:08.010 CST [115240] jian@src4/psql XID:0 LOG:
statement: drop table x1;
DROP TABLE
src4=# select pg_current_xact_id();
2024-01-11 17:21:22.085 CST [115240] jian@src4/psql XID:0 LOG:
statement: select pg_current_xact_id();
pg_current_xact_id
--------------------
867
(1 row)
Hi
another big difference compare to HEAD:
select name,setting
from pg_settings
where name in
('auto_explain.log_timing','auto_explain.log_analyze',
'auto_explain.log_min_duration','log_statement','log_line_prefix')
;
name | setting
-------------------------------+----------------------------
auto_explain.log_analyze | on
auto_explain.log_min_duration | 0
auto_explain.log_timing | on
log_line_prefix | %m [%p] %q%u@%d/%a XID:%x
log_statement | all
simple query: explain(analyze, costs off) select 1 from pg_sleep(10);
with patch:
src3=# explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
duration: 10010.167 ms plan:
Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
Function Scan on pg_sleep (cost=0.00..0.01 rows=1 width=4)
(actual time=10010.155..10010.159 rows=1 loops=1)
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
statement: explain(analyze, costs off) select 1 from pg_sleep(10);
QUERY PLAN
-----------------------------------------------------------------------------
Function Scan on pg_sleep (actual time=10010.155..10010.159 rows=1 loops=1)
Planning Time: 0.115 ms
Execution Time: 10010.227 ms
(3 rows)
without patch:
src4=#
src4=# explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:13.462 CST [5869] jian@src4/psql XID:0 LOG:
statement: explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:23.473 CST [5869] jian@src4/psql XID:0 LOG:
duration: 10010.133 ms plan:
Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
Function Scan on pg_sleep (cost=0.00..0.01 rows=1 width=4)
(actual time=10010.126..10010.128 rows=1 loops=1)
QUERY PLAN
-----------------------------------------------------------------------------
Function Scan on pg_sleep (actual time=10010.126..10010.128 rows=1 loops=1)
Planning Time: 0.031 ms
Execution Time: 10010.172 ms
(3 rows)
A little tweak to the code.
GetTopTransactionIdIfAny() != InvalidTransactionId
changed to
TransactionIdIsValid(GetTopTransactionIdIfAny()
On 2024/1/12 08:51, jian he wrote:
Hi
...
with patch:
src3=# explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
duration: 10010.167 ms plan:
Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
Function Scan on pg_sleep (cost=0.00..0.01 rows=1 width=4)
(actual time=10010.155..10010.159 rows=1 loops=1)
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
statement: explain(analyze, costs off) select 1 from pg_sleep(10);
QUERY PLAN
-----------------------------------------------------------------------------
Function Scan on pg_sleep (actual time=10010.155..10010.159 rows=1 loops=1)
Planning Time: 0.115 ms
Execution Time: 10010.227 ms
(3 rows)
This problem does exist in a statement that takes a long time to run.
XID is applied only for the first change tuple. If the user want to see
it in a single statement log, they have to wait until the statement has
finished executing. And we don't know how long it will take until the
statement ends. It is not appropriate to output the log twice because of
xid. Besides, without parsing log_line_prefix we don't know if the user
wants to see xid.
Attachments:
postgres-v2.patchtext/plain; charset=UTF-8; name=postgres-v2.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 1a34bd3715..bd08b64450 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1064,8 +1064,9 @@ exec_simple_query(const char *query_string)
*/
parsetree_list = pg_parse_query(query_string);
- /* Log immediately if dictated by log_statement */
- if (check_log_statement(parsetree_list))
+ /* Log immediately if dictated by log_statement and XID assigned. */
+ if (TransactionIdIsValid(GetTopTransactionIdIfAny()) &&
+ check_log_statement(parsetree_list))
{
ereport(LOG,
(errmsg("statement: %s", query_string),
@@ -1282,6 +1283,16 @@ exec_simple_query(const char *query_string)
PortalDrop(portal, false);
+ /* Log if dictated by log_statement and has not been logged. */
+ if (!was_logged && check_log_statement(parsetree_list))
+ {
+ ereport(LOG,
+ (errmsg("statement: %s", query_string),
+ errhidestmt(true),
+ errdetail_execute(parsetree_list)));
+ was_logged = true;
+ }
+
if (lnext(parsetree_list, parsetree_item) == NULL)
{
/*
On 12 Jan 2024, at 05:51, jian he <jian.universality@gmail.com> wrote:
another big difference compare to HEAD:
Hi Jian,
thanks for looking into this. Would you be willing to review the next version of the patch?
As long as you are looking into this, you might be interested in registering yourself in a CF entry as a reviewer. [0]https://commitfest.postgresql.org/47/4606/
Thanks!
Best regards, Andrey Borodin.
On Mon, Mar 4, 2024 at 2:12 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
On 12 Jan 2024, at 05:51, jian he <jian.universality@gmail.com> wrote:
another big difference compare to HEAD:
Hi Jian,
thanks for looking into this. Would you be willing to review the next version of the patch?
I just applied postgres-v2.patch.
Previously, I just mentioned, there is a big difference between to HEAD.
I didn't mention it explicitly at [1]/messages/by-id/CACJufxHg1oir8sd=xScMP3n+tYcbug=zusG5KiA2KzH5PmOeuQ@mail.gmail.com
postgres-v2.patch still has the problem.
it's not related to XID, it's the misbehavior of the patch.
When we load the 'auto_explain' module, we can log two things, the
query plan and statement.
After applying the patch, the outputs of the query plan and the
statement will be mixed together, which
obviously is far from the desired behavior.
Maybe we can tolerate LOG, first output the query plan then statement.
As long as you are looking into this, you might be interested in registering yourself in a CF entry as a reviewer. [0]
sure.
[1]: /messages/by-id/CACJufxHg1oir8sd=xScMP3n+tYcbug=zusG5KiA2KzH5PmOeuQ@mail.gmail.com
On 2024/3/4 15:48, jian he wrote:
Maybe we can tolerate LOG, first output the query plan then statement.
It is more appropriate to let the extension solve its own problems.
Of course, this change is not easy to implement.
Due to the way XID is assigned, there seems to be no good solution at
the moment.
On 2024/3/11 09:25, Quan Zongliang wrote:
On 2024/3/4 15:48, jian he wrote:
Maybe we can tolerate LOG, first output the query plan then statement.
It is more appropriate to let the extension solve its own problems.
Of course, this change is not easy to implement.
Due to the way XID is assigned, there seems to be no good solution at
the moment.
According to the discussion with Jian He. Use the guc hook to check if
the xid needs to be output. If needed, the statement log can be delayed
to be output.
Attachments:
postgres-v3.patchtext/plain; charset=UTF-8; name=postgres-v3.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 76f48b13d2..217910c9de 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1068,11 +1068,14 @@ exec_simple_query(const char *query_string)
/* Log immediately if dictated by log_statement */
if (check_log_statement(parsetree_list))
{
- ereport(LOG,
- (errmsg("statement: %s", query_string),
- errhidestmt(true),
- errdetail_execute(parsetree_list)));
- was_logged = true;
+ if (!Log_has_xid || TransactionIdIsValid(GetTopTransactionIdIfAny()))
+ {
+ ereport(LOG,
+ (errmsg("statement: %s", query_string),
+ errhidestmt(true),
+ errdetail_execute(parsetree_list)));
+ was_logged = true;
+ }
}
/*
@@ -1283,6 +1286,16 @@ exec_simple_query(const char *query_string)
PortalDrop(portal, false);
+ /* Log if dictated by log_statement and has not been logged. */
+ if (!was_logged && check_log_statement(parsetree_list))
+ {
+ ereport(LOG,
+ (errmsg("statement: %s", query_string),
+ errhidestmt(true),
+ errdetail_execute(parsetree_list)));
+ was_logged = true;
+ }
+
if (lnext(parsetree_list, parsetree_item) == NULL)
{
/*
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 605ff3b045..e7dd9f5027 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -79,6 +79,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "tcop/tcopprot.h"
+#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -114,6 +115,9 @@ char *Log_destination_string = NULL;
bool syslog_sequence_numbers = true;
bool syslog_split_messages = true;
+/* Whether the transaction id will appear in the log messages. */
+bool Log_has_xid = false;
+
/* Processed form of backtrace_functions GUC */
static char *backtrace_function_list;
@@ -286,6 +290,50 @@ message_level_is_interesting(int elevel)
}
+/*
+ * check_log_line_prefix: GUC check_hook for Log_line_prefix
+ */
+bool check_log_line_prefix(char **newval, void **extra, GucSource source)
+{
+ const char *p;
+
+ if ((*newval) == NULL)
+ {
+ Log_has_xid = false;
+ return true;
+ }
+
+ Log_has_xid = false;
+
+ for (p = (*newval); *p != '\0'; p++)
+ {
+ if (*p != '%')
+ {
+ /* literal char, just skip */
+ continue;
+ }
+
+ /* must be a '%', so skip to the next char */
+ p++;
+ if (*p == '\0')
+ break; /* format error - ignore it */
+ else if (*p == '%')
+ {
+ /* string contains %% */
+ continue;
+ }
+
+ /* process the option */
+ if (*p == 'x')
+ {
+ Log_has_xid = true;
+ break;
+ }
+ }
+
+ return true;
+}
+
/*
* in_error_recursion_trouble --- are we at risk of infinite error recursion?
*
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c68fdc008b..96948e5005 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -4103,7 +4103,7 @@ struct config_string ConfigureNamesString[] =
},
&Log_line_prefix,
"%m [%p] ",
- NULL, NULL, NULL
+ check_log_line_prefix, NULL, NULL
},
{
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 054dd2bf62..9536146ce3 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -502,6 +502,8 @@ extern PGDLLIMPORT char *Log_destination_string;
extern PGDLLIMPORT bool syslog_sequence_numbers;
extern PGDLLIMPORT bool syslog_split_messages;
+extern PGDLLIMPORT bool Log_has_xid;
+
/* Log destination bitmap */
#define LOG_DESTINATION_STDERR 1
#define LOG_DESTINATION_SYSLOG 2
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..41f8ba3f16 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -78,6 +78,7 @@ extern bool check_log_destination(char **newval, void **extra,
extern void assign_log_destination(const char *newval, void *extra);
extern const char *show_log_file_mode(void);
extern bool check_log_stats(bool *newval, void **extra, GucSource source);
+extern bool check_log_line_prefix(char **newval, void **extra, GucSource source);
extern bool check_log_timezone(char **newval, void **extra, GucSource source);
extern void assign_log_timezone(const char *newval, void *extra);
extern const char *show_log_timezone(void);
On Tue, Apr 16, 2024 at 3:16 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
According to the discussion with Jian He. Use the guc hook to check if
the xid needs to be output. If needed, the statement log can be delayed
to be output.
I appreciate the work that both of you have put into this, but I think
we should reject this patch and remove the TODO item. We currently
have some facilities (like log_statement) that log the statement
before parsing it or executing it, and others (like
log_min_duration_statement) that log it afterward. That is probably
not documented as clearly as it should be, but it's long-established
behavior.
What this patch does is change the behavior of log_statement so that
log_statement sometimes logs the statement before it's executed, and
sometimes after the statement. I think that's going to be confusing
and unhelpful. In particular, right now you can assume that if you set
log_statement=all and there's a statement running, it's already been
logged. With this change, that would sometimes be true and sometimes
false.
For example, suppose that at 9am sharp, I run an UPDATE command that
takes ten seconds to complete. Right now, the log_statement message
will appear at 9am. With this change, it will run at 9am if I do it
inside a transaction block that has an XID already, and at 9:00:10am
if I do it in a transaction block that does not yet have an XID, or if
I do it outside of a transaction. I don't think the benefit of getting
the XID in the log message is nearly enough to justify such a strange
behavior.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 2024/5/16 00:58, Robert Haas wrote:
On Tue, Apr 16, 2024 at 3:16 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
According to the discussion with Jian He. Use the guc hook to check if
the xid needs to be output. If needed, the statement log can be delayed
to be output.I appreciate the work that both of you have put into this, but I think
we should reject this patch and remove the TODO item. We currently
have some facilities (like log_statement) that log the statement
before parsing it or executing it, and others (like
log_min_duration_statement) that log it afterward. That is probably
not documented as clearly as it should be, but it's long-established
behavior.What this patch does is change the behavior of log_statement so that
log_statement sometimes logs the statement before it's executed, and
sometimes after the statement. I think that's going to be confusing
and unhelpful. In particular, right now you can assume that if you set
log_statement=all and there's a statement running, it's already been
logged. With this change, that would sometimes be true and sometimes
false.For example, suppose that at 9am sharp, I run an UPDATE command that
takes ten seconds to complete. Right now, the log_statement message
will appear at 9am. With this change, it will run at 9am if I do it
inside a transaction block that has an XID already, and at 9:00:10am
if I do it in a transaction block that does not yet have an XID, or if
I do it outside of a transaction. I don't think the benefit of getting
the XID in the log message is nearly enough to justify such a strange
behavior.
I thought about writing statement log when xid assigned. But it's seemed
too complicated.
I'm inclined to keep it for a while. Until we find a good way or give
up. It's a reasonable request, after all.
On Thu, May 16, 2024 at 6:01 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
I thought about writing statement log when xid assigned. But it's seemed
too complicated.
I'm inclined to keep it for a while. Until we find a good way or give
up. It's a reasonable request, after all.
I don't think it's reasonable at all. We can't log the XID before it's
assigned, and we can't log the statement after the XID is assigned
without completely changing how the parameter works.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, May 16, 2024 at 08:37:43AM -0400, Robert Haas wrote:
On Thu, May 16, 2024 at 6:01 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
I thought about writing statement log when xid assigned. But it's seemed
too complicated.
I'm inclined to keep it for a while. Until we find a good way or give
up. It's a reasonable request, after all.I don't think it's reasonable at all. We can't log the XID before it's
assigned, and we can't log the statement after the XID is assigned
without completely changing how the parameter works.
I have removed the TODO item.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.