log_line_prefix: make it possible to add the search_path

Started by Pierre Ducroquetover 3 years ago7 messages
#1Pierre Ducroquet
p.psql@pinaraf.info
1 attachment(s)

Hello

I'm working on several databases where schemas are used to differentiate the
tenants.
This is great for performance, but several tools are lacking around this
usecase by not showing the schema, one of them being log_line_prefix.
It is possible to work around this using the application_name, but a mistake
on the application side would be fatal, while the search_path would still
indicate the real tables used in a query.
The attached patch implements this, using %S. I've not written the
documentation yet, since I'm not sure this would be acceptable as is, or if a
more "generic" method should be used (I thought of %{name} to fetch an
arbitrary GUC, but did not implement due to a lack of need for that feature)

Attachments:

0001-log_line_prefix-make-it-possible-to-add-search_path.patchtext/x-patch; charset=UTF-8; name=0001-log_line_prefix-make-it-possible-to-add-search_path.patchDownload
From d28ea4452c6e78bf8e67db49d1c72dbf4bd1ca48 Mon Sep 17 00:00:00 2001
From: Pierre Ducroquet <p.psql@pinaraf.info>
Date: Mon, 25 Jul 2022 09:33:49 +0200
Subject: [PATCH] log_line_prefix: make it possible to add search_path

This adds support for %S to add the current search_path in log_line_prefix, thus
making it easier to track slow queries on databases with for instance one schema
per tenant.
---
 src/backend/utils/error/elog.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 95f32de4e2..41e894bc83 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -68,6 +68,7 @@
 
 #include "access/transam.h"
 #include "access/xact.h"
+#include "catalog/namespace.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
@@ -83,7 +84,6 @@
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
 
-
 /* In this module, access gettext() via err_gettext() */
 #undef _
 #define _(x) err_gettext(x)
@@ -2791,6 +2791,12 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata)
 					appendStringInfo(buf, "%lld",
 									 (long long) pgstat_get_my_query_id());
 				break;
+			case 'S':
+				if (padding != 0)
+					appendStringInfo(buf, "%*s", padding, namespace_search_path);
+				else
+					appendStringInfoString(buf, namespace_search_path);
+
 			default:
 				/* format error - ignore it */
 				break;
-- 
2.37.1

#2Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Pierre Ducroquet (#1)
Re: log_line_prefix: make it possible to add the search_path

On 2022-Jul-25, Pierre Ducroquet wrote:

This is great for performance, but several tools are lacking around this
usecase by not showing the schema, one of them being log_line_prefix.

The attached patch implements this, using %S. I've not written the
documentation yet, since I'm not sure this would be acceptable as is, or if a
more "generic" method should be used (I thought of %{name} to fetch an
arbitrary GUC, but did not implement due to a lack of need for that feature)

It seems that this would be too noisy to be truly usable. What if we
emitted a log line when the variable changed, and the value that's in
use when the connection starts?

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)

#3Pierre
p.psql@pinaraf.info
In reply to: Alvaro Herrera (#2)
Re: log_line_prefix: make it possible to add the search_path

On Monday, July 25, 2022 11:52:41 AM CEST Alvaro Herrera wrote:

On 2022-Jul-25, Pierre Ducroquet wrote:

This is great for performance, but several tools are lacking around this
usecase by not showing the schema, one of them being log_line_prefix.

The attached patch implements this, using %S. I've not written the
documentation yet, since I'm not sure this would be acceptable as is, or
if a more "generic" method should be used (I thought of %{name} to fetch
an arbitrary GUC, but did not implement due to a lack of need for that
feature)

It seems that this would be too noisy to be truly usable. What if we
emitted a log line when the variable changed, and the value that's in
use when the connection starts?

Then the log files would be filled by these messages, only to be able to make
use of the few slow queries that are logged during the day. Or it would need
to be a log before each slow query. I'm not sure how well it would work.

#4Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Pierre (#3)
Re: log_line_prefix: make it possible to add the search_path

On 2022-Jul-25, Pierre wrote:

On Monday, July 25, 2022 11:52:41 AM CEST Alvaro Herrera wrote:

It seems that this would be too noisy to be truly usable. What if we
emitted a log line when the variable changed, and the value that's in
use when the connection starts?

Then the log files would be filled by these messages, only to be able to make
use of the few slow queries that are logged during the day.

Ah, yeah, that's not useful for that case ...

Or it would need to be a log before each slow query. I'm not sure how
well it would work.

... and this would probably be prohibitively complex to implement and
use, as well as too slow for the high traffic case.

Maybe your idea of allowing arbitrary GUCs is not a bad one, something
like
%{search_path}G
(where we add a letter at the end just so we can add other things in the
future that aren't GUCs.)

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#4)
Re: log_line_prefix: make it possible to add the search_path

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Maybe your idea of allowing arbitrary GUCs is not a bad one, something
like
%{search_path}G
(where we add a letter at the end just so we can add other things in the
future that aren't GUCs.)

I'm pretty uncomfortable about the amount of code that could potentially
be reached during an error logging attempt if we do that.

regards, tom lane

#6Lukas Fittl
lukas@fittl.com
In reply to: Pierre Ducroquet (#1)
Re: log_line_prefix: make it possible to add the search_path

On Mon, Jul 25, 2022 at 12:38 AM Pierre Ducroquet <p.psql@pinaraf.info>
wrote:

usecase by not showing the schema, one of them being log_line_prefix.
It is possible to work around this using the application_name, but a
mistake
on the application side would be fatal, while the search_path would still
indicate the real tables used in a query.

I'm assuming this is mostly referring to STATEMENT log lines and other
situations where the original query is output (e.g. auto_explain).

+1 on the benefit of solving this (I've had this use case before), but I
think we can keep this more specific than a general log_line_prefix option.
The search_path isn't relevant to any log line that doesn't reference a
query, since e.g. autovacuum log output fully qualifies its relation names,
and many other common log lines have nothing to do with tables or queries.

What if we instead had something like this, as an extra CONTEXT (or DETAIL)
log line:

LOG: duration: 4079.697 ms execute <unnamed>:
SELECT * FROM x WHERE y = $1 LIMIT $2
DETAIL: parameters: $1 = 'long string', $2 = '1'
CONTEXT: settings: search_path = 'my_tenant_schema, "$user", public'

That way you could determine that the slow query was affecting the "x"
table in "my_tenant_schema".

This log output would be controlled by a new GUC, e.g.
"log_statement_search_path" with three settings: (1) never, (2)
non_default, (3) always.

The default would be "never" (same as today). "non_default" would output
the search path when a SET has modified it in the current session (and so
we couldn't infer it from the config or the role/database overrides).
"always" would always output the search path for statement-related log
lines.

Thanks,
Lukas

--
Lukas Fittl

#7Pierre
p.psql@pinaraf.info
In reply to: Lukas Fittl (#6)
1 attachment(s)
Re: log_line_prefix: make it possible to add the search_path

On Tuesday, July 26, 2022 3:08:01 AM CEST Lukas Fittl wrote:

On Mon, Jul 25, 2022 at 12:38 AM Pierre Ducroquet <p.psql@pinaraf.info>

wrote:

usecase by not showing the schema, one of them being log_line_prefix.
It is possible to work around this using the application_name, but a
mistake
on the application side would be fatal, while the search_path would still
indicate the real tables used in a query.

I'm assuming this is mostly referring to STATEMENT log lines and other
situations where the original query is output (e.g. auto_explain).

+1 on the benefit of solving this (I've had this use case before), but I
think we can keep this more specific than a general log_line_prefix option.
The search_path isn't relevant to any log line that doesn't reference a
query, since e.g. autovacuum log output fully qualifies its relation names,
and many other common log lines have nothing to do with tables or queries.

What if we instead had something like this, as an extra CONTEXT (or DETAIL)
log line:

LOG: duration: 4079.697 ms execute <unnamed>:
SELECT * FROM x WHERE y = $1 LIMIT $2
DETAIL: parameters: $1 = 'long string', $2 = '1'
CONTEXT: settings: search_path = 'my_tenant_schema, "$user", public'

That way you could determine that the slow query was affecting the "x"
table in "my_tenant_schema".

This log output would be controlled by a new GUC, e.g.
"log_statement_search_path" with three settings: (1) never, (2)
non_default, (3) always.

The default would be "never" (same as today). "non_default" would output
the search path when a SET has modified it in the current session (and so
we couldn't infer it from the config or the role/database overrides).
"always" would always output the search path for statement-related log
lines.

Thanks,
Lukas

Hi

This is a good idea. I've hacked a first implementation of it (lacking
documentation, and several logs are still missing) attached to this email.
The biggest issue I had was with knowing where the setting come from since no
guc.h function expose that information. I worked around this a bit, but I'm
sure it would be preferable to do it otherwise.

Thanks for your feedbacks

Regards

Pierre

Attachments:

0001-log-the-search_path-of-statements-configurable.patchtext/x-patch; charset=UTF-8; name=0001-log-the-search_path-of-statements-configurable.patchDownload
From 11bbdec931481ffc02ce147db2c7c8b08e6503e4 Mon Sep 17 00:00:00 2001
From: Pierre Ducroquet <pducroquet@entrouvert.com>
Date: Tue, 26 Jul 2022 14:39:59 +0200
Subject: [PATCH] log the search_path of statements (configurable)

Introduce a new GUC, log_statement_search_path, to make it possible
to log the search_path of queries.
---
 src/backend/catalog/namespace.c |  4 ++++
 src/backend/tcop/postgres.c     | 34 ++++++++++++++++++++++++++++-----
 src/backend/utils/misc/guc.c    | 11 +++++++++++
 src/include/catalog/namespace.h |  2 ++
 src/include/tcop/tcopprot.h     |  8 ++++++++
 src/include/utils/guc.h         |  1 +
 6 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index fafb9349cc..ec6c260f15 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -197,6 +197,7 @@ static SubTransactionId myTempNamespaceSubID = InvalidSubTransactionId;
  * of the GUC variable 'search_path'.
  */
 char	   *namespace_search_path = NULL;
+GucSource	namespace_search_path_source = 0;
 
 
 /* Local functions */
@@ -4348,6 +4349,9 @@ check_search_path(char **newval, void **extra, GucSource source)
 	pfree(rawname);
 	list_free(namelist);
 
+	/* Store the source */
+	namespace_search_path_source = source;
+
 	return true;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d0bbd30d2b..4fa6260f63 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -39,6 +39,7 @@
 #include "access/parallel.h"
 #include "access/printtup.h"
 #include "access/xact.h"
+#include "catalog/namespace.h"
 #include "catalog/pg_type.h"
 #include "commands/async.h"
 #include "commands/prepare.h"
@@ -96,6 +97,15 @@ bool		Log_disconnections = false;
 
 int			log_statement = LOGSTMT_NONE;
 
+const struct config_enum_entry log_statement_search_path_options[] = {
+        {"never", LOG_STATEMENT_SEARCH_PATH_NEVER, false},
+        {"non_default", LOG_STATEMENT_SEARCH_PATH_NON_DEFAULT, false},
+        {"always", LOG_STATEMENT_SEARCH_PATH_ALWAYS, false},
+        {NULL, 0, false}
+};
+
+int			log_statement_search_path = LOG_STATEMENT_SEARCH_PATH_NEVER;
+
 /* GUC variable for maximum stack depth (measured in kilobytes) */
 int			max_stack_depth = 100;
 
@@ -1328,11 +1338,25 @@ exec_simple_query(const char *query_string)
 					 errhidestmt(true)));
 			break;
 		case 2:
-			ereport(LOG,
-					(errmsg("duration: %s ms  statement: %s",
-							msec_str, query_string),
-					 errhidestmt(true),
-					 errdetail_execute(parsetree_list)));
+			if ((log_statement_search_path > LOG_STATEMENT_SEARCH_PATH_NEVER)
+				&& (
+					namespace_search_path_source > PGC_S_INTERACTIVE
+					||
+					log_statement_search_path > LOG_STATEMENT_SEARCH_PATH_NON_DEFAULT
+				)) {
+				ereport(LOG,
+						(errmsg("duration: %s ms  statement: %s",
+								msec_str, query_string),
+						errhidestmt(true),
+						errdetail("search_path: %s", namespace_search_path),
+						errdetail_execute(parsetree_list)));
+			} else {
+				ereport(LOG,
+						(errmsg("duration: %s ms  statement: %s",
+								msec_str, query_string),
+						errhidestmt(true),
+						errdetail_execute(parsetree_list)));
+			}
 			break;
 	}
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index af4a1c3068..524d6befa6 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -604,6 +604,7 @@ extern const struct config_enum_entry archive_mode_options[];
 extern const struct config_enum_entry recovery_target_action_options[];
 extern const struct config_enum_entry sync_method_options[];
 extern const struct config_enum_entry dynamic_shared_memory_options[];
+extern const struct config_enum_entry log_statement_search_path_options[];
 
 /*
  * GUC option variables that are exported from this module
@@ -4836,6 +4837,16 @@ static struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_statement_search_path", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Sets whether to log the search path of logged statements."),
+			NULL
+		},
+		&log_statement_search_path,
+		LOG_STATEMENT_SEARCH_PATH_NEVER, log_statement_search_path_options,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"log_min_messages", PGC_SUSET, LOGGING_WHEN,
 			gettext_noop("Sets the message levels that are logged."),
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 1bc55c01a5..903531b231 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -16,6 +16,7 @@
 
 #include "nodes/primnodes.h"
 #include "storage/lock.h"
+#include "utils/guc.h"
 
 
 /*
@@ -183,6 +184,7 @@ extern void AtEOSubXact_Namespace(bool isCommit, SubTransactionId mySubid,
 
 /* stuff for search_path GUC variable */
 extern PGDLLIMPORT char *namespace_search_path;
+extern PGDLLIMPORT GucSource namespace_search_path_source;
 
 extern List *fetch_search_path(bool includeImplicit);
 extern int	fetch_search_path_array(Oid *sarray, int sarray_len);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 70d9dab25b..0d3d7bf4e0 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -41,6 +41,14 @@ typedef enum
 	LOGSTMT_ALL					/* log all statements */
 } LogStmtLevel;
 
+typedef enum 
+{
+	LOG_STATEMENT_SEARCH_PATH_NEVER = 0,
+	LOG_STATEMENT_SEARCH_PATH_NON_DEFAULT,
+	LOG_STATEMENT_SEARCH_PATH_ALWAYS
+} LogStmtSearchPathOptions;
+
+
 extern PGDLLIMPORT int log_statement;
 
 extern List *pg_parse_query(const char *query_string);
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index e734493a48..34b406092b 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -270,6 +270,7 @@ extern PGDLLIMPORT double log_statement_sample_rate;
 extern PGDLLIMPORT double log_xact_sample_rate;
 extern PGDLLIMPORT char *backtrace_functions;
 extern PGDLLIMPORT char *backtrace_symbol_list;
+extern PGDLLIMPORT int log_statement_search_path;
 
 extern PGDLLIMPORT int temp_file_limit;
 
-- 
2.35.1