[PATCH] Add log_transaction setting

Started by Sergey Solovievalmost 2 years ago5 messageshackers
Jump to latest
#1Sergey Soloviev
sergey.soloviev@tantorlabs.ru

Hi.

We encountered a problem with excessive logging when transaction is
sampled.
When it is getting sampled every statement is logged, event SELECT. This
can
lead to performance degradation and log polluting.
I have added new setting to filter statements when transaction is
sampled - log_transaction.

Overview
===================
This parameter works very similar to log_statement, but it filters out
statements only
in sampled transaction. It has same values as log_statement, access
rights (only superuser),
setup in postgresql.conf or by superuser with SET.
But by default it has value "all" in compliance with existing behaviour
(to log every statement).

Example usage
==================
Log every DDL, but only subset of other statements.

postgresql.conf

log_transaction = ddl
log_statement = all
log_transaction_sample_rate = 1
log_statement_sample_rate = 0.3

backend:

BEGIN;
CREATE TABLE t1(value text);
INSERT INTO t1(value) VALUES ('hello'), ('world');
SELECT * FROM t1;
DROP TABLE t1;
COMMIT;

logfile:

LOG:  duration: 6.465 ms  statement: create table t1(value text);
LOG:  statement: insert into t1(value) values ('hello'), ('world');
LOG:  duration: 6.457 ms  statement: drop table t1;

As you can see CREATE and DROP were logged with duration (because it is
DDL), but
only INSERT was logged (without duration) - not SELECT.

Testing
===================
All tests are passed - default configuration is compatible with existing
behaviour.
Honestly, I could not find any TAP/regress tests that would test logging
- some
tests only use logs to ensure test results.
I did not find suitable place for such tests, so no tests provided

Implementation details
===================
I have modified signature of check_log_duration function - this accepts
List of
statements that were executed. This is to decide whether we should log
current statement if transaction is sampled.
But this list can be empty, in that case we decide to log. NIL is passed
only
in fast path, PARSE and BIND functions - by default these are logged
if transaction is sampled, so we are compliant with existing behaviour
again.
In first implementation version, there was boolean flag (instead of
List), but
it was replaced by List to defer determining (function call) whether it
is worth logging.

-------------
Best regards, Sergey Solovev

Attachments:

0001-Add-log_transaction-configuration-setting.patchtext/x-patch; charset=UTF-8; name=0001-Add-log_transaction-configuration-setting.patchDownload+81-10
#2Kirill Reshke
reshkekirill@gmail.com
In reply to: Sergey Soloviev (#1)
Re: [PATCH] Add log_transaction setting

On Thu, 4 Jul 2024 at 21:46, Sergey Solovev
<sergey.soloviev@tantorlabs.ru> wrote:

Hi.

We encountered a problem with excessive logging when transaction is
sampled.
When it is getting sampled every statement is logged, event SELECT. This
can
lead to performance degradation and log polluting.
I have added new setting to filter statements when transaction is
sampled - log_transaction.

Overview
===================
This parameter works very similar to log_statement, but it filters out
statements only
in sampled transaction. It has same values as log_statement, access
rights (only superuser),
setup in postgresql.conf or by superuser with SET.
But by default it has value "all" in compliance with existing behaviour
(to log every statement).

Example usage
==================
Log every DDL, but only subset of other statements.

postgresql.conf

log_transaction = ddl
log_statement = all
log_transaction_sample_rate = 1
log_statement_sample_rate = 0.3

backend:

BEGIN;
CREATE TABLE t1(value text);
INSERT INTO t1(value) VALUES ('hello'), ('world');
SELECT * FROM t1;
DROP TABLE t1;
COMMIT;

logfile:

LOG: duration: 6.465 ms statement: create table t1(value text);
LOG: statement: insert into t1(value) values ('hello'), ('world');
LOG: duration: 6.457 ms statement: drop table t1;

As you can see CREATE and DROP were logged with duration (because it is
DDL), but
only INSERT was logged (without duration) - not SELECT.

Testing
===================
All tests are passed - default configuration is compatible with existing
behaviour.
Honestly, I could not find any TAP/regress tests that would test logging
- some
tests only use logs to ensure test results.
I did not find suitable place for such tests, so no tests provided

Implementation details
===================
I have modified signature of check_log_duration function - this accepts
List of
statements that were executed. This is to decide whether we should log
current statement if transaction is sampled.
But this list can be empty, in that case we decide to log. NIL is passed
only
in fast path, PARSE and BIND functions - by default these are logged
if transaction is sampled, so we are compliant with existing behaviour
again.
In first implementation version, there was boolean flag (instead of
List), but
it was replaced by List to defer determining (function call) whether it
is worth logging.

-------------
Best regards, Sergey Solovev

Hi!

As I understand, the need of this GUC variable comes from fact, that
is the transaction is sampled for logging, all statements within this
tx are logged and this is not configurable now, correct?
Well, if this is the case, why should we add a new GUC? Maybe we
should use `log_statement` in this case too, so there is a bug, that
log_statement is honored not during tx sampling?

Also, tests are failing[1]https://cirrus-ci.com/task/5645711230894080

[1]: https://cirrus-ci.com/task/5645711230894080

--
Best regards,
Kirill Reshke

#3Sergey Soloviev
sergey.soloviev@tantorlabs.ru
In reply to: Kirill Reshke (#2)
Re: [PATCH] Add log_transaction setting

Attachments:

0002-Add-log_transaction-configuration-setting.patchtext/x-diff; name=0002-Add-log_transaction-configuration-setting.patchDownload+81-10
#4Michael Paquier
michael@paquier.xyz
In reply to: Sergey Soloviev (#3)
Re: [PATCH] Add log_transaction setting

On Wed, Aug 14, 2024 at 09:08:00PM +0300, Сергей Соловьев wrote:

From b5e779771e8a7582951aff6f43a716e9e5975884 Mon Sep 17 00:00:00 2001
From: "Sergey Solovev" <sergey.soloviev@tantorlabs.ru>
Date: Thu, 4 Jul 2024 17:02:13 +0300
Subject: [PATCH] Add log_transaction configuration setting

CF bot is red, please provide a rebase. And also, perhaps avoid HTML
messages..
--
Michael

#5Kirill Reshke
reshkekirill@gmail.com
In reply to: Sergey Soloviev (#3)
Re: [PATCH] Add log_transaction setting

On Wed, 14 Aug 2024 at 23:08, Сергей Соловьев
<sergey.soloviev@tantorlabs.ru> wrote:

10.08.2024, 16:40, "Kirill Reshke" <reshkekirill@gmail.com>:

On Thu, 4 Jul 2024 at 21:46, Sergey Solovev
<sergey.soloviev@tantorlabs.ru> wrote:

Hi.

We encountered a problem with excessive logging when transaction is
sampled.
When it is getting sampled every statement is logged, event SELECT. This
can
lead to performance degradation and log polluting.
I have added new setting to filter statements when transaction is
sampled - log_transaction.

Overview
===================
This parameter works very similar to log_statement, but it filters out
statements only
in sampled transaction. It has same values as log_statement, access
rights (only superuser),
setup in postgresql.conf or by superuser with SET.
But by default it has value "all" in compliance with existing behaviour
(to log every statement).

Example usage
==================
Log every DDL, but only subset of other statements.

postgresql.conf

log_transaction = ddl
log_statement = all
log_transaction_sample_rate = 1
log_statement_sample_rate = 0.3

backend:

BEGIN;
CREATE TABLE t1(value text);
INSERT INTO t1(value) VALUES ('hello'), ('world');
SELECT * FROM t1;
DROP TABLE t1;
COMMIT;

logfile:

LOG: duration: 6.465 ms statement: create table t1(value text);
LOG: statement: insert into t1(value) values ('hello'), ('world');
LOG: duration: 6.457 ms statement: drop table t1;

As you can see CREATE and DROP were logged with duration (because it is
DDL), but
only INSERT was logged (without duration) - not SELECT.

Testing
===================
All tests are passed - default configuration is compatible with existing
behaviour.
Honestly, I could not find any TAP/regress tests that would test logging
- some
tests only use logs to ensure test results.
I did not find suitable place for such tests, so no tests provided

Implementation details
===================
I have modified signature of check_log_duration function - this accepts
List of
statements that were executed. This is to decide whether we should log
current statement if transaction is sampled.
But this list can be empty, in that case we decide to log. NIL is passed
only
in fast path, PARSE and BIND functions - by default these are logged
if transaction is sampled, so we are compliant with existing behaviour
again.
In first implementation version, there was boolean flag (instead of
List), but
it was replaced by List to defer determining (function call) whether it
is worth logging.

-------------
Best regards, Sergey Solovev

Hi!

As I understand, the need of this GUC variable comes from fact, that
is the transaction is sampled for logging, all statements within this
tx are logged and this is not configurable now, correct?
Well, if this is the case, why should we add a new GUC? Maybe we
should use `log_statement` in this case too, so there is a bug, that
log_statement is honored not during tx sampling?

Also, tests are failing[1]

[1] https://cirrus-ci.com/task/5645711230894080

--
Best regards,
Kirill Reshke

Hi, Kirill!

Thanks for review.

Also, tests are failing[1]

That is my mistake: I have tested patch on REL_17BETA2, but set 18 version, and
it seems that I run tests without proper ./configure flags. I reproduced error and fixed
it in new patch.

As I understand, the need of this GUC variable comes from fact, that
is the transaction is sampled for logging, all statements within this
tx are logged and this is not configurable now, correct?

Yes. That's the point - all or nothing.

Well, if this is the case, why should we add a new GUC? Maybe we
should use `log_statement` in this case too, so there is a bug, that
log_statement is honored not during tx sampling?

That can be good solution, but I proceed from possibility of flexible logging setup.
Shown example (when all DDL statements were logged) is just one of such use cases.
Another example is security policy: when we must log every data modification -
set `log_transaction = mod`.

Hi Sergey!
We are still awaiting a rebase.
I moved this to the next CF, as there is little chance of being
finished within this CF.

--
Best regards,
Kirill Reshke