pg_stat_wal: tracking the compression effect

Started by Ken Katoover 3 years ago6 messages
#1Ken Kato
katouknl@oss.nttdata.com

Hi hackers,

We can specify compression method (for example, lz4, zstd), but it is
hard to know the effect of compression depending on the method. There is
already a way to know the compression effect using pg_waldump. However,
having these statistics in the view makes it more accessible. I am
proposing to add statistics, which keeps track of compression effect in
pg_stat_ wal view.

The design I am thinking is below:

compression_saved | compression_times
------------------+-------------------
38741 | 6

Accumulating the values, which indicates how much space is saved by each
compression (size before compression - size after compression), and keep
track of how many times compression has happened. So that one can know
how much space is saved on average.

What do you think?

Regards,

--
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Ken Kato (#1)
Re: pg_stat_wal: tracking the compression effect

At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in

Accumulating the values, which indicates how much space is saved by
each compression (size before compression - size after compression),
and keep track of how many times compression has happened. So that one
can know how much space is saved on average.

Honestly, I don't think its useful much.
How about adding them to pg_waldump and pg_walinspect instead?

# It further widens the output of pg_waldump, though..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: pg_stat_wal: tracking the compression effect

At Fri, 26 Aug 2022 11:55:27 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in

Accumulating the values, which indicates how much space is saved by
each compression (size before compression - size after compression),
and keep track of how many times compression has happened. So that one
can know how much space is saved on average.

Honestly, I don't think its useful much.
How about adding them to pg_waldump and pg_walinspect instead?

# It further widens the output of pg_waldump, though..

Sorry, that was apparently too short.

I know you already see that in per-record output of pg_waldump, but
maybe we need the summary of saved bytes in "pg_waldump -b -z" output
and the corresponding output of pg_walinspect.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#4Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Ken Kato (#1)
Re: pg_stat_wal: tracking the compression effect

On 25 Aug 2022, at 12:04, Ken Kato <katouknl@oss.nttdata.com> wrote:

What do you think?

I think users will need to choose between Lz4 and Zstd. So they need to know tradeoff - compression ratio vs cpu time spend per page(or any other segment).

I know that Zstd must be kind of "better", but doubt it have enough runway on 1 block to show off. If only we could persist compression context between many pages...
Compression ratio may be different on different workloads, so system view or something similar could be of use.

Thanks!

Best regards, Andrey Borodin.

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#3)
Re: pg_stat_wal: tracking the compression effect

On Fri, Aug 26, 2022 at 8:39 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Fri, 26 Aug 2022 11:55:27 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in

Accumulating the values, which indicates how much space is saved by
each compression (size before compression - size after compression),
and keep track of how many times compression has happened. So that one
can know how much space is saved on average.

Honestly, I don't think its useful much.
How about adding them to pg_waldump and pg_walinspect instead?

# It further widens the output of pg_waldump, though..

Sorry, that was apparently too short.

I know you already see that in per-record output of pg_waldump, but
maybe we need the summary of saved bytes in "pg_waldump -b -z" output
and the corresponding output of pg_walinspect.

+1 for adding compression stats such as type and saved bytes to
pg_waldump and pg_walinspect given that the WAL records already have
the saved bytes info. Collecting them in the server via pg_stat_wal
will require some extra effort, for instance, every WAL record insert
requires that code to be executed. When users want to analyze the
compression efforts they can either use pg_walinspect or pg_waldump
and change the compression type if required.

--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/

#6Ken Kato
katouknl@oss.nttdata.com
In reply to: Bharath Rupireddy (#5)
Re: pg_stat_wal: tracking the compression effect

On 2022-08-27 16:48, Bharath Rupireddy wrote:

On Fri, Aug 26, 2022 at 8:39 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Fri, 26 Aug 2022 11:55:27 +0900 (JST), Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote in

At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in

Accumulating the values, which indicates how much space is saved by
each compression (size before compression - size after compression),
and keep track of how many times compression has happened. So that one
can know how much space is saved on average.

Honestly, I don't think its useful much.
How about adding them to pg_waldump and pg_walinspect instead?

# It further widens the output of pg_waldump, though..

Sorry, that was apparently too short.

I know you already see that in per-record output of pg_waldump, but
maybe we need the summary of saved bytes in "pg_waldump -b -z" output
and the corresponding output of pg_walinspect.

+1 for adding compression stats such as type and saved bytes to
pg_waldump and pg_walinspect given that the WAL records already have
the saved bytes info. Collecting them in the server via pg_stat_wal
will require some extra effort, for instance, every WAL record insert
requires that code to be executed. When users want to analyze the
compression efforts they can either use pg_walinspect or pg_waldump
and change the compression type if required.

Thank you for all the comments!

I will go with adding the compression stats in pg_waldump and
pg_walinspect.

Regards,
--
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION