Add statistics refresh materialized view

Started by Seino Yukialmost 5 years ago5 messageshackers
Jump to latest
#1Seino Yuki
seinoyu@oss.nttdata.com

Hi.

This is a proposal for a new feature in statistics collector.
I think we need to add statistics about refresh matview to
pg_stat_all_tables view.

When the "REFRESH MATERIALIZED VIEW" was executed, the number of times
it was executed
and date it took were not recorded anywhere.

"pg_stat_statements" can be used to get the number of executions and the
date and time of execution,
but this information is statement-based, not view-based.
Also, that method requires the high cost of "pg_stat_statements".

This patch will add statistics(count, last time) about "REFRESH
MATERIALIZED VIEW"
to pg_stat_all_tables(pg_stat_user_tables, [pg_stat_sys_tables]).

What do you think?

Regards,
Seino Yuki

Attachments:

add_statistics_refresh_matview_v1.patchtext/x-diff; name=add_statistics_refresh_matview_v1.patchDownload+147-5
#2Fujii Masao
masao.fujii@gmail.com
In reply to: Seino Yuki (#1)
Re: Add statistics refresh materialized view

On 2021/07/09 1:39, Seino Yuki wrote:

Hi.

This is a proposal for a new feature in statistics collector.
I think we need to add statistics about refresh matview to pg_stat_all_tables view.

Why do you want to treat only REFRESH MATERIALIZED VIEW command special?
What about other utility commands like TRUNCATE, CLUSTER, etc?

It's not good design to add new columns per utility command into
pg_stat_all_tables. Otherwise pg_stat_all_tables will have to have lots of
columns to expose the stats of many utility commands at last. Which is
ugly and very user-unfriendly.

Most entries in pg_stat_all_tables are basically for tables. So the columns
about REFRESH MATERIALIZED VIEW are useless for those most entries.
This is another reason why I think the design is not good.

When the "REFRESH MATERIALIZED VIEW" was executed, the number of times it was executed
and date it took were not recorded anywhere.

pg_stat_statements and log_statement would help?

"pg_stat_statements" can be used to get the number of executions and the date and time of execution,
but this information is statement-based, not view-based.

pg_stat_statements reports different records for REFRESH MATERIALIZED VIEW
commands on different views. So ISTM that we can aggregate the information
per view, from pg_stat_statements. No?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#3Seino Yuki
seinoyu@oss.nttdata.com
In reply to: Fujii Masao (#2)
Re: Add statistics refresh materialized view

On 2021-09-01 23:15, Fujii Masao wrote:

Why do you want to treat only REFRESH MATERIALIZED VIEW command
special?
What about other utility commands like TRUNCATE, CLUSTER, etc?

First of all, knowing the update date and time of the MATVIEW is
essential for actual operation.
Without that information, users will not be able to trust the MATVIEW.

In terms of the reliability of the information in the table,
I think the priority of the REFRESHED MATVIEW is higher than that of
TRUNCATE and CLUSTER.

It's not good design to add new columns per utility command into
pg_stat_all_tables. Otherwise pg_stat_all_tables will have to have lots
of
columns to expose the stats of many utility commands at last. Which is
ugly and very user-unfriendly.

Most entries in pg_stat_all_tables are basically for tables. So the
columns
about REFRESH MATERIALIZED VIEW are useless for those most entries.
This is another reason why I think the design is not good.

I agree with this opinion.
Initially, I thought about storing this information in pg_matviews,
but decided against it because of the overhead of adding it to the
system catalog.

pg_stat_statements reports different records for REFRESH MATERIALIZED
VIEW
commands on different views. So ISTM that we can aggregate the
information
per view, from pg_stat_statements. No?

I made this suggestion based on the premise that the last update date
and time of the Mateview should always be retained.
I think the same concept applies to Oracle Database.
https://docs.oracle.com/cd/F19136_01/refrn/ALL_MVIEWS.html#GUID-8B9432B5-6B66-411A-936E-590D9D7671E9
I thought it would be useless to enable pg_stat_statements and
log_statement to see this information.

However, as you said, for most use cases, pg_stat_statements and
log_statement may be sufficient.
I would like to withdraw this proposal.

Regards,

#4Michael Paquier
michael@paquier.xyz
In reply to: Seino Yuki (#3)
Re: Add statistics refresh materialized view

On Tue, Sep 07, 2021 at 06:11:14PM +0900, Seino Yuki wrote:

I would like to withdraw this proposal.

This was registered in the CF, so marked as RwF.
--
Michael

#5Said Assemlal
sassemlal@neurorx.com
In reply to: Seino Yuki (#3)
Re: Add statistics refresh materialized view

Hi,

However, as you said, for most use cases, pg_stat_statements and
log_statement may be sufficient.
I would like to withdraw this proposal.

Well, they either require extensions or parameters to be set properly.
One advantage I see to store those kind of information is that it can be
queried by application developers (users are reporting old data for
example).

We currently have to rely on other ways to figure out if materialized
views were properly refreshed.