Inquiry About Determining Parallel Plans for REFRESH MATERIALIZED VIEW

Started by Zhang Mingli12 months ago4 messages

zmlpostgres@gmail.com

12 months ago

Hi, all

I am currently exploring the execution of the REFRESH MATERIALIZED VIEW command and have a specific question regarding the underlying query plan. As you know, the REFRESH command is a utility command, and using EXPLAIN REFRESH does not provide a plan structure for analysis.

During development, we want to ascertain whether the SELECT part of the REFRESH command executes with a parallel plan. While I understand that we can run EXPLAIN on the SELECT statement directly, we are interested in knowing if there is a method to determine the execution plan under the REFRESH command context.

Currently, we have configured the necessary settings for parallel queries and have been recording the execution time of the REFRESH command to compare parallel and non-parallel executions.

However, I’m looking for a more definitive way to verify if the plan is indeed parallel during the execution of the REFRESH command.

Any suggestions?

Zhang Mingli
www.hashdata.xyz

Import Notes

Reply to msg id not found: 35481658-92e7-49c9-a3b5-77e8667ec971@SparkReference msg id not found: 35481658-92e7-49c9-a3b5-77e8667ec971@Spark

Yugo Nagata

nagata@sraoss.co.jp

12 months ago

In reply to: Zhang Mingli (#1)

Re: Inquiry About Determining Parallel Plans for REFRESH MATERIALIZED VIEW

On Thu, 16 Jan 2025 12:39:06 +0800
Zhang Mingli <zmlpostgres@gmail.com> wrote:

Hi, all

I am currently exploring the execution of the REFRESH MATERIALIZED VIEW command and have a specific question regarding the underlying query plan. As you know, the REFRESH command is a utility command, and using EXPLAIN REFRESH does not provide a plan structure for analysis.

During development, we want to ascertain whether the SELECT part of the REFRESH command executes with a parallel plan. While I understand that we can run EXPLAIN on the SELECT statement directly, we are interested in knowing if there is a method to determine the execution plan under the REFRESH command context.

Currently, we have configured the necessary settings for parallel queries and have been recording the execution time of the REFRESH command to compare parallel and non-parallel executions.

However, I’m looking for a more definitive way to verify if the plan is indeed parallel during the execution of the REFRESH command.

Any suggestions?

You will be able to log the plan during the REFRESH by using auto_explain and setting
log_analyze and log_nested_statements to on.

Regards,
Yugo Nagata

--
Yugo Nagata <nagata@sraoss.co.jp>

Zhang Mingli

zmlpostgres@gmail.com

12 months ago

In reply to: Yugo Nagata (#2)

Re: Inquiry About Determining Parallel Plans for REFRESH MATERIALIZED VIEW

Zhang Mingli
www.hashdata.xyz
On Jan 16, 2025 at 14:04 +0800, Yugo Nagata <nagata@sraoss.co.jp>, wrote:

You will be able to log the plan during the REFRESH by using auto_explain and setting
log_analyze and log_nested_statements to on.

Hi Yugo,

Thank you for your help! That’s certainly a viable approach to logging the plan during the REFRESH operation.
However, I want to clarify that we’re particularly interested in examining the SQL cases. When there are numerous queries that may also include REFRESH, it can be challenging to sift through the logs and identify the specific query we want to analyze using SQL.

Ideally, it would be beneficial if we could obtain an explanation of the SQL associated with a REFRESH command, allowing us to see the SELECT plan without having to execute the REFRESH itself.
We could limit EXPLAIN utility command to only REFRESH , on the AS SELECT part, similar to how we can with CREATE TABLE AS, is it possible and worthwhile?

jian he

jian.universality@gmail.com

12 months ago

In reply to: Zhang Mingli (#3)

Re: Inquiry About Determining Parallel Plans for REFRESH MATERIALIZED VIEW

On Thu, Jan 16, 2025 at 3:05 PM Zhang Mingli <zmlpostgres@gmail.com> wrote:

Thank you for your help! That’s certainly a viable approach to logging the plan during the REFRESH operation.
However, I want to clarify that we’re particularly interested in examining the SQL cases. When there are numerous queries that may also include REFRESH, it can be challenging to sift through the logs and identify the specific query we want to analyze using SQL.

Ideally, it would be beneficial if we could obtain an explanation of the SQL associated with a REFRESH command, allowing us to see the SELECT plan without having to execute the REFRESH itself.
We could limit EXPLAIN utility command to only REFRESH , on the AS SELECT part, similar to how we can with CREATE TABLE AS, is it possible and worthwhile?

you can use pg_get_viewdef to get the matview definition then use it
in the plpgsql function.

create function explain_query_json(query_sql text)
returns table (explain_line json)
language plpgsql as
$$
begin
return query execute 'EXPLAIN (FORMAT json) ' || query_sql;
end;
$$;