Adding description about the random_page_cost and parallel_leader_participation

Started by PG Bug reporting formover 5 years ago2 messagesdocs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/12/runtime-config-query.html
Description:

I recently have an observation that the combination of random_page_cost and
parallel_leader_participation can cause a performance regression. However,
based on the official website, it seems that these two configurations seem
not to have a very straight relationship.
I run the PostgreSQL 11 on Ubuntu 18.04. When I create a database with
about 16MB and run a select query by pgbench. If I set the random_page_cost
as 4 and parallel_leader_participation is on, the average latency of select
query is about 72 ms for five clients. However, if we change the
random_page_cost to 1 or turn off the parallel_leader_participation, the
average latency is only 54ms. However, If I manually run the query on a
client, changing the value of random_page_cost doesn't change the
performance
I check the workload and find that if the random_page_cost is 4 and
parallel_leader_participation is on, query planer would tend to choose Merge
gather join for select clause, which would require more memory to do
sorting. If the random_page_cost is 1 and parallel_leader_participation is
on, the query plan would choose nest loop, which requires less memory. I
guess this might be the root cause of performance regression.
If my guess is true, I would suggest mentioning that the combination of
random_page_cost and paraller_leader_participation would cause the query to
use more memory in the official document

#2Bruce Momjian
bruce@momjian.us
In reply to: PG Bug reporting form (#1)
Re: Adding description about the random_page_cost and parallel_leader_participation

On Tue, Sep 1, 2020 at 05:26:50PM +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/12/runtime-config-query.html
Description:

I recently have an observation that the combination of random_page_cost and
parallel_leader_participation can cause a performance regression. However,
based on the official website, it seems that these two configurations seem
not to have a very straight relationship.
I run the PostgreSQL 11 on Ubuntu 18.04. When I create a database with
about 16MB and run a select query by pgbench. If I set the random_page_cost
as 4 and parallel_leader_participation is on, the average latency of select
query is about 72 ms for five clients. However, if we change the
random_page_cost to 1 or turn off the parallel_leader_participation, the
average latency is only 54ms. However, If I manually run the query on a
client, changing the value of random_page_cost doesn't change the
performance
I check the workload and find that if the random_page_cost is 4 and
parallel_leader_participation is on, query planer would tend to choose Merge
gather join for select clause, which would require more memory to do
sorting. If the random_page_cost is 1 and parallel_leader_participation is
on, the query plan would choose nest loop, which requires less memory. I
guess this might be the root cause of performance regression.
If my guess is true, I would suggest mentioning that the combination of
random_page_cost and paraller_leader_participation would cause the query to
use more memory in the official document

There are a lot of interactions between settings, and mentioning all of
them would be impossible. I suggest you ask this on our performance
email list, and then you can get some ideas on the caues of this.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee