pg_upgrade vs vacuum_cost_delay
Shouldn't pg_upgrade turn off vacuum cost delay when it vacuums the new
cluster? Not talking about the post-analyze script, but when it runs
vacuumdb to analyze and freeze before loading the new schema, in
prepare_new_cluster()? Those run during downtime, so it seems like you'd
want those to run as fast as possible.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 16-06-2016 09:05, Magnus Hagander wrote:
Shouldn't pg_upgrade turn off vacuum cost delay when it vacuums the new
cluster? Not talking about the post-analyze script, but when it runs
vacuumdb to analyze and freeze before loading the new schema, in
prepare_new_cluster()? Those run during downtime, so it seems like you'd
want those to run as fast as possible.
Doesn't --new-options do the job?
--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 16, 2016 at 4:35 PM, Euler Taveira <euler@timbira.com.br> wrote:
On 16-06-2016 09:05, Magnus Hagander wrote:
Shouldn't pg_upgrade turn off vacuum cost delay when it vacuums the new
cluster? Not talking about the post-analyze script, but when it runs
vacuumdb to analyze and freeze before loading the new schema, in
prepare_new_cluster()? Those run during downtime, so it seems like you'd
want those to run as fast as possible.Doesn't --new-options do the job?
You could, but it seems like it should do it by default.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Thu, Jun 16, 2016 at 04:45:14PM +0200, Magnus Hagander wrote:
On Thu, Jun 16, 2016 at 4:35 PM, Euler Taveira <euler@timbira.com.br> wrote:
On 16-06-2016 09:05, Magnus Hagander wrote:
Shouldn't pg_upgrade turn off vacuum cost delay when it vacuums the new
cluster? Not talking about the post-analyze script, but when it runs
vacuumdb to analyze and freeze before loading the new schema, in
prepare_new_cluster()? Those run during downtime, so it seems like you'd
want those to run as fast as possible.Doesn't --new-options do the job?
You could, but it seems like it should do it by default.
Based on this seven year old post, I realized there are minimal
directions in pg_upgrade docs about how to generate statistics quickly,
so I created this patch to help.
We do have docs on updating planner statistics:
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-STATISTICS
but that doesn't seem to cover cases where you are doing an upgrade or
pg_dump restore. Should I move this information into there instead?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
Attachments:
analyze.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..c72e69bb67 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -784,6 +784,14 @@ psql --username=postgres --file=script.sql postgres
of the upgrade. You might need to set connection parameters to
match your new cluster.
</para>
+
+ <para>
+ Using <command>vacuumdb --all --analyze-only</command> can efficiently
+ generate such statistics, and the use of <option>--jobs</option>
+ can speed it up. Option <option>--analyze-in-stages</option> can
+ be used to generate minimal statistics quickly. Non-zero values of
+ <varname>vacuum_cost_delay</varname> will delay statistics generation.
+ </para>
</step>
<step>
On Thu, Nov 23, 2023 at 5:23 AM Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Jun 16, 2016 at 04:45:14PM +0200, Magnus Hagander wrote:
On Thu, Jun 16, 2016 at 4:35 PM, Euler Taveira <euler@timbira.com.br> wrote:
On 16-06-2016 09:05, Magnus Hagander wrote:
Shouldn't pg_upgrade turn off vacuum cost delay when it vacuums the new
cluster? Not talking about the post-analyze script, but when it runs
vacuumdb to analyze and freeze before loading the new schema, in
prepare_new_cluster()? Those run during downtime, so it seems like you'd
want those to run as fast as possible.Doesn't --new-options do the job?
You could, but it seems like it should do it by default.
Based on this seven year old post, I realized there are minimal
directions in pg_upgrade docs about how to generate statistics quickly,
so I created this patch to help.We do have docs on updating planner statistics:
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-STATISTICS
but that doesn't seem to cover cases where you are doing an upgrade or
pg_dump restore. Should I move this information into there instead?
Wow, that's... A while :)
I don't think that final sentence really helps much - for anybody who
doesn't know that functionality well already, it will just be
confusing. At the very least it should be a link that sends you to the
documentation of how that functionality works?
But beyond that, perhaps what we'd really want (now that vacuumdb has
gained more functionality, and is used instead of the custom script
all the way) to add is a parameter --no-cost-delay that would issue a
SET to turn it off for the run?
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
Hi,
On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote:
+ Non-zero values of + <varname>vacuum_cost_delay</varname> will delay statistics generation.
Now I wonder wheter vacuumdb maybe should have an option to explicitly
force vacuum_cost_delay to 0 (I don't think it has?)?
Michael
On Fri, Nov 24, 2023 at 11:21 AM Michael Banck <mbanck@gmx.net> wrote:
Hi,
On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote:
+ Non-zero values of + <varname>vacuum_cost_delay</varname> will delay statistics generation.Now I wonder wheter vacuumdb maybe should have an option to explicitly
force vacuum_cost_delay to 0 (I don't think it has?)?
That's exactly what I proposed, isn't it? :)
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
Hi,
On Fri, Nov 24, 2023 at 12:17:56PM +0100, Magnus Hagander wrote:
On Fri, Nov 24, 2023 at 11:21 AM Michael Banck <mbanck@gmx.net> wrote:
On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote:
+ Non-zero values of + <varname>vacuum_cost_delay</varname> will delay statistics generation.Now I wonder wheter vacuumdb maybe should have an option to explicitly
force vacuum_cost_delay to 0 (I don't think it has?)?That's exactly what I proposed, isn't it? :)
You're right, I somehow only saw your mail after I had already sent
mine.
To make up for this, I created a patch that implements our propoals, see
attached.
Michael
Attachments:
0001-Add-no-cost-delay-option-to-vacuumdb.patchtext/x-diff; charset=us-asciiDownload
From b4a6e23f297994a237e35b9f96d2b806a17ba7a8 Mon Sep 17 00:00:00 2001
From: Michael Banck <michael.banck@credativ.de>
Date: Fri, 24 Nov 2023 13:00:31 +0100
Subject: [PATCH] Add --no-cost-delay option to vacuumdb.
This sets the vacuum_cost_delay parameter to 0 and overrides the system
setting (which is 0 by default as well). This can be useful in scripts where
the actual value of vacuum_cost_delay is not known and vacuumdb should operate
as fast as possible.
If --no-cost-delay is selected in addition to --analze-in-stages, all stages
are always analyzed without cost-based vacuum. Otherwise, the previous
behaviour of only overriding the system settting of vacuum_cost_delay for the
first stage is retained.
---
src/bin/scripts/vacuumdb.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index dd0d51659b..67373c6506 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -47,6 +47,7 @@ typedef struct vacuumingOptions
bool process_toast;
bool skip_database_stats;
char *buffer_usage_limit;
+ bool no_cost_delay;
} vacuumingOptions;
/* object filter options */
@@ -127,6 +128,7 @@ main(int argc, char *argv[])
{"no-process-toast", no_argument, NULL, 11},
{"no-process-main", no_argument, NULL, 12},
{"buffer-usage-limit", required_argument, NULL, 13},
+ {"no-cost-delay", no_argument, NULL, 14},
{NULL, 0, NULL, 0}
};
@@ -157,6 +159,7 @@ main(int argc, char *argv[])
vacopts.do_truncate = true;
vacopts.process_main = true;
vacopts.process_toast = true;
+ vacopts.no_cost_delay = false;
pg_logging_init(argv[0]);
progname = get_progname(argv[0]);
@@ -274,6 +277,9 @@ main(int argc, char *argv[])
case 13:
vacopts.buffer_usage_limit = escape_quotes(optarg);
break;
+ case 14:
+ vacopts.no_cost_delay = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -508,6 +514,11 @@ vacuum_one_database(ConnParams *cparams,
"SET default_statistics_target=10; RESET vacuum_cost_delay;",
"RESET default_statistics_target;"
};
+ const char *stage_commands_no_cost_delay[] = {
+ "SET default_statistics_target=1; SET vacuum_cost_delay=0;",
+ "SET default_statistics_target=10; SET vacuum_cost_delay=0",
+ "RESET default_statistics_target; SET vacuum_cost_delay=0"
+ };
const char *stage_messages[] = {
gettext_noop("Generating minimal optimizer statistics (1 target)"),
gettext_noop("Generating medium optimizer statistics (10 targets)"),
@@ -799,10 +810,22 @@ vacuum_one_database(ConnParams *cparams,
* ourselves before setting up the slots.
*/
if (stage == ANALYZE_NO_STAGE)
- initcmd = NULL;
+ {
+ /* Switch off vacuum_cost_delay, if requested */
+ if (vacopts->no_cost_delay)
+ {
+ initcmd = "SET vacuum_cost_delay=0;";
+ executeCommand(conn, initcmd, echo);
+ }
+ else
+ initcmd = NULL;
+ }
else
{
- initcmd = stage_commands[stage];
+ if (vacopts->no_cost_delay)
+ initcmd = stage_commands_no_cost_delay[stage];
+ else
+ initcmd = stage_commands[stage];
executeCommand(conn, initcmd, echo);
}
@@ -1171,6 +1194,8 @@ help(const char *progname)
printf(_(" --no-process-main skip the main relation\n"));
printf(_(" --no-process-toast skip the TOAST table associated with the table to vacuum\n"));
printf(_(" --no-truncate don't truncate empty pages at the end of the table\n"));
+ printf(_(" --no-cost-delay force cost-based vacuum to off, overriding the\n"
+ " vacuum_cost_delay configuration parameter\n"));
printf(_(" -n, --schema=SCHEMA vacuum tables in the specified schema(s) only\n"));
printf(_(" -N, --exclude-schema=SCHEMA do not vacuum tables in the specified schema(s)\n"));
printf(_(" -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n"));
--
2.39.2
On Fri, Nov 24, 2023 at 01:10:01PM +0100, Michael Banck wrote:
Hi,
On Fri, Nov 24, 2023 at 12:17:56PM +0100, Magnus Hagander wrote:
On Fri, Nov 24, 2023 at 11:21 AM Michael Banck <mbanck@gmx.net> wrote:
On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote:
+ Non-zero values of + <varname>vacuum_cost_delay</varname> will delay statistics generation.Now I wonder wheter vacuumdb maybe should have an option to explicitly
force vacuum_cost_delay to 0 (I don't think it has?)?That's exactly what I proposed, isn't it? :)
You're right, I somehow only saw your mail after I had already sent
mine.To make up for this, I created a patch that implements our propoals, see
attached.
This is already posssible with PGOPTIONS, so I don't see the need for
a separate option:
PGOPTIONS='-c vacuum_cost_delay=99' psql -c 'SHOW vacuum_cost_delay;'
test
vacuum_cost_delay
-------------------
99ms
(1 row)
Here is a patch which shows its usage.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
Attachments:
analyze.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 4f78e0e1c0..2520f6c50d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -784,6 +784,17 @@ psql --username=postgres --file=script.sql postgres
of the upgrade. You might need to set connection parameters to
match your new cluster.
</para>
+
+ <para>
+ Using <command>vacuumdb --all --analyze-only</command> can efficiently
+ generate such statistics, and the use of <option>--jobs</option>
+ can speed it up. Option <option>--analyze-in-stages</option>
+ can be used to generate minimal statistics quickly.
+ If <varname>vacuum_cost_delay</varname> is set to a non-zero
+ value, this can be overridden to speed up statistics generation
+ using <envar>PGOPTIONS</envar>, e.g., <literal>PGOPTIONS='-c
+ vacuum_cost_delay=0' vacuumdb ...</literal>.
+ </para>
</step>
<step>
On Fri, Nov 24, 2023 at 5:34 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Nov 24, 2023 at 01:10:01PM +0100, Michael Banck wrote:
Hi,
On Fri, Nov 24, 2023 at 12:17:56PM +0100, Magnus Hagander wrote:
On Fri, Nov 24, 2023 at 11:21 AM Michael Banck <mbanck@gmx.net> wrote:
On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote:
+ Non-zero values of + <varname>vacuum_cost_delay</varname> will delay statistics generation.Now I wonder wheter vacuumdb maybe should have an option to explicitly
force vacuum_cost_delay to 0 (I don't think it has?)?That's exactly what I proposed, isn't it? :)
You're right, I somehow only saw your mail after I had already sent
mine.To make up for this, I created a patch that implements our propoals, see
attached.This is already posssible with PGOPTIONS, so I don't see the need for
a separate option:PGOPTIONS='-c vacuum_cost_delay=99' psql -c 'SHOW vacuum_cost_delay;'
test
vacuum_cost_delay
-------------------
99ms
(1 row)Here is a patch which shows its usage.
Given how common this would be I think that's a pretty use-unfriendly
way to do it. I'd vote for still adding it.
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
On Fri, Nov 24, 2023 at 06:20:28PM +0100, Magnus Hagander wrote:
On Fri, Nov 24, 2023 at 5:34 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Nov 24, 2023 at 01:10:01PM +0100, Michael Banck wrote:
You're right, I somehow only saw your mail after I had already sent
mine.To make up for this, I created a patch that implements our propoals, see
attached.This is already posssible with PGOPTIONS, so I don't see the need for
a separate option:PGOPTIONS='-c vacuum_cost_delay=99' psql -c 'SHOW vacuum_cost_delay;'
test
vacuum_cost_delay
-------------------
99ms
(1 row)Here is a patch which shows its usage.
Given how common this would be I think that's a pretty use-unfriendly
way to do it. I'd vote for still adding it.
Well, the big question is how many people have a non-default
vacuum_cost_delay, since it defaults to zero. If someone has changed
the default (a small percentage), how many of those will be confused by
PGOPTIONS? At that point, it seems unnecessary. Also consider that a
new option will only be useful for those who have non-default
vacuum_cost_delay values, which can also be confusing.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
On Fri, Nov 24, 2023 at 06:20:28PM +0100, Magnus Hagander wrote:
On Fri, Nov 24, 2023 at 5:34 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Nov 24, 2023 at 01:10:01PM +0100, Michael Banck wrote:
Hi,
On Fri, Nov 24, 2023 at 12:17:56PM +0100, Magnus Hagander wrote:
On Fri, Nov 24, 2023 at 11:21 AM Michael Banck <mbanck@gmx.net> wrote:
On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote:
+ Non-zero values of + <varname>vacuum_cost_delay</varname> will delay statistics generation.Now I wonder wheter vacuumdb maybe should have an option to explicitly
force vacuum_cost_delay to 0 (I don't think it has?)?That's exactly what I proposed, isn't it? :)
You're right, I somehow only saw your mail after I had already sent
mine.To make up for this, I created a patch that implements our propoals, see
attached.This is already posssible with PGOPTIONS, so I don't see the need for
a separate option:PGOPTIONS='-c vacuum_cost_delay=99' psql -c 'SHOW vacuum_cost_delay;'
test
vacuum_cost_delay
-------------------
99ms
(1 row)Here is a patch which shows its usage.
Given how common this would be I think that's a pretty use-unfriendly
way to do it. I'd vote for still adding it.
Patch applied to master.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.