Support --include-analyze in pg_dump, pg_dumpall, pg_restore
hi.
after watching https://www.youtube.com/live/k4A9-WZET_4?si=vz3lTud735s2vcCO
then trying to hack it too.
the implemented feature is as the $subject description, making pg_dump
also includes the ANALYZE command in its output.
if option --include-analyze not specified, then pg_dump will not dump
ANALYZE, that's the status quo.
option --include-analyze support table relkind: RELKIND_PARTITIONED_TABLE,
RELKIND_MATVIEW, RELKIND_RELATION.
this option cannot be used with option --schema-only.
for materialized view, it will dump after the refresh materialized
view command.
(tested in several cases, ANALYZE will really be at the end of the dump).
By default, pg_restore does not restore the ANALYZE commands.
This means that if the archive contains ANALYZE commands,
you still need to explicitly specify the --include-analyze option to
restore them.
doc added.
demo for dump:
create materialized view mvw as select a from generate_series(1, 3) a;
---<<<<<<<<<<<<<<portion of pg_dump output starts>>>>>>>>>>>>>>>>>>>>>>
--
-- Name: mvw; Type: MATERIALIZED VIEW DATA; Schema: public; Owner: jian
--
REFRESH MATERIALIZED VIEW public.mvw;
--
-- Name: mvw; Type: ANALYZE; Schema: public; Owner: jian
--
ANALYZE public.mvw;
---<<<<<<<<<<<<<<portion of pg_dump output ends>>>>>>>>>>>>>>>>>>>>>>
demo for dump partitioned table:
CREATE TABLE prt1 (a int, b int, c text) PARTITION BY RANGE(a) ;
CREATE TABLE prt1_p1 PARTITION OF prt1 FOR VALUES FROM (0) TO (5);
CREATE TABLE prt1_p2 PARTITION OF prt1 FOR VALUES FROM (8) TO (11);
INSERT INTO prt1 SELECT i, i+11, to_char(i, 'FM0000') FROM
generate_series(0, 2) i;
pg_dump --table=*prt1* --include-analyze
---<<<<<<<<<<<<<<portion of output ends>>>>>>>>>>>>>>>>>>>>>>
--
-- Data for Name: prt1_p1; Type: TABLE DATA; Schema: public; Owner: jian
--
COPY public.prt1_p1 (a, b, c) FROM stdin;
0 11 0000
1 12 0001
2 13 0002
\.
--
-- Data for Name: prt1_p2; Type: TABLE DATA; Schema: public; Owner: jian
--
COPY public.prt1_p2 (a, b, c) FROM stdin;
\.
--
-- Name: prt1; Type: ANALYZE; Schema: public; Owner: jian
--
ANALYZE ONLY public.prt1;
--
-- Name: prt1_p1; Type: ANALYZE; Schema: public; Owner: jian
--
ANALYZE public.prt1_p1;
--
-- Name: prt1_p2; Type: ANALYZE; Schema: public; Owner: jian
--
ANALYZE public.prt1_p2;
---<<<<<<<<<<<<<<portion of output starts>>>>>>>>>>>>>>>>>>>>>>
TODO item: writing tests.
idea credits to Andrey Borodin, Nikolay Samokhvalov, Kirk Wolak
what do you think?
Attachments:
v1-0001-Support-include-analyze-in-pg_dump-pg_dumpall-pg_.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Support-include-analyze-in-pg_dump-pg_dumpall-pg_.patchDownload+203-24
jian he <jian.universality@gmail.com> writes:
the implemented feature is as the $subject description, making pg_dump
also includes the ANALYZE command in its output.
Isn't this pretty much obsoleted by the ongoing work to dump and
restore statistics?
regards, tom lane
On Tue, Jan 14, 2025 at 09:32:19AM -0500, Tom Lane wrote:
jian he <jian.universality@gmail.com> writes:
the implemented feature is as the $subject description, making pg_dump
also includes the ANALYZE command in its output.Isn't this pretty much obsoleted by the ongoing work to dump and
restore statistics?
Yeah, and I believe that work is still on track for v18. That's likely
going to be much faster than analyzing everything, and it'll be usable in
pg_upgrade, too.
/messages/by-id/flat/CADkLM=cB0rF3p_FuWRTMSV0983ihTRpsH+OCpNyiqE7Wk0vUWA@mail.gmail.com
--
nathan