[PATCH] pg_dump: Do not dump statistics for excluded tables
Hi hackers,
I've attached a patch against master that addresses a small bug in pg_dump.
Previously, pg_dump would include CREATE STATISTICS statements for
tables that were excluded from the dump, causing reload to fail if any
excluded tables had extended statistics.
The patch skips the creation of the StatsExtInfo if the associated
table does not have the DUMP_COMPONENT_DEFINITION flag set. This is
similar to how getPublicationTables behaves if a table is excluded.
I've covered this with a regression test by altering one of the CREATE
STATISTICS examples to work with the existing 'exclude_test_table'
run. Without the fix, that causes the test to fail with:
# Failed test 'exclude_test_table: should not dump CREATE STATISTICS
extended_stats_no_options'
# at t/002_pg_dump.pl line 4934.
Regards,
Rian
Attachments:
v1-0001-pg_dump-Do-not-dump-statistics-for-excluded-tables.patchtext/plain; charset=UTF-8; name=v1-0001-pg_dump-Do-not-dump-statistics-for-excluded-tables.patchDownload+34-15
Rian McGuire <rian.mcguire@buildkite.com> writes:
I've attached a patch against master that addresses a small bug in pg_dump.
Previously, pg_dump would include CREATE STATISTICS statements for
tables that were excluded from the dump, causing reload to fail if any
excluded tables had extended statistics.
I agree that's a bug ...
The patch skips the creation of the StatsExtInfo if the associated
table does not have the DUMP_COMPONENT_DEFINITION flag set. This is
similar to how getPublicationTables behaves if a table is excluded.
... but I don't like the details of this patch (and I'm not too
thrilled with the implementation of getPublicationTables, either).
The style in pg_dump is to put such decisions into separate
policy-setting subroutines. Also, skipping creation of the
DumpableObject altogether is the wrong thing because it'd prevent
pg_dump from tracing or reasoning about dependencies involving the
stats object, which can be relevant even if the object itself isn't
dumped --- this is why all the other data-collection subroutines
operate as they do. getPublicationTables can probably get away
with its low-rent approach given that publication membership isn't
represented by pg_depend entries, but it's far from clear that it'll
never be an issue for stats.
So I think it needs to be more like the attached.
(I did use your test case verbatim.)
regards, tom lane