pg_dump sort priority mismatch for large objects
(moving to a new thread)
On Thu, Jul 10, 2025 at 06:05:26PM +0530, Nitin Motiani wrote:
There might be an existing issue here, because dbObjectTypePriorities
has the following comment:* NOTE: object-type priorities must match the section assignments made in
* pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY,
* POST_DATA objects must sort after DO_POST_DATA_BOUNDARY, and DATA objects
* must sort between them.But dumpLO() puts large objects in SECTION_DATA, and PRIO_LARGE_OBJECT
is before PRIO_PRE_DATA_BOUNDARY. I admittedly haven't spent too much
time investigating this, though.I looked through the history of this to see how this happened and if it
could be an existing issue. Prior to a45c78e3284b, dumpLO used to put large
objects in SECTION_PRE_DATA. That commit changed dumpLO and also changed
addBoundaryDependencies to move DO_LARGE_OBJECT from pre-data to data
section. Seems like since then this has been inconsistent with
pg_dump_sort.c. I think the change in pg_dump_sort.c should be backported
to PG17 & 18 independent of the state of the larger patch.
Thanks for doing the archaeology!
But even with the inconsistency, it doesn't look like there is an existing
issue. As the dependencies were changed in addBoundaryDependencies, that
should take precedence over the order in pg_dump_sort.c. The
dbObjectTypePriorities are used by sortDumpableObjectsByTypeName. But right
after that sortDumpableObjects sorts the objects based on dependencies
therefore the change in boundary dependencies should ensure that this is
working as intended. Still I think dbObjectTypePriorities should be made
consistent with the rest.
+1, if for no other reason than we'll need it to be below PRIO_TABLE_DATA
for the speed-up-pg_upgrade-with-many-LOs patch [0]/messages/by-id/aBkQLSkx1zUJ-LwJ@nathan. Does anyone see any
problems with applying something like the following down to v17?
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 0b0977788f1..538e7dcb493 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -76,10 +76,10 @@ enum dbObjectTypePriorities
PRIO_TABLE_ATTACH,
PRIO_DUMMY_TYPE,
PRIO_ATTRDEF,
- PRIO_LARGE_OBJECT,
PRIO_PRE_DATA_BOUNDARY, /* boundary! */
PRIO_TABLE_DATA,
PRIO_SEQUENCE_SET,
+ PRIO_LARGE_OBJECT,
PRIO_LARGE_OBJECT_DATA,
PRIO_STATISTICS_DATA_DATA,
PRIO_POST_DATA_BOUNDARY, /* boundary! */
[0]: /messages/by-id/aBkQLSkx1zUJ-LwJ@nathan
--
nathan
Nathan Bossart <nathandbossart@gmail.com> writes:
On Thu, Jul 10, 2025 at 06:05:26PM +0530, Nitin Motiani wrote:
I looked through the history of this to see how this happened and if it
could be an existing issue. Prior to a45c78e3284b, dumpLO used to put large
objects in SECTION_PRE_DATA. That commit changed dumpLO and also changed
addBoundaryDependencies to move DO_LARGE_OBJECT from pre-data to data
section. Seems like since then this has been inconsistent with
pg_dump_sort.c. I think the change in pg_dump_sort.c should be backported
to PG17 & 18 independent of the state of the larger patch.
+1, if for no other reason than we'll need it to be below PRIO_TABLE_DATA
for the speed-up-pg_upgrade-with-many-LOs patch [0]. Does anyone see any
problems with applying something like the following down to v17?
That's clearly an oversight in a45c78e3284b. I agree that fixing
pg_dump_sort.c to match shouldn't create any functional difficulties.
It might make the topological sort step marginally faster by
reducing the number of ordering violations that have to be fixed.
regards, tom lane