Order of tables dumped by pg_dump
Can somebody help me understand the order in which pg_dump produces tables?
I sometimes write migration scripts to apply schema changes. Part of
validating these scripts involves running my schema file (a file or files
which create the schema from scratch, with CREATE TABLE and other
statements), then dumping the schema. Next apply the draft migration script
to a copy of production, then dump the schema again.
Ideally, the schema dump obtained by applying the migration script to
production and the one obtained by running the migration script on a copy
of production should be identical.
In the past I used to have problems with things being out of order,
sometimes leading to large diffs that were really just re-orderings. More
recent Postgres versions have been better. Sometimes small "real" diffs
result in large text diffs, but I think this is due to dependency changes
and probably hard to avoid.
Right now however I'm back to a situation where 3 tables are generated
differently in the 2 dumps. In one, all 3 are together while in the other
they are in 3 different locations. As a result I have 8 diffs: 4 for the
table definitions, and 4 for the permissions. I have verified that the
differences are only an ordering difference by diffing the output of \d+
for the 3 tables and by diffing the two schemas after sorting their
respective lines.
So I have 2 identical schemas that produce different dumps. This makes me
wonder if the pg_dump order is not entirely deterministic and maybe some
ORDER BYs are missing somewhere, resulting in the order of the data in the
system tables affecting the output.
If somebody can point me at the part of the pg_dump code that determines
the table order, I'd be interested in taking a look and trying to make the
order more deterministic. Any other hints welcome as well. I am working
with Postgres 18.1.
Isaac Morland <isaac.morland@gmail.com> writes:
So I have 2 identical schemas that produce different dumps. This makes me
wonder if the pg_dump order is not entirely deterministic and maybe some
ORDER BYs are missing somewhere, resulting in the order of the data in the
system tables affecting the output.
That's not supposed to happen (and we have regression tests that rely
on it not happening). Are you sure that the databases are really
identical? One thing I'd check is if the database locales are the
same. The primary sort key inside pg_dump is usually object name,
so you could conceivably get an ordering that depends on locale,
especially if there's any non-ASCII letters in the object names.
If somebody can point me at the part of the pg_dump code that determines
the table order, I'd be interested in taking a look and trying to make the
order more deterministic. Any other hints welcome as well. I am working
with Postgres 18.1.
See sortDumpableObjectsByTypeName (the initial name-based sort)
and sortDumpableObjects (fixes up any dependency problems).
regards, tom lane
On Fri, 15 May 2026 at 15:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Isaac Morland <isaac.morland@gmail.com> writes:
So I have 2 identical schemas that produce different dumps. This makes me
wonder if the pg_dump order is not entirely deterministic and maybe some
ORDER BYs are missing somewhere, resulting in the order of the data inthe
system tables affecting the output.
That's not supposed to happen (and we have regression tests that rely
on it not happening). Are you sure that the databases are really
identical? One thing I'd check is if the database locales are the
same. The primary sort key inside pg_dump is usually object name,
so you could conceivably get an ordering that depends on locale,
especially if there's any non-ASCII letters in the object names.
Thanks for the confirmation.
I checked that none of my table names match [^a-z0-9_] and I always use
UTF-8 encoding with C locale.
However, I didn't think to mention that the database is running 14.22 while
the client pg_dump is 18.1.
I tried to run a test in which I restored the two dumps, then dumped the
result, to see if the order of the table definitions in the dumps mattered.
Unfortunately I found what seems to be a bug in pg_dump relating to the
dumping of inheritance hierarchies (see below).
I did verify that the dumps were identical except for the order of the 3
affected tables, so whatever is causing the difference is not something
that appears in the dumped objects; it only affects their order. This is
why I suspected dependencies from another schema not included in the dump
but there are none. However, I noticed that there are dependencies from the
same tables that trigger the bug in pg_dump. So I suspect the re-ordering
of being another symptom of the bug.
The bug is that inheritance hierarchies can be dumped wrong.
I created a test case:
odyssey=> begin read write;
BEGIN
odyssey=*> create table test_parent (f integer);
CREATE TABLE
odyssey=*> create table test_child (primary key (f)) inherits (test_parent);
CREATE TABLE
odyssey=*> commit;
COMMIT
This dumps as follows:
[ omit header stuff ]
CREATE TABLE ijmorlan.test_parent (
f integer
);
[ omit ownership ]
CREATE TABLE ijmorlan.test_child (
NOT NULL f
)
INHERITS (ijmorlan.test_parent);
[ omit more stuff ]
The CREATE TABLE for test_child is not syntactically valid.
Note that there is no primary key for the parent. There are also no rows in
that table. So it's a bit unusual.
If somebody can point me at the part of the pg_dump code that determines
the table order, I'd be interested in taking a look and trying to make
the
order more deterministic. Any other hints welcome as well. I am working
with Postgres 18.1.See sortDumpableObjectsByTypeName (the initial name-based sort)
and sortDumpableObjects (fixes up any dependency problems).
Thanks, I will take a look.
Isaac Morland <isaac.morland@gmail.com> writes:
However, I didn't think to mention that the database is running 14.22 while
the client pg_dump is 18.1.
Hmm, that might be relevant. However, I tried your example
both with current-HEAD server + current-HEAD pg_dump and
with 14.23 server + current-HEAD pg_dump. For me, the output
is stable across dump/reload except for the random \restrict keys.
I know that our handling of NOT NULL constraints changed significantly
in v18 and there have been multiple bug fixes in that area, so maybe
you'd have better luck with 18.4? I didn't spot any smoking guns in
a quick trawl of the commit log back to 18.1, but I might've been
searching for the wrong keywords.
CREATE TABLE ijmorlan.test_child (
NOT NULL f
)
INHERITS (ijmorlan.test_parent);
The CREATE TABLE for test_child is not syntactically valid.
Yeah it is. The standalone constraint clause is perfectly fine,
the column declaration comes from the parent, and the primary
key constraint gets added on later (after loading data). In
any case, I think you'd have a hard time demonstrating a dump order
inconsistency with only two tables. If there is a problem here,
it'd likely require three or more somehow-related tables.
regards, tom lane
On Sat, 16 May 2026 at 13:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Isaac Morland <isaac.morland@gmail.com> writes:
However, I didn't think to mention that the database is running 14.22
while
the client pg_dump is 18.1.
Hmm, that might be relevant. However, I tried your example
both with current-HEAD server + current-HEAD pg_dump and
with 14.23 server + current-HEAD pg_dump. For me, the output
is stable across dump/reload except for the random \restrict keys.I know that our handling of NOT NULL constraints changed significantly
in v18 and there have been multiple bug fixes in that area, so maybe
you'd have better luck with 18.4? I didn't spot any smoking guns in
a quick trawl of the commit log back to 18.1, but I might've been
searching for the wrong keywords.
Thanks for this effort. I'll see if I can make any progress on identifying
what triggers my problem to occur.
CREATE TABLE ijmorlan.test_child (
NOT NULL f
)
INHERITS (ijmorlan.test_parent);The CREATE TABLE for test_child is not syntactically valid.
Yeah it is. The standalone constraint clause is perfectly fine,
the column declaration comes from the parent, and the primary
key constraint gets added on later (after loading data). In
any case, I think you'd have a hard time demonstrating a dump order
inconsistency with only two tables. If there is a problem here,
it'd likely require three or more somehow-related tables.
My apologies for the confusion, I now see that one of the syntax
definitions of table_constraint in 18 is "NOT NULL column_name". This form
doesn't exist in 14.
The 2 sample table definitions were just to show what I had misunderstood
to be a bug in pg_dump, not to illustrate anything about my table order
problem, which involves how 3 tables get ordered relative to the rest of
the tables.
Import Notes
Reply to msg id not found: 1508563.1778952329@sss.pgh.pa.us