BUG #14192: pg_dump/pg_restore omit setting search_path in restored db
The following bug has been logged on the website:
Bug reference: 14192
Logged by: Hans Buschmann
Email address: buschmann@nidsa.net
PostgreSQL version: 9.5.3
Operating system: Windows 10 64bit
Description:
To transport a whole database from Production to test/development I dump the
whole database with pg_dump -d xxxdb -Fc -f mydumpfile.dmp.
After creating the database on the target system with createdb I restore the
database with pg_restore -d xxxdb mydumpfile.dmp. (no other flags except
login information involved)
All Information is correctly restored (2 schemas and their content) except
the search path.
On the origin database I had entered
ALTER DATABASE xxxdb SET search_path TO public,partition2;
The restore omits setting the search path, it must be entered again with
ALTER DATABASE on the target machine.
The same occurs naturally also in the case of an emergency restore in case
of failure, which breaks the restore from running correctly without
expertise of the underlying database in time of restoration.
With my understanding, that the search path is an inherent part of the
origin database structure, the pg_dump/pg_restore sequence does not
completely reproduce the original database. I think this is a bug.
All is done on 9.5.3, 64bit Windows edition on Windows 64bit platforms, but
I suppose it is platform independent. I haven't tested it on different
versions or platforms.
I haven't tested whether pg_dump or pg_restore omits the setting (mostly
because of not using an SQL dump format).
Thanks for looking
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
buschmann@nidsa.net writes:
On the origin database I had entered
ALTER DATABASE xxxdb SET search_path TO public,partition2;
The restore omits setting the search path, it must be entered again with
ALTER DATABASE on the target machine.
This is operating as designed: pg_dump dumps the content of a database,
not settings attached to the database object as such. You can use
pg_dumpall (possibly with the -g switch) to capture that info.
There've been repeated discussions about refactoring this division of
labor, but nothing's been agreed to yet.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, Jun 15, 2016 at 11:49 AM, Hans Buschmann <buschmann@nidsa.net>
wrote:
Thank you for your quick reply (my first post/bug report).
When changeing my database partly to partitions, I introduced two schemas
to separate current and archive data.According to Postgres DOC chapter 5.8.3 it is generally not advisable to
use schema qualified names for any objects but to use search_path instead.In my opinion this encouraged naming of objects without explicit schema is
semantically part of the application (e.g. functions) even when not written
by words.When setting the search_path altered for the database it becomes
semantically a part of the database and the application. This implies it
should be dumped with the content of the database to preserve the
consistency of the application.The same applies to cases with only one schema with no standard name (when
public becomes myapplication).My point is the logical consistency of what consists a database and how to
transport all information in one container (a dump).Even the syntax (ALTER DATABASE xxxdb SET SEARCH PATH) suggests this to be
part of the database and not of a session or the cluster.These are my 2 cents as being relatively new to PostgreSQL.
I suspect most people would agree with this sentiment. But the status-quo,
while less than ideal, allows for the end result to be realized, if
imperfectly, and thus the motivation to donate ones time to improving it is
not that great.
The underlying point is that users are global - but there is no "global
shell" to operate from so every command has to be executed while within a
specific database. The specific mapping you are making when you do "ALTER
DATABASE" has an optional user component (which if not specified simply
means all users) and so it too ends up being global.
David J.
Import Notes
Reply to msg id not found: D2B9F2A20670C84685EF7D183F2949E2373D5F@gigant.nidsa.net
On 6/15/2016 8:49 AM, Hans Buschmann wrote:
Even the syntax (ALTER DATABASE xxxdb SET SEARCH PATH) suggests this
to be part of the database and not of a session or the cluster.
that syntax is indeed stored with the database schema, and would be part
of a database dump.
you also can simply SET search_path 'somevalue'; and this applies
only to the current session, and wouldn't be part of the database
--
john r pierce, recycling bits in santa cruz
Import Notes
Reply to msg id not found: D2B9F2A20670C84685EF7D183F2949E2373D5F@gigant.nidsa.net
Thank you for your quick reply (my first post/bug report).
When changeing my database partly to partitions, I introduced two schemas to separate current and archive data.
According to Postgres DOC chapter 5.8.3 it is generally not advisable to use schema qualified names for any objects but to use search_path instead.
In my opinion this encouraged naming of objects without explicit schema is semantically part of the application (e.g. functions) even when not written by words.
When setting the search_path altered for the database it becomes semantically a part of the database and the application. This implies it should be dumped with the content of the database to preserve the consistency of the application.
The same applies to cases with only one schema with no standard name (when public becomes myapplication).
My point is the logical consistency of what consists a database and how to transport all information in one container (a dump).
Even the syntax (ALTER DATABASE xxxdb SET SEARCH PATH) suggests this to be part of the database and not of a session or the cluster.
These are my 2 cents as being relatively new to PostgreSQL.
Thanks
Hans Buschmann
On Wed, Jun 15, 2016 at 1:26 PM, Hans Buschmann <buschmann@nidsa.net> wrote:
On 6/15/2016 John R Pierce wrote:
that syntax is indeed stored with the database schema, and would be part
of a database dump.you also can simply SET search_path 'somevalue'; and this applies
only to the current session, and wouldn't be part of the databaseThis is the point I don't understand.
You said, the syntax ALTER DATABASE xxxdb SET SEARCH PATH is stored in the
database and would be part of a database dump.But to my observation, this is not the case. I certainly set the search
path to the database (not the session) on the dumping machine, but it is
not restored with pg_restore.I verified this with a little test database and checked the resulting SQL
dump. (You may do it yourself).The databse search path appeared not in the dump.
Not sure if John just mis-understood you, mis-spoke, or doesn't understand
what "ALTER DATABASE SET" does but indeed his comment was incorrect insofar
as it was used to answer the question you posed. ALTER DATABASE SET
installs a cluster record that points to a specific database in the same
cluster. The cluster contains multiple databases and pg_dumpall is used to
obtain those items that are cluster-oriented (even if the specific values
only would pertain to a single database in the cluster).
David J.
Import Notes
Reply to msg id not found: D2B9F2A20670C84685EF7D183F2949E2373D60@gigant.nidsa.net
On 6/15/2016 John R Pierce wrote:
that syntax is indeed stored with the database schema, and would be part
of a database dump.you also can simply SET search_path 'somevalue'; and this applies
only to the current session, and wouldn't be part of the database
This is the point I don't understand.
You said, the syntax ALTER DATABASE xxxdb SET SEARCH PATH is stored in the database and would be part of a database dump.
But to my observation, this is not the case. I certainly set the search path to the database (not the session) on the dumping machine, but it is not restored with pg_restore.
I verified this with a little test database and checked the resulting SQL dump. (You may do it yourself).
The databse search path appeared not in the dump.
To test:
postgres=# create database test1 template=template0 encoding 'UTF8';
CREATE DATABASE
postgres=# \c test1
test1=# create schema test2;
CREATE SCHEMA
test1=# show search_path;
search_path
-----------------
"$user", public
(1 Zeile)
test1=# alter database test1 set search_path TO public,test2;
ALTER DATABASE
test1=# create table pubtable (id int);
CREATE TABLE
test1=# create table test2.p2table (id2 int);
CREATE TABLE
(after reconnect to database to start a new session)
test1=# \dt
Liste der Relationen
Schema | Name | Typ | Eigent³mer
--------+----------+---------+------------
public | pubtable | Tabelle | postgres
test2 | p2table | Tabelle | postgres
(2 Zeilen)
test1=# \q
pg_dump -U postgres -d test1 -f test1_search_path.sql
I hope I have been more clear now
Hans Buschmann
I understand the differences between cluster and database and pg_dumpall and pg_dump.
In my opinion a pg_dump of a database should pack all informations of the application (the database) in the dumpfile in one container, to be able to restore it full functional at a different place.
Because the search_path is a crucial information for the application to work correctly (like any other object inside the database) it should be packed into this container called pg_dump dumpfile.
This should be independent of the current implementation, where we store the search_path in a cluster record: The informatation belongs semantically to the content of the database, even if it is stored elsewhere.
My concern with promoting this suggestion is to avoid trouble in emergency cases, logical consistency, intuitive usage of pg_dump and smooth experience for non-expert people.
Hans B.
On Thursday, June 16, 2016, Hans Buschmann <buschmann@nidsa.net> wrote:
I understand the differences between cluster and database and pg_dumpall
and pg_dump.In my opinion a pg_dump of a database should pack all informations of the
application (the database) in the dumpfile in one container, to be able to
restore it full functional at a different place.Because the search_path is a crucial information for the application to
work correctly (like any other object inside the database) it should be
packed into this container called pg_dump dumpfile.This should be independent of the current implementation, where we store
the search_path in a cluster record: The informatation belongs semantically
to the content of the database, even if it is stored elsewhere.My concern with promoting this suggestion is to avoid trouble in
emergency cases, logical consistency, intuitive usage of pg_dump and smooth
experience for non-expert people.
You've made your opinion quite clear and likely others share it. But
someone needs to design and write a patch - we can't commit opinions. I'd
suggest, at least pushing for this after 9.6 is released. No one is really
interested in pondering this days before beta2 is to be released. I'd
suggest researching past discussions on the topic, I'm sure there are some,
in the meantime.
David J.
Thank you for investigating. (I am not able to contribute with patches, at least in the foreseable future...)
By casually following pgsql-hackers I am aware of the current release cycle.
I really don't expect this to be addressed immediately with 9.6 b2 around the corner.
To have a firm opinion should not be misundertood as a push for a fast fix, but seen more as part of quality assurance.
Thanks
Hans B.