sequences and pg_upgrade
I was toying with a couple of ideas that would involve changing the
storage of sequences. (Say, for the sake of discussion, removing the
problematic/useless sequence_name field.) This would cause problems for
pg_upgrade, because pg_upgrade copies the "heap" storage of sequences
like it does for normal tables, and we have no facilities for effecting
any changes during that.
There was a previous discussion in the early days of pg_migrator, which
resulted in the current behavior:
/messages/by-id/20090713220112.GF7933@klana.box
This also alluded to what I think was the last change in the sequence
storage format (10a3471bed7b57fb986a5be8afdee5f0dda419de) between
versions 8.3 and 8.4. How did pg_upgrade handle that?
I think the other solution mentioned in that thread would also work:
Have pg_upgrade treat sequences more like system catalogs, whose format
changes between major releases, and transferred them via the
dump/restore route. So instead of copying the disk files, issue a
setval call, and the sequence should be all set up.
Am I missing anything?
Attached is a rough patch set that would implement that.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-pg_dump-Separate-table-data-and-sequence-data-object.patchtext/x-patch; name=0001-pg_dump-Separate-table-data-and-sequence-data-object.patchDownload+15-5
0002-pg_dump-Add-sequence-data-option.patchtext/x-patch; name=0002-pg_dump-Add-sequence-data-option.patchDownload+29-2
0003-pg_upgrade-Skip-copying-sequence-files.patchtext/x-patch; name=0003-pg_upgrade-Skip-copying-sequence-files.patchDownload+2-3
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
I was toying with a couple of ideas that would involve changing the
storage of sequences. (Say, for the sake of discussion, removing the
problematic/useless sequence_name field.) This would cause problems for
pg_upgrade, because pg_upgrade copies the "heap" storage of sequences
like it does for normal tables, and we have no facilities for effecting
any changes during that.
There was a previous discussion in the early days of pg_migrator, which
resulted in the current behavior:
/messages/by-id/20090713220112.GF7933@klana.box
This also alluded to what I think was the last change in the sequence
storage format (10a3471bed7b57fb986a5be8afdee5f0dda419de) between
versions 8.3 and 8.4. How did pg_upgrade handle that?
I think it probably never did handle that. pg_upgrade doesn't currently
claim to support migrating from 8.3, and the thread you mention shows that
the original attempt at 8.3->8.4 migration crashed-and-burned for numerous
unrelated reasons. We may not have ever got to the point of noticing that
10a3471be also created a problem.
I think the other solution mentioned in that thread would also work:
Have pg_upgrade treat sequences more like system catalogs, whose format
changes between major releases, and transferred them via the
dump/restore route. So instead of copying the disk files, issue a
setval call, and the sequence should be all set up.
Seems reasonable.
If you're proposing to expose --sequence-data as a user-visible option,
the patch set lacks documentation. But I wonder whether it shouldn't
simply be a side-effect of --binary-upgrade. It seems a tad
non-orthogonal for a user switch.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Aug 30, 2016 at 08:46:48AM -0400, Peter Eisentraut wrote:
I think the other solution mentioned in that thread would also work:
Have pg_upgrade treat sequences more like system catalogs, whose format
changes between major releases, and transferred them via the
dump/restore route. So instead of copying the disk files, issue a
setval call, and the sequence should be all set up.Am I missing anything?
Looks straight-forward to me.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-08-30 08:46:48 -0400, Peter Eisentraut wrote:
I was toying with a couple of ideas that would involve changing the
storage of sequences. (Say, for the sake of discussion, removing the
problematic/useless sequence_name field.)
I'd be quite interested to know what changes that are...
I think the other solution mentioned in that thread would also work:
Have pg_upgrade treat sequences more like system catalogs, whose format
changes between major releases, and transferred them via the
dump/restore route. So instead of copying the disk files, issue a
setval call, and the sequence should be all set up.
+1.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, failed
Thank you for the patch.
As I see there are no objections in the discussion, all the patches look clear.
Could you clarify, please, why do we dump sequence in schemaOnly mode?
+ if (dopt.schemaOnly && dopt.sequence_data)
+ getSequenceData(&dopt, tblinfo, numTables, dopt.oids);
Example:
postgres=# create table t(i serial, data text);
postgres=# insert into t(data) values ('aaa');
pg_dump -d postgres --sequence-data --schema-only > ../reviews/dump_pg
Then restore it into newdb and add new value.
newdb=# insert into t(data) values ('aaa');
INSERT 0 1
newdb=# select * from t;
i | data
---+------
2 | aaa
I'm not an experienced user, but I thought that while doing dump/restore
of schema of database we reset all the data. Why should the table in newly
created (via pg_restore) database have non-default sequence value?
I also did some other tests and all of them were passed.
One more thing to do is a documentation for the new option.
You should update help() function in pg_dump.c and also add some
notes to pg_dump.sgml and probably to pgupgrade.sgml.
The new status of this patch is: Waiting on Author
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/14/16 8:52 AM, Anastasia Lubennikova wrote:
Could you clarify, please, why do we dump sequence in schemaOnly mode? + if (dopt.schemaOnly && dopt.sequence_data) + getSequenceData(&dopt, tblinfo, numTables, dopt.oids);
The point of this patch is that with the new option, you can dump
sequence data (but not table data) alongside with the schema. This
would be used by pg_upgrade for the reasons described at the beginning
of the thread.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
15.09.2016 15:29, Peter Eisentraut:
On 9/14/16 8:52 AM, Anastasia Lubennikova wrote:
Could you clarify, please, why do we dump sequence in schemaOnly mode? + if (dopt.schemaOnly && dopt.sequence_data) + getSequenceData(&dopt, tblinfo, numTables, dopt.oids);The point of this patch is that with the new option, you can dump
sequence data (but not table data) alongside with the schema. This
would be used by pg_upgrade for the reasons described at the beginning
of the thread.
Oh, thank you. Now I see.
Somewhy I thought that it *always* dumps sequence data in schemaOnly mode.
Which is definitely not true.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Here is an updated patch set. Compared to the initial set, I have
changed pg_dump's sorting priorities so that sequence data is always
after table data. This would otherwise have introduced a problem
because sortDataAndIndexObjectsBySize() only considers consecutive
DO_TABLE_DATA entries. Also, I have removed the separate
--sequence-data switch from pg_dump and made it implicit in
--binary-upgrade. (So the previous patches 0002 and 0003 have been
combined, because it's no longer a separate feature.)
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v2-0001-pg_dump-Separate-table-and-sequence-data-object-t.patchinvalid/octet-stream; name=v2-0001-pg_dump-Separate-table-and-sequence-data-object-t.patchDownload+37-27
v2-0002-pg_upgrade-Upgrade-sequence-data-via-pg_dump.patchinvalid/octet-stream; name=v2-0002-pg_upgrade-Upgrade-sequence-data-via-pg_dump.patchDownload+24-7
23.09.2016 21:06, Peter Eisentraut:
Here is an updated patch set. Compared to the initial set, I have
changed pg_dump's sorting priorities so that sequence data is always
after table data. This would otherwise have introduced a problem
because sortDataAndIndexObjectsBySize() only considers consecutive
DO_TABLE_DATA entries. Also, I have removed the separate
--sequence-data switch from pg_dump and made it implicit in
--binary-upgrade. (So the previous patches 0002 and 0003 have been
combined, because it's no longer a separate feature.)
The patches are good, no complaints.
But again, I have the same question.
I was confused, why do we always dump sequence data,
because I'd overlooked the --sequence-data key. I'd rather leave this
option,
because it's quite non intuitive behaviour...
/* dump sequence data even in schema-only mode */
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Oct 1, 2016 at 1:50 AM, Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
23.09.2016 21:06, Peter Eisentraut:
Here is an updated patch set. Compared to the initial set, I have
changed pg_dump's sorting priorities so that sequence data is always
after table data. This would otherwise have introduced a problem
because sortDataAndIndexObjectsBySize() only considers consecutive
DO_TABLE_DATA entries. Also, I have removed the separate
--sequence-data switch from pg_dump and made it implicit in
--binary-upgrade. (So the previous patches 0002 and 0003 have been
combined, because it's no longer a separate feature.)The patches are good, no complaints.
But again, I have the same question.
I was confused, why do we always dump sequence data,
because I'd overlooked the --sequence-data key. I'd rather leave this
option,
because it's quite non intuitive behaviour...
/* dump sequence data even in schema-only mode */
Moved to next CF. This is fresh.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/30/16 12:50 PM, Anastasia Lubennikova wrote:
The patches are good, no complaints.
But again, I have the same question.
I was confused, why do we always dump sequence data,
because I'd overlooked the --sequence-data key. I'd rather leave this
option,
because it's quite non intuitive behaviour...
/* dump sequence data even in schema-only mode */
Here are rebased patches.
Regarding your question: The initial patch had a separate option for
this behavior, which was then used by pg_upgrade. It was commented that
this option is not useful outside of pg_upgrade, so it doesn't need to
be exposed as a user-facing option. I agreed with that and removed the
option. We can always add the option back easily if someone really
wants it, but so far no use case has been presented. So I suggest we
proceed with this proposal ignoring whether this option is exposed or not.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v3-0001-pg_dump-Separate-table-and-sequence-data-object-t.patchtext/x-patch; name=v3-0001-pg_dump-Separate-table-and-sequence-data-object-t.patchDownload+25-16
v3-0002-pg_upgrade-Upgrade-sequence-data-via-pg_dump.patchtext/x-patch; name=v3-0002-pg_upgrade-Upgrade-sequence-data-via-pg_dump.patchDownload+24-7
On Mon, Oct 31, 2016 at 9:53 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 9/30/16 12:50 PM, Anastasia Lubennikova wrote:
The patches are good, no complaints.
But again, I have the same question.
I was confused, why do we always dump sequence data,
because I'd overlooked the --sequence-data key. I'd rather leave this
option,
because it's quite non intuitive behaviour...
/* dump sequence data even in schema-only mode */Here are rebased patches.
Regarding your question: The initial patch had a separate option for
this behavior, which was then used by pg_upgrade. It was commented that
this option is not useful outside of pg_upgrade, so it doesn't need to
be exposed as a user-facing option. I agreed with that and removed the
option. We can always add the option back easily if someone really
wants it, but so far no use case has been presented. So I suggest we
proceed with this proposal ignoring whether this option is exposed or not.
I had a look at those fresh patches, and 0001 looks like a good thing.
This makes the separation between sequences and table data dump
cleaner. I ran some tests with pg_upgrade and 0002, and things are
clear. And +1 for the way done in the patch, aka no options of pg_dump
exposed to user, still keep the option tracking as a separate value.
One small thing here:
static void
-getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, bool oids)
+getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables,
bool oids, char relkind)
{
int i;
for (i = 0; i < numTables; i++)
{
- if (tblinfo[i].dobj.dump & DUMP_COMPONENT_DATA)
+ if (tblinfo[i].dobj.dump & DUMP_COMPONENT_DATA &&
+ (!relkind || tblinfo[i].relkind == relkind))
makeTableDataInfo(dopt, &(tblinfo[i]), oids)
One idea here would be to have an extra routine, getSequenceData and
not extend getTableData() with relkind as extra argument. I am fine
with the way patch does things, so I just switched the patch as ready
for committer.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/2/16 2:34 AM, Michael Paquier wrote:
I had a look at those fresh patches, and 0001 looks like a good thing.
This makes the separation between sequences and table data dump
cleaner. I ran some tests with pg_upgrade and 0002, and things are
clear. And +1 for the way done in the patch, aka no options of pg_dump
exposed to user, still keep the option tracking as a separate value.
Committed, thanks.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers