9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Hello all,
while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore --data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default. This can lead to
rather serious data errors.
Reproducer:
* Create a db and sequence:
$ createdb test
$ psql test -c "CREATE SEQUENCE odd10 INCREMENT BY 2 MINVALUE 1 MAXVALUE 10 CYCLE"
CREATE SEQUENCE
* Advance it two steps:
$ psql --cluster 9.2/main -Atc "SELECT nextval('odd10')" test
1
$ psql --cluster 9.2/main -Atc "SELECT nextval('odd10')" test
3
* Dump schema and "test" data:
$ pg_dumpall -s > /tmp/schema
$ pg_dump -Fc test > /tmp/test.dump
* Drop db:
$ dropdb test
* Restore DB:
$ psql template1 < /tmp/schema
$ pg_restore --data-only -d postgres /tmp/test.dump
* Check the counter:
$ psql -Atc "SELECT nextval('odd10')" test
1
This is wrong, it should be "5", from the original database. This
worked all the way up to 9.1.
With "pg_restore /tmp/test.dump" (i. e. a full dump), you see
SELECT pg_catalog.setval('odd10', 3, true);
in the dump, but it is not present in the output of
"pg_restore --data-only /tmp/test.dump". However, when I use the 9.1
version, it works:
$ /usr/lib/postgresql/9.1/bin/pg_restore --data-only /tmp/test.dump | grep setval
SELECT pg_catalog.setval('odd10', 3, true);
Thanks!
Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Martin Pitt <mpitt@debian.org> writes:
while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore --data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default.
I believe this is a consequence of commit
a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the entirely
false assumption that --schema-only and --data-only have something to
do with the order that entries appear in the archive ...
regards, tom lane
On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Martin Pitt <mpitt@debian.org> writes:
while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore --data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default.I believe this is a consequence of commit
a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the entirely
false assumption that --schema-only and --data-only have something to
do with the order that entries appear in the archive ...
Darn, will investigate.
cheers
andrew
On 05/16/2012 10:23 AM, Andrew Dunstan wrote:
On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:Martin Pitt <mpitt@debian.org <mailto:mpitt@debian.org>> writes:
while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore--data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default.I believe this is a consequence of commit
a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the
entirely
false assumption that --schema-only and --data-only have something to
do with the order that entries appear in the archive ...Darn, will investigate.
[cc -hackers]
Well, the trouble is that we have these pesky SECTION_NONE entries for
things like comments, security labels and ACLs that need to be dumped in
the right section, so we can't totally ignore the order. But we could
(and probably should) ignore the order for making decisions about
everything BUT those entries.
So, here's a revised plan:
--section=data will dump exactly TABLE DATA, SEQUENCE SET or BLOBS
entries
--section=pre-data will dump SECTION_PRE_DATA items (other than
SEQUENCE SET) plus any immediately following SECTION_NONE items.
--section=post-data will dump everything else.
Comments?
cheers
andrew
On 05/21/2012 02:59 PM, Andrew Dunstan wrote:
On 05/16/2012 10:23 AM, Andrew Dunstan wrote:
On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:Martin Pitt <mpitt@debian.org <mailto:mpitt@debian.org>> writes:
while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore--data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default.I believe this is a consequence of commit
a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the
entirely
false assumption that --schema-only and --data-only have
something to
do with the order that entries appear in the archive ...Darn, will investigate.
[cc -hackers]
Well, the trouble is that we have these pesky SECTION_NONE entries for
things like comments, security labels and ACLs that need to be dumped
in the right section, so we can't totally ignore the order. But we
could (and probably should) ignore the order for making decisions
about everything BUT those entries.So, here's a revised plan:
--section=data will dump exactly TABLE DATA, SEQUENCE SET or BLOBS
entries
--section=pre-data will dump SECTION_PRE_DATA items (other than
SEQUENCE SET) plus any immediately following SECTION_NONE items.
--section=post-data will dump everything else.
It turns out there were some infelicities with pg_dump as well as with
pg_restore.
I think the attached patch does the right thing. I'll keep testing -
I'll be happier if other people bang on it too.
cheers
andrew
Attachments:
sectionfix.patchtext/x-patch; name=sectionfix.patchDownload
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 2341,2354 **** _tocEntryRequired(TocEntry *te, RestoreOptions *ropt, bool include_acls)
if (!ropt->createDB && strcmp(te->desc, "DATABASE") == 0)
return 0;
! /* skip (all but) post data section as required */
! /* table data is filtered if necessary lower down */
if (ropt->dumpSections != DUMP_UNSECTIONED)
{
! if (!(ropt->dumpSections & DUMP_POST_DATA) && te->inPostData)
! return 0;
! if (!(ropt->dumpSections & DUMP_PRE_DATA) && ! te->inPostData && strcmp(te->desc, "TABLE DATA") != 0)
return 0;
}
--- 2341,2365 ----
if (!ropt->createDB && strcmp(te->desc, "DATABASE") == 0)
return 0;
! /*
! * Skip pre and post data section as required
! * Data is filtered if necessary lower down
! * Sequence set operations are in the pre data section for parallel
! * processing purposes, but part of the data section for sectioning
! * purposes.
! * SECTION_NONE items are filtered according to where they are
! * positioned in the list of TOC entries.
! */
if (ropt->dumpSections != DUMP_UNSECTIONED)
{
! if (!(ropt->dumpSections & DUMP_POST_DATA) && /* post data skip */
! ((te->section == SECTION_NONE && te->inPostData) ||
! te->section == SECTION_POST_DATA))
return 0;
+ if (!(ropt->dumpSections & DUMP_PRE_DATA) && /* pre data skip */
+ ((te->section == SECTION_NONE && ! te->inPostData) ||
+ (te->section == SECTION_PRE_DATA && strcmp(te->desc, "SEQUENCE SET") != 0)))
+ return 0;
}
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
***************
*** 7096,7101 **** dumpDumpableObject(Archive *fout, DumpableObject *dobj)
--- 7096,7103 ----
switch (dobj->objType)
{
+ case DO_TABLE:
+ break; /* has its own controls */
case DO_INDEX:
case DO_TRIGGER:
case DO_CONSTRAINT:
***************
*** 12075,12081 **** dumpTable(Archive *fout, TableInfo *tbinfo)
if (tbinfo->relkind == RELKIND_SEQUENCE)
dumpSequence(fout, tbinfo);
! else if (!dataOnly)
dumpTableSchema(fout, tbinfo);
/* Handle the ACL here */
--- 12077,12083 ----
if (tbinfo->relkind == RELKIND_SEQUENCE)
dumpSequence(fout, tbinfo);
! else if (dumpSections & DUMP_PRE_DATA)
dumpTableSchema(fout, tbinfo);
/* Handle the ACL here */
***************
*** 13291,13297 **** dumpSequence(Archive *fout, TableInfo *tbinfo)
*
* Add a 'SETVAL(seq, last_val, iscalled)' as part of a "data" dump.
*/
! if (!dataOnly)
{
/*
* DROP must be fully qualified in case same name appears in
--- 13293,13299 ----
*
* Add a 'SETVAL(seq, last_val, iscalled)' as part of a "data" dump.
*/
! if (dumpSections & DUMP_PRE_DATA)
{
/*
* DROP must be fully qualified in case same name appears in
***************
*** 13412,13418 **** dumpSequence(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
}
! if (!schemaOnly)
{
resetPQExpBuffer(query);
appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
--- 13414,13420 ----
tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
}
! if (dumpSections & DUMP_DATA)
{
resetPQExpBuffer(query);
appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
Andrew Dunstan <andrew@dunslane.net> writes:
Martin Pitt <mpitt@debian.org <mailto:mpitt@debian.org>> writes:
while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore --data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default.
It turns out there were some infelicities with pg_dump as well as with
pg_restore.
I think the attached patch does the right thing. I'll keep testing -
I'll be happier if other people bang on it too.
After looking this over, I think the original patch was just
fundamentally wrong and needs to be largely rewritten. The basic
error was in saying that the existing options --schema-only and
--data-only were equivalent to particular cases of --section, which
they are not. The proposed new patch does not make this better;
it just makes the logic even more spaghetti-ish.
I think what we need is to rip all that out and treat --section as being
a new option that's not tied to the old ones in any way, but is an
entirely orthogonal filter. The right place to implement it (for either
pg_dump or pg_restore) is in the TOC-scanning loops in
pg_backup_archiver.c, which can track which section they are in fairly
easily (probably define it as being the current item's section unless
that's SECTION_NONE, in which case use the previous section value).
BTW, I'm thinking we could make that code simpler and faster if the
_tocEntryRequired logic were done only once in an initial pass, and then
we stored the teReqs result into a work field in the TocEntry struct
for use in later passes.
Andrew told me off-list that he would be unavailable due to travel for
awhile, so I will have a go at fixing this.
regards, tom lane