pg_upgrade

Started by Brian Hirtover 15 years ago10 messagesgeneral

bhirt@me.com

over 15 years ago

I'm testing pg_upgrade out and ran into a couple of problems. First when I did pg_upgrade --check I got the tsearch2 tables preventing the upgrade from happening:

Database: testdatabase
public.pg_ts_dict.dict_init
public.pg_ts_dict.dict_lexize
public.pg_ts_parser.prs_start
public.pg_ts_parser.prs_nexttoken
public.pg_ts_parser.prs_end
public.pg_ts_parser.prs_headline
public.pg_ts_parser.prs_lextype

For testing, at this point I really didn't care about tsearch, so I simply dropped those tables so I could revisit them later -- however, I'm confused about these tables in general, both pg_catalog.pg_ts_parser and public.pg_ts_parser exist with different, albeit similar, schemas. I think that the table in public is no longer used and was a remnant from pre-8.3 when tsearch2 wasn't part of the distribution, can anyone confirm this?

Anyway, after removing the tsearch tables, I did pg_upgrade --check again and it said the clusters were compatible. I proceeded to run the upgrade command and it bombed out in the "Restoring user relation files" section. I've included some output below, any advice on what is going on? It seems something is messed up in either the check logic or actual migration code.

root@ubuntu:~# /usr/pg-8.4/bin/oid2name
All databases:
Oid Database Name Tablespace
-------------------------------------------
...
11564 postgres pg_default
...
root@ubuntu:~# /usr/pg-8.4/bin/oid2name -o 2683

From database "postgres":

Filenode Table Name
----------------------------------------
2683 pg_largeobject_loid_pn_index

postgres@ubuntu:~$ /usr/pg-9.0/bin/pg_upgrade
Performing Consistency Checks
-----------------------------
Checking old data directory (/moby/pgdb-8.4) ok
Checking old bin directory (/usr/pg-8.4/bin) ok
Checking new data directory (/moby/pgdb-9.0) ok
Checking new bin directory (/usr/pg-9.0/bin) ok
Checking for reg* system oid user data types ok
Checking for /contrib/isn with bigint-passing mismatch ok
Checking for large objects ok
Creating catalog dump ok
Checking for presence of required libraries ok

| If pg_upgrade fails after this point, you must
| re-initdb the new cluster before continuing.
| You will also need to remove the ".old" suffix
| from /moby/pgdb-8.4/global/pg_control.old.

Performing Migration
--------------------
Adding ".old" suffix to old global/pg_control ok
Analyzing all rows in the new cluster ok
Freezing all rows on the new cluster ok
Deleting new commit clogs ok
Copying old commit clogs to new server ok
Setting next transaction id for new cluster ok
Resetting WAL archives ok
Setting frozenxid counters in new cluster ok
Creating databases in the new cluster ok
Adding support functions to new cluster ok
Restoring database schema to new cluster ok
Removing support functions from new cluster ok
Restoring user relation files
/moby/pgdb-8.4/base/11564/2683
Could not find pg_toast.pg_toast_2147483647 in new cluster

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Brian Hirt (#1)

Re: pg_upgrade

Brian Hirt <bhirt@me.com> writes:

I'm testing pg_upgrade out and ran into a couple of problems. First when I did pg_upgrade --check I got the tsearch2 tables preventing the upgrade from happening:
Database: testdatabase
public.pg_ts_dict.dict_init
public.pg_ts_dict.dict_lexize
public.pg_ts_parser.prs_start
public.pg_ts_parser.prs_nexttoken
public.pg_ts_parser.prs_end
public.pg_ts_parser.prs_headline
public.pg_ts_parser.prs_lextype

For testing, at this point I really didn't care about tsearch, so I simply dropped those tables so I could revisit them later -- however, I'm confused about these tables in general, both pg_catalog.pg_ts_parser and public.pg_ts_parser exist with different, albeit similar, schemas. I think that the table in public is no longer used and was a remnant from pre-8.3 when tsearch2 wasn't part of the distribution, can anyone confirm this?

Correct, you should just drop the ones that aren't in pg_catalog.

Anyway, after removing the tsearch tables, I did pg_upgrade --check again and it said the clusters were compatible. I proceeded to run the upgrade command and it bombed out in the "Restoring user relation files" section.

That sure looks like a bug, but there's not enough info here to
diagnose. Is there actually a pg_toast.pg_toast_2147483647 table
in the 8.4 cluster? (I'm betting not.) Could you try extracting
a test case? I wonder whether "pg_dump -s" from the 8.4 database,
loaded into a fresh 8.4 database, would be enough to reproduce.

regards, tom lane

Brian Hirt

bhirt@me.com

over 15 years ago

In reply to: Tom Lane (#2)

Re: pg_upgrade

Looks like pg_upgrade is using 32bit oids. 2147483647 is the max signed 32 bit int, but the oids for my tables are clearly larger than that.

== output from pg_upgrade ==
Database: basement84_dev
relname: mit.company: reloid: 2147483647 reltblspace:
relname: mit.company_history: reloid: 2147483647 reltblspace:

== output from catalog query ==
basement84_dev=# select c.oid,c.relname from pg_catalog.pg_namespace n, pg_catalog.pg_class c where n.oid = c.relnamespace and n.nspname = 'mit';
oid | relname
------------+--------------------
3000767630 | company
3000767633 | company_history
(22 rows)

On Sep 28, 2010, at 10:51 AM, Tom Lane wrote:

Show quoted text

Brian Hirt <bhirt@me.com> writes:

I'm testing pg_upgrade out and ran into a couple of problems. First when I did pg_upgrade --check I got the tsearch2 tables preventing the upgrade from happening:
Database: testdatabase
public.pg_ts_dict.dict_init
public.pg_ts_dict.dict_lexize
public.pg_ts_parser.prs_start
public.pg_ts_parser.prs_nexttoken
public.pg_ts_parser.prs_end
public.pg_ts_parser.prs_headline
public.pg_ts_parser.prs_lextype

For testing, at this point I really didn't care about tsearch, so I simply dropped those tables so I could revisit them later -- however, I'm confused about these tables in general, both pg_catalog.pg_ts_parser and public.pg_ts_parser exist with different, albeit similar, schemas. I think that the table in public is no longer used and was a remnant from pre-8.3 when tsearch2 wasn't part of the distribution, can anyone confirm this?

Correct, you should just drop the ones that aren't in pg_catalog.

Anyway, after removing the tsearch tables, I did pg_upgrade --check again and it said the clusters were compatible. I proceeded to run the upgrade command and it bombed out in the "Restoring user relation files" section.

That sure looks like a bug, but there's not enough info here to
diagnose. Is there actually a pg_toast.pg_toast_2147483647 table
in the 8.4 cluster? (I'm betting not.) Could you try extracting
a test case? I wonder whether "pg_dump -s" from the 8.4 database,
loaded into a fresh 8.4 database, would be enough to reproduce.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Brian Hirt (#3)

Re: pg_upgrade

Brian Hirt wrote:

Looks like pg_upgrade is using 32bit oids. 2147483647 is the max signed 32 bit int, but the oids for my tables are clearly larger than that.

== output from pg_upgrade ==
Database: basement84_dev
relname: mit.company: reloid: 2147483647 reltblspace:
relname: mit.company_history: reloid: 2147483647 reltblspace:

== output from catalog query ==
basement84_dev=# select c.oid,c.relname from pg_catalog.pg_namespace n, pg_catalog.pg_class c where n.oid = c.relnamespace and n.nspname = 'mit';
oid | relname
------------+--------------------
3000767630 | company
3000767633 | company_history
(22 rows)

Interesting. Odd it would report the max 32-bit signed int. I wonder
if it somehow is getting set to -1. I looked briefly at the pg_upgrade
code and it appears to put all oids in unsigned ints.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Bruce Momjian (#4)

Re: pg_upgrade

Bruce Momjian <bruce@momjian.us> writes:

Interesting. Odd it would report the max 32-bit signed int. I wonder
if it somehow is getting set to -1. I looked briefly at the pg_upgrade
code and it appears to put all oids in unsigned ints.

On some platforms, that's what you'll get if you feed a value larger
than 2^31 to atoi() and related functions. I will bet lunch that this
behavior reflects an attempt to use signed-integer input functions on
OID values. You need to check the string conversion code itself, not
just the declared type of the result variables.

regards, tom lane

Brian Hirt

bhirt@me.com

over 15 years ago

In reply to: Bruce Momjian (#4)

Re: pg_upgrade

It looks like it's related to atol

$ cat test-atol.c
#include <stdlib.h>
#include <stdio.h>

int
main(int argc, char **argv)
{
unsigned int test1;
long test2;
long long test3;
unsigned int test4;

test1 = (unsigned int)atol("3000767169");
test2 = (long)atol("3000767169");
test3 = atoll("3000767169");
test4 = (unsigned int)atoll("3000767169");

fprintf(stderr,"%u %ld %lld %u\n",test1,test2,test3,test4);
}

$ make test-atol
cc test-atol.c -o test-atol
$ ./test-atol
2147483647 2147483647 3000767169 3000767169

I think C90 and C99 specify different behaviors with atol

Is there some standard way postgresql parses integer strings? Maybe that method should be used instead of duplicating the functionality so at least the two behave consistently.

--brian

On Sep 28, 2010, at 2:00 PM, Bruce Momjian wrote:

Show quoted text

Brian Hirt wrote:

Looks like pg_upgrade is using 32bit oids. 2147483647 is the max signed 32 bit int, but the oids for my tables are clearly larger than that.

== output from pg_upgrade ==
Database: basement84_dev
relname: mit.company: reloid: 2147483647 reltblspace:
relname: mit.company_history: reloid: 2147483647 reltblspace:

== output from catalog query ==
basement84_dev=# select c.oid,c.relname from pg_catalog.pg_namespace n, pg_catalog.pg_class c where n.oid = c.relnamespace and n.nspname = 'mit';
oid | relname
------------+--------------------
3000767630 | company
3000767633 | company_history
(22 rows)

Interesting. Odd it would report the max 32-bit signed int. I wonder
if it somehow is getting set to -1. I looked briefly at the pg_upgrade
code and it appears to put all oids in unsigned ints.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Brian Hirt (#6)

Re: pg_upgrade

Brian Hirt wrote:

It looks like it's related to atol

Yep, I found the use of atol in the pg_upgrade code too. Working on a
patch now.

---------------------------------------------------------------------------

$ cat test-atol.c
#include <stdlib.h>
#include <stdio.h>

int
main(int argc, char **argv)
{
unsigned int test1;
long test2;
long long test3;
unsigned int test4;

test1 = (unsigned int)atol("3000767169");
test2 = (long)atol("3000767169");
test3 = atoll("3000767169");
test4 = (unsigned int)atoll("3000767169");

fprintf(stderr,"%u %ld %lld %u\n",test1,test2,test3,test4);
}

$ make test-atol
cc test-atol.c -o test-atol
$ ./test-atol
2147483647 2147483647 3000767169 3000767169

I think C90 and C99 specify different behaviors with atol

Is there some standard way postgresql parses integer strings? Maybe that method should be used instead of duplicating the functionality so at least the two behave consistently.

--brian

On Sep 28, 2010, at 2:00 PM, Bruce Momjian wrote:

Brian Hirt wrote:

Looks like pg_upgrade is using 32bit oids. 2147483647 is the max signed 32 bit int, but the oids for my tables are clearly larger than that.

== output from pg_upgrade ==
Database: basement84_dev
relname: mit.company: reloid: 2147483647 reltblspace:
relname: mit.company_history: reloid: 2147483647 reltblspace:

== output from catalog query ==
basement84_dev=# select c.oid,c.relname from pg_catalog.pg_namespace n, pg_catalog.pg_class c where n.oid = c.relnamespace and n.nspname = 'mit';
oid | relname
------------+--------------------
3000767630 | company
3000767633 | company_history
(22 rows)

Interesting. Odd it would report the max 32-bit signed int. I wonder
if it somehow is getting set to -1. I looked briefly at the pg_upgrade
code and it appears to put all oids in unsigned ints.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Bruce Momjian (#7)

Re: pg_upgrade

Bruce Momjian wrote:

Brian Hirt wrote:

It looks like it's related to atol

Yep, I found the use of atol in the pg_upgrade code too. Working on a
patch now.

I have applied the attached patch to HEAD and 9.0.X. Odd I had never
received a bug report about this before. Good thing it didn't silently
fail, but it is designed to be very picky.

This patch will appear in the next 9.0.X release. Thanks for the
report.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Brian Hirt

bhirt@me.com

over 15 years ago

In reply to: Bruce Momjian (#8)

Re: pg_upgrade

Bruce,

The applied patch has the same behavior on i686 Ubuntu 10.04. It looks like atol() is just a macro for strtol() in stdio.h. I think you want strtoul() instead of strtol()

when i change str2uint() to use strtoul() pg_upgrade completes without a problem (I still haven't tested the upgrade database, but I expect that will be just fine).

I think it's pretty uncommon for the OID to be that big which is why nobody stumbled onto this. This particular installation has pretty much been reloading development databases non stop for the last year. Also, people tend to initdb a lot when testing and doing development which will keep resetting the oid low.

Thanks for getting this one fixed

--brian

Show quoted text

On Sep 28, 2010, at 3:49 PM, Bruce Momjian wrote:

I have applied the attached patch to HEAD and 9.0.X. Odd I had never
received a bug report about this before. Good thing it didn't silently
fail, but it is designed to be very picky.

#10

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Brian Hirt (#9)

Re: pg_upgrade

Brian Hirt wrote:

Bruce,

The applied patch has the same behavior on i686 Ubuntu 10.04. It
looks like atol() is just a macro for strtol() in stdio.h. I think
you want strtoul() instead of strtol()

Yes, thanks. I have now applied that fix in HEAD and 9.0.X.

when i change str2uint() to use strtoul() pg_upgrade completes without
a problem (I still haven't tested the upgrade database, but I expect
that will be just fine).

Yep.

I think it's pretty uncommon for the OID to be that big which is why
nobody stumbled onto this. This particular installation has pretty
much been reloading development databases non stop for the last year.
Also, people tend to initdb a lot when testing and doing development
which will keep resetting the oid low.

Yes, seems > 2^31 oids are rarer than I thought.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

pg_upgrade

Attachments: