pg_migrator issue with contrib

Started by Brad Nicholsonover 16 years ago87 messages
#1Brad Nicholson
bnichols@ca.afilias.info

I've been kicking the tires on this a bit, and I've found an issue when
dealing with contrib/ (specifically dblink, although I haven't looked
around anymore).

dblink_current_query() is not in the 8.4 version - when I run
pg_migrator on an 8.3 cluster that has dblink installed, I get the
following:

Restoring database schema
psql:/home/postgres/pg_migrator_dump_db.sql:271: ERROR: could not find
function "dblink_current_query" in file
"/opt/dbs/pgsql84-beta2/lib/dblink.so"

There were problems executing "/opt/dbs/pgsql84-beta2/bin/psql" --set
ON_ERROR_STOP=on --port 5432 -f "/home/postgres/pg_migrator_dump_db.sql"
--dbname template1 >> "/dev/null"

pg_migrator exits leaving me with a corrupted 8.3 instance.

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice. Even better would be a sanity check
added to prevent this.
--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.

#2Bruce Momjian
bruce@momjian.us
In reply to: Brad Nicholson (#1)
Re: pg_migrator issue with contrib

Brad Nicholson wrote:

I've been kicking the tires on this a bit, and I've found an issue when
dealing with contrib/ (specifically dblink, although I haven't looked
around anymore).

dblink_current_query() is not in the 8.4 version - when I run
pg_migrator on an 8.3 cluster that has dblink installed, I get the
following:

Restoring database schema
psql:/home/postgres/pg_migrator_dump_db.sql:271: ERROR: could not find
function "dblink_current_query" in file
"/opt/dbs/pgsql84-beta2/lib/dblink.so"

There were problems executing "/opt/dbs/pgsql84-beta2/bin/psql" --set
ON_ERROR_STOP=on --port 5432 -f "/home/postgres/pg_migrator_dump_db.sql"
--dbname template1 >> "/dev/null"

Yep, pg_migrator will exit on any restore error. This really relates to
the problem of how we handle /contrib schema migration from one
release to the other. Good thing you posted to hackers because that is
really the group that can address this. pg_migrator is really just
restoring the schema that pg_dump is producing.

pg_migrator exits leaving me with a corrupted 8.3 instance.

When you say "corrupted", I assume you mean you have remove the _old
suffixes to restart your 8.3 instance, right? I hope that is the only
corruption issue --- please confirm.

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice. Even better would be a sanity check
added to prevent this.

OK, I am looking to the hackers group for recommentations on this. I
wonder if I should recommend uninstalling /contrib modules before the
upgrade, but that is not possible for custom data types that have
columns already defined in the old cluster.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Bruce Momjian (#2)
Re: pg_migrator issue with contrib

Bruce Momjian wrote:

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice. Even better would be a sanity check
added to prevent this.

OK, I am looking to the hackers group for recommentations on this. I
wonder if I should recommend uninstalling /contrib modules before the
upgrade, but that is not possible for custom data types that have
columns already defined in the old cluster.

There is no nice answer to this. It goes way beyond data types: you
could be using the module stuff in indexes, functions, views etc. You
can't just drop the stuff. The best I have been able to do in similar
cases is to install the updated module in the database before restoring,
and ignore any restoration errors about "foo already exists" or "foo not
found in .so file". Not sure how well that translates to pg_migrator,
though.

cheers

andrew

#4Brad Nicholson
bnichols@ca.afilias.info
In reply to: Bruce Momjian (#2)
Re: pg_migrator issue with contrib

On Fri, 2009-06-05 at 15:50 -0400, Bruce Momjian wrote:

Brad Nicholson wrote:

I've been kicking the tires on this a bit, and I've found an issue when
dealing with contrib/ (specifically dblink, although I haven't looked
around anymore).

dblink_current_query() is not in the 8.4 version - when I run
pg_migrator on an 8.3 cluster that has dblink installed, I get the
following:

Restoring database schema
psql:/home/postgres/pg_migrator_dump_db.sql:271: ERROR: could not find
function "dblink_current_query" in file
"/opt/dbs/pgsql84-beta2/lib/dblink.so"

There were problems executing "/opt/dbs/pgsql84-beta2/bin/psql" --set
ON_ERROR_STOP=on --port 5432 -f "/home/postgres/pg_migrator_dump_db.sql"
--dbname template1 >> "/dev/null"

Yep, pg_migrator will exit on any restore error. This really relates to
the problem of how we handle /contrib schema migration from one
release to the other. Good thing you posted to hackers because that is
really the group that can address this. pg_migrator is really just
restoring the schema that pg_dump is producing.

pg_migrator exits leaving me with a corrupted 8.3 instance.

When you say "corrupted", I assume you mean you have remove the _old
suffixes to restart your 8.3 instance, right? I hope that is the only
corruption issue --- please confirm.

Unfortunately no - when I try and start the old version, I get:

pg_ctl -D pgsql83/ start
postgres: could not find the database system
Expected to find it in the directory "/opt/DATA/pgsql83",
but could not open file "/opt/DATA/pgsql83/global/pg_control": No such
file or directory

I am specifying both new and old paths to pg_migrator, as I don't have
my data directories in standard locations.

Test case is pretty straight forward.

Create an 8.3 instance
Add dblink
follow through the pg_migrator upgrade path
try and start your 8.3 instance after the failure

--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Brad Nicholson wrote:

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice. Even better would be a sanity check
added to prevent this.

OK, I am looking to the hackers group for recommentations on this.

One thing I was going to suggest is that pg_migrator check that all the
.so's in the old installation also exist in the new one. However that
could be overkill since you don't know for sure if the old .so is
referenced anywhere in the database.

As for the specific problem at hand, it might've been a mistake to
replace dblink_current_query() with a SQL function instead of changing
the internal implementation of the C function. We could still fix that.

regards, tom lane

#6Bruce Momjian
bruce@momjian.us
In reply to: Brad Nicholson (#4)
Re: pg_migrator issue with contrib

Brad Nicholson wrote:

When you say "corrupted", I assume you mean you have remove the _old
suffixes to restart your 8.3 instance, right? I hope that is the only
corruption issue --- please confirm.

Unfortunately no - when I try and start the old version, I get:

pg_ctl -D pgsql83/ start
postgres: could not find the database system
Expected to find it in the directory "/opt/DATA/pgsql83",
but could not open file "/opt/DATA/pgsql83/global/pg_control": No such
file or directory

I am specifying both new and old paths to pg_migrator, as I don't have
my data directories in standard locations.

Test case is pretty straight forward.

Create an 8.3 instance
Add dblink
follow through the pg_migrator upgrade path
try and start your 8.3 instance after the failure

Uh, I assume you read all of the INSTALL instructions, including the
last item:

10. Reverting to old cluster

...

--> If you ran pg_migrator _without_ --link or did not start the new server,
the old cluster was not modified except that an ".old" suffix was
appended to $PGDATA/global/pg_control and tablespaces directories. To
reuse the old cluster, remove the tablespace directories created by the
new cluster and remove the ".old" suffix from the old cluster tablespace
directory names and $PGDATA/global/pg_control; then you can restart the
old cluster.

I just modified the arrow text to be clearer. The rename of
pg_controldata is done to prevent accidental starting of the old cluster
in case of migration success.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#7Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#3)
Re: pg_migrator issue with contrib

Andrew Dunstan wrote:

Bruce Momjian wrote:

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice. Even better would be a sanity check
added to prevent this.

OK, I am looking to the hackers group for recommentations on this. I
wonder if I should recommend uninstalling /contrib modules before the
upgrade, but that is not possible for custom data types that have
columns already defined in the old cluster.

There is no nice answer to this. It goes way beyond data types: you
could be using the module stuff in indexes, functions, views etc. You
can't just drop the stuff. The best I have been able to do in similar
cases is to install the updated module in the database before restoring,
and ignore any restoration errors about "foo already exists" or "foo not
found in .so file". Not sure how well that translates to pg_migrator,
though.

I suspected this answer but figured I would get a definative answer
rather than guessing.

Based on the way pg_migrator works I am going to suggest that if
/contrib restore generates an error that they uninstall the /contrib
from the old cluster and retry.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#8Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#5)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Brad Nicholson wrote:

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice. Even better would be a sanity check
added to prevent this.

OK, I am looking to the hackers group for recommentations on this.

One thing I was going to suggest is that pg_migrator check that all the
.so's in the old installation also exist in the new one. However that
could be overkill since you don't know for sure if the old .so is
referenced anywhere in the database.

Or in this case that the new *.so has the same functions.

As for the specific problem at hand, it might've been a mistake to
replace dblink_current_query() with a SQL function instead of changing
the internal implementation of the C function. We could still fix that.

I am afraid /contrib is going to be a mine field for this type of
problem so I am going to recommend uninstaling the /contrib module if
possible and retry the migration. That should work in this case.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#9Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#7)
Re: pg_migrator issue with contrib

Bruce Momjian wrote:

There is no nice answer to this. It goes way beyond data types: you
could be using the module stuff in indexes, functions, views etc. You
can't just drop the stuff. The best I have been able to do in similar
cases is to install the updated module in the database before restoring,
and ignore any restoration errors about "foo already exists" or "foo not
found in .so file". Not sure how well that translates to pg_migrator,
though.

I suspected this answer but figured I would get a definative answer
rather than guessing.

Based on the way pg_migrator works I am going to suggest that if
/contrib restore generates an error that they uninstall the /contrib
from the old cluster and retry.

I have added the following paragraph to the pg_migrator INSTALL docs:

If an error occurs while restoring the database schema, pg_migrator will
exit and you will have to revert to the old cluster as outlined in step
#10 below. To try pg_migrator again, you will need to modify the old
cluster so the pg_migrator schema restore succeeds. If the problem is a
/contrib module, you might need to uninstall the /contrib module from
the old cluster and install it in the new cluster after migration.

The only other idea I have would be to add a switch that allows
schema restore errors to be ignored, but I would like to see if the
above text is enough first.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#10Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Bruce Momjian (#8)
Re: pg_migrator issue with contrib

Hi,

Ah, we need module/extension/package/plugin so badly...

Le 5 juin 09 à 22:19, Bruce Momjian a écrit :

I am afraid /contrib is going to be a mine field for this type of
problem so I am going to recommend uninstaling the /contrib module if
possible and retry the migration. That should work in this case.

You can't seriously recommend that, I'm afraid.
As Andrew (Dunstan) was saying up-thread, the faulty module (from
contrib or pgfoundry) could hold some indexes (btree, gist, gin) and/
or data types in the cluster relations.

So if you uninstall the module (drop type cascade, drop operator
class, ...) you lose data.

Some example modules that I can think of and are wildspread in the
field, as far as I know, are ip4r (data type and indexes), orafce
(functions, views, tables), and some less spread are prefix (data type
and indexes) or temporal (period data type, indexes).

Please do not recommend people to lose their precious data to be able
to upgrade. You could tell them pg_migrator isn't an option in their
case, though. At least we're left with a faster (multi-threaded)
pg_restore :)

Regards,
--
dim

#11Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#9)
Re: pg_migrator issue with contrib

Bruce,

Assuming a contrib module *hasn't* changed its API, does pg_migrator
link against the 8.4 version of the module's .so, or the old 8.3 version?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#12David Blewett
david@dawninglight.net
In reply to: Dimitri Fontaine (#10)
Re: pg_migrator issue with contrib

On Fri, Jun 5, 2009 at 6:11 PM, Dimitri Fontaine <dfontaine@hi-media.com> wrote:

Some example modules that I can think of and are wildspread in the field, as
far as I know, are ip4r (data type and indexes), orafce (functions, views,
tables), and some less spread are prefix (data type and indexes) or temporal
(period data type, indexes).

And hstore...

David Blewett

#13Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#11)
Re: pg_migrator issue with contrib

Josh Berkus wrote:

Bruce,

Assuming a contrib module *hasn't* changed its API, does pg_migrator
link against the 8.4 version of the module's .so, or the old 8.3 version?

8.4 version, or whatever is in the 8.4 lib, which should be 8.4.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#14Bruce Momjian
bruce@momjian.us
In reply to: Dimitri Fontaine (#10)
Re: pg_migrator issue with contrib

Dimitri Fontaine wrote:

Hi,

Ah, we need module/extension/package/plugin so badly...

Le 5 juin 09 ? 22:19, Bruce Momjian a ?crit :

I am afraid /contrib is going to be a mine field for this type of
problem so I am going to recommend uninstaling the /contrib module if
possible and retry the migration. That should work in this case.

You can't seriously recommend that, I'm afraid.
As Andrew (Dunstan) was saying up-thread, the faulty module (from
contrib or pgfoundry) could hold some indexes (btree, gist, gin) and/
or data types in the cluster relations.

So if you uninstall the module (drop type cascade, drop operator
class, ...) you lose data.

Some example modules that I can think of and are wildspread in the
field, as far as I know, are ip4r (data type and indexes), orafce
(functions, views, tables), and some less spread are prefix (data type
and indexes) or temporal (period data type, indexes).

Please do not recommend people to lose their precious data to be able
to upgrade. You could tell them pg_migrator isn't an option in their
case, though. At least we're left with a faster (multi-threaded)
pg_restore :)

Very good point, and something I had not considered. I tried
uninstalling hstore and it gladly dropped any hstore columns!

I have updated the INSTALL instructions:

If an error occurs while restoring the database schema, pg_migrator will
exit and you will have to revert to the old cluster as outlined in step
#10 below. To try pg_migrator again, you will need to modify the old
cluster so the pg_migrator schema restore succeeds. If the problem is a
/contrib module, you might need to uninstall the /contrib module from
the old cluster and install it in the new cluster after the migration,
assuming the module is not being used to store user data.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#15Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#13)
Re: pg_migrator issue with contrib

On 6/5/09 6:27 PM, Bruce Momjian wrote:

Josh Berkus wrote:

Bruce,

Assuming a contrib module *hasn't* changed its API, does pg_migrator
link against the 8.4 version of the module's .so, or the old 8.3 version?

8.4 version, or whatever is in the 8.4 lib, which should be 8.4.

So, here's what we need for 8.3 --> 8.4 for contrib modules:

1) make a list of contrib modules which do not convert cleanly (testers?)
2) document these.
3) give pg_migrator some hackish way to install 8.4 contrib modules from
source before copying over the database.
4) set pg_migrator to ignore duplicate object warnings if it does the above.

Note that we expect NOT to have this issue for 8.4-->8.5, since we'll
have a full module infrastructure by then. Really!

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#15)
Re: pg_migrator issue with contrib

Josh Berkus <josh@agliodbs.com> writes:

So, here's what we need for 8.3 --> 8.4 for contrib modules:

1) make a list of contrib modules which do not convert cleanly (testers?)
2) document these.
3) give pg_migrator some hackish way to install 8.4 contrib modules from
source before copying over the database.
4) set pg_migrator to ignore duplicate object warnings if it does the above.

1A) consider whether we can reduce/eliminate the conversion problems.
We have five days.

I don't think we need testing, per se. The first step should be to diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed. Any module whose .sql.in file hasn't changed
is definitely safe.

regards, tom lane

#17Grzegorz Jaskiewicz
gj@pointblue.com.pl
In reply to: Tom Lane (#16)
Re: pg_migrator issue with contrib

On 6 Jun 2009, at 19:50, Tom Lane wrote:

We have five days.

I don't think we need testing, per se. The first step should be to
diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed. Any module whose .sql.in file hasn't changed
is definitely safe.

I can tell you already that dblink has changed, and I had to drop it
before migration, otherwise everything went fine. Migration of 57GB
data in place took about 1 minute.

#18Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Josh Berkus (#15)
Re: pg_migrator issue with contrib

Le 6 juin 09 à 20:45, Josh Berkus a écrit :

So, here's what we need for 8.3 --> 8.4 for contrib modules:

That does nothing for external modules whose code isn't in PostgreSQL
control. I'm thinking of those examples I cited up-thread --- and some
more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard -
core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

Note that we expect NOT to have this issue for 8.4-->8.5, since
we'll have a full module infrastructure by then. Really!

Note: added in-place upgrade to requirements of the first version of
the feature.
--
dim

#19Bruce Momjian
bruce@momjian.us
In reply to: Dimitri Fontaine (#18)
Re: pg_migrator issue with contrib

Dimitri Fontaine wrote:

Le 6 juin 09 ? 20:45, Josh Berkus a ?crit :

So, here's what we need for 8.3 --> 8.4 for contrib modules:

That does nothing for external modules whose code isn't in PostgreSQL
control. I'm thinking of those examples I cited up-thread --- and some
more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

Agreed, that's why the new INSTALL text addresses this clearly. You
might want to read my blog entry about why the INSTALL file is so
important for pg_migrator:

http://momjian.us/main/blogs/pgblog.html#June_6_2009

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard -
core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

Well, they will get an error and see the INSTALL file --- I don't see
having pg_migrator go around looking for things as a fruitful effort.

Note that we expect NOT to have this issue for 8.4-->8.5, since
we'll have a full module infrastructure by then. Really!

Note: added in-place upgrade to requirements of the first version of
the feature.

Yes, this will certainly spur development of a better /contrib install
system.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#20Bruce Momjian
bruce@momjian.us
In reply to: Grzegorz Jaskiewicz (#17)
Re: pg_migrator issue with contrib

Grzegorz Jaskiewicz wrote:

On 6 Jun 2009, at 19:50, Tom Lane wrote:

We have five days.

I don't think we need testing, per se. The first step should be to
diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed. Any module whose .sql.in file hasn't changed
is definitely safe.

I can tell you already that dblink has changed, and I had to drop it
before migration, otherwise everything went fine. Migration of 57GB
data in place took about 1 minute.

The good news is that the INSTALL instructions where clear enough for
the tester to understand that uninstalling dblink from the old cluster
and reinstalling it in the new cluster would work.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#21Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#16)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

So, here's what we need for 8.3 --> 8.4 for contrib modules:

1) make a list of contrib modules which do not convert cleanly (testers?)
2) document these.
3) give pg_migrator some hackish way to install 8.4 contrib modules from
source before copying over the database.
4) set pg_migrator to ignore duplicate object warnings if it does the above.

1A) consider whether we can reduce/eliminate the conversion problems.
We have five days.

I don't think we need testing, per se. The first step should be to diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed. Any module whose .sql.in file hasn't changed
is definitely safe.

I am a little hesistant to do one-off adjustments for /contrib changes
between versions. The INSTALL instructions are pretty clear of how to
fix problems like dblink.

A more complex case would be if hstore couldn't be migrated, and there
is no way to drop that type and reinstall on the new system without
dropping your data too, but again, I don't see how we have any way to
fix that for 8.4 anyway.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#22Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#15)
Re: pg_migrator issue with contrib

Josh Berkus wrote:

On 6/5/09 6:27 PM, Bruce Momjian wrote:

Josh Berkus wrote:

Bruce,

Assuming a contrib module *hasn't* changed its API, does pg_migrator
link against the 8.4 version of the module's .so, or the old 8.3 version?

8.4 version, or whatever is in the 8.4 lib, which should be 8.4.

So, here's what we need for 8.3 --> 8.4 for contrib modules:

1) make a list of contrib modules which do not convert cleanly (testers?)
2) document these.
3) give pg_migrator some hackish way to install 8.4 contrib modules from
source before copying over the database.
4) set pg_migrator to ignore duplicate object warnings if it does the above.

Note that we expect NOT to have this issue for 8.4-->8.5, since we'll
have a full module infrastructure by then. Really!

I think the cleanest solution is to document that these issues might
happen and suggest solutions. This has already worked for our tester
today so I am hopeful people will just figure out how to fix this.

I have added dblink as a specific migration example to the INSTALL file:

---------------------------------------------------------------------------

If an error occurs while restoring the database schema, pg_migrator will
exit and you will have to revert to the old cluster as outlined in step
#10 below. To try pg_migrator again, you will need to modify the old
cluster so the pg_migrator schema restore succeeds. If the problem is a
/contrib module, you might need to uninstall the /contrib module from
the old cluster and install it in the new cluster after the migration,
assuming the module is not being used to store user data.

For example, /contrib/dblink changed its API from Postgres 8.3 to 8.4 so
sites using it must uninstall dblink from their Postgres 8.3, cluster,
do the migration, then install dblink in Postgres 8.4. More complex
cases might require individual objects to be migrated manually.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#22)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

I think the cleanest solution is to document that these issues might
happen and suggest solutions.

No, the cleanest solution is to fix it before people ever see a
problem. This is trivial to do in the case of dblink and I don't
see why you think it would be better to make users work around it.

Also, dblink is one of the easiest cases because (a) it doesn't have
anything but functions, and thus it's possible to drop it from the
old DB without data loss, and (b) the inconsistency that it has is
something that will result in a clean, readily understandable failure
during migration. As opposed to some other cases that will migrate
just fine and then dump core during use.

I've just finished running through a diff of 8.3 vs 8.4 contrib
SQL files. It looks like we have these cases:

* Quite a few modules have differences in the signatures of GIST and
GIN opclass support functions. As was discussed earlier, we don't
really have to fix those, because the SQL-level signatures don't
affect anything at runtime. Most of these modules also have RECHECK
flags that were removed, but we have a solution for that too.

* dblink has an entirely unnecessary change in the definition of
dblink_current_query(), which will cause it to fail to migrate.

* fuzzystrmatch has added a new levenshtein() function. The worst
case if this is migrated without any extra effort is the user
doesn't have access to that function. But it's a functions-only
module, so the user could fix it by just rerunning the SQL file
afterwards.

* intarray has removed its internal @> and <@ operators. As I was
mentioning the other day, it might be the best thing to just revert
that change anyway, until we can get a better handle on the behavior
of GIN for empty arrays.

* isn has got the nastiest problems of the lot: since it piggybacks
on type bigint, a migrated database might work, or might crash
miserably, depending on whether bigint has become pass-by-value
in the new database. I'm not very sure if we can fix this reasonably.

* pageinspect has changed the ABI of get_raw_page() in a way that will
likely make it dump core if the function definition is migrated from
an old DB. This needs to be fixed. It's also added a new function
fsm_page_contents(bytea), which is no big problem, since the SQL
file can be re-executed without harm.

* pg_buffercache has changed the view pg_buffercache, which is
definitely going to be a migration issue. Need to test whether
it represents a crash risk if the old definition is migrated.

* pg_freespacemap has made *major* changes in its ABI. There's
probably no hope of this working either, but we need to be sure
it's not a crash risk.

* pgstattuple has made changes in the output types of its functions.
This is a serious crash risk, and I'm not immediately sure how to
fix it. Note that simply migrating the module will not draw any
errors.

* seg has added a new function seg_center(seg). This can be fixed
by reloading the SQL file, but not by dropping the module beforehand
since that would risk data loss.

* tsearch2 has opclass support function changes, but unlike other
cases of this, it will fail to migrate to 8.4 because the functions
are references to core functions instead of being declared in the
module. Also, "drop it first" isn't a very useful recommendation
since the domains it defines might be used somewhere.

So we've definitely got some work here. If we do nothing, several
of these issues are going to end up with CVE numbers.

regards, tom lane

#24Grzegorz Jaskiewicz
gj@pointblue.com.pl
In reply to: Bruce Momjian (#20)
Re: pg_migrator issue with contrib

On 7 Jun 2009, at 03:27, Bruce Momjian wrote:

Grzegorz Jaskiewicz wrote:

On 6 Jun 2009, at 19:50, Tom Lane wrote:

We have five days.

I don't think we need testing, per se. The first step should be to
diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed. Any module whose .sql.in file hasn't
changed
is definitely safe.

I can tell you already that dblink has changed, and I had to drop it
before migration, otherwise everything went fine. Migration of 57GB
data in place took about 1 minute.

The good news is that the INSTALL instructions where clear enough for
the tester to understand that uninstalling dblink from the old cluster
and reinstalling it in the new cluster would work.

Yes, but I forgot about one database, and had to do it all over again,
luckily I first tested it without --link...

#25Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#23)
Re: pg_migrator issue with contrib

On Sun, Jun 7, 2009 at 12:11 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

* intarray has removed its internal @> and <@ operators.  As I was
mentioning the other day, it might be the best thing to just revert
that change anyway, until we can get a better handle on the behavior
of GIN for empty arrays.

+1.

So we've definitely got some work here.  If we do nothing, several
of these issues are going to end up with CVE numbers.

I think it's becoming increasingly clear that pg_migrator is not for
the faint of heart; in fact, I think we should hedge references to it
with words like "experimental". If we set an expectation that this
tool has the same level of maturity and reliability that people
associate with PostgreSQL in general, we are going to have a lot of
disappointed users. Just to recall the history, the first pgfoundry
commit messages for this tool were on February 9th, three months after
the start of the final CommitFest and feature freeze for 8.4. Since
then, development has proceeded at an amazingly rapid pace, but
there's only so much you can do in four months, especially when it's
too late to rearchitect previously-committed 8.4 features to play more
nicely with the new feature being added.

It seems to me that if we keep plugging at this problem, 8.4->8.5
migration has the potential to be considerably smoother than 8.3 to
8.4 migration (and, yes, a good module facility will help), but it
wouldn't be surprising to me if we're well into 9.x territory before
we really get all of the issues hammered out. I don't think even a
major feature like Hot Standby has the far-reaching implications on
how the system needs to be designed that upgrade-in-place does.

So while I agree with Tom that we should fix as many of these issues
as we reasonably well can for 8.4, I also agree with Bruce that a lot
of this comes down to setting appropriate expectations for our users:
this is a potentially useful tool, especially for users with huge
databases, but it's new, and it's not perfect, and use with caution.

...Robert

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#25)
Re: pg_migrator issue with contrib

Robert Haas <robertmhaas@gmail.com> writes:

I think it's becoming increasingly clear that pg_migrator is not for
the faint of heart; in fact, I think we should hedge references to it
with words like "experimental".

Probably ...

Just to recall the history, the first pgfoundry
commit messages for this tool were on February 9th, three months after
the start of the final CommitFest and feature freeze for 8.4. Since
then, development has proceeded at an amazingly rapid pace, but
there's only so much you can do in four months,

... but the above is a *complete* misrepresentation of the thing's
history (apparently you failed to look in the Attic?). EDB have been
using previous versions of this tool for some years, and the basic
premise is the same as contrib/pg_upgrade that existed as far back as
7.1/7.2 timeframe.

regards, tom lane

#27Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#26)
Re: pg_migrator issue with contrib

On Jun 7, 2009, at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I think it's becoming increasingly clear that pg_migrator is not for
the faint of heart; in fact, I think we should hedge references to it
with words like "experimental".

Probably ...

Just to recall the history, the first pgfoundry
commit messages for this tool were on February 9th, three months
after
the start of the final CommitFest and feature freeze for 8.4. Since
then, development has proceeded at an amazingly rapid pace, but
there's only so much you can do in four months,

... but the above is a *complete* misrepresentation of the thing's
history (apparently you failed to look in the Attic?). EDB have been
using previous versions of this tool for some years, and the basic
premise is the same as contrib/pg_upgrade that existed as far back as
7.1/7.2 timeframe.

I did know that EDB had been using the tool for a while, but I admit
I'm not familiar with the whole history. I had the impression that
we'd gotten a lot more serious about really making this rock solid
since Bruce took it in February. But maybe that's not the case?

...Robert

#28Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Tom Lane (#26)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I think it's becoming increasingly clear that pg_migrator is not for
the faint of heart; in fact, I think we should hedge references to it
with words like "experimental".

Probably ...

I'm with Robert on that one - while pg_migrator is extremely inmportant
for us to go forward. I really think we need to tag it "experimental"
for this release at least. pg_migrator is complex and we are still
discovering new issues every day I don't think rushing it as _THE_
solution will do any good for our reputation. A lot of our code(as
software in general) took years to mature and pg_migrator is likely not
an exception.

Just to recall the history, the first pgfoundry
commit messages for this tool were on February 9th, three months after
the start of the final CommitFest and feature freeze for 8.4. Since
then, development has proceeded at an amazingly rapid pace, but
there's only so much you can do in four months,

... but the above is a *complete* misrepresentation of the thing's
history (apparently you failed to look in the Attic?). EDB have been
using previous versions of this tool for some years, and the basic
premise is the same as contrib/pg_upgrade that existed as far back as
7.1/7.2 timeframe.

well - how much field exposure has pg_migrator/edb_migrator seen
actually? given the large number of "breaks your database" bugs and
issues that got found during the last few weeks I have a hard time
imaging that edb really used the current code for their customers.

Stefan

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#27)
Re: pg_migrator issue with contrib

Robert Haas <robertmhaas@gmail.com> writes:

I did know that EDB had been using the tool for a while, but I admit
I'm not familiar with the whole history. I had the impression that
we'd gotten a lot more serious about really making this rock solid
since Bruce took it in February. But maybe that's not the case?

I don't actually know the EDB end of the history either; maybe someone
can educate us about that. But it's true that the core developers,
at least, weren't taking it seriously until this year. That's because
it really can only handle catalog changes, not changes to the contents
of user tables; and it's been quite a long time since we've had a
release where we didn't change tuple header layout or packing rules or
something that made it a nonstarter. It wasn't clear till early this
year that 8.3->8.4 would be a cycle where pg_migrator had a chance of
being useful in production ... so we got serious about it.

(I do not know whether EDB ever really used it in production. If they
did, it must have been for private updates that changed catalogs and
not user data.)

regards, tom lane

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#23)
Re: pg_migrator issue with contrib

I wrote:

* pageinspect has changed the ABI of get_raw_page() in a way that will
likely make it dump core if the function definition is migrated from
an old DB. This needs to be fixed.
[ and similarly for some other contrib modules ]

After thinking about this some more, I think that there is a fairly
simple coding rule we can adopt to prevent post-migration crashes
of the sort I'm worrying about above. That is:

* If you change the ABI of a C-language function, change its C name.

This will ensure that if someone tries to migrate the old function
definition from an old database, they will get a pg_migrator failure,
or at worst a clean runtime failure when they attempt to use the old
definition. They won't get a core dump or some worse form of security
problem.

As an example, the problem in pageinspect is this diff:

***************
*** 6,16 ****
--
-- get_raw_page()
--
! CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page'
LANGUAGE C STRICT;

  --
  -- page_header()
  --
--- 6,21 ----
  --
  -- get_raw_page()
  --
! CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
  RETURNS bytea
  AS 'MODULE_PATHNAME', 'get_raw_page'
  LANGUAGE C STRICT;
+ CREATE OR REPLACE FUNCTION get_raw_page(text, int4) 
+ RETURNS bytea
+ AS $$ SELECT get_raw_page($1, 'main', $2); $$
+ LANGUAGE SQL STRICT;
+ 
  --
  -- page_header()
  --
***************

The underlying C-level get_raw_page function is still there, but
it now expects three arguments not two, and will crash if it's
passed an int4 where it's expecting a text argument. But the old
function definition will migrate without error --- there's no way
for pg_migrator to realize it's installing a security hazard.

The way we should have done this, which I intend to go change it to,
is something like

CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page'
LANGUAGE C STRICT;

CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page_3'
LANGUAGE C STRICT;

so that the old function's ABI is preserved. Migration of the old
contrib module will then lead to the 3-argument function not being
immediately available, but the 2-argument one still works. Had we not
wanted to keep the 2-argument form for some reason, we would have
provided only get_raw_page_3 in the .so file, and attempts to use the
old function definition would fail safely.

(We have actually seen similar problems before with people trying
to dump and reload database containing contrib modules. pg_migrator
is not creating a problem that wasn't there before, it's just making
it worse.)

Comments, better ideas?

regards, tom lane

#31Bruce Momjian
bruce@momjian.us
In reply to: Grzegorz Jaskiewicz (#24)
Re: pg_migrator issue with contrib

Grzegorz Jaskiewicz wrote:

On 7 Jun 2009, at 03:27, Bruce Momjian wrote:

Grzegorz Jaskiewicz wrote:

On 6 Jun 2009, at 19:50, Tom Lane wrote:

We have five days.

I don't think we need testing, per se. The first step should be to
diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed. Any module whose .sql.in file hasn't
changed
is definitely safe.

I can tell you already that dblink has changed, and I had to drop it
before migration, otherwise everything went fine. Migration of 57GB
data in place took about 1 minute.

The good news is that the INSTALL instructions where clear enough for
the tester to understand that uninstalling dblink from the old cluster
and reinstalling it in the new cluster would work.

Yes, but I forgot about one database, and had to do it all over again,
luckily I first tested it without --link...

Any failure would have allowed you to revert to the old server. It is
only when you _start_ the new server that the old server cannot be used;
I have clarified the INSTALL file on that point.

If you use linking, the migration will be much faster (no data copying),
but you will no longer be able to access your old cluster once you start
the new cluster after the upgrade.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#32Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#29)
Re: pg_migrator issue with contrib

On Jun 7, 2009, at 12:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I did know that EDB had been using the tool for a while, but I admit
I'm not familiar with the whole history. I had the impression that
we'd gotten a lot more serious about really making this rock solid
since Bruce took it in February. But maybe that's not the case?

I don't actually know the EDB end of the history either; maybe someone
can educate us about that. But it's true that the core developers,
at least, weren't taking it seriously until this year. That's because
it really can only handle catalog changes, not changes to the contents
of user tables; and it's been quite a long time since we've had a
release where we didn't change tuple header layout or packing rules or
something that made it a nonstarter. It wasn't clear till early this
year that 8.3->8.4 would be a cycle where pg_migrator had a chance of
being useful in production ... so we got serious about it.

OK, that's more or less what I thought, and what I intended to convey
upthread. As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet. As long as we set
expectations accordingly, I think that's OK. You mention CVEs for
these contrib issues, but will CVEs still be issued if we make clear
that this is experimental only? I would hope not, since that would
amount to a policy that any half-baked code anywhere on pgfoundry is
just as much our responsibility as core Postgres. Surely we're
allowed to say "good progress has been made here, but we harbor no
illusions that it's bullet-proof."

...Robert

#33Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#32)
Re: pg_migrator issue with contrib

On 6/7/09 10:56 AM, Robert Haas wrote:

OK, that's more or less what I thought, and what I intended to convey
upthread. As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet.

Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
AFAIC, until we have these sorts of issues worked out, it's still a beta.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#34Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Josh Berkus (#33)
Re: pg_migrator issue with contrib

Josh Berkus wrote:

On 6/7/09 10:56 AM, Robert Haas wrote:

OK, that's more or less what I thought, and what I intended to convey
upthread. As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet.

Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
AFAIC, until we have these sorts of issues worked out, it's still a beta.

afaiks bruce stated he is going to remove the BETA tag from pg_migrator
soon so I guess calling it beta in the main project docs will confuse
the hell out of people(or causing them to think that it is not beta any
more).
So maybe calling it experimental(from the POV of the main project) or
something similiar might still be the better solution.

Stefan

#35Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#23)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think the cleanest solution is to document that these issues might
happen and suggest solutions.

No, the cleanest solution is to fix it before people ever see a
problem. This is trivial to do in the case of dblink and I don't
see why you think it would be better to make users work around it.

Also, dblink is one of the easiest cases because (a) it doesn't have
anything but functions, and thus it's possible to drop it from the
old DB without data loss, and (b) the inconsistency that it has is
something that will result in a clean, readily understandable failure
during migration. As opposed to some other cases that will migrate
just fine and then dump core during use.

I've just finished running through a diff of 8.3 vs 8.4 contrib
SQL files. It looks like we have these cases:

[ list removed]

Certainly if you can fix /contrib problems at the source, please do so.
I was commenting on the idea of having pg_migrator somehow skip specific
items to try to make it more failure-proof. While I can do that for a
few cases, such as suppress the creation of specific functions by
filtering the schema dump file, I will never be able to get them all,
and doing it extensively could destabilize pg_migrator.

You are suggesting improving /contrib itself, which is better.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#36Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#30)
Re: pg_migrator issue with contrib

Tom Lane wrote:

The underlying C-level get_raw_page function is still there, but
it now expects three arguments not two, and will crash if it's
passed an int4 where it's expecting a text argument. But the old
function definition will migrate without error --- there's no way
for pg_migrator to realize it's installing a security hazard.

FYI, there is nothing pg_migrator specific here. Someone doing a
dump/reload from 8.3 to 8.4 would have the same security issue.
pg_migrator is using the same pg_dump output as a dump restore, except
it uses --schema. pg_migrator would actually be more secure because it
will exit on the restore error rather than having the error possibly
ignored by the user.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#37Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#32)
Re: pg_migrator issue with contrib

Robert Haas wrote:

OK, that's more or less what I thought, and what I intended to convey
upthread. As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet. As long as we set
expectations accordingly, I think that's OK. You mention CVEs for
these contrib issues, but will CVEs still be issued if we make clear
that this is experimental only? I would hope not, since that would
amount to a policy that any half-baked code anywhere on pgfoundry is
just as much our responsibility as core Postgres. Surely we're
allowed to say "good progress has been made here, but we harbor no
illusions that it's bullet-proof."

Again, there is nothing pg_migrator-specific about these security
issues.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#38Bruce Momjian
bruce@momjian.us
In reply to: Stefan Kaltenbrunner (#34)
Re: pg_migrator issue with contrib

Stefan Kaltenbrunner wrote:

Josh Berkus wrote:

On 6/7/09 10:56 AM, Robert Haas wrote:

OK, that's more or less what I thought, and what I intended to convey
upthread. As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet.

Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
AFAIC, until we have these sorts of issues worked out, it's still a beta.

afaiks bruce stated he is going to remove the BETA tag from pg_migrator
soon so I guess calling it beta in the main project docs will confuse
the hell out of people(or causing them to think that it is not beta any
more).
So maybe calling it experimental(from the POV of the main project) or
something similar might still be the better solution.

This all sounds very discouraging. It is like, "Oh, my, there is a
migration tool and it might have bugs. How do we prevent people from
using it?"

Right now nothing in the project is referring to pg_migrator except in
the press release, and it is marked as beta there. How do you want to
deemphasize it more than that? Why did I bother working on this if the
community reaction is to try to figure out how to make people avoid
using it?

I am now thinking I need to my own PR for pg_migrator because obviously
the community is only worried it might have a bug. Instead of testing
it, looking at the code, submitting bug reports, or anything
constructive, you sit around figuring out how to put a disparaging label
on it!

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#39Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#29)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I did know that EDB had been using the tool for a while, but I admit
I'm not familiar with the whole history. I had the impression that
we'd gotten a lot more serious about really making this rock solid
since Bruce took it in February. But maybe that's not the case?

I don't actually know the EDB end of the history either; maybe someone
can educate us about that. But it's true that the core developers,
at least, weren't taking it seriously until this year. That's because
it really can only handle catalog changes, not changes to the contents
of user tables; and it's been quite a long time since we've had a
release where we didn't change tuple header layout or packing rules or
something that made it a nonstarter. It wasn't clear till early this
year that 8.3->8.4 would be a cycle where pg_migrator had a chance of
being useful in production ... so we got serious about it.

(I do not know whether EDB ever really used it in production. If they
did, it must have been for private updates that changed catalogs and
not user data.)

pg_migrator verion 0.5 is still on the pg_migrator web site, and that is
the version I started from. It had this line in the intro:

PG_migrator is a tool (not a complete solution) that performs an
in-place upgrade of existing data.

Of course no one wants a toolkit, they want a full solution, so I
modified the code to be easier to use and more robust. I am not sure
how much EDB used it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#40Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#38)
Re: pg_migrator issue with contrib

On Sun, Jun 7, 2009 at 11:50 PM, Bruce Momjian<bruce@momjian.us> wrote:

Stefan Kaltenbrunner wrote:

Josh Berkus wrote:

On 6/7/09 10:56 AM, Robert Haas wrote:

OK, that's more or less what I thought, and what I intended to convey
upthread.  As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet.

Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
AFAIC, until we have these sorts of issues worked out, it's still a beta.

afaiks bruce stated he is going to remove the BETA tag from pg_migrator
soon so I guess calling it beta in the main project docs will confuse
the hell out of people(or causing them to think that it is not beta any
more).
So maybe calling it experimental(from the POV of the main project) or
something similar might still be the better solution.

This all sounds very discouraging.  It is like, "Oh, my, there is a
migration tool and it might have bugs.  How do we prevent people from
using it?"

Right now nothing in the project is referring to pg_migrator except in
the press release, and it is marked as beta there.  How do you want to
deemphasize it more than that?  Why did I bother working on this if the
community reaction is to try to figure out how to make people avoid
using it?

Because Rome wasn't built in a day.

It seems to me that you yourself placed a far more disparaging label
on it than anything that anyone has proposed today; this was a week
ago:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php

I don't think it's anyone's intention to disparage your work on this
tool. It certainly isn't mine. But it seems obvious to me that it
has some pretty severe limitations and warts. The fact that those
limitations and warts are well-documented doesn't negate their
existence. I also don't think calling the software "beta" or
"experimental" is a way of deemphasizing it. I think it's a way of
being clear that this software is not the bullet-proof, rock-solid,
handles-all-cases-and-keeps-on-trucking level of robustness that
people have come to expect from PostgreSQL.

FWIW, I have no problem at all with mentioning pg_migrator in the
release notes or the documentation; my failure to respond to your last
emails on this topic was due to being busy and having already spent
too much time responding to other emails, not due to thinking it was a
bad idea. I actually think it's a good idea. But I also think those
references should describe it as experimental, because I think it is.
I really hope it won't remain experimental forever, but I think that's
an accurate characterization of where it is now.

You, or others, may disagree, of course.

...Robert

#41Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Robert Haas (#40)
Re: pg_migrator issue with contrib

Robert Haas wrote:

On Sun, Jun 7, 2009 at 11:50 PM, Bruce Momjian<bruce@momjian.us> wrote:

Stefan Kaltenbrunner wrote:

Josh Berkus wrote:

On 6/7/09 10:56 AM, Robert Haas wrote:

OK, that's more or less what I thought, and what I intended to convey
upthread. As far as core Postgres is concerned this is a new feature,
and we haven't worked out all the kinks yet.

Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
AFAIC, until we have these sorts of issues worked out, it's still a beta.

afaiks bruce stated he is going to remove the BETA tag from pg_migrator
soon so I guess calling it beta in the main project docs will confuse
the hell out of people(or causing them to think that it is not beta any
more).
So maybe calling it experimental(from the POV of the main project) or
something similar might still be the better solution.

This all sounds very discouraging. It is like, "Oh, my, there is a
migration tool and it might have bugs. How do we prevent people from
using it?"

Right now nothing in the project is referring to pg_migrator except in
the press release, and it is marked as beta there. How do you want to
deemphasize it more than that? Why did I bother working on this if the
community reaction is to try to figure out how to make people avoid
using it?

Because Rome wasn't built in a day.

indeed

It seems to me that you yourself placed a far more disparaging label
on it than anything that anyone has proposed today; this was a week
ago:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php

well that is way more discouraging than what I wanted to say :)

I don't think it's anyone's intention to disparage your work on this
tool. It certainly isn't mine. But it seems obvious to me that it
has some pretty severe limitations and warts. The fact that those
limitations and warts are well-documented doesn't negate their
existence. I also don't think calling the software "beta" or
"experimental" is a way of deemphasizing it. I think it's a way of
being clear that this software is not the bullet-proof, rock-solid,
handles-all-cases-and-keeps-on-trucking level of robustness that
people have come to expect from PostgreSQL.

Exactly my point. pg_migrator gained a lot of momentum in the last weeks
an months, but imho it has still way to go. I do think that binary
upgrades are extremely important for us(that's why I did a fair amount
of testing on it) but I don't think that we should go too far for this
release.
A lot of the code that makes postgresql what it is now took years to
mature on pgfoundry or in contrib. So some of the questions to ask would be:
* is pg_migrator ready for contrib/? Probably not - it is still a way
too moving target so pgfoundry is good
* is pg_migrator ready for /src/bin?

Realistically I think we need to get at least one full cycle to see what
happens in the field with something as complex as pg_migrator to
really get a grasp on what else comes up.

FWIW, I have no problem at all with mentioning pg_migrator in the
release notes or the documentation; my failure to respond to your last
emails on this topic was due to being busy and having already spent
too much time responding to other emails, not due to thinking it was a
bad idea. I actually think it's a good idea. But I also think those
references should describe it as experimental, because I think it is.
I really hope it won't remain experimental forever, but I think that's
an accurate characterization of where it is now.

yep - I was not against mentioning it either. We just should do it in a
sane way(ie it is not part of the core project yet but endorsed and
might get added in the future or such) so we don't confuse people(like
we call it beta, the homepage does not) and yet get valuable feedback
which we certainly need to go forward.

Stefan

#42Magnus Hagander
magnus@hagander.net
In reply to: Dimitri Fontaine (#18)
Re: pg_migrator issue with contrib

Dimitri Fontaine wrote:

Le 6 juin 09 � 20:45, Josh Berkus a �crit :

So, here's what we need for 8.3 --> 8.4 for contrib modules:

That does nothing for external modules whose code isn't in PostgreSQL
control. I'm thinking of those examples I cited up-thread --- and some
more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

For me and I know several others, the big question would be PostGIS.
Unfortunately I haven't had the time to run any tests myself, so I'll
join the line of people being worried, but I have a number of customers
with somewhere between pretty and very large PostGIS databases that
could really benefit from pg_migrator.

As long as PostGIS is the same version in both of them, is pg_migrator
is likely to work? (one can always run the PostGIS upgrade as a separate
step)

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

--
Magnus Hagander
Self: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#43Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#30)
Re: pg_migrator issue with contrib

On Sun, Jun 7, 2009 at 12:36 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

I wrote:

* pageinspect has changed the ABI of get_raw_page() in a way that will
likely make it dump core if the function definition is migrated from
an old DB.  This needs to be fixed.
[ and similarly for some other contrib modules ]

After thinking about this some more, I think that there is a fairly
simple coding rule we can adopt to prevent post-migration crashes
of the sort I'm worrying about above.  That is:

* If you change the ABI of a C-language function, change its C name.

This will ensure that if someone tries to migrate the old function
definition from an old database, they will get a pg_migrator failure,
or at worst a clean runtime failure when they attempt to use the old
definition.  They won't get a core dump or some worse form of security
problem.

As an example, the problem in pageinspect is this diff:

***************
*** 6,16 ****
 --
 -- get_raw_page()
 --
! CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
 RETURNS bytea
 AS 'MODULE_PATHNAME', 'get_raw_page'
 LANGUAGE C STRICT;

 --
 -- page_header()
 --
--- 6,21 ----
 --
 -- get_raw_page()
 --
! CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
 RETURNS bytea
 AS 'MODULE_PATHNAME', 'get_raw_page'
 LANGUAGE C STRICT;
+ CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
+ RETURNS bytea
+ AS $$ SELECT get_raw_page($1, 'main', $2); $$
+ LANGUAGE SQL STRICT;
+
 --
 -- page_header()
 --
***************

The underlying C-level get_raw_page function is still there, but
it now expects three arguments not two, and will crash if it's
passed an int4 where it's expecting a text argument.  But the old
function definition will migrate without error --- there's no way
for pg_migrator to realize it's installing a security hazard.

The way we should have done this, which I intend to go change it to,
is something like

CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page'
LANGUAGE C STRICT;

CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page_3'
LANGUAGE C STRICT;

so that the old function's ABI is preserved.  Migration of the old
contrib module will then lead to the 3-argument function not being
immediately available, but the 2-argument one still works.  Had we not
wanted to keep the 2-argument form for some reason, we would have
provided only get_raw_page_3 in the .so file, and attempts to use the
old function definition would fail safely.

(We have actually seen similar problems before with people trying
to dump and reload database containing contrib modules.  pg_migrator
is not creating a problem that wasn't there before, it's just making
it worse.)

Comments, better ideas?

maybe, get_raw_page_v2, etc? I suppose you could run into situation
with multiple versions of the function w/same # of arguments?

merlin

#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#42)
Re: pg_migrator issue with contrib

Magnus Hagander <magnus@hagander.net> writes:

As long as PostGIS is the same version in both of them, is pg_migrator
is likely to work? (one can always run the PostGIS upgrade as a separate
step)

There was just some discussion about that on postgis-devel. I think the
conclusion was that you would have to do the PostGIS update as a
separate step. They intend to support both 1.3.x and 1.4.x on current
versions of Postgres for some time, so in principle you could do it in
either order.

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump). If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.

regards, tom lane

#45Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#44)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Magnus Hagander <magnus@hagander.net> writes:

As long as PostGIS is the same version in both of them, is pg_migrator
is likely to work? (one can always run the PostGIS upgrade as a separate
step)

There was just some discussion about that on postgis-devel. I think the
conclusion was that you would have to do the PostGIS update as a
separate step. They intend to support both 1.3.x and 1.4.x on current
versions of Postgres for some time, so in principle you could do it in
either order.

Doing them as two steps is totally fine with me, because IIRC the
PostGIS upgrades generally don't require hours and hours of downtime.

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump). If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.

Well, in that case, error out would be a better option than doing it and
probably fail later. And have a --force option available, but don't
suggest it.

//Magnus

#46Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Magnus Hagander (#45)
Re: pg_migrator issue with contrib

Magnus Hagander <magnus@hagander.net> writes:

Tom Lane wrote:

Magnus Hagander <magnus@hagander.net> writes:

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump). If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.

Well, in that case, error out would be a better option than doing it and
probably fail later. And have a --force option available, but don't
suggest it.

My vote would go to detect and error out without recovering option. If
the tool is not able to handle a situation and knows it, I don't see
what would be good about it leting the user lose data on purpose.

The --force option should be for the user to manually drop his columns
and indexes (etc) and try pg_migrator again, which will stop listing
faulty objects but care about the now compatible cluster.

Restoring the lost data is not the job of pg_migrator, of course.

Regards,
--
dim

#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dimitri Fontaine (#46)
Re: pg_migrator issue with contrib

Dimitri Fontaine <dfontaine@hi-media.com> writes:

My vote would go to detect and error out without recovering option. If
the tool is not able to handle a situation and knows it, I don't see
what would be good about it leting the user lose data on purpose.

No, that's not what's being discussed. The proposal is to have it error
out when it *does not* know whether there is a real problem; and, in
fact, when there's only a rather low probability of there being a real
problem. My view is that that's basically counterproductive. It leads
directly to having to have a --force switch and then to people using
that switch carelessly.

regards, tom lane

#48Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#40)
Re: pg_migrator issue with contrib

Robert Haas wrote:

Right now nothing in the project is referring to pg_migrator except in
the press release, and it is marked as beta there. ?How do you want to
deemphasize it more than that? ?Why did I bother working on this if the
community reaction is to try to figure out how to make people avoid
using it?

Because Rome wasn't built in a day.

It seems to me that you yourself placed a far more disparaging label
on it than anything that anyone has proposed today; this was a week
ago:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php

I don't think it's anyone's intention to disparage your work on this
tool. It certainly isn't mine. But it seems obvious to me that it
has some pretty severe limitations and warts. The fact that those
limitations and warts are well-documented doesn't negate their
existence. I also don't think calling the software "beta" or
"experimental" is a way of deemphasizing it. I think it's a way of
being clear that this software is not the bullet-proof, rock-solid,
handles-all-cases-and-keeps-on-trucking level of robustness that
people have come to expect from PostgreSQL.

FWIW, I have no problem at all with mentioning pg_migrator in the
release notes or the documentation; my failure to respond to your last
emails on this topic was due to being busy and having already spent
too much time responding to other emails, not due to thinking it was a
bad idea. I actually think it's a good idea. But I also think those
references should describe it as experimental, because I think it is.
I really hope it won't remain experimental forever, but I think that's
an accurate characterization of where it is now.

pg_migrator should be looked at critically here, and I agree we should
avoid letting pg_migrator failures reflect badly on Postgres.

Let me list the problems with pg_migrator:

o /contrib and plugin migration (not unique to pg_migrator)
o you must read/follow the install instructions
o might require post-migration table/index rebuilds
o new so serious bugs might exist

and let me list its benefits:

o first in-place upgrade capability in years
o tested by some users, all successful (since late alpha)
o removes major impediment to adoption
o includes extensive error checking and reporting
o contains detailed installation/usage instructions

So let's look at pg_migrator as an opportunity and a risk. As far as I
know, only Hiroshi Saito and I have have looked at the code. Why don't
others read the pg_migrator source code looking for bugs? Why have more
people not test it?

I think "experimental" is the wrong label. Experimental assumes its
usefulness is uncertain and that it is still being researched ---
neither is true. Once I release pg_migrator 8.4 final at the end of
this week (assuming no bugs are reported), I consider it done, or at
least advanced as far as I can go until I get more feedback from users.

I think we can say: "pg_migrator is designed for experienced users with
large databases, for whom the typical dump/restore required for major
version upgrades is a hardship".

I assume this will be the same adoption pattern we had with the Win32
port, where it was a new platform in 8.0 and we dealt with some issues
as it was deployed, and that people who want it will find it and
hopefully it will be useful for them.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#49Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#42)
Re: pg_migrator issue with contrib

Magnus Hagander wrote:

Dimitri Fontaine wrote:

Le 6 juin 09 ? 20:45, Josh Berkus a ?crit :

So, here's what we need for 8.3 --> 8.4 for contrib modules:

That does nothing for external modules whose code isn't in PostgreSQL
control. I'm thinking of those examples I cited up-thread --- and some
more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

For me and I know several others, the big question would be PostGIS.
Unfortunately I haven't had the time to run any tests myself, so I'll
join the line of people being worried, but I have a number of customers
with somewhere between pretty and very large PostGIS databases that
could really benefit from pg_migrator.

As long as PostGIS is the same version in both of them, is pg_migrator
is likely to work? (one can always run the PostGIS upgrade as a separate
step)

Yes, it should work with the same version of PostGIS, but I have not
tested it. There is nothing special about PostGIS that would cause it
not work work --- we use the same pg_dump as you would for a major
upgrade --- we just move the files around instead of dumping the data.

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

Right now pg_migrator throws an error if the schema load doesn't work.
Assuming you use the same version on the old and new clusters, it should
work fine. I am unclear what checking oids would do.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#50Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#44)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Magnus Hagander <magnus@hagander.net> writes:

As long as PostGIS is the same version in both of them, is pg_migrator
is likely to work? (one can always run the PostGIS upgrade as a separate
step)

There was just some discussion about that on postgis-devel. I think the
conclusion was that you would have to do the PostGIS update as a
separate step. They intend to support both 1.3.x and 1.4.x on current
versions of Postgres for some time, so in principle you could do it in
either order.

Oh, yea, you can't go from PostGIS version 1.3 to 1.4 _while_ you do the
pg_migrator upgrade. It has to be done either before or after
pg_migrator is run. I wonder how I could prevent someone from trying
that trick.

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump). If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.

Yep. The install instructions explain how you have to get around this,
and if they don't understand it, they shouldn't be using pg_migrator and
should just do the traditional dump/restore. It is too tempting to give
them a force flag.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#51Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#45)
Re: pg_migrator issue with contrib

Magnus Hagander wrote:

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard
-core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump). If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.

Well, in that case, error out would be a better option than doing it and
probably fail later. And have a --force option available, but don't
suggest it.

Uh, what doesn't error out now in pg_migrator?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#52Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Tom Lane (#47)
Re: pg_migrator issue with contrib

Tom Lane <tgl@sss.pgh.pa.us> writes:

Dimitri Fontaine <dfontaine@hi-media.com> writes:

My vote would go to detect and error out without recovering option. If
the tool is not able to handle a situation and knows it, I don't see
what would be good about it leting the user lose data on purpose.

No, that's not what's being discussed. The proposal is to have it error
out when it *does not* know whether there is a real problem; and, in
fact, when there's only a rather low probability of there being a real
problem. My view is that that's basically counterproductive. It leads
directly to having to have a --force switch and then to people using
that switch carelessly.

True, it could be that the data type representation has not changed
between 8.3 and 8.4, nor the index content format. In this case
pg_migrator will work fine on the cluster as soon as you installed the
new .so...

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Ok, maybe there's a way it'll just work. I withdraw my vote.

Thanks for your patience, regards,
--
dim

#53Bruce Momjian
bruce@momjian.us
In reply to: Dimitri Fontaine (#46)
Re: pg_migrator issue with contrib

Dimitri Fontaine wrote:

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump). If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.

Well, in that case, error out would be a better option than doing it and
probably fail later. And have a --force option available, but don't
suggest it.

My vote would go to detect and error out without recovering option. If
the tool is not able to handle a situation and knows it, I don't see
what would be good about it leting the user lose data on purpose.

The --force option should be for the user to manually drop his columns
and indexes (etc) and try pg_migrator again, which will stop listing
faulty objects but care about the now compatible cluster.

Restoring the lost data is not the job of pg_migrator, of course.

Agreed. Right now pg_migrator never modifies the old cluster except for
renaming pg_control (documented) so the old cluster is not accidentally
restarted. I don't want to change that behavior.

I would filter the dump --schema file, but again, it is best to let the
administrator do it, and if they can't, they should just do
dump/restore.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#54Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#48)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Let me list the problems with pg_migrator:

o /contrib and plugin migration (not unique to pg_migrator)
o you must read/follow the install instructions
o might require post-migration table/index rebuilds
o new so serious bugs might exist

I think that #1 and #4 could be substantially alleviated if the
instructions recommended doing a trial run with a schema-only dump
of the database. That is,

* pg_dumpall -s
* load that into a test installation (of the *old* PG version)
* migrate the test installation to new PG version
* do the same sorts of applications compatibility checks you'd want to
do anyway before a major version upgrade

This would certainly catch migration-time failures caused by plugins,
and the followup testing would probably catch any large post-migration
issue.

Somebody who is not willing to do this type of testing should not be
using pg_migrator (yet), and probably has not got a database large
enough to need it anyway.

regards, tom lane

#55Bruce Momjian
bruce@momjian.us
In reply to: Dimitri Fontaine (#52)
Re: pg_migrator issue with contrib

Dimitri Fontaine wrote:

Tom Lane <tgl@sss.pgh.pa.us> writes:

Dimitri Fontaine <dfontaine@hi-media.com> writes:

My vote would go to detect and error out without recovering option. If
the tool is not able to handle a situation and knows it, I don't see
what would be good about it leting the user lose data on purpose.

No, that's not what's being discussed. The proposal is to have it error
out when it *does not* know whether there is a real problem; and, in
fact, when there's only a rather low probability of there being a real
problem. My view is that that's basically counterproductive. It leads
directly to having to have a --force switch and then to people using
that switch carelessly.

True, it could be that the data type representation has not changed
between 8.3 and 8.4, nor the index content format. In this case
pg_migrator will work fine on the cluster as soon as you installed the
new .so...

Yes.

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Yes, that is a problem. It is not a pg_migrator-specific problem
because people traditionally bring the /contrib schema over from the old
install (I think). The only pg_migrator-specific failure is when the
data format changed and dump/restore would fix it, but pg_migrator would
migrate corrupt data. :-(

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#50)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

There was just some discussion about that on postgis-devel. I think the
conclusion was that you would have to do the PostGIS update as a
separate step. They intend to support both 1.3.x and 1.4.x on current
versions of Postgres for some time, so in principle you could do it in
either order.

Oh, yea, you can't go from PostGIS version 1.3 to 1.4 _while_ you do the
pg_migrator upgrade. It has to be done either before or after
pg_migrator is run. I wonder how I could prevent someone from trying
that trick.

You don't need to, because it will fail automatically. They are using
version-numbered .so files, so C-language functions referencing the 1.3
.so will fail to load if only the 1.4 .so is in the new installation.

regards, tom lane

#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dimitri Fontaine (#52)
Re: pg_migrator issue with contrib

Dimitri Fontaine <dfontaine@hi-media.com> writes:

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Exactly. And note that this is not pg_migrator's fault: a pg_dump
dump and reload of the database exposes the user to the same risks,
if the module author has not been careful about compatibility.

regards, tom lane

#58Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#54)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Let me list the problems with pg_migrator:

o /contrib and plugin migration (not unique to pg_migrator)
o you must read/follow the install instructions
o might require post-migration table/index rebuilds
o new so serious bugs might exist

I think that #1 and #4 could be substantially alleviated if the
instructions recommended doing a trial run with a schema-only dump
of the database. That is,

* pg_dumpall -s
* load that into a test installation (of the *old* PG version)
* migrate the test installation to new PG version
* do the same sorts of applications compatibility checks you'd want to
do anyway before a major version upgrade

But you have no data in the database --- can any meaningful testing be
done?

FYI, pg_migrator will do the schema load pretty early (even before the
file copy) and fail on errors. Retrying pg_migrator is pretty easy and
is now well documented in the INSTALL file.

This would certainly catch migration-time failures caused by plugins,
and the followup testing would probably catch any large post-migration
issue.

Somebody who is not willing to do this type of testing should not be
using pg_migrator (yet), and probably has not got a database large
enough to need it anyway.

Agreed.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#59Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#56)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

There was just some discussion about that on postgis-devel. I think the
conclusion was that you would have to do the PostGIS update as a
separate step. They intend to support both 1.3.x and 1.4.x on current
versions of Postgres for some time, so in principle you could do it in
either order.

Oh, yea, you can't go from PostGIS version 1.3 to 1.4 _while_ you do the
pg_migrator upgrade. It has to be done either before or after
pg_migrator is run. I wonder how I could prevent someone from trying
that trick.

You don't need to, because it will fail automatically. They are using
version-numbered .so files, so C-language functions referencing the 1.3
.so will fail to load if only the 1.4 .so is in the new installation.

Nice. Something to be learned there. ;-)

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#60Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#57)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Dimitri Fontaine <dfontaine@hi-media.com> writes:

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Exactly. And note that this is not pg_migrator's fault: a pg_dump
dump and reload of the database exposes the user to the same risks,
if the module author has not been careful about compatibility.

Agreed. In many ways pg_migrator failures will be caused by failures of
the many tools it relies upon, as I mentioned a few days ago.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#55)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Dimitri Fontaine wrote:

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Yes, that is a problem. It is not a pg_migrator-specific problem
because people traditionally bring the /contrib schema over from the old
install (I think). The only pg_migrator-specific failure is when the
data format changed and dump/restore would fix it, but pg_migrator would
migrate corrupt data. :-(

There is a different problem though: sometimes the recommended fix for
contrib-module incompatibilities is to load the new contrib module into
the target database before trying to load your old dump file. (We told
people to do that for 8.2->8.3 tsearch2, for example.) In the
pg_migrator context there is no way to insert the new contrib module
first, and also no way to ignore the duplicate-object errors that you
typically get while loading the old dump.

It would probably be a relatively simple feature addition to teach
pg_migrator to load such-and-such modules into the new databases before
loading the old dump. But I'm still scared to death by the idea of
letting it ignore errors, so there doesn't seem to be any good solution
to this type of migration scenario.

regards, tom lane

#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#58)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

I think that #1 and #4 could be substantially alleviated if the
instructions recommended doing a trial run with a schema-only dump
of the database. That is,

* pg_dumpall -s
* load that into a test installation (of the *old* PG version)
* migrate the test installation to new PG version
* do the same sorts of applications compatibility checks you'd want to
do anyway before a major version upgrade

But you have no data in the database --- can any meaningful testing be
done?

Well, you'd have to insert some. But this is something you'd have to do
*anyway*, unless you are willing to just pray that your apps don't need
any changes. The only new thing I'm suggesting here is incorporating
use of pg_migrator into your normal compatibility testing.

regards, tom lane

#63Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Tom Lane (#57)
Re: pg_migrator issue with contrib

Tom Lane <tgl@sss.pgh.pa.us> writes:

Exactly. And note that this is not pg_migrator's fault: a pg_dump
dump and reload of the database exposes the user to the same risks,
if the module author has not been careful about compatibility.

It seems to me the dump will contain text string representation of data
and pg_restore will run the input function of the type on this, so that
maintaining backward compatibility of the data type doesn't sound
hard. As far as the specific index support goes, pg_restore creates the
index from scratch.

So, from my point of view, supporting backward compatibility by means of
dump and restore is the easy way. Introducing pg_migrator in the game,
the data type and index internals upgrade are now faced to the same
problem as the -core in-place upgrade project.

Maybe we'll be able to provide the extension authors (not only contrib)
a specialized API to trigger in case of in-place upgrade of PG version
or the extension itself, ala Erlang code upgrade facility e.g.:
http://erlang.org/doc/reference_manual/code_loading.html#12.3

This part of the extension design will need help from C dynamic module
experts around, because it's terra incognita as far as I'm concerned.

Regards,
--
dim

#64Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#23)
Re: pg_migrator issue with contrib

I wrote:

* pg_buffercache has changed the view pg_buffercache, which is
definitely going to be a migration issue. Need to test whether
it represents a crash risk if the old definition is migrated.

I checked this, and there is not a crash risk: the function successfully
creates its result tuplestore, and then the main executor notices it's
not compatible with the old view. So you get

regression=# select * from pg_buffercache;
ERROR: function return row and query-specified return row do not match
DETAIL: Returned row contains 8 attributes, but query expects 7.

You can fix it by dropping and recreating the view (eg, run the module's
uninstall and then install scripts). I suppose that might be a bit
annoying if you've built additional views atop this one, but overall
it doesn't sound too bad.

So I don't plan to do anything about this module. Still working on
the others.

regards, tom lane

#65Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#61)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Dimitri Fontaine wrote:

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Yes, that is a problem. It is not a pg_migrator-specific problem
because people traditionally bring the /contrib schema over from the old
install (I think). The only pg_migrator-specific failure is when the
data format changed and dump/restore would fix it, but pg_migrator would
migrate corrupt data. :-(

There is a different problem though: sometimes the recommended fix for
contrib-module incompatibilities is to load the new contrib module into
the target database before trying to load your old dump file. (We told
people to do that for 8.2->8.3 tsearch2, for example.) In the
pg_migrator context there is no way to insert the new contrib module
first, and also no way to ignore the duplicate-object errors that you
typically get while loading the old dump.

It would probably be a relatively simple feature addition to teach
pg_migrator to load such-and-such modules into the new databases before
loading the old dump. But I'm still scared to death by the idea of
letting it ignore errors, so there doesn't seem to be any good solution
to this type of migration scenario.

Ah, OK, interesting. We could have pg_migrator detect this issue and
fail right away with a message indicating pg_migrator cannot be used
unless those objects are dumped manually and the /contrib removed.

As long as pg_migrator is clear, I don't think people will complain.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#66Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#62)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

I think that #1 and #4 could be substantially alleviated if the
instructions recommended doing a trial run with a schema-only dump
of the database. That is,

* pg_dumpall -s
* load that into a test installation (of the *old* PG version)
* migrate the test installation to new PG version
* do the same sorts of applications compatibility checks you'd want to
do anyway before a major version upgrade

But you have no data in the database --- can any meaningful testing be
done?

Well, you'd have to insert some. But this is something you'd have to do
*anyway*, unless you are willing to just pray that your apps don't need
any changes. The only new thing I'm suggesting here is incorporating
use of pg_migrator into your normal compatibility testing.

Ah, I see. Interesting. I have added the second sentence to the
pg_migrator README:

Installation
------------

See the INSTALL file for detailed installation instructions. For
deployment testing, create a schema-only copy of the old cluster, insert
dummy data, and migrate that.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#67Bruce Momjian
bruce@momjian.us
In reply to: Dimitri Fontaine (#63)
Re: pg_migrator issue with contrib

Dimitri Fontaine wrote:

Tom Lane <tgl@sss.pgh.pa.us> writes:

Exactly. And note that this is not pg_migrator's fault: a pg_dump
dump and reload of the database exposes the user to the same risks,
if the module author has not been careful about compatibility.

It seems to me the dump will contain text string representation of data
and pg_restore will run the input function of the type on this, so that
maintaining backward compatibility of the data type doesn't sound
hard. As far as the specific index support goes, pg_restore creates the
index from scratch.

So, from my point of view, supporting backward compatibility by means of
dump and restore is the easy way. Introducing pg_migrator in the game,
the data type and index internals upgrade are now faced to the same
problem as the -core in-place upgrade project.

Maybe we'll be able to provide the extension authors (not only contrib)
a specialized API to trigger in case of in-place upgrade of PG version
or the extension itself, ala Erlang code upgrade facility e.g.:
http://erlang.org/doc/reference_manual/code_loading.html#12.3

This part of the extension design will need help from C dynamic module
experts around, because it's terra incognita as far as I'm concerned.

At a minimum it would be great if items could mark themselves as
non-binary-upgradable. Perhaps the existence of a symbol in the *.so
file could indicate that, or a function call could be made and you pass
in the old and new major version numbers and it would return true/false
based on binary upgradeability.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#65)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

There is a different problem though: sometimes the recommended fix for
contrib-module incompatibilities is to load the new contrib module into
the target database before trying to load your old dump file. (We told
people to do that for 8.2->8.3 tsearch2, for example.) In the
pg_migrator context there is no way to insert the new contrib module
first, and also no way to ignore the duplicate-object errors that you
typically get while loading the old dump.

Ah, OK, interesting. We could have pg_migrator detect this issue and
fail right away with a message indicating pg_migrator cannot be used
unless those objects are dumped manually and the /contrib removed.

How would it detect it? I think the only thing you could do is
hard-wire tests for specific objects, which is klugy and doesn't
extend to third-party modules that you don't know about.

In most cases where there are major incompatibilities, attempting to
load the old dump file would fail anyway, so I don't think pg_migrator
really needs any hard-wired test. It's the minor changes that are
risky. I think ultimately the burden for those has to be on the module
author: he has to either avoid cross-version incompatibilities or make
sure they will fail safely.

regards, tom lane

#69Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#48)
Re: pg_migrator issue with contrib

On Mon, Jun 8, 2009 at 11:08 AM, Bruce Momjian<bruce@momjian.us> wrote:

Robert Haas wrote:

Right now nothing in the project is referring to pg_migrator except in
the press release, and it is marked as beta there. ?How do you want to
deemphasize it more than that? ?Why did I bother working on this if the
community reaction is to try to figure out how to make people avoid
using it?

Because Rome wasn't built in a day.

It seems to me that you yourself placed a far more disparaging label
on it than anything that anyone has proposed today; this was a week
ago:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php

I don't think it's anyone's intention to disparage your work on this
tool.  It certainly isn't mine.  But it seems obvious to me that it
has some pretty severe limitations and warts.  The fact that those
limitations and warts are well-documented doesn't negate their
existence.  I also don't think calling the software "beta" or
"experimental" is a way of deemphasizing it.  I think it's a way of
being clear that this software is not the bullet-proof, rock-solid,
handles-all-cases-and-keeps-on-trucking level of robustness that
people have come to expect from PostgreSQL.

FWIW, I have no problem at all with mentioning pg_migrator in the
release notes or the documentation; my failure to respond to your last
emails on this topic was due to being busy and having already spent
too much time responding to other emails, not due to thinking it was a
bad idea.  I actually think it's a good idea.  But I also think those
references should describe it as experimental, because I think it is.
I really hope it won't remain experimental forever, but I think that's
an accurate characterization of where it is now.

pg_migrator should be looked at critically here, and I agree we should
avoid letting pg_migrator failures reflect badly on Postgres.

Let me list the problems with pg_migrator:

       o  /contrib and plugin migration (not unique to pg_migrator)
       o  you must read/follow the install instructions
       o  might require post-migration table/index rebuilds
       o  new so serious bugs might exist

I pretty much agree with this list. With respect to #2, I don't think
that it's asking a lot for people to read/follow the install
instructions, so I don't consider that a serious problem.

and let me list its benefits:

       o  first in-place upgrade capability in years
       o  tested by some users, all successful (since late alpha)
       o  removes major impediment to adoption
       o  includes extensive error checking and reporting
       o  contains detailed installation/usage instructions

So let's look at pg_migrator as an opportunity and a risk.  As far as I
know, only Hiroshi Saito and I have have looked at the code.  Why don't
others read the pg_migrator source code looking for bugs?  Why have more
people not test it?

No reason at all - I very much hope that happens.

I think "experimental" is the wrong label.  Experimental assumes its
usefulness is uncertain and that it is still being researched ---
neither is true.  Once I release pg_migrator 8.4 final at the end of
this week (assuming no bugs are reported), I consider it done, or at
least advanced as far as I can go until I get more feedback from users.

Oh, to me "experimental" does not imply that usefulness is uncertain;
rather, it implies that usefulness has been established but that the
code is new (item #4 above) and may be not be 100% feature-complete
(items #1 and #3 above).

I think we can say:  "pg_migrator is designed for experienced users with
large databases, for whom the typical dump/restore required for major
version upgrades is a hardship".

Precisely. In other words, if you are an INEXPERIENCED user (that is
to say, most of them) or you don't have a particular large database,
dump + reload is probably the safest option. We're not discouraging
you from use pg_migrator, but please be careful and observe that it is
new and has some limitations.

I assume this will be the same adoption pattern we had with the Win32
port, where it was a new platform in 8.0 and we dealt with some issues
as it was deployed, and that people who want it will find it and
hopefully it will be useful for them.

Completely agree. And like the Windows port, hopefully after a
release or two, we'll figure out what we can improve and do so. I am
interested in this problem but all of my free time lately has been
going into the EXPLAIN patch I'm working on, so I haven't had time to
dig into it much. The problems of being a hobbyist...

...Robert

#70Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#68)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

There is a different problem though: sometimes the recommended fix for
contrib-module incompatibilities is to load the new contrib module into
the target database before trying to load your old dump file. (We told
people to do that for 8.2->8.3 tsearch2, for example.) In the
pg_migrator context there is no way to insert the new contrib module
first, and also no way to ignore the duplicate-object errors that you
typically get while loading the old dump.

Ah, OK, interesting. We could have pg_migrator detect this issue and
fail right away with a message indicating pg_migrator cannot be used
unless those objects are dumped manually and the /contrib removed.

How would it detect it? I think the only thing you could do is
hard-wire tests for specific objects, which is klugy and doesn't
extend to third-party modules that you don't know about.

Yep, that is the only way. The good news is that we are not modifying
any data so it is just a detection issue, i.e. it would not destabilize
pg_migrator's operation.

In most cases where there are major incompatibilities, attempting to
load the old dump file would fail anyway, so I don't think pg_migrator
really needs any hard-wired test. It's the minor changes that are
risky. I think ultimately the burden for those has to be on the module
author: he has to either avoid cross-version incompatibilities or make
sure they will fail safely.

Yep.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#67)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

At a minimum it would be great if items could mark themselves as
non-binary-upgradable.

It's hardly difficult to make that happen --- just change the C name of
some function, or the name of the whole .so file.

regards, tom lane

#72Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#69)
Re: pg_migrator issue with contrib

Robert Haas wrote:

Let me list the problems with pg_migrator:

? ? ? ?o ?/contrib and plugin migration (not unique to pg_migrator)
? ? ? ?o ?you must read/follow the install instructions
? ? ? ?o ?might require post-migration table/index rebuilds
? ? ? ?o ?new so serious bugs might exist

I pretty much agree with this list. With respect to #2, I don't think
that it's asking a lot for people to read/follow the install
instructions, so I don't consider that a serious problem.

My point was that I think someday pg_migrator will be point-and-click,
but it is not now.

Oh, to me "experimental" does not imply that usefulness is uncertain;
rather, it implies that usefulness has been established but that the
code is new (item #4 above) and may be not be 100% feature-complete
(items #1 and #3 above).

I think we can say: ?"pg_migrator is designed for experienced users with
large databases, for whom the typical dump/restore required for major
version upgrades is a hardship".

Precisely. In other words, if you are an INEXPERIENCED user (that is
to say, most of them) or you don't have a particular large database,
dump + reload is probably the safest option. We're not discouraging
you from use pg_migrator, but please be careful and observe that it is
new and has some limitations.

Agreed. There is no reason for most users to need pg_migrator; it is
not worth the risk for them, however small. There are some people who
really need it, and hopefully they are experienced users, while there is
a larger group who want to know such an option _exists_, so if they ever
need it, it is available.

I assume this will be the same adoption pattern we had with the Win32
port, where it was a new platform in 8.0 and we dealt with some issues
as it was deployed, and that people who want it will find it and
hopefully it will be useful for them.

Completely agree. And like the Windows port, hopefully after a
release or two, we'll figure out what we can improve and do so. I am
interested in this problem but all of my free time lately has been
going into the EXPLAIN patch I'm working on, so I haven't had time to
dig into it much. The problems of being a hobbyist...

One difference in risk is that the Windows port usually had _new_ data
meaning you were not risking as much as using pg_migrator on an
estabilished database installation.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#73Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#71)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

At a minimum it would be great if items could mark themselves as
non-binary-upgradable.

It's hardly difficult to make that happen --- just change the C name of
some function, or the name of the whole .so file.

Yes, but it needs to happen. ;-) PostGIS has done this, which is good.
The problem is that if they don't do it it is out of the control of
pg_migrator.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#74Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#72)
Re: pg_migrator issue with contrib

Bruce Momjian wrote:

Oh, to me "experimental" does not imply that usefulness is uncertain;
rather, it implies that usefulness has been established but that the
code is new (item #4 above) and may be not be 100% feature-complete
(items #1 and #3 above).

I think we can say: ?"pg_migrator is designed for experienced users with
large databases, for whom the typical dump/restore required for major
version upgrades is a hardship".

Precisely. In other words, if you are an INEXPERIENCED user (that is
to say, most of them) or you don't have a particular large database,
dump + reload is probably the safest option. We're not discouraging
you from use pg_migrator, but please be careful and observe that it is
new and has some limitations.

Agreed. There is no reason for most users to need pg_migrator; it is
not worth the risk for them, however small. There are some people who
really need it, and hopefully they are experienced users, while there is
a larger group who want to know such an option _exists_, so if they ever
need it, it is available.

I think this "larger group" is where my problem with the word
"experimental" come in. I think pg_migrator is far enough along that we
know it works, and that it will probably work for future major releases.
By calling it "experimental" we are not conveying confidence in the tool
for people who are making deployment decisions based on the existence of
such tool, even if they aren't going to use it initially. And by not
conveying confidence, we will lose the adoption advantage we can get
from pg_migrator.

Now, we don't want to over-promise, but at the same time we shouldn't
downplay the tool either. For a sufficiently-experienced administrator,
pg_migrator is a useful migration tool, and we need to convey that.
Even if you have to hire a consultant to manage the migration, if it
saves days of downtime, it is worth it. Consultants don't often use
experimental tools, but they do use complex, powerful tools that are
often rough around the edges in terms of usability, e.g. read the
INSTALL file carefully.

For longterm strategy, let me list the challenges for pg_migrator from
any major upgrade (listed in the DEVELOPERS file):

Change Conversion Method
------------------------------------------------------------------------------
clog none
heap page header, including bitmask convert to new page format on read
tuple header, including bitmask convert to new page format on read
data value format create old data type in new cluster
index page format reindex, or recreate old index methods

These are the issue we will have to address for 8.5 and beyond if
pg_migrator is to remain useful.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#75Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#73)
Re: pg_migrator issue with contrib

On Mon, Jun 8, 2009 at 1:06 PM, Bruce Momjian<bruce@momjian.us> wrote:

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

At a minimum it would be great if items could mark themselves as
non-binary-upgradable.

It's hardly difficult to make that happen --- just change the C name of
some function, or the name of the whole .so file.

Yes, but it needs to happen.  ;-)  PostGIS has done this, which is good.
The problem is that if they don't do it it is out of the control of
pg_migrator.

I think it might be possible to implement a system that can't be
broken by accident. Firefox (at least AIUI) requires plugin authors
to explicitly flag their modules as compatible with new versions of
Firefox. When you upgrade your firefox installation in place (heh,
heh) it goes off to the web site and checks whether all of your
extensions have been so flagged. Any that have not been get disabled
automatically.

Obviously we don't want to get into connecting to a web site, but we
could probably come up with some other API for .so files to indicate
which versions of PG they're compatible with. If they don't implement
that API, we assume the predate its introduction and are not
upgradeable. I'm fuzzy on the details here but the point is that if
you implement an opt-in system rather than an opt-out system then
people have to deliberately circumvent it to break things, rather than
just needing to be lazy.

...Robert

#76Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#74)
Re: pg_migrator issue with contrib

On Mon, Jun 8, 2009 at 1:20 PM, Bruce Momjian<bruce@momjian.us> wrote:

Bruce Momjian wrote:

Oh, to me "experimental" does not imply that usefulness is uncertain;
rather, it implies that usefulness has been established but that the
code is new (item #4 above) and may be not be 100% feature-complete
(items #1 and #3 above).

I think we can say: ?"pg_migrator is designed for experienced users with
large databases, for whom the typical dump/restore required for major
version upgrades is a hardship".

Precisely.  In other words, if you are an INEXPERIENCED user (that is
to say, most of them) or you don't have a particular large database,
dump + reload is probably the safest option.  We're not discouraging
you from use pg_migrator, but please be careful and observe that it is
new and has some limitations.

Agreed.  There is no reason for most users to need pg_migrator;  it is
not worth the risk for them, however small.  There are some people who
really need it, and hopefully they are experienced users, while there is
a larger group who want to know such an option _exists_, so if they ever
need it, it is available.

I think this "larger group" is where my problem with the word
"experimental" come in.  I think pg_migrator is far enough along that we
know it works, and that it will probably work for future major releases.
By calling it "experimental" we are not conveying confidence in the tool
for people who are making deployment decisions based on the existence of
such tool, even if they aren't going to use it initially.  And by not
conveying confidence, we will lose the adoption advantage we can get
from pg_migrator.

Now, we don't want to over-promise, but at the same time we shouldn't
downplay the tool either.  For a sufficiently-experienced administrator,
pg_migrator is a useful migration tool, and we need to convey that.
Even if you have to hire a consultant to manage the migration, if it
saves days of downtime, it is worth it.  Consultants don't often use
experimental tools, but they do use complex, powerful tools that are
often rough around the edges in terms of usability, e.g. read the
INSTALL file carefully.

Fair enough. I'm game to use a different word. I spent approximately
30 seconds coming up with that suggestion. :-)

For longterm strategy, let me list the challenges for pg_migrator from
any major upgrade (listed in the DEVELOPERS file):

 Change                                  Conversion Method
 ------------------------------------------------------------------------------
 clog                                    none
 heap page header, including bitmask     convert to new page format on read
 tuple header, including bitmask         convert to new page format on read
 data value format                       create old data type in new cluster
 index page format                       reindex, or recreate old index methods

These are the issue we will have to address for 8.5 and beyond if
pg_migrator is to remain useful.

No arguments here, sounds like interesting stuff.

...Robert

#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#75)
Re: pg_migrator issue with contrib

Robert Haas <robertmhaas@gmail.com> writes:

Obviously we don't want to get into connecting to a web site, but we
could probably come up with some other API for .so files to indicate
which versions of PG they're compatible with.

You mean like PG_MODULE_MAGIC?

regards, tom lane

#78Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#77)
Re: pg_migrator issue with contrib

On Mon, Jun 8, 2009 at 1:32 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Obviously we don't want to get into connecting to a web site, but we
could probably come up with some other API for .so files to indicate
which versions of PG they're compatible with.

You mean like PG_MODULE_MAGIC?

Hey, how about that. Why doesn't that solve our problem here?

[ thinks ... ] I guess it's because there's no guarantee that the
function is installed on the SQL level with the signature that is
appropriate on the C level. To fix that, I suppose we'd need to
version the contents of the .sql file that installs the definitions,
which gets back to the problem of building a general-purpose module
facility.

...Robert

#79Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#75)
Re: pg_migrator issue with contrib

Robert Haas wrote:

On Mon, Jun 8, 2009 at 1:06 PM, Bruce Momjian<bruce@momjian.us> wrote:

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

At a minimum it would be great if items could mark themselves as
non-binary-upgradable.

It's hardly difficult to make that happen --- just change the C name of
some function, or the name of the whole .so file.

Yes, but it needs to happen. ?;-) ?PostGIS has done this, which is good.
The problem is that if they don't do it it is out of the control of
pg_migrator.

I think it might be possible to implement a system that can't be
broken by accident. Firefox (at least AIUI) requires plugin authors
to explicitly flag their modules as compatible with new versions of
Firefox. When you upgrade your firefox installation in place (heh,
heh) it goes off to the web site and checks whether all of your
extensions have been so flagged. Any that have not been get disabled
automatically.

Interesting it allows the flagging to happen in real-time, rather than
requiring the system to know at install time whether it is compatible
with future verions (almost an impossibility). I am afraid we would
need some kind of real-time check, or at least have major versions flag
which external stuff cannot be upgraded, which we have discussed here
already.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#80Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#76)
Re: pg_migrator issue with contrib

Robert Haas wrote:

Now, we don't want to over-promise, but at the same time we shouldn't
downplay the tool either. ?For a sufficiently-experienced administrator,
pg_migrator is a useful migration tool, and we need to convey that.
Even if you have to hire a consultant to manage the migration, if it
saves days of downtime, it is worth it. ?Consultants don't often use
experimental tools, but they do use complex, powerful tools that are
often rough around the edges in terms of usability, e.g. read the
INSTALL file carefully.

Fair enough. I'm game to use a different word. I spent approximately
30 seconds coming up with that suggestion. :-)

I think the text I already posted is appropriate:

pg_migrator is designed for experienced users with large databases, for
whom the typical dump/restore required for major version upgrades is a
hardship.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#23)
Re: pg_migrator issue with contrib

I wrote:

[ concerning various migration hazards in contrib/ ]

* isn has got the nastiest problems of the lot: since it piggybacks
on type bigint, a migrated database might work, or might crash
miserably, depending on whether bigint has become pass-by-value
in the new database. I'm not very sure if we can fix this reasonably.

I've spent some time thinking about possible workarounds for this, and
not really come up with any. The only feasible thing I can think of
to do is teach pg_migrator to refuse to migrate if (a) the old DB
contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
can be checked in pg_control). One question here is how you decide
if the old DB contains contrib/isn. I don't think looking for the
type name per se is a hot idea. The best plan that has come to mind
is to look through pg_proc to see if there are any C-language functions
that reference "$libdir/isn".

* pg_freespacemap has made *major* changes in its ABI. There's
probably no hope of this working either, but we need to be sure
it's not a crash risk.

This turns out not to be a problem: the set of exposed C function names
changed, so the function definitions will fail to migrate. Dropping
the module before migrating and reinstalling it afterwards is an easy
workaround.

* pgstattuple has made changes in the output types of its functions.
This is a serious crash risk, and I'm not immediately sure how to
fix it. Note that simply migrating the module will not draw any
errors.

This doesn't seem to create serious problems either. pgstatindex() has
changed some output columns from int4 to int8, but because it creates
the result tuple from text strings, it manages to just work anyway.
(In principle you could get some overflow problems with very large
indexes, but I doubt that's an issue in practice; and it couldn't cause
a crash anyway.) pg_relpages() likewise changed its return type, but
in this particular case you could only get garbage answers not a crash.
So I think we can just tell people to reinstall the SQL file after
migration.

* tsearch2 has opclass support function changes, but unlike other
cases of this, it will fail to migrate to 8.4 because the functions
are references to core functions instead of being declared in the
module. Also, "drop it first" isn't a very useful recommendation
since the domains it defines might be used somewhere.

It would be nice if this migrated cleanly, but it doesn't and there's
not much we can do about it. At least it will fail safely.

So, other than the suggested pg_migrator hack for contrib/isn, the only
thing left to do here is fix dblink_current_query(). I'll take care of
that, but not till after Joe commits his remaining patch, so as not to
risk creating merge hazards for him.

regards, tom lane

#82Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#78)
Re: pg_migrator issue with contrib

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 8, 2009 at 1:32 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

You mean like PG_MODULE_MAGIC?

Hey, how about that. Why doesn't that solve our problem here?

[ thinks ... ] I guess it's because there's no guarantee that the
function is installed on the SQL level with the signature that is
appropriate on the C level.

Yeah. And it's more than just the function itself. For example,
in the contrib/isn mess, the function definitions didn't change.
The problem is the passbyval flag (or lack of it) on the type
definition.

I think we've speculated in the past about having ways of embedding
per-function data into the .so libraries so that these sorts of
things could be caught automatically. But it'd be a lot of work
for rather limited reward.

regards, tom lane

#83Alvaro Herrera
alvherre@commandprompt.com
In reply to: Bruce Momjian (#74)
Re: pg_migrator issue with contrib

Bruce Momjian escribi�:

For longterm strategy, let me list the challenges for pg_migrator from
any major upgrade (listed in the DEVELOPERS file):

Change Conversion Method
------------------------------------------------------------------------------
clog none
heap page header, including bitmask convert to new page format on read
tuple header, including bitmask convert to new page format on read
data value format create old data type in new cluster
index page format reindex, or recreate old index methods

TOAST changes?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#84Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#81)
Re: pg_migrator issue with contrib

Tom Lane wrote:

I wrote:

[ concerning various migration hazards in contrib/ ]

* isn has got the nastiest problems of the lot: since it piggybacks
on type bigint, a migrated database might work, or might crash
miserably, depending on whether bigint has become pass-by-value
in the new database. I'm not very sure if we can fix this reasonably.

I've spent some time thinking about possible workarounds for this, and
not really come up with any. The only feasible thing I can think of
to do is teach pg_migrator to refuse to migrate if (a) the old DB
contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
can be checked in pg_control). One question here is how you decide
if the old DB contains contrib/isn. I don't think looking for the
type name per se is a hot idea. The best plan that has come to mind
is to look through pg_proc to see if there are any C-language functions
that reference "$libdir/isn".

Sure, pg_migrator is good at checking. Please confirm you want this
added to pg_migrator.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#85Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#84)
Re: pg_migrator issue with contrib

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

I've spent some time thinking about possible workarounds for this, and
not really come up with any. The only feasible thing I can think of
to do is teach pg_migrator to refuse to migrate if (a) the old DB
contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
can be checked in pg_control). One question here is how you decide
if the old DB contains contrib/isn. I don't think looking for the
type name per se is a hot idea. The best plan that has come to mind
is to look through pg_proc to see if there are any C-language functions
that reference "$libdir/isn".

Sure, pg_migrator is good at checking. Please confirm you want this
added to pg_migrator.

Yeah, I'd suggest it. Even if we later come up with a workaround for
contrib/isn, you're going to want to have the infrastructure in place
for this type of check, because there will surely be cases that need it.

Note that I think the FLOAT8PASSYVAL check is a must. There is no
reason to forbid migrating isn on 32-bit machines, for example.

regards, tom lane

#86Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#83)
Re: pg_migrator issue with contrib

Alvaro Herrera wrote:

Bruce Momjian escribi?:

For longterm strategy, let me list the challenges for pg_migrator from
any major upgrade (listed in the DEVELOPERS file):

Change Conversion Method
------------------------------------------------------------------------------
clog none
heap page header, including bitmask convert to new page format on read
tuple header, including bitmask convert to new page format on read
data value format create old data type in new cluster
index page format reindex, or recreate old index methods

TOAST changes?

Good point, added.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#87Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#85)
1 attachment(s)
Re: pg_migrator issue with contrib

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

I've spent some time thinking about possible workarounds for this, and
not really come up with any. The only feasible thing I can think of
to do is teach pg_migrator to refuse to migrate if (a) the old DB
contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
can be checked in pg_control). One question here is how you decide
if the old DB contains contrib/isn. I don't think looking for the
type name per se is a hot idea. The best plan that has come to mind
is to look through pg_proc to see if there are any C-language functions
that reference "$libdir/isn".

Sure, pg_migrator is good at checking. Please confirm you want this
added to pg_migrator.

Yeah, I'd suggest it. Even if we later come up with a workaround for
contrib/isn, you're going to want to have the infrastructure in place
for this type of check, because there will surely be cases that need it.

Note that I think the FLOAT8PASSYVAL check is a must. There is no
reason to forbid migrating isn on 32-bit machines, for example.

Done, with patch attached, and pg_migrator beta6 released.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload
? tools
? log
? src/pg_migrator
Index: src/controldata.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/controldata.c,v
retrieving revision 1.14
diff -c -r1.14 controldata.c
*** src/controldata.c	13 May 2009 15:19:16 -0000	1.14
--- src/controldata.c	8 Jun 2009 21:29:31 -0000
***************
*** 23,29 ****
   */
  void
  get_control_data(migratorContext *ctx, const char *bindir,
! 				 const char *datadir, ControlData *ctrl)
  {
  	char		cmd[MAXPGPATH];
  	char		bufin[MAX_STRING];
--- 23,30 ----
   */
  void
  get_control_data(migratorContext *ctx, const char *bindir,
! 				 const char *datadir, ControlData *ctrl,
! 				 const uint32 pg_version)
  {
  	char		cmd[MAXPGPATH];
  	char		bufin[MAX_STRING];
***************
*** 43,48 ****
--- 44,50 ----
  	bool		got_index = false;
  	bool		got_toast = false;
  	bool		got_date_is_int = false;
+ 	bool		got_float8_pass_by_value = false;
  	char	   *lang = NULL;
  
  	/* Because we test the pg_resetxlog output strings, it has to be in English. */
***************
*** 65,71 ****
  	/* Only pre-8.4 has these so if they are not set below we will check later */
  	ctrl->lc_collate = NULL;
  	ctrl->lc_ctype = NULL;
! 	
  	/* we have the result of cmd in "output". so parse it line by line now */
  	while (fgets(bufin, sizeof(bufin), output))
  	{
--- 67,80 ----
  	/* Only pre-8.4 has these so if they are not set below we will check later */
  	ctrl->lc_collate = NULL;
  	ctrl->lc_ctype = NULL;
! 
! 	/* Not in pre-8.4 */
! 	if (pg_version < 80400)
! 	{
! 		ctrl->float8_pass_by_value = false;
! 		got_float8_pass_by_value = true;
! 	}
! 
  	/* we have the result of cmd in "output". so parse it line by line now */
  	while (fgets(bufin, sizeof(bufin), output))
  	{
***************
*** 249,254 ****
--- 258,275 ----
  			ctrl->date_is_int = strstr(p, "64-bit integers") != NULL;
  			got_date_is_int = true;
  		}
+ 		else if ((p = strstr(bufin, "Float8 argument passing:")) != NULL)
+ 		{
+ 			p = strchr(p, ':');
+ 
+ 			if (p == NULL || strlen(p) <= 1)
+ 				pg_log(ctx, PG_FATAL, "%d: pg_resetxlog  problem\n", __LINE__);
+ 
+ 			p++;				/* removing ':' char */
+ 			/* used later for /contrib check */
+ 			ctrl->float8_pass_by_value = strstr(p, "by value") != NULL;
+ 			got_float8_pass_by_value = true;
+ 		}
  		/* In pre-8.4 only */
  		else if ((p = strstr(bufin, "LC_COLLATE:")) != NULL)
  		{
***************
*** 305,311 ****
  	if (!got_xid || !got_oid || !got_log_id || !got_log_seg || !got_tli ||
  		!got_align || !got_blocksz || !got_largesz || !got_walsz ||
  		!got_walseg || !got_ident || !got_index || !got_toast ||
! 		!got_date_is_int)
  	{
  		pg_log(ctx, PG_REPORT,
  			"Some required control information is missing;  cannot find:\n");
--- 326,332 ----
  	if (!got_xid || !got_oid || !got_log_id || !got_log_seg || !got_tli ||
  		!got_align || !got_blocksz || !got_largesz || !got_walsz ||
  		!got_walseg || !got_ident || !got_index || !got_toast ||
! 		!got_date_is_int || !got_float8_pass_by_value)
  	{
  		pg_log(ctx, PG_REPORT,
  			"Some required control information is missing;  cannot find:\n");
***************
*** 352,357 ****
--- 373,382 ----
  		if (!got_date_is_int)
  			pg_log(ctx, PG_REPORT, "  dates/times are integers?\n");
  
+ 		/* value added in Postgres 8.4 */
+ 		if (!got_float8_pass_by_value)
+ 			pg_log(ctx, PG_REPORT, "  float8 argument passing method\n");
+ 
  		pg_log(ctx, PG_FATAL,
  			   "Unable to continue without required control information, terminating\n");
  	}
Index: src/pg_migrator.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/pg_migrator.c,v
retrieving revision 1.42
diff -c -r1.42 pg_migrator.c
*** src/pg_migrator.c	4 Jun 2009 22:01:55 -0000	1.42
--- src/pg_migrator.c	8 Jun 2009 21:29:31 -0000
***************
*** 71,76 ****
--- 71,77 ----
  	{
  		v8_3_check_for_name_data_type_usage(&ctx, CLUSTER_OLD);
  		v8_3_check_for_tsquery_usage(&ctx, CLUSTER_OLD);
+ 		v8_3_check_for_isn_and_int8_passing_mismatch(&ctx, CLUSTER_OLD);
  		if (ctx.check)
  		{
  			v8_3_rebuild_tsvector_tables(&ctx, true, CLUSTER_OLD);
***************
*** 560,568 ****
  
  	/* get/check pg_control data of servers */
  	get_control_data(ctx, ctx->old.bindir, ctx->old.pgdata,
! 					 &ctx->old.controldata);
  	get_control_data(ctx, ctx->new.bindir, ctx->new.pgdata,
! 					 &ctx->new.controldata);
  	check_control_data(ctx, &ctx->old.controldata, &ctx->new.controldata);
  }
  
--- 561,569 ----
  
  	/* get/check pg_control data of servers */
  	get_control_data(ctx, ctx->old.bindir, ctx->old.pgdata,
! 					 &ctx->old.controldata, ctx->old.pg_version);
  	get_control_data(ctx, ctx->new.bindir, ctx->new.pgdata,
! 					 &ctx->new.controldata, ctx->new.pg_version);
  	check_control_data(ctx, &ctx->old.controldata, &ctx->new.controldata);
  }
  
Index: src/pg_migrator.h
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/pg_migrator.h,v
retrieving revision 1.51
diff -c -r1.51 pg_migrator.h
*** src/pg_migrator.h	8 Jun 2009 18:33:32 -0000	1.51
--- src/pg_migrator.h	8 Jun 2009 21:29:31 -0000
***************
*** 145,150 ****
--- 145,151 ----
  	uint32		index;
  	uint32		toast;
  	bool		date_is_int;
+ 	bool		float8_pass_by_value;
  	char	   *lc_collate;
  	char	   *lc_ctype;
  	char	   *encoding;
***************
*** 244,250 ****
  /* controldata.c */
  
  void get_control_data(migratorContext *ctx, const char *bindir,
! 				 const char *datadir, ControlData *ctrl);
  void check_control_data(migratorContext *ctx, ControlData *oldctrl,
  				   ControlData *newctrl);
  void set_locale_and_encoding(migratorContext *ctx, Cluster whichCluster);
--- 245,252 ----
  /* controldata.c */
  
  void get_control_data(migratorContext *ctx, const char *bindir,
! 				 const char *datadir, ControlData *ctrl,
! 				 const uint32 pg_version);
  void check_control_data(migratorContext *ctx, ControlData *oldctrl,
  				   ControlData *newctrl);
  void set_locale_and_encoding(migratorContext *ctx, Cluster whichCluster);
***************
*** 375,380 ****
--- 377,384 ----
  							Cluster whichCluster);
  void		v8_3_check_for_tsquery_usage(migratorContext *ctx,
  							Cluster whichCluster);
+ void		v8_3_check_for_isn_and_int8_passing_mismatch(migratorContext *ctx,
+ 							Cluster whichCluster);
  void		v8_3_rebuild_tsvector_tables(migratorContext *ctx,
  							bool check_mode, Cluster whichCluster);
  void		v8_3_invalidate_hash_gin_indexes(migratorContext *ctx,
Index: src/version.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/version.c,v
retrieving revision 1.17
diff -c -r1.17 version.c
*** src/version.c	31 May 2009 20:25:48 -0000	1.17
--- src/version.c	8 Jun 2009 21:29:31 -0000
***************
*** 89,95 ****
  				"| Your installation uses the \"name\" data type in user tables.\n"
  				"| This data type changed its internal alignment between your old and\n"
  				"| new clusters so this cluster cannot currently be upgraded.\n"
! 				"| You can remove the problem tables and restart this program.\n"
  				"| Simply dropping the offending columns will not work.\n"
  				"| A list of the problem columns is in the file\n"
  				"| \"%s\".\n\n", output_path);
--- 89,95 ----
  				"| Your installation uses the \"name\" data type in user tables.\n"
  				"| This data type changed its internal alignment between your old and\n"
  				"| new clusters so this cluster cannot currently be upgraded.\n"
! 				"| You can remove the problem tables and restart the migration.\n"
  				"| Simply dropping the offending columns will not work.\n"
  				"| A list of the problem columns is in the file\n"
  				"| \"%s\".\n\n", output_path);
***************
*** 178,184 ****
  				"| Your installation uses the \"tsquery\" data type.\n"
  				"| This data type added a new internal field between your old and\n"
  				"| new clusters so this cluster cannot currently be upgraded.\n"
! 				"| You can remove the problem columns and restart this program.\n"
  				"| A list of the problem columns is in the file\n"
  				"| \"%s\".\n\n", output_path);
  	}
--- 178,184 ----
  				"| Your installation uses the \"tsquery\" data type.\n"
  				"| This data type added a new internal field between your old and\n"
  				"| new clusters so this cluster cannot currently be upgraded.\n"
! 				"| You can remove the problem columns and restart the migration.\n"
  				"| A list of the problem columns is in the file\n"
  				"| \"%s\".\n\n", output_path);
  	}
***************
*** 188,193 ****
--- 188,286 ----
  
  
  /*
+  * v8_3_check_for_isn_and_int8_passing_mismatch()
+  *
+  *	/contrib/isn relies on data type bigint, and the CREATE TYPE
+  *  PASSEDBYVALUE setting might not match in the old and new servers,
+  *	so the new shared object file would work unreliably.  Passing int8
+  *	by value is new in Postgres 8.4.
+  */
+ void
+ v8_3_check_for_isn_and_int8_passing_mismatch(migratorContext *ctx, Cluster whichCluster)
+ {
+ 	ClusterInfo	*active_cluster = (whichCluster == CLUSTER_OLD) ?
+ 					&ctx->old : &ctx->new;
+ 	int			dbnum;
+ 	FILE		*script = NULL;
+ 	bool		found = false;
+ 	char		output_path[MAXPGPATH];
+ 
+ 	prep_status(ctx, "Checking for /contrib/isn with bigint-passing mismatch");
+ 
+ 	if (ctx->old.controldata.float8_pass_by_value ==
+ 		ctx->new.controldata.float8_pass_by_value)
+ 	{
+ 		/* no mismatch */
+ 		check_ok(ctx);
+ 		return;
+ 	}
+ 
+ 	snprintf(output_path, sizeof(output_path), "%s/contrib_isn_and_int8_pass_by_value.txt",
+ 			ctx->home_dir);
+ 	
+ 	for (dbnum = 0; dbnum < active_cluster->dbarr.ndbs; dbnum++)
+ 	{
+ 		PGresult   *res;
+  		bool		db_used = false;
+ 		int			ntups;
+ 		int			rowno;
+ 		int			i_nspname, i_proname;
+ 		DbInfo	   *active_db = &active_cluster->dbarr.dbs[dbnum];
+ 		PGconn	   *conn = connectToServer(ctx, active_db->db_name, whichCluster);
+ 		
+ 		/* Find any user-defined tsquery columns */
+ 		res = executeQueryOrDie(ctx, conn,
+ 								"SELECT n.nspname, p.proname "
+ 								"FROM	pg_catalog.pg_proc p, "
+ 								"		pg_catalog.pg_namespace n "
+ 								"WHERE	p.pronamespace = n.oid AND "
+ 								"		p.probin = '$libdir/isn' AND "
+ 								"		n.nspname != 'pg_catalog' AND "
+ 								"		n.nspname != 'information_schema'");
+ 
+ 		ntups = PQntuples(res);
+ 		i_nspname = PQfnumber(res, "nspname");
+ 		i_proname = PQfnumber(res, "proname");
+ 		for (rowno = 0; rowno < ntups; rowno++)
+ 		{
+ 			found = true;
+ 			if (script == NULL && (script = fopen(output_path, "w")) == NULL)
+ 					pg_log(ctx, PG_FATAL, "Could not create necessary file:  %s\n", output_path);
+ 			if (!db_used)
+ 			{
+ 				fprintf(script, "Database:  %s\n", active_db->db_name);
+ 				db_used = true;
+ 			}
+ 			fprintf(script, "  %s.%s\n",
+ 					PQgetvalue(res, rowno, i_nspname),
+ 					PQgetvalue(res, rowno, i_proname));
+ 		}
+ 
+ 		PQclear(res);
+ 
+ 		PQfinish(conn);
+ 	}
+ 
+ 	if (found)
+ 	{
+ 		fclose(script);
+ 		pg_log(ctx, PG_REPORT, "fatal\n");
+ 		pg_log(ctx, PG_FATAL,
+ 				"| Your installation uses \"/contrib/isn\" functions\n"
+ 				"| which rely on the bigint data type.  Your old and new\n"
+ 				"| clusters pass bigint values differently so this cluster\n"
+ 				"| cannot currently be upgraded.  You can manually migrate\n"
+ 				"| data that use \"/contrib/isn\" facilities and remove\n"
+ 				"| \"/contrib/isn\" from the old cluster and restart the\n"
+ 				"| migration.  A list of the problem functions is in the\n"
+ 				"| file \"%s\".\n\n", output_path);
+ 	}
+ 	else
+ 		check_ok(ctx);
+ }
+ 
+ 
+ /*
   * v8_3_rebuild_tsvector_tables()
   *
   * 8.3 sorts lexemes by its length and if lengths are the same then it uses