SSI patch renumbered existing 2PC resource managers??

Started by Tom Laneover 14 years ago13 messages
#1Tom Lane
tgl@sss.pgh.pa.us

So I finally started actually reading the SSI changes, and I am a tad
distressed by this:

diff --git a/src/include/access/twophase_rmgr.h b/src/include/access/twophase_rmgr.h
index a541d0f..1c7d8bb 100644
--- a/src/include/access/twophase_rmgr.h
+++ b/src/include/access/twophase_rmgr.h
@@ -23,8 +23,9 @@ typedef uint8 TwoPhaseRmgrId;
  */
 #define TWOPHASE_RM_END_ID			0
 #define TWOPHASE_RM_LOCK_ID			1
-#define TWOPHASE_RM_PGSTAT_ID		2
-#define TWOPHASE_RM_MULTIXACT_ID	3
+#define TWOPHASE_RM_PREDICATELOCK_ID	2
+#define TWOPHASE_RM_PGSTAT_ID		3
+#define TWOPHASE_RM_MULTIXACT_ID	4
 #define TWOPHASE_RM_MAX_ID			TWOPHASE_RM_MULTIXACT_ID

extern const TwoPhaseCallback twophase_recover_callbacks[];

What was the rationale for changing the assignments of existing 2PC IDs?
So far as I can tell, that breaks pg_upgrade (if there are any open
prepared transactions) for no redeeming social benefit. Is there a
reason why TWOPHASE_RM_PREDICATELOCK_ID has to be 2 and not at the end?

regards, tom lane

#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#1)
Re: SSI patch renumbered existing 2PC resource managers??

Tom Lane <tgl@sss.pgh.pa.us> wrote:

What was the rationale for changing the assignments of existing
2PC IDs? So far as I can tell, that breaks pg_upgrade (if there
are any open prepared transactions) for no redeeming social
benefit. Is there a reason why TWOPHASE_RM_PREDICATELOCK_ID has
to be 2 and not at the end?

I'm sure that Dan will jump in if this guess is wrong, but since the
predicate lock code is largely derived from the heavyweight locking
code, it probably seemed to have a minor cosmetic benefit to put it
adjacent to that. It didn't occur to me when the SSI 2PC code went
in, but I can see the problem now that you point it out. The new
entry should be moved to the end for compatibility. Would you like
me to submit a patch to fix this, or should I stay out of your way?

-Kevin

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#1)
Re: SSI patch renumbered existing 2PC resource managers??

On 13.06.2011 21:31, Tom Lane wrote:

So I finally started actually reading the SSI changes, and I am a tad
distressed by this:

diff --git a/src/include/access/twophase_rmgr.h b/src/include/access/twophase_rmgr.h
index a541d0f..1c7d8bb 100644
--- a/src/include/access/twophase_rmgr.h
+++ b/src/include/access/twophase_rmgr.h
@@ -23,8 +23,9 @@ typedef uint8 TwoPhaseRmgrId;
*/
#define TWOPHASE_RM_END_ID			0
#define TWOPHASE_RM_LOCK_ID			1
-#define TWOPHASE_RM_PGSTAT_ID		2
-#define TWOPHASE_RM_MULTIXACT_ID	3
+#define TWOPHASE_RM_PREDICATELOCK_ID	2
+#define TWOPHASE_RM_PGSTAT_ID		3
+#define TWOPHASE_RM_MULTIXACT_ID	4
#define TWOPHASE_RM_MAX_ID			TWOPHASE_RM_MULTIXACT_ID

extern const TwoPhaseCallback twophase_recover_callbacks[];

What was the rationale for changing the assignments of existing 2PC IDs?

As far as I can tell it was for purely cosmetic reasons, to have lock
and predicate lock lines together.

So far as I can tell, that breaks pg_upgrade (if there are any open
prepared transactions) for no redeeming social benefit.

Surely pg_upgrade can't work anyway if there's any open prepared
transactions in the database. We're not going to guarantee to keep all
the data structures we write in two-phase state files unchanged over
major releases. If pg_upgrade is not checking for prepared transcations
at the moment, such a check should probably should be added.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Dan Ports
drkp@csail.mit.edu
In reply to: Heikki Linnakangas (#3)
Re: SSI patch renumbered existing 2PC resource managers??

On Mon, Jun 13, 2011 at 10:22:19PM +0300, Heikki Linnakangas wrote:

As far as I can tell it was for purely cosmetic reasons, to have lock
and predicate lock lines together.

Yes, that is the only reason.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#3)
Re: SSI patch renumbered existing 2PC resource managers??

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 13.06.2011 21:31, Tom Lane wrote:

So far as I can tell, that breaks pg_upgrade (if there are any open
prepared transactions) for no redeeming social benefit.

Surely pg_upgrade can't work anyway if there's any open prepared
transactions in the database. We're not going to guarantee to keep all
the data structures we write in two-phase state files unchanged over
major releases. If pg_upgrade is not checking for prepared transcations
at the moment, such a check should probably should be added.

No, pg_upgrade should not be unilaterally refusing that. The correct
way to deal with this consideration is to change the TWOPHASE_MAGIC
number when we make a change in on-disk 2PC state. Which wasn't done
in the SSI patch. We can either change that now, or undo the
unnecessary change in existing RM IDs. I vote for the latter.

regards, tom lane

#6Dan Ports
drkp@csail.mit.edu
In reply to: Tom Lane (#5)
Re: SSI patch renumbered existing 2PC resource managers??

On Mon, Jun 13, 2011 at 03:33:24PM -0400, Tom Lane wrote:

We can either change that now, or undo the
unnecessary change in existing RM IDs. I vote for the latter.

Sounds good to me. I'd offer a patch, but it'd probably take you longer
to apply than to make the change yourself.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

#7Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#5)
Re: SSI patch renumbered existing 2PC resource managers??

Tom Lane wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 13.06.2011 21:31, Tom Lane wrote:

So far as I can tell, that breaks pg_upgrade (if there are any open
prepared transactions) for no redeeming social benefit.

Surely pg_upgrade can't work anyway if there's any open prepared
transactions in the database. We're not going to guarantee to keep all
the data structures we write in two-phase state files unchanged over
major releases. If pg_upgrade is not checking for prepared transcations
at the moment, such a check should probably should be added.

No, pg_upgrade should not be unilaterally refusing that. The correct
way to deal with this consideration is to change the TWOPHASE_MAGIC
number when we make a change in on-disk 2PC state. Which wasn't done
in the SSI patch. We can either change that now, or undo the
unnecessary change in existing RM IDs. I vote for the latter.

Uh, isn't there some physical files in pg_twophase/ that stick around to
keep prepared transactions --- if so, pg_upgrade does not copy them from
the old cluster to the new one. I am also hesistant to do so because
there might be data in there that isn't portable. I like the idea of
adding a check, I assume by reading pg_prepared_xact().

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#7)
Re: SSI patch renumbered existing 2PC resource managers??

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

No, pg_upgrade should not be unilaterally refusing that.

Uh, isn't there some physical files in pg_twophase/ that stick around to
keep prepared transactions --- if so, pg_upgrade does not copy them from
the old cluster to the new one. I am also hesistant to do so because
there might be data in there that isn't portable.

This argument seems a tad peculiar, since the *entire* *point* of
pg_upgrade is to push physical files from one installation into another
even though compatibility isn't guaranteed. It is the program's duty to
understand enough to know whether it can transport the cluster's state
safely. Not to arbitrarily discard state because it might possibly not
be transportable.

regards, tom lane

#9Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#5)
Re: SSI patch renumbered existing 2PC resource managers??

On 13.06.2011 22:33, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 13.06.2011 21:31, Tom Lane wrote:

So far as I can tell, that breaks pg_upgrade (if there are any open
prepared transactions) for no redeeming social benefit.

Surely pg_upgrade can't work anyway if there's any open prepared
transactions in the database. We're not going to guarantee to keep all
the data structures we write in two-phase state files unchanged over
major releases. If pg_upgrade is not checking for prepared transcations
at the moment, such a check should probably should be added.

No, pg_upgrade should not be unilaterally refusing that. The correct
way to deal with this consideration is to change the TWOPHASE_MAGIC
number when we make a change in on-disk 2PC state. Which wasn't done
in the SSI patch. We can either change that now, or undo the
unnecessary change in existing RM IDs. I vote for the latter.

Ok, I've renumbered the existing RMs back the way they were.

I nevertheless don't think it's worthwhile to try to migrate 2pc state
files in pg_upgrade. More than likely, it's a mistake on part of the
admin anyway if there is a prepared transaction open at that point.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Heikki Linnakangas (#9)
1 attachment(s)
Re: SSI patch renumbered existing 2PC resource managers??

Heikki Linnakangas wrote:

Ok, I've renumbered the existing RMs back the way they were.

Don't we also need something like the attached?

-Kevin

Attachments:

ssi-twophase-c.patchapplication/octet-stream; name=ssi-twophase-c.patchDownload
*** a/src/backend/access/transam/twophase_rmgr.c
--- b/src/backend/access/transam/twophase_rmgr.c
***************
*** 25,58 **** const TwoPhaseCallback twophase_recover_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_recover,		/* Lock */
- 	predicatelock_twophase_recover,		/* PredicateLock */
  	NULL,						/* pgstat */
! 	multixact_twophase_recover	/* MultiXact */
  };
  
  const TwoPhaseCallback twophase_postcommit_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_postcommit,	/* Lock */
- 	NULL,						/* PredicateLock */
  	pgstat_twophase_postcommit, /* pgstat */
! 	multixact_twophase_postcommit		/* MultiXact */
  };
  
  const TwoPhaseCallback twophase_postabort_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_postabort,	/* Lock */
- 	NULL,						/* PredicateLock */
  	pgstat_twophase_postabort,	/* pgstat */
! 	multixact_twophase_postabort	/* MultiXact */
  };
  
  const TwoPhaseCallback twophase_standby_recover_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_standby_recover,		/* Lock */
- 	NULL,						/* PredicateLock */
  	NULL,						/* pgstat */
! 	NULL						/* MultiXact */
  };
--- 25,58 ----
  {
  	NULL,						/* END ID */
  	lock_twophase_recover,		/* Lock */
  	NULL,						/* pgstat */
! 	multixact_twophase_recover, /* MultiXact */
! 	predicatelock_twophase_recover		/* PredicateLock */
  };
  
  const TwoPhaseCallback twophase_postcommit_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_postcommit,	/* Lock */
  	pgstat_twophase_postcommit, /* pgstat */
! 	multixact_twophase_postcommit,		/* MultiXact */
! 	NULL						/* PredicateLock */
  };
  
  const TwoPhaseCallback twophase_postabort_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_postabort,	/* Lock */
  	pgstat_twophase_postabort,	/* pgstat */
! 	multixact_twophase_postabort,		/* MultiXact */
! 	NULL						/* PredicateLock */
  };
  
  const TwoPhaseCallback twophase_standby_recover_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  {
  	NULL,						/* END ID */
  	lock_twophase_standby_recover,		/* Lock */
  	NULL,						/* pgstat */
! 	NULL,						/* MultiXact */
! 	NULL						/* PredicateLock */
  };
#11Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Kevin Grittner (#10)
Re: SSI patch renumbered existing 2PC resource managers??

On 14.06.2011 15:14, Kevin Grittner wrote:

Heikki Linnakangas wrote:

Ok, I've renumbered the existing RMs back the way they were.

Don't we also need something like the attached?

Yes. I just committed a fix for that after noticing that the buildfarm
didn't like it. Sorry..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#12Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: SSI patch renumbered existing 2PC resource managers??

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

No, pg_upgrade should not be unilaterally refusing that.

Uh, isn't there some physical files in pg_twophase/ that stick around to
keep prepared transactions --- if so, pg_upgrade does not copy them from
the old cluster to the new one. I am also hesistant to do so because
there might be data in there that isn't portable.

This argument seems a tad peculiar, since the *entire* *point* of
pg_upgrade is to push physical files from one installation into another
even though compatibility isn't guaranteed. It is the program's duty to
understand enough to know whether it can transport the cluster's state
safely. Not to arbitrarily discard state because it might possibly not
be transportable.

Well, pg_upgrade succeeds because it does as little as necessary to do
the migration, relying on pg_dump to do much of the migration work at
the catalog level. pg_upgrade tries to be involved as little as
possible with the Postgres code so it doesn't have to be changed
regularly between major versions.

The prepared transaction case seems ugly enough that we don't want
pg_upgrade to have to check every major release if anything changed
about the data stored in prepared transactions. This is the same reason
pg_upgrade doesn't transfer WAL files from the old cluster, just pg_clog
files (which rarely changes its format).

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#13Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#12)
1 attachment(s)
Re: SSI patch renumbered existing 2PC resource managers??

Bruce Momjian wrote:

This argument seems a tad peculiar, since the *entire* *point* of
pg_upgrade is to push physical files from one installation into another
even though compatibility isn't guaranteed. It is the program's duty to
understand enough to know whether it can transport the cluster's state
safely. Not to arbitrarily discard state because it might possibly not
be transportable.

Well, pg_upgrade succeeds because it does as little as necessary to do
the migration, relying on pg_dump to do much of the migration work at
the catalog level. pg_upgrade tries to be involved as little as
possible with the Postgres code so it doesn't have to be changed
regularly between major versions.

The prepared transaction case seems ugly enough that we don't want
pg_upgrade to have to check every major release if anything changed
about the data stored in prepared transactions. This is the same reason
pg_upgrade doesn't transfer WAL files from the old cluster, just pg_clog
files (which rarely changes its format).

I have applied the attached pg_upgrade patch to head and 9.1 to fail if
prepared transactions are in the old or new cluster.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachments:

/rtmp/pg_upgradetext/x-diffDownload
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index fdec6e3..376d25a
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*************** static void check_old_cluster_has_new_cl
*** 16,21 ****
--- 16,22 ----
  static void check_locale_and_encoding(ControlData *oldctrl,
  						  ControlData *newctrl);
  static void check_is_super_user(ClusterInfo *cluster);
+ static void check_for_prepared_transactions(ClusterInfo *cluster);
  static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
  static void check_for_reg_data_type_usage(ClusterInfo *cluster);
  
*************** check_old_cluster(bool live_check,
*** 65,70 ****
--- 66,72 ----
  	 * Check for various failure cases
  	 */
  	check_is_super_user(&old_cluster);
+ 	check_for_prepared_transactions(&old_cluster);
  	check_for_reg_data_type_usage(&old_cluster);
  	check_for_isn_and_int8_passing_mismatch(&old_cluster);
  
*************** check_new_cluster(void)
*** 117,122 ****
--- 119,125 ----
  	get_db_and_rel_infos(&new_cluster);
  
  	check_new_cluster_is_empty();
+ 	check_for_prepared_transactions(&new_cluster);
  	check_old_cluster_has_new_cluster_dbs();
  
  	check_loadable_libraries();
*************** check_is_super_user(ClusterInfo *cluster
*** 507,512 ****
--- 510,545 ----
  
  
  /*
+  *	check_for_prepared_transactions()
+  *
+  *	Make sure there are no prepared transactions because the storage format
+  *	might have changed.
+  */
+ static void
+ check_for_prepared_transactions(ClusterInfo *cluster)
+ {
+ 	PGresult   *res;
+ 	PGconn	   *conn = connectToServer(cluster, "template1");
+ 
+ 	prep_status("Checking for prepared transactions");
+ 
+ 	res = executeQueryOrDie(conn,
+ 							"SELECT * "
+ 							"FROM pg_catalog.pg_prepared_xact()");
+ 
+ 	if (PQntuples(res) != 0)
+ 		pg_log(PG_FATAL, "The %s cluster contains prepared transactions\n",
+ 			   CLUSTER_NAME(cluster));
+ 
+ 	PQclear(res);
+ 
+ 	PQfinish(conn);
+ 
+ 	check_ok();
+ }
+ 
+ 
+ /*
   *	check_for_isn_and_int8_passing_mismatch()
   *
   *	contrib/isn relies on data type int8, and in 8.4 int8 can now be passed