large object does not exist after pg_migrator

Started by Jamie Foxover 16 years ago17 messages
#1Jamie Fox
jfox@directcommerce.com

Hi -
After what seemed to be a normal successful pg_migrator migration from 8.3.7
to 8.4.0, in either link or copy mode, vacuumlo fails on both our production
and qa databases:

Jul 1 11:17:03 db2 postgres[9321]: [14-1] LOG: duration: 175.563 ms
statement: DELETE FROM vacuum_l WHERE lo IN (SELECT "xml_data" FROM
"public"."xml_user")
Jul 1 11:17:03 db2 postgres[9321]: [15-1] ERROR: large object 17919608
does not exist
Jul 1 11:17:03 db2 postgres[9321]: [16-1] ERROR: current transaction is
aborted, commands ignored until end of transaction block

I migrated our qa database using pg_dump/pg_restore and vacuumlo has no
problem with it. When I try querying the two databases for large objects
manually I see the same error in the one that was migrated with pg_migrator:

select loread(lo_open(xml_data,262144),1073741819) from xml_user where id =
'10837246';
ERROR: large object 24696063 does not exist
SQL state: 42704

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Any advice on what I might look for to try and track down this problem?
pg_restore on our production database takes too long so it would be really
nice to use pg_migrator instead.

Thanks,

Jamie

#2Jamie Fox
jfox@directcommerce.com
In reply to: Jamie Fox (#1)
Re: large object does not exist after pg_migrator

Hi -
This is probably more helpful - the pg_largeobject table only changed after
vacuumlo, not before. When comparing pre- and post- pg_migrator databases
(no vacuum or vacuumlo):

select * from pg_largeobject where loid = '24696063';

in the pre- there are three rows, having pageno 0 through 3, in the post-
database there are no results.

Thanks for any advice,

Jamie

On Mon, Jul 13, 2009 at 2:13 PM, Jamie Fox <jfox@directcommerce.com> wrote:

Show quoted text

Hi -
After what seemed to be a normal successful pg_migrator migration from
8.3.7 to 8.4.0, in either link or copy mode, vacuumlo fails on both our
production and qa databases:

Jul 1 11:17:03 db2 postgres[9321]: [14-1] LOG: duration: 175.563 ms
statement: DELETE FROM vacuum_l WHERE lo IN (SELECT "xml_data" FROM
"public"."xml_user")
Jul 1 11:17:03 db2 postgres[9321]: [15-1] ERROR: large object 17919608
does not exist
Jul 1 11:17:03 db2 postgres[9321]: [16-1] ERROR: current transaction is
aborted, commands ignored until end of transaction block

I migrated our qa database using pg_dump/pg_restore and vacuumlo has no
problem with it. When I try querying the two databases for large objects
manually I see the same error in the one that was migrated with pg_migrator:

select loread(lo_open(xml_data,262144),1073741819) from xml_user where id =
'10837246';
ERROR: large object 24696063 does not exist
SQL state: 42704

I can also see that the pg_largeobject table is different, in the
pg_restore version the Rows (estimated) is 316286 and Rows (counted) is the
same, in the pg_migrator version the Rows (counted) is only 180507.

Any advice on what I might look for to try and track down this problem?
pg_restore on our production database takes too long so it would be really
nice to use pg_migrator instead.

Thanks,

Jamie

#3Bruce Momjian
bruce@momjian.us
In reply to: Jamie Fox (#1)
Re: [GENERAL] large object does not exist after pg_migrator

Jamie Fox wrote:

Hi -
After what seemed to be a normal successful pg_migrator migration from 8.3.7
to 8.4.0, in either link or copy mode, vacuumlo fails on both our production
and qa databases:

Jul 1 11:17:03 db2 postgres[9321]: [14-1] LOG: duration: 175.563 ms
statement: DELETE FROM vacuum_l WHERE lo IN (SELECT "xml_data" FROM
"public"."xml_user")
Jul 1 11:17:03 db2 postgres[9321]: [15-1] ERROR: large object 17919608
does not exist
Jul 1 11:17:03 db2 postgres[9321]: [16-1] ERROR: current transaction is
aborted, commands ignored until end of transaction block

I migrated our qa database using pg_dump/pg_restore and vacuumlo has no
problem with it. When I try querying the two databases for large objects
manually I see the same error in the one that was migrated with pg_migrator:

select loread(lo_open(xml_data,262144),1073741819) from xml_user where id =
'10837246';
ERROR: large object 24696063 does not exist
SQL state: 42704

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Any advice on what I might look for to try and track down this problem?
pg_restore on our production database takes too long so it would be really
nice to use pg_migrator instead.

[ Email moved to hackers list.]

Wow, I didn't test large objects specifically, and I am confused why
there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#4Bruce Momjian
bruce@momjian.us
In reply to: Jamie Fox (#2)
Re: [GENERAL] large object does not exist after pg_migrator

Forwarded to hackers.

---------------------------------------------------------------------------

Jamie Fox wrote:

Hi -
This is probably more helpful - the pg_largeobject table only changed after
vacuumlo, not before. When comparing pre- and post- pg_migrator databases
(no vacuum or vacuumlo):

select * from pg_largeobject where loid = '24696063';

in the pre- there are three rows, having pageno 0 through 3, in the post-
database there are no results.

Thanks for any advice,

Jamie

On Mon, Jul 13, 2009 at 2:13 PM, Jamie Fox <jfox@directcommerce.com> wrote:

Hi -
After what seemed to be a normal successful pg_migrator migration from
8.3.7 to 8.4.0, in either link or copy mode, vacuumlo fails on both our
production and qa databases:

Jul 1 11:17:03 db2 postgres[9321]: [14-1] LOG: duration: 175.563 ms
statement: DELETE FROM vacuum_l WHERE lo IN (SELECT "xml_data" FROM
"public"."xml_user")
Jul 1 11:17:03 db2 postgres[9321]: [15-1] ERROR: large object 17919608
does not exist
Jul 1 11:17:03 db2 postgres[9321]: [16-1] ERROR: current transaction is
aborted, commands ignored until end of transaction block

I migrated our qa database using pg_dump/pg_restore and vacuumlo has no
problem with it. When I try querying the two databases for large objects
manually I see the same error in the one that was migrated with pg_migrator:

select loread(lo_open(xml_data,262144),1073741819) from xml_user where id =
'10837246';
ERROR: large object 24696063 does not exist
SQL state: 42704

I can also see that the pg_largeobject table is different, in the
pg_restore version the Rows (estimated) is 316286 and Rows (counted) is the
same, in the pg_migrator version the Rows (counted) is only 180507.

Any advice on what I might look for to try and track down this problem?
pg_restore on our production database takes too long so it would be really
nice to use pg_migrator instead.

Thanks,

Jamie

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Bruce Momjian (#3)
Re: [GENERAL] large object does not exist after pg_migrator

Bruce Momjian wrote:

Jamie Fox wrote:

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Wow, I didn't test large objects specifically, and I am confused why
there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

Maybe pg_largeobject is not getting frozen?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#6Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#5)
Re: [GENERAL] large object does not exist after pg_migrator

Alvaro Herrera wrote:

Bruce Momjian wrote:

Jamie Fox wrote:

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Wow, I didn't test large objects specifically, and I am confused why
there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

Maybe pg_largeobject is not getting frozen?

That would explain the change in count, but I thought we froze
_everything_, and had to.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#7Alvaro Herrera
alvherre@commandprompt.com
In reply to: Bruce Momjian (#6)
Re: [GENERAL] large object does not exist after pg_migrator

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Jamie Fox wrote:

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Wow, I didn't test large objects specifically, and I am confused why
there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

Maybe pg_largeobject is not getting frozen?

That would explain the change in count, but I thought we froze
_everything_, and had to.

After a quick chat with Bruce it was determined that we don't freeze
anything (it would be horrid for downtime if we did so in pg_migrator;
and it would be useless if ran anywhere else). What we do is migrate
pg_clog from the old cluster to the new. So never mind that hypothesis.

Bruce noticed that the pg_dump/pg_migrator combo is failing to restore
pg_largeobject's relfrozenxid. We're not sure how this is causing the
errors Jamie is seeing, because what I think should happen is that scans
of the table should fail with failures to open pg_clog files
such-and-such, but not missing tuples ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#8Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#7)
Re: [GENERAL] large object does not exist after pg_migrator

Alvaro Herrera wrote:

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Jamie Fox wrote:

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Wow, I didn't test large objects specifically, and I am confused why
there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

Maybe pg_largeobject is not getting frozen?

That would explain the change in count, but I thought we froze
_everything_, and had to.

After a quick chat with Bruce it was determined that we don't freeze
anything (it would be horrid for downtime if we did so in pg_migrator;
and it would be useless if ran anywhere else). What we do is migrate
pg_clog from the old cluster to the new. So never mind that hypothesis.

FYI, we do freeze the new cluster that has only schema definitions, no
data.

Bruce noticed that the pg_dump/pg_migrator combo is failing to restore
pg_largeobject's relfrozenxid. We're not sure how this is causing the
errors Jamie is seeing, because what I think should happen is that scans
of the table should fail with failures to open pg_clog files
such-and-such, but not missing tuples ...

Yea, I can fix that in PG 8.4.1, but that doesn't seem like the cause of
the missing rows. Alvaro and I are still investigating.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#9Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#7)
1 attachment(s)
Re: [GENERAL] large object does not exist after pg_migrator

Alvaro Herrera wrote:

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Jamie Fox wrote:

I can also see that the pg_largeobject table is different, in the pg_restore
version the Rows (estimated) is 316286 and Rows (counted) is the same, in
the pg_migrator version the Rows (counted) is only 180507.

Wow, I didn't test large objects specifically, and I am confused why
there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

Maybe pg_largeobject is not getting frozen?

That would explain the change in count, but I thought we froze
_everything_, and had to.

After a quick chat with Bruce it was determined that we don't freeze
anything (it would be horrid for downtime if we did so in pg_migrator;
and it would be useless if ran anywhere else). What we do is migrate
pg_clog from the old cluster to the new. So never mind that hypothesis.

Bruce noticed that the pg_dump/pg_migrator combo is failing to restore
pg_largeobject's relfrozenxid. We're not sure how this is causing the
errors Jamie is seeing, because what I think should happen is that scans
of the table should fail with failures to open pg_clog files
such-and-such, but not missing tuples ...

Jamie, is it possible for you to apply the attached patch to the 8.4
server, install the new pg_dump, and run the test again to see if
pg_largeobject is fixed? This patch properly sets the relfrozenxid in
the system tables for each database.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/pgpatches/largeobjecttext/x-diffDownload
Index: src/bin/pg_dump/pg_dump.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.540
diff -c -c -r1.540 pg_dump.c
*** src/bin/pg_dump/pg_dump.c	2 Jul 2009 21:34:32 -0000	1.540
--- src/bin/pg_dump/pg_dump.c	14 Jul 2009 02:58:32 -0000
***************
*** 34,39 ****
--- 34,40 ----
  #include "access/sysattr.h"
  #include "catalog/pg_cast.h"
  #include "catalog/pg_class.h"
+ #include "catalog/pg_largeobject.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_trigger.h"
  #include "catalog/pg_type.h"
***************
*** 1732,1737 ****
--- 1733,1743 ----
  
  	if (binary_upgrade)
  	{
+ 		PGresult   *lo_res;
+ 		PQExpBuffer loFrozenQry = createPQExpBuffer();
+ 		PQExpBuffer loOutQry = createPQExpBuffer();
+ 		int			i_relfrozenxid;
+ 		
  		appendPQExpBuffer(creaQry, "\n-- For binary upgrade, set datfrozenxid.\n");
  		appendPQExpBuffer(creaQry, "UPDATE pg_catalog.pg_database\n"
  						  "SET datfrozenxid = '%u'\n"
***************
*** 1739,1744 ****
--- 1745,1788 ----
  						  frozenxid);
  		appendStringLiteralAH(creaQry, datname, AH);
  		appendPQExpBuffer(creaQry, ";\n");
+ 
+ 		/*
+ 		 *	pg_largeobject comes from the old system intact, so set
+ 		 *	its relfrozenxid.
+ 		 */
+ 
+ 		appendPQExpBuffer(loFrozenQry, "SELECT relfrozenxid\n"
+ 							"FROM pg_catalog.pg_class\n"
+ 							"WHERE oid = %d;\n",
+ 							LargeObjectRelationId);
+ 
+ 		lo_res = PQexec(g_conn, loFrozenQry->data);
+ 		check_sql_result(lo_res, g_conn, loFrozenQry->data, PGRES_TUPLES_OK);
+ 
+ 		if (PQntuples(lo_res) != 1)
+ 		{
+ 			write_msg(NULL, "dumpDatabase(): could not find pg_largeobject.relfrozenxid\n");
+ 			exit_nicely();
+ 		}
+ 
+ 		i_relfrozenxid = PQfnumber(lo_res, "relfrozenxid");
+ 
+ 		appendPQExpBuffer(loOutQry, "\n-- For binary upgrade, set pg_largeobject relfrozenxid.\n");
+ 		appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
+ 						  "SET relfrozenxid = '%u'\n"
+ 						  "WHERE oid = %d;\n",
+ 						  atoi(PQgetvalue(lo_res, 0, i_relfrozenxid)),
+ 						  LargeObjectRelationId);
+ 		ArchiveEntry(AH, nilCatalogId, createDumpId(),
+ 					 "pg_largeobject", NULL, NULL, "",
+ 					 false, "pg_largeobject", SECTION_PRE_DATA,
+ 					 loOutQry->data, "", NULL,
+ 					 NULL, 0,
+ 					 NULL, NULL);
+ 						  
+ 		PQclear(lo_res);
+ 		destroyPQExpBuffer(loFrozenQry);
+ 		destroyPQExpBuffer(loOutQry);
  	}
  
  	appendPQExpBuffer(delQry, "DROP DATABASE %s;\n",
#10Jamie Fox
jfox@directcommerce.com
In reply to: Bruce Momjian (#9)
Re: [GENERAL] large object does not exist after pg_migrator

On Mon, Jul 13, 2009 at 8:03 PM, Bruce Momjian <bruce@momjian.us> wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Jamie Fox wrote:

I can also see that the pg_largeobject table is different, in the

pg_restore

version the Rows (estimated) is 316286 and Rows (counted) is the

same, in

the pg_migrator version the Rows (counted) is only 180507.

Wow, I didn't test large objects specifically, and I am confused

why

there would be a count discrepancy. I will need to do some research
unless someone else can guess about the cause.

Maybe pg_largeobject is not getting frozen?

That would explain the change in count, but I thought we froze
_everything_, and had to.

After a quick chat with Bruce it was determined that we don't freeze
anything (it would be horrid for downtime if we did so in pg_migrator;
and it would be useless if ran anywhere else). What we do is migrate
pg_clog from the old cluster to the new. So never mind that hypothesis.

Bruce noticed that the pg_dump/pg_migrator combo is failing to restore
pg_largeobject's relfrozenxid. We're not sure how this is causing the
errors Jamie is seeing, because what I think should happen is that scans
of the table should fail with failures to open pg_clog files
such-and-such, but not missing tuples ...

Jamie, is it possible for you to apply the attached patch to the 8.4
server, install the new pg_dump, and run the test again to see if
pg_largeobject is fixed? This patch properly sets the relfrozenxid in
the system tables for each database.

Sorry for the confusion, an addendum meant to be helpful fell out of this
thread during the move from -general. I will try this patch now, but to be
clear - the number of rows in pg_largeobject changed after I ran vacuumlo
(that eventually failed).

Here's what I have found that got broken during pg_migrate: In two side by
side databases (an 8.3.7 copy and 8.4.0 migrated with pg_migrator) the
pg_largeobject table has the same number of rows. However, in the 8.4
database any select for an loid in pg_largeobject returns zero rows. If I
select all loids to a file, and compare to select all loids from 8.3.7
they're the same. When I select != an loid it seems to exclude the one and
return the rest, but all other comparisons <, > or = return zero rows. Or
I'm completely batty. Dereferencing via lo_open of blob_data (an oid) in
other tables fails in the 8.4 database with 'large object xxxxid does not
exist'.

Thanks again,

Jamie

#11Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jamie Fox (#10)
Re: [GENERAL] large object does not exist after pg_migrator

Jamie Fox wrote:

Here's what I have found that got broken during pg_migrate: In two side by
side databases (an 8.3.7 copy and 8.4.0 migrated with pg_migrator) the
pg_largeobject table has the same number of rows. However, in the 8.4
database any select for an loid in pg_largeobject returns zero rows. If I
select all loids to a file, and compare to select all loids from 8.3.7
they're the same. When I select != an loid it seems to exclude the one and
return the rest, but all other comparisons <, > or = return zero rows. Or
I'm completely batty. Dereferencing via lo_open of blob_data (an oid) in
other tables fails in the 8.4 database with 'large object xxxxid does not
exist'.

Oh, so maybe it's pg_largeobject's index that's borked ... Did you try
reindexing it?

How are we transferring pg_largeobject, and are we transferring its
index too?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#12Jamie Fox
jfox@directcommerce.com
In reply to: Alvaro Herrera (#11)
Re: [GENERAL] large object does not exist after pg_migrator

Here's what I have found that got broken during pg_migrate: In two side

by

side databases (an 8.3.7 copy and 8.4.0 migrated with pg_migrator) the
pg_largeobject table has the same number of rows. However, in the 8.4
database any select for an loid in pg_largeobject returns zero rows. If

I

select all loids to a file, and compare to select all loids from 8.3.7
they're the same. When I select != an loid it seems to exclude the one

and

return the rest, but all other comparisons <, > or = return zero rows.

Or

I'm completely batty. Dereferencing via lo_open of blob_data (an oid) in
other tables fails in the 8.4 database with 'large object xxxxid does not
exist'.

Oh, so maybe it's pg_largeobject's index that's borked ... Did you try
reindexing it?

How are we transferring pg_largeobject, and are we transferring its
index too?

Hi -
REINDEX INDEX pg_largeobject_loid_pn_index;

This seems to have fixed the problem, lo_open of lob data is working again -
now to see how vacuumlo likes it.

Thanks,

Jamie

#13Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jamie Fox (#12)
Re: [GENERAL] large object does not exist after pg_migrator

Jamie Fox wrote:

Hi -
REINDEX INDEX pg_largeobject_loid_pn_index;

This seems to have fixed the problem, lo_open of lob data is working again -
now to see how vacuumlo likes it.

So did it work?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#14Jamie Fox
jfox@directcommerce.com
In reply to: Alvaro Herrera (#13)
Re: [GENERAL] large object does not exist after pg_migrator

Worked great, vacuumlo finished, a vacuum -full finished amazingly quickly,
very exciting. We're pointing qa apps at it now for testing.
For some reason though, that index has to be rebuilt after running
pg_migrator.

I'll be testing on our 100GB+ prod copy shortly and will let you know if you
want.

Thanks,

Jamie

On Wed, Jul 15, 2009 at 9:28 AM, Alvaro Herrera
<alvherre@commandprompt.com>wrote:

Show quoted text

Jamie Fox wrote:

Hi -
REINDEX INDEX pg_largeobject_loid_pn_index;

This seems to have fixed the problem, lo_open of lob data is working

again -

now to see how vacuumlo likes it.

So did it work?

--
Alvaro Herrera
http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#15Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#13)
pg_migrator 8.4.1 alpha 1 released with bug mention

To more clearly identify that pg_migrator now has known bugs, I have
released pg_migrator 8.4.1 alpha1, and mentioned in the README that
there are known bugs related to migrating sequences and large objects.
I have removed the 8.4 source file from pgfoundry.

---------------------------------------------------------------------------

Alvaro Herrera wrote:

Jamie Fox wrote:

Hi -
REINDEX INDEX pg_largeobject_loid_pn_index;

This seems to have fixed the problem, lo_open of lob data is working again -
now to see how vacuumlo likes it.

So did it work?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#16Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#9)
1 attachment(s)
Re: [GENERAL] large object does not exist after pg_migrator

Bruce Momjian wrote:

After a quick chat with Bruce it was determined that we don't freeze
anything (it would be horrid for downtime if we did so in pg_migrator;
and it would be useless if ran anywhere else). What we do is migrate
pg_clog from the old cluster to the new. So never mind that hypothesis.

Bruce noticed that the pg_dump/pg_migrator combo is failing to restore
pg_largeobject's relfrozenxid. We're not sure how this is causing the
errors Jamie is seeing, because what I think should happen is that scans
of the table should fail with failures to open pg_clog files
such-and-such, but not missing tuples ...

Jamie, is it possible for you to apply the attached patch to the 8.4
server, install the new pg_dump, and run the test again to see if
pg_largeobject is fixed? This patch properly sets the relfrozenxid in
the system tables for each database.

I have applied the attached patch to have pg_dump restore
pg_largeobject.relfrozenxid in binary upgrade mode; backpatched to
8.4.X.

This doesn't fix the reported problem, but it is still a bug.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload
Index: src/bin/pg_dump/pg_dump.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.540
diff -c -c -r1.540 pg_dump.c
*** src/bin/pg_dump/pg_dump.c	2 Jul 2009 21:34:32 -0000	1.540
--- src/bin/pg_dump/pg_dump.c	20 Jul 2009 20:51:34 -0000
***************
*** 34,39 ****
--- 34,40 ----
  #include "access/sysattr.h"
  #include "catalog/pg_cast.h"
  #include "catalog/pg_class.h"
+ #include "catalog/pg_largeobject.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_trigger.h"
  #include "catalog/pg_type.h"
***************
*** 1739,1744 ****
--- 1740,1746 ----
  						  frozenxid);
  		appendStringLiteralAH(creaQry, datname, AH);
  		appendPQExpBuffer(creaQry, ";\n");
+ 
  	}
  
  	appendPQExpBuffer(delQry, "DROP DATABASE %s;\n",
***************
*** 1764,1769 ****
--- 1766,1816 ----
  				 NULL,			/* Dumper */
  				 NULL);			/* Dumper Arg */
  
+ 	/*
+ 	 *	pg_largeobject comes from the old system intact, so set
+ 	 *	its relfrozenxid.
+ 	 */
+ 	if (binary_upgrade)
+ 	{
+ 		PGresult   *lo_res;
+ 		PQExpBuffer loFrozenQry = createPQExpBuffer();
+ 		PQExpBuffer loOutQry = createPQExpBuffer();
+ 		int			i_relfrozenxid;
+ 		
+ 		appendPQExpBuffer(loFrozenQry, "SELECT relfrozenxid\n"
+ 							"FROM pg_catalog.pg_class\n"
+ 							"WHERE oid = %d;\n",
+ 							LargeObjectRelationId);
+ 
+ 		lo_res = PQexec(g_conn, loFrozenQry->data);
+ 		check_sql_result(lo_res, g_conn, loFrozenQry->data, PGRES_TUPLES_OK);
+ 
+ 		if (PQntuples(lo_res) != 1)
+ 		{
+ 			write_msg(NULL, "dumpDatabase(): could not find pg_largeobject.relfrozenxid\n");
+ 			exit_nicely();
+ 		}
+ 
+ 		i_relfrozenxid = PQfnumber(lo_res, "relfrozenxid");
+ 
+ 		appendPQExpBuffer(loOutQry, "\n-- For binary upgrade, set pg_largeobject relfrozenxid.\n");
+ 		appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
+ 						  "SET relfrozenxid = '%u'\n"
+ 						  "WHERE oid = %d;\n",
+ 						  atoi(PQgetvalue(lo_res, 0, i_relfrozenxid)),
+ 						  LargeObjectRelationId);
+ 		ArchiveEntry(AH, nilCatalogId, createDumpId(),
+ 					 "pg_largeobject", NULL, NULL, "",
+ 					 false, "pg_largeobject", SECTION_PRE_DATA,
+ 					 loOutQry->data, "", NULL,
+ 					 NULL, 0,
+ 					 NULL, NULL);
+ 						  
+ 		PQclear(lo_res);
+ 		destroyPQExpBuffer(loFrozenQry);
+ 		destroyPQExpBuffer(loOutQry);
+ 	}
+ 
  	/* Dump DB comment if any */
  	if (g_fout->remoteVersion >= 80200)
  	{
#17Bruce Momjian
bruce@momjian.us
In reply to: Jamie Fox (#12)
1 attachment(s)
Re: [GENERAL] large object does not exist after pg_migrator

Jamie Fox wrote:

Here's what I have found that got broken during pg_migrate: In two side

by

side databases (an 8.3.7 copy and 8.4.0 migrated with pg_migrator) the
pg_largeobject table has the same number of rows. However, in the 8.4
database any select for an loid in pg_largeobject returns zero rows. If

I

select all loids to a file, and compare to select all loids from 8.3.7
they're the same. When I select != an loid it seems to exclude the one

and

return the rest, but all other comparisons <, > or = return zero rows.

Or

I'm completely batty. Dereferencing via lo_open of blob_data (an oid) in
other tables fails in the 8.4 database with 'large object xxxxid does not
exist'.

Oh, so maybe it's pg_largeobject's index that's borked ... Did you try
reindexing it?

How are we transferring pg_largeobject, and are we transferring its
index too?

Hi -
REINDEX INDEX pg_largeobject_loid_pn_index;

This seems to have fixed the problem, lo_open of lob data is working again -
now to see how vacuumlo likes it.

I have applied the attached patch to pg_migrator to properly migrate the
pg_largeobject index. I have added large object comment migration as a
TODO item.

This eliminates the last known bug in pg_migrator.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload
? tools
? log
? src/pg_migrator
Index: TODO
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/TODO,v
retrieving revision 1.15
diff -c -r1.15 TODO
*** TODO	2 Jun 2009 15:15:38 -0000	1.15
--- TODO	20 Jul 2009 21:55:58 -0000
***************
*** 1,3 ****
--- 1,4 ----
  o  support migration from Postgres 8.2 to 8.4?
  o  create pg_dump custom format for rebuilds so it can be done in parallel
  o  remove copy_dir code, or use it
+ o  handle large object comments
Index: src/pg_migrator.h
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/pg_migrator.h,v
retrieving revision 1.64
diff -c -r1.64 pg_migrator.h
*** src/pg_migrator.h	20 Jul 2009 18:57:12 -0000	1.64
--- src/pg_migrator.h	20 Jul 2009 21:55:58 -0000
***************
*** 202,207 ****
--- 202,208 ----
  	char	   *pg_version_str;	/* string PG_VERSION of cluster */
  	Oid			pg_database_oid;		/* OID of pg_database relation */
  	Oid			pg_largeobject_oid;	/* OID of pg_largeobject relation */
+ 	Oid			pg_largeobject_index_oid;	/* OID of pg_largeobject index */
  	char	   *libpath;	/* pathname for cluster's pkglibdir */
  } ClusterInfo;
  
Index: src/relfilenode.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/relfilenode.c,v
retrieving revision 1.26
diff -c -r1.26 relfilenode.c
*** src/relfilenode.c	2 Jul 2009 23:30:27 -0000	1.26
--- src/relfilenode.c	20 Jul 2009 21:55:58 -0000
***************
*** 164,184 ****
  	{
  		DbInfo	   *new_db = &newdb_arr->dbs[dbnum];
  		DbInfo	   *old_db = dbarr_lookup_db(olddb_arr, new_db->db_name);
  		FileNameMap *mappings;
  		int			n_maps;
  		pageCnvCtx *pageConverter = NULL;
  
- 		assert(old_db);
- 
  		n_maps = 0;
  		mappings = gen_db_file_maps(ctx, old_db, new_db, &n_maps, old_pgdata,
  								   new_pgdata);
  
  		if (n_maps)
  		{
- 			char		old_file[MAXPGPATH];
- 			char		new_file[MAXPGPATH];
- 
  			print_maps(ctx, mappings, n_maps, new_db->db_name);
  
  #ifdef PAGE_CONVERSION
--- 164,181 ----
  	{
  		DbInfo	   *new_db = &newdb_arr->dbs[dbnum];
  		DbInfo	   *old_db = dbarr_lookup_db(olddb_arr, new_db->db_name);
+ 		char		old_file[MAXPGPATH];
+ 		char		new_file[MAXPGPATH];
  		FileNameMap *mappings;
  		int			n_maps;
  		pageCnvCtx *pageConverter = NULL;
  
  		n_maps = 0;
  		mappings = gen_db_file_maps(ctx, old_db, new_db, &n_maps, old_pgdata,
  								   new_pgdata);
  
  		if (n_maps)
  		{
  			print_maps(ctx, mappings, n_maps, new_db->db_name);
  
  #ifdef PAGE_CONVERSION
***************
*** 187,206 ****
  			transfer_single_new_db(ctx, pageConverter, mappings, n_maps);
  
  			pg_free(mappings);
- 
- 			/*
- 			 * The pg_largeobject system table is treated as a user table.
- 			 * Since we already know its OID we simply link it
- 			 */
- 			snprintf(old_file, sizeof(old_file), "%s/base/%u/%u", old_pgdata,
- 					 old_db->db_oid, ctx->old.pg_largeobject_oid);
- 			snprintf(new_file, sizeof(new_file), "%s/base/%u/%u", new_pgdata,
- 					 new_db->db_oid, ctx->new.pg_largeobject_oid);
- 
- 			unlink(new_file);
- 			transfer_relfile(ctx, pageConverter, old_file, new_file, "pg_catalog",
- 						"pg_largeobject", "pg_catalog", "pg_largeobject");
  		}
  	}
  
  	return msg;
--- 184,214 ----
  			transfer_single_new_db(ctx, pageConverter, mappings, n_maps);
  
  			pg_free(mappings);
  		}
+ 		
+ 		/*
+ 		 * The pg_largeobject system table is treated as a user table.
+ 		 * Since we already know its OID we simply link it
+ 		 */
+ 		snprintf(old_file, sizeof(old_file), "%s/base/%u/%u", old_pgdata,
+ 				 old_db->db_oid, ctx->old.pg_largeobject_oid);
+ 		snprintf(new_file, sizeof(new_file), "%s/base/%u/%u", new_pgdata,
+ 				 new_db->db_oid, ctx->new.pg_largeobject_oid);
+ 
+ 		unlink(new_file);
+ 		transfer_relfile(ctx, pageConverter, old_file, new_file, "pg_catalog",
+ 					"pg_largeobject", "pg_catalog", "pg_largeobject");
+ 
+ 		/* do the pg_largeobject index too */
+ 		snprintf(old_file, sizeof(old_file), "%s/base/%u/%u", old_pgdata,
+ 				 old_db->db_oid, ctx->old.pg_largeobject_index_oid);
+ 		snprintf(new_file, sizeof(new_file), "%s/base/%u/%u", new_pgdata,
+ 				 new_db->db_oid, ctx->new.pg_largeobject_index_oid);
+ 
+ 		unlink(new_file);
+ 		transfer_relfile(ctx, pageConverter, old_file, new_file, "pg_catalog",
+ 					"pg_largeobject_loid_pn_index", "pg_catalog",
+ 					"pg_largeobject_loid_pn_index");
  	}
  
  	return msg;
***************
*** 218,224 ****
  {
  	PGconn	   *conn = connectToServer(ctx, "template1", whichCluster);
  	PGresult   *res;
! 	int			relfile_fnum;
  
  	res = executeQueryOrDie(ctx, conn,
  							"SELECT c.relname, c.relfilenode "
--- 226,232 ----
  {
  	PGconn	   *conn = connectToServer(ctx, "template1", whichCluster);
  	PGresult   *res;
! 	int			i_relfile;
  
  	res = executeQueryOrDie(ctx, conn,
  							"SELECT c.relname, c.relfilenode "
***************
*** 227,245 ****
  							"WHERE 	c.relnamespace = n.oid AND "
  							"		n.nspname = 'pg_catalog' AND "
  							"		c.relname IN "
! 							"		('pg_database', 'pg_largeobject') "
  							"ORDER BY c.relname");
  
! 	relfile_fnum = PQfnumber(res, "relfilenode");
  	if (whichCluster == CLUSTER_OLD)
  	{
! 		ctx->old.pg_database_oid = atol(PQgetvalue(res, 0, relfile_fnum));
! 		ctx->old.pg_largeobject_oid = atol(PQgetvalue(res, 1, relfile_fnum));
  	}
  	else
  	{
! 		ctx->new.pg_database_oid = atol(PQgetvalue(res, 0, relfile_fnum));
! 		ctx->new.pg_largeobject_oid = atol(PQgetvalue(res, 1, relfile_fnum));
  	}
  
  	PQclear(res);
--- 235,256 ----
  							"WHERE 	c.relnamespace = n.oid AND "
  							"		n.nspname = 'pg_catalog' AND "
  							"		c.relname IN "
! 							"		('pg_database', 'pg_largeobject', "
! 							"		 'pg_largeobject_loid_pn_index') "
  							"ORDER BY c.relname");
  
! 	i_relfile = PQfnumber(res, "relfilenode");
  	if (whichCluster == CLUSTER_OLD)
  	{
! 		ctx->old.pg_database_oid = atol(PQgetvalue(res, 0, i_relfile));
! 		ctx->old.pg_largeobject_oid = atol(PQgetvalue(res, 1, i_relfile));
! 		ctx->old.pg_largeobject_index_oid = atol(PQgetvalue(res, 2, i_relfile));
  	}
  	else
  	{
! 		ctx->new.pg_database_oid = atol(PQgetvalue(res, 0, i_relfile));
! 		ctx->new.pg_largeobject_oid = atol(PQgetvalue(res, 1, i_relfile));
! 		ctx->new.pg_largeobject_index_oid = atol(PQgetvalue(res, 2, i_relfile));
  	}
  
  	PQclear(res);