pg_restore oddity?

Started by Mario Weilguniover 18 years ago4 messages

mweilguni@sime.com

over 18 years ago

There's a IMO a problem with pg_restore, it should be easy to fix (I
hope - and I could try to fix it and send a patch).
* I've a dump taken from a 8.1 database
* I'm using gist and ltree
* I'm restoring to a 8.2 database

Problem:
I cannot use "-1" for performance, because some gist stuff has changed
and the restore fails. But there seems to be no option for pg_restore to
use transactions for data restore, so it's very very slow (one million
records, each obviously in it's own transaction - because a separate
session "select count(1) from logins" shows a growing number).

It would be nice to use transactions for the data stuff itself, but not
for schema changes or functions. I know I can use separate pg_restore
runs for schema and data, but it's complicated IMHO.

I see several options:
* Use transactions for data, maybe with a separate command line option
* Use transactions everytime, and place savepoints to recover from errors?

Any ideas what I could do?

Regards
Mario

Heikki Linnakangas

heikki@enterprisedb.com

over 18 years ago

In reply to: Mario Weilguni (#1)

Re: pg_restore oddity?

Mario Weilguni wrote:

I cannot use "-1" for performance, because some gist stuff has changed
and the restore fails. But there seems to be no option for pg_restore to
use transactions for data restore, so it's very very slow (one million
records, each obviously in it's own transaction - because a separate
session "select count(1) from logins" shows a growing number).

By default, pg_dump/pg_restore uses a COPY command for each table, and
each COPY executes as a single transaction, so you shouldn't see the row
count growing like that. Is the dump file in --inserts format?

It would be nice to use transactions for the data stuff itself, but not
for schema changes or functions. I know I can use separate pg_restore
runs for schema and data, but it's complicated IMHO.

pg_restore -s foo
pg_restore -a -1 foo

doesn't seem too complicated to me. Am I missing something?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

mweilguni@sime.com

over 18 years ago

In reply to: Heikki Linnakangas (#2)

Re: pg_restore oddity?

Heikki Linnakangas schrieb:

Mario Weilguni wrote:

I cannot use "-1" for performance, because some gist stuff has changed
and the restore fails. But there seems to be no option for pg_restore to
use transactions for data restore, so it's very very slow (one million
records, each obviously in it's own transaction - because a separate
session "select count(1) from logins" shows a growing number).

By default, pg_dump/pg_restore uses a COPY command for each table, and
each COPY executes as a single transaction, so you shouldn't see the row
count growing like that. Is the dump file in --inserts format?

You are right, it was my fault. I was confused about the pg_dump syntax,
and used "-d" (the "-d" because pg_restore needs it for the destination
database, not the dump itself), so it was using "--inserts".

Everything is working fine. I've done dump/restores cycles a hundreds
times, and now such a mistake. I can't believe it.
Seems like I need to take some vacations.

Thanks for the help!

Florian G. Pflug

fgp@phlo.org

over 18 years ago

In reply to: Heikki Linnakangas (#2)

Re: pg_restore oddity?

Heikki Linnakangas wrote:

Mario Weilguni wrote:

I cannot use "-1" for performance, because some gist stuff has changed
and the restore fails. But there seems to be no option for pg_restore to
use transactions for data restore, so it's very very slow (one million
records, each obviously in it's own transaction - because a separate
session "select count(1) from logins" shows a growing number).

By default, pg_dump/pg_restore uses a COPY command for each table, and
each COPY executes as a single transaction, so you shouldn't see the row
count growing like that. Is the dump file in --inserts format?

It would be nice to use transactions for the data stuff itself, but not
for schema changes or functions. I know I can use separate pg_restore
runs for schema and data, but it's complicated IMHO.

pg_restore -s foo
pg_restore -a -1 foo

doesn't seem too complicated to me. Am I missing something?

Doesn't pg_restore create the indices *after* loading the data if you let it
restore the schema *and* the data in one step? The above workaround would
disable that optimization, thereby making the data-restore phase much more costly.

Now that I think about it, I remember that I've often whished that we not only
had --schema-only and --data-only, but also --schema-unconstrained-only and
--constraints-only.

regards, Florian Pflug