pg_dump and thousands of schemas

Started by Hugo <Nabble>almost 14 years ago66 messageshackers
Jump to latest
#1Hugo <Nabble>
hugo.tech@gmail.com

Hi everyone,

We have a production database (postgresql 9.0) with more than 20,000 schemas
and 40Gb size. In the past we had all that information in just one schema
and pg_dump used to work just fine (2-3 hours to dump everything). Then we
decided to split the database into schemas, which makes a lot of sense for
the kind of information we store and the plans we have for the future. The
problem now is that pg_dump takes forever to finish (more than 24 hours) and
we just can't have consistent daily backups like we had in the past. When I
try to dump just one schema with almost nothing in it, it takes 12 minutes.
When I try to dump a big schema with lots of information, it takes 14
minutes. So pg_dump is clearly lost in the middle of so many schemas. The
load on the machine is low (it is a hot standby replica db) and we have good
configurations for memory, cache, shared_buffers and everything else. The
performance of the database itself is good, it is only pg_dump that is
inefficient for the task. I have found an old discussion back in 2007 that
seems to be quite related to this problem:

http://postgresql.1045698.n5.nabble.com/5-minutes-to-pg-dump-nothing-tp1888814.html

It seems that pg_dump hasn't been tested with a huge number of schemas like
that. Does anyone have a solution or suggestions? Do you know if there are
patches specific for this case?

Thanks in advance,
Hugo

-----
Official Nabble Administrator - we never ask for passwords.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-thousands-of-schemas-tp5709766.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

#2Craig James
cjames@emolecules.com
In reply to: Hugo <Nabble> (#1)
Re: pg_dump and thousands of schemas

On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo.tech@gmail.com> wrote:

Hi everyone,

We have a production database (postgresql 9.0) with more than 20,000
schemas
and 40Gb size. In the past we had all that information in just one schema
and pg_dump used to work just fine (2-3 hours to dump everything). Then we
decided to split the database into schemas, which makes a lot of sense for
the kind of information we store and the plans we have for the future. The
problem now is that pg_dump takes forever to finish (more than 24 hours)
and
we just can't have consistent daily backups like we had in the past. When I
try to dump just one schema with almost nothing in it, it takes 12 minutes.
When I try to dump a big schema with lots of information, it takes 14
minutes. So pg_dump is clearly lost in the middle of so many schemas. The
load on the machine is low (it is a hot standby replica db) and we have
good
configurations for memory, cache, shared_buffers and everything else. The
performance of the database itself is good, it is only pg_dump that is
inefficient for the task. I have found an old discussion back in 2007 that
seems to be quite related to this problem:

http://postgresql.1045698.n5.nabble.com/5-minutes-to-pg-dump-nothing-tp1888814.html

It seems that pg_dump hasn't been tested with a huge number of schemas like
that. Does anyone have a solution or suggestions? Do you know if there are
patches specific for this case?

How many total relations do you have? I don't know if there is a limit to
the number of schemas, but I suspect when you went from one schema to
20,000 schemas, you also went from N relations to 20000*N relations.

Somewhere between 100,000 and 1 million total relations, Postgres starts to
have trouble. See this thread:

http://permalink.gmane.org/gmane.comp.db.postgresql.performance/33254

(Why is it that Google can't find these archives on postgresql.org?)

Craig

Show quoted text

Thanks in advance,
Hugo

-----
Official Nabble Administrator - we never ask for passwords.
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/pg-dump-and-thousands-of-schemas-tp5709766.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Craig James (#2)
Re: pg_dump and thousands of schemas

On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames@emolecules.com> wrote:

On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo.tech@gmail.com> wrote:

Hi everyone,

We have a production database (postgresql 9.0) with more than 20,000
schemas
and 40Gb size. In the past we had all that information in just one schema
and pg_dump used to work just fine (2-3 hours to dump everything). Then we
decided to split the database into schemas, which makes a lot of sense for
the kind of information we store and the plans we have for the future. The
problem now is that pg_dump takes forever to finish (more than 24 hours)
and
we just can't have consistent daily backups like we had in the past. When
I
try to dump just one schema with almost nothing in it, it takes 12
minutes.

Sorry, your original did not show up here, so I'm piggy-backing on
Craig's reply.

Is dumping just one schema out of thousands an actual use case, or is
it just an attempt to find a faster way to dump all the schemata
through a back door?

pg_dump itself seems to have a lot of quadratic portions (plus another
one on the server which it hits pretty heavily), and it hard to know
where to start addressing them. It seems like addressing the overall
quadratic nature might be a globally better option, but addressing
just the problem with dumping one schema might be easier to kluge
together.

When I try to dump a big schema with lots of information, it takes 14
minutes. So pg_dump is clearly lost in the middle of so many schemas. The
load on the machine is low (it is a hot standby replica db) and we have
good
configurations for memory, cache, shared_buffers and everything else. The
performance of the database itself is good, it is only pg_dump that is
inefficient for the task. I have found an old discussion back in 2007 that
seems to be quite related to this problem:

http://postgresql.1045698.n5.nabble.com/5-minutes-to-pg-dump-nothing-tp1888814.html

It seems that pg_dump hasn't been tested with a huge number of schemas
like
that. Does anyone have a solution or suggestions? Do you know if there are
patches specific for this case?

How many total relations do you have?  I don't know if there is a limit to
the number of schemas, but I suspect when you went from one schema to 20,000
schemas, you also went from N relations to 20000*N relations.

Yes, that might be important to know--whether the total number of
relations changed, or just their distribution amongst the schemata.

Cheers,

Jeff

#4Bruce Momjian
bruce@momjian.us
In reply to: Jeff Janes (#3)
Re: pg_dump and thousands of schemas

On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:

On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames@emolecules.com> wrote:

On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo.tech@gmail.com> wrote:

Hi everyone,

We have a production database (postgresql 9.0) with more than 20,000
schemas
and 40Gb size. In the past we had all that information in just one schema
and pg_dump used to work just fine (2-3 hours to dump everything). Then we
decided to split the database into schemas, which makes a lot of sense for
the kind of information we store and the plans we have for the future. The
problem now is that pg_dump takes forever to finish (more than 24 hours)
and
we just can't have consistent daily backups like we had in the past. When
I
try to dump just one schema with almost nothing in it, it takes 12
minutes.

Sorry, your original did not show up here, so I'm piggy-backing on
Craig's reply.

Is dumping just one schema out of thousands an actual use case, or is
it just an attempt to find a faster way to dump all the schemata
through a back door?

pg_dump itself seems to have a lot of quadratic portions (plus another
one on the server which it hits pretty heavily), and it hard to know
where to start addressing them. It seems like addressing the overall
quadratic nature might be a globally better option, but addressing
just the problem with dumping one schema might be easier to kluge
together.

Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#5Hugo <Nabble>
hugo.tech@gmail.com
In reply to: Bruce Momjian (#4)
Re: pg_dump and thousands of schemas

Thanks for the replies. The number of relations in the database is really
high (~500,000) and I don't think we can shrink that. The truth is that
schemas bring a lot of advantages to our system and postgresql doesn't show
signs of stress with them. So I believe it should also be possible for
pg_dump to handle them with the same elegance.

Dumping just one schema out of thousands was indeed an attempt to find a
faster way to backup the database. I don't mind creating a shell script or
program that dumps every schema individually as long as each dump is fast
enough to keep the total time within a few hours. But since each dump
currently takes at least 12 minutes, that just doesn't work. I have been
looking at the source of pg_dump in order to find possible improvements, but
this will certainly take days or even weeks. We will probably have to use
'tar' to compress the postgresql folder as the backup solution for now until
we can fix pg_dump or wait for postgresql 9.2 to become the official version
(as long as I don't need a dump and restore to upgrade the db).

If anyone has more suggestions, I would like to hear them. Thank you!

Regards,
Hugo

--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-thousands-of-schemas-tp5709766p5709975.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

#6Ondrej Ivanič
ondrej.ivanic@gmail.com
In reply to: Hugo <Nabble> (#5)
Re: pg_dump and thousands of schemas

Hi,

On 25 May 2012 14:54, Hugo <Nabble> <hugo.tech@gmail.com> wrote:

Thanks for the replies. The number of relations in the database is really
high (~500,000) and I don't think we can shrink that. The truth is that
schemas bring a lot of advantages to our system and postgresql doesn't show
signs of stress with them. So I believe it should also be possible for
pg_dump to handle them with the same elegance.

If anyone has more suggestions, I would like to hear them. Thank you!

Maybe filesystem level backup could solve this issue:
http://www.postgresql.org/docs/9.1/static/continuous-archiving.html#BACKUP-BASE-BACKUP

but keep in mind that:
- it preserves bloat in your database thus backup might need more space
- you can't restore to different PG version

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hugo <Nabble> (#5)
Re: pg_dump and thousands of schemas

"Hugo <Nabble>" <hugo.tech@gmail.com> writes:

If anyone has more suggestions, I would like to hear them. Thank you!

Provide a test case?

We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
extremely specific cases that might or might not have anything to do
with what you're seeing. The complainant was extremely helpful about
tracking down the problems:
http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php

regards, tom lane

#8Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
Re: pg_dump and thousands of schemas

On Fri, May 25, 2012 at 10:41:23AM -0400, Tom Lane wrote:

"Hugo <Nabble>" <hugo.tech@gmail.com> writes:

If anyone has more suggestions, I would like to hear them. Thank you!

Provide a test case?

We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
extremely specific cases that might or might not have anything to do
with what you're seeing. The complainant was extremely helpful about
tracking down the problems:
http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php

Yes, please help us improve this! At this point pg_upgrade is limited
by the time to dump/restore the database schema, but I can't get users
to give me any way to debug the speed problems.

Someone reported pg_upgrade took 45 minutes because of pg_dumpall
--schema, which is quite long.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#9Jeff Janes
jeff.janes@gmail.com
In reply to: Bruce Momjian (#4)
Re: pg_dump and thousands of schemas

On Thu, May 24, 2012 at 8:54 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:

pg_dump itself seems to have a lot of quadratic portions (plus another
one on the server which it hits pretty heavily), and it hard to know
where to start addressing them.  It seems like addressing the overall
quadratic nature might be a globally better option, but addressing
just the problem with dumping one schema might be easier to kluge
together.

Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.

Those speed ups don't seem to apply here, though. I get the same
performance in 9.0.7 as 9.2.beta1.

There is an operation in pg_dump which is O(#_of_schemata_in_db *
#_of_table_in_db), or something like that.

The attached very crude patch reduces that to
O(log_of_#_of_schemata_in_db * #_of_table_in_db)

I was hoping this would be a general improvement. It doesn't seem be.
But it is a very substantial improvement in the specific case of
dumping one small schema out of a very large database.

It seems like dumping one schema would be better optimized by not
loading up the entire database catalog, but rather by restricting to
just that schema at the catalog stage. But I haven't dug into those
details.

For dumping entire databases, It looks like the biggest problem is
going to be LockReassignCurrentOwner in the server. And that doesn't
seem to be easy to fix, as any change to it to improve pg_dump will
risk degrading normal use cases.

If we want to be able to efficiently dump entire databases in a
scalable way, it seems like there should be some way to obtain a
data-base-wide AccessShare lock, which blocks AccessExclusive locks on
any object in the database, and turns ordinary object-level
AccessShare lock requests into no-ops. I don't think you can get
hundreds of thousands of simultaneously held and individually recorded
AccessShare locks without causing bad things to happen.

Cheers,

Jeff

Attachments:

pg_dump_ns_search.patchapplication/octet-stream; name=pg_dump_ns_search.patchDownload+24-2
#10Jeff Janes
jeff.janes@gmail.com
In reply to: Bruce Momjian (#8)
Re: pg_dump and thousands of schemas

On Fri, May 25, 2012 at 8:18 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, May 25, 2012 at 10:41:23AM -0400, Tom Lane wrote:

"Hugo <Nabble>" <hugo.tech@gmail.com> writes:

If anyone has more suggestions, I would like to hear them. Thank you!

Provide a test case?

We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
extremely specific cases that might or might not have anything to do
with what you're seeing.  The complainant was extremely helpful about
tracking down the problems:
http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php

Yes, please help us improve this!  At this point pg_upgrade is limited
by the time to dump/restore the database schema, but I can't get users
to give me any way to debug the speed problems.

For dumping one small schema from a large database, look at the time
progression of this:

dropdb foo; createdb foo;

for f in `seq 0 10000 1000000`; do
perl -le 'print "create schema foo$_; create table foo$_.foo (k
integer, v integer);"
foreach $ARGV[0]..$ARGV[0]+9999' $f | psql -d foo > /dev/null ;
time pg_dump foo -Fc -n foo1 | wc -c;
done >& dump_one_schema_timing

To show the overall dump speed problem, drop the "-n foo1", and change
the step size from 10000/9999 down to 1000/999

Cheers,

Jeff

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#9)
Re: pg_dump and thousands of schemas

Jeff Janes <jeff.janes@gmail.com> writes:

There is an operation in pg_dump which is O(#_of_schemata_in_db *
#_of_table_in_db), or something like that.
The attached very crude patch reduces that to
O(log_of_#_of_schemata_in_db * #_of_table_in_db)

I was hoping this would be a general improvement. It doesn't seem be.
But it is a very substantial improvement in the specific case of
dumping one small schema out of a very large database.

Your test case in
<CAMkU=1zedM4VyLVyLuVmoekUnUXkXfnGPer+3bvPm-A_9CNYSA@mail.gmail.com>
shows pretty conclusively that findNamespace is a time sink for large
numbers of schemas, so that seems worth fixing. I don't like this
patch though: we already have infrastructure for this in pg_dump,
namely buildIndexArray/findObjectByOid, so what we should do is use
that not invent something new. I will go see about doing that.

It seems like dumping one schema would be better optimized by not
loading up the entire database catalog, but rather by restricting to
just that schema at the catalog stage.

The reason pg_dump is not built that way is that considerations like
dump order dependencies are not going to work at all if it only looks
at a subset of the database. Of course, dependency chains involving
objects not dumped might be problematic anyway, but I'd still want it
to do the best it could.

For dumping entire databases, It looks like the biggest problem is
going to be LockReassignCurrentOwner in the server. And that doesn't
seem to be easy to fix, as any change to it to improve pg_dump will
risk degrading normal use cases.

I didn't try profiling the server side, but pg_dump doesn't use
subtransactions so it's not clear to me why LockReassignCurrentOwner
would get called at all ...

regards, tom lane

#12Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#11)
Re: pg_dump and thousands of schemas

On Fri, May 25, 2012 at 9:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

For dumping entire databases, It looks like the biggest problem is
going to be LockReassignCurrentOwner in the server.  And that doesn't
seem to be easy to fix, as any change to it to improve pg_dump will
risk degrading normal use cases.

I didn't try profiling the server side, but pg_dump doesn't use
subtransactions so it's not clear to me why LockReassignCurrentOwner
would get called at all ...

I thought that every select statement in a repeatable read transaction
ran in a separate "portal", and that a portal is a flavor of
subtransaction. Anyway, it does show up at the top of a profile of
the server, so it is getting called somehow.

Cheers,

Jeff

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#9)
Re: pg_dump and thousands of schemas

Jeff Janes <jeff.janes@gmail.com> writes:

For dumping entire databases, It looks like the biggest problem is
going to be LockReassignCurrentOwner in the server. And that doesn't
seem to be easy to fix, as any change to it to improve pg_dump will
risk degrading normal use cases.

If we want to be able to efficiently dump entire databases in a
scalable way, it seems like there should be some way to obtain a
data-base-wide AccessShare lock, which blocks AccessExclusive locks on
any object in the database, and turns ordinary object-level
AccessShare lock requests into no-ops.

I thought a little bit about that, but it seems fairly unworkable.
In the first place, pg_dump doesn't necessarily want lock on every table
in the database. In the second, such a lock mechanism would have
logical difficulties, notably whether it would be considered to apply to
tables created after the lock request occurs. If it does, then it would
effectively block all such creations (since creation takes exclusive
locks that ought to conflict). If it doesn't, how would you implement
that? In any case, we'd be adding significant cost and complexity to
lock acquisition operations, for something that only whole-database
pg_dump operations could conceivably make use of.

As far as the specific problem at hand goes, I think there might be a
less invasive solution. I poked into the behavior with gdb (and you're
right, LockReassignCurrentOwner does get called during portal drop)
and noted that although pg_dump is indeed holding thousands of locks,
any given statement that it issues touches only a few of them. So the
loop in LockReassignCurrentOwner iterates over the whole lock table but
does something useful at only a few entries.

We could fix things for this usage pattern with what seems to me to
be a pretty low-overhead method: add a fixed-size array to
ResourceOwners in which we can remember up to N LOCALLOCKs, for N around
10 or so. Add a LOCALLOCK to that array when we add the ResourceOwner to
that LOCALLOCK, so long as the array hasn't overflowed. (If the array
does overflow, we mark it as overflowed and stop adding entries.) Then,
in LockReassignCurrentOwner, we only iterate over the whole hash table
if the ResourceOwner's array has overflowed. If it hasn't, use the
array to visit just the LOCALLOCKs that need work.

Comments?

regards, tom lane

#14Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#13)
Re: pg_dump and thousands of schemas

On Fri, May 25, 2012 at 1:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

For dumping entire databases, It looks like the biggest problem is
going to be LockReassignCurrentOwner in the server.  And that doesn't
seem to be easy to fix, as any change to it to improve pg_dump will
risk degrading normal use cases.

If we want to be able to efficiently dump entire databases in a
scalable way, it seems like there should be some way to obtain a
data-base-wide AccessShare lock, which blocks AccessExclusive locks on
any object in the database, and turns ordinary object-level
AccessShare lock requests into no-ops.

I thought a little bit about that, but it seems fairly unworkable.
In the first place, pg_dump doesn't necessarily want lock on every table
in the database.

The database-wide method could be invoked only when there are no
options given to pg_dump that limit to a subset. Or does that not
resolve the objection?

 In the second, such a lock mechanism would have
logical difficulties, notably whether it would be considered to apply to
tables created after the lock request occurs.  If it does, then it would
effectively block all such creations (since creation takes exclusive
locks that ought to conflict).

That seems acceptable to me. With unrestricted dump, almost all other
DDL is locked out already, I don't know that locking out one more
thing is that big a deal. Especially if there is some way to
circumvent the use of that feature.

If it doesn't, how would you implement
that?  In any case, we'd be adding significant cost and complexity to
lock acquisition operations, for something that only whole-database
pg_dump operations could conceivably make use of.

Before Robert's fast-path locks were developed, I wanted a way to put
the server into 'stable schema' mode where AccessExclusive locks were
forbidden and AccessShared were no-ops, just for performance reasons.
Now with fast-path, that might no longer be a meaningful feature.

If databases scale out a lot, won't max_locks_per_transaction, and the
amount of shared memory it would require to keep increasing it, become
a substantial problem?

As far as the specific problem at hand goes, I think there might be a
less invasive solution.  I poked into the behavior with gdb (and you're
right, LockReassignCurrentOwner does get called during portal drop)
and noted that although pg_dump is indeed holding thousands of locks,
any given statement that it issues touches only a few of them.  So the
loop in LockReassignCurrentOwner iterates over the whole lock table but
does something useful at only a few entries.

We could fix things for this usage pattern with what seems to me to
be a pretty low-overhead method: add a fixed-size array to
ResourceOwners in which we can remember up to N LOCALLOCKs, for N around
10 or so.

I had thought along these terms too. I think 10 would capture most of
the gain. with pg_dump, so far I see a huge number of resource owners
with maximum number of locks being 0, 2 or 4, and only a handful with
more than 4. Of course I haven't looked at all use cases.

The reason we want to limit at all is not memory, but rather so that
explicitly removing locks doesn't have to dig through a large list to
find the specific one to remove, therefore become quadratic in the
case that many locks are explicitly removed, right? Does anyone ever
add a bunch of locks, and then afterward go through and explicitly
remove them all in FIFO order? I think most users would either remove
them LIFO, or drop them in bulk. But better safe than sorry.

 Add a LOCALLOCK to that array when we add the ResourceOwner to
that LOCALLOCK, so long as the array hasn't overflowed.  (If the array
does overflow, we mark it as overflowed and stop adding entries.)  Then,
in LockReassignCurrentOwner, we only iterate over the whole hash table
if the ResourceOwner's array has overflowed.  If it hasn't, use the
array to visit just the LOCALLOCKs that need work.

Comments?

I have some basic parts of this already coded up. I can try to finish
coding this up for CF next or next+1. I'm not yet sure how to avoid
weakening the boundary between resowner.c and lock.c, my original code
was pretty ugly there, as it was just a proof of concept.

What would be a situation that might be adversely affected by the
overhead of such a change? I think pgbench -S except implemented in a
plpgsql loop would probably do it.

Cheers,

Jeff

#15Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#7)
Re: pg_dump and thousands of schemas

"Hugo <Nabble>" <hugo.tech@gmail.com> writes:

If anyone has more suggestions, I would like to hear them. Thank you!

Provide a test case?

We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
extremely specific cases that might or might not have anything to do
with what you're seeing. The complainant was extremely helpful about
tracking down the problems:
http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php

I'm wondering if these fixes (or today's commit) include the case for
a database has ~100 thounsands of tables, indexes. One of my customers
has had troubles with pg_dump for the database, it takes over 10
hours.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

#16Hugo <Nabble>
hugo.tech@gmail.com
In reply to: Tom Lane (#7)
Re: pg_dump and thousands of schemas

Here is a sample dump that takes a long time to be written by pg_dump:
http://postgresql.1045698.n5.nabble.com/file/n5710183/test.dump.tar.gz
test.dump.tar.gz
(the file above has 2.4Mb, the dump itself has 66Mb)

This database has 2,311 schemas similar to those in my production database.
All schemas are empty, but pg_dump still takes 3 hours to finish it on my
computer. So now you can imagine my production database with more than
20,000 schemas like that. Can you guys take a look and see if the code has
room for improvements? I generated this dump with postgresql 9.1 (which is
what I have on my local computer), but my production database uses
postgresql 9.0. So it would be great if improvements could be delivered to
version 9.0 as well.

Thanks a lot for all the help!

Hugo

--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-thousands-of-schemas-tp5709766p5710183.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

#17Jeff Janes
jeff.janes@gmail.com
In reply to: Hugo <Nabble> (#16)
Re: pg_dump and thousands of schemas

On Sat, May 26, 2012 at 9:12 PM, Hugo <Nabble> <hugo.tech@gmail.com> wrote:

Here is a sample dump that takes a long time to be written by pg_dump:
http://postgresql.1045698.n5.nabble.com/file/n5710183/test.dump.tar.gz
test.dump.tar.gz
(the file above has 2.4Mb, the dump itself has 66Mb)

This database has 2,311 schemas similar to those in my production database.
All schemas are empty,

This dump does not reload cleanly. It uses many roles which it
doesn't create. Also, the schemata are not empty, they have about 20
tables apiece.

I created the missing roles with all default options.

Doing a default pg_dump took 66 minutes.

but pg_dump still takes 3 hours to finish it on my
computer. So now you can imagine my production database with more than
20,000 schemas like that. Can you guys take a look and see if the code has
room for improvements?

There is a quadratic behavior in pg_dump's "mark_create_done". This
should probably be fixed, but in the mean time it can be circumvented
by using -Fc rather than -Fp for the dump format. Doing that removed
17 minutes from the run time.

I'm working on a patch to reduce the LockReassignCurrentOwner problem
in the server when using pg_dump with lots of objects. Using a
preliminary version for this, in conjunction with -Fc, reduced the
dump time to 3.5 minutes.

Cheers,

Jeff

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#17)
Re: pg_dump and thousands of schemas

Jeff Janes <jeff.janes@gmail.com> writes:

There is a quadratic behavior in pg_dump's "mark_create_done". This
should probably be fixed, but in the mean time it can be circumvented
by using -Fc rather than -Fp for the dump format. Doing that removed
17 minutes from the run time.

Hmm, that would just amount to postponing the work from pg_dump to
pg_restore --- although I suppose it could be a win if the dump is for
backup purposes and you probably won't ever have to restore it.
inhibit_data_for_failed_table() has the same issue, though perhaps it's
less likely to be exercised; and there is a previously noted O(N^2)
behavior for the loop around repoint_table_dependencies.

We could fix these things by setting up index arrays that map dump ID
to TocEntry pointer and dump ID of a table to dump ID of its TABLE DATA
TocEntry. The first of these already exists (tocsByDumpId) but is
currently built only if doing parallel restore. We'd have to build it
all the time to use it for fixing mark_create_done. Still, the extra
space is small compared to the size of the TocEntry data structures,
so I don't see that that's a serious objection.

I have nothing else to do right now so am a bit tempted to go fix this.

I'm working on a patch to reduce the LockReassignCurrentOwner problem
in the server when using pg_dump with lots of objects.

Cool.

regards, tom lane

#19Hugo <Nabble>
hugo.tech@gmail.com
In reply to: Tom Lane (#18)
Re: pg_dump and thousands of schemas

Thanks again for the hard work, guys.

When I said that the schemas were empty, I was talking about data, not
tables. So you are right that each schema has ~20 tables (plus indices,
sequences, etc.), but pretty much no data (maybe one or two rows at most).
Data doesn't seem to be so important in this case (I may be wrong though),
so the sample database should be enough to find the weak spots that need
attention.

but in the mean time it can be circumvented
by using -Fc rather than -Fp for the dump format.
Doing that removed 17 minutes from the run time.

We do use -Fc in our production server, but it doesn't help much (dump time
still > 24 hours). Actually, I tried several different dump options without
success. It seems that you guys are very close to great improvements here.
Thanks for everything!

Best,
Hugo

--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-thousands-of-schemas-tp5709766p5710341.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

#20Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#15)
Re: pg_dump and thousands of schemas

We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
extremely specific cases that might or might not have anything to do
with what you're seeing. The complainant was extremely helpful about
tracking down the problems:
http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php

I'm wondering if these fixes (or today's commit) include the case for
a database has ~100 thounsands of tables, indexes. One of my customers
has had troubles with pg_dump for the database, it takes over 10
hours.

So I did qucik test with old PostgreSQL 9.0.2 and current (as of
commit 2755abf386e6572bad15cb6a032e504ad32308cc). In a fresh initdb-ed
database I created 100,000 tables, and each has two integer
attributes, one of them is a primary key. Creating tables were
resonably fast as expected (18-20 minutes). This created a 1.4GB
database cluster.

pg_dump dbname >/dev/null took 188 minutes on 9.0.2, which was pretty
long time as the customer complained. Now what was current? Well it
took 125 minutes. Ps showed that most of time was spent in backend.

Below is the script to create tables.

cnt=100000
while [ $cnt -gt 0 ]
do
psql -e -p 5432 -c "create table t$cnt(i int primary key, j int);" test
cnt=`expr $cnt - 1`
done

p.s. You need to increate max_locks_per_transaction before running
pg_dump (I raised to 640 in my case).
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#20)
#22Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#21)
#23Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#21)
#24Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#24)
#26Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#26)
#28Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#25)
#29Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#20)
#30Robert Klemme
shortcutter@googlemail.com
In reply to: Tatsuo Ishii (#29)
#31Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Robert Klemme (#30)
#32Robert Klemme
shortcutter@googlemail.com
In reply to: Tatsuo Ishii (#31)
#33Claudio Freire
klaussfreire@gmail.com
In reply to: Robert Klemme (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#34)
#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#35)
#37Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#36)
#38Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#36)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#36)
#40Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#38)
#42Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#41)
#43Jeff Janes
jeff.janes@gmail.com
In reply to: Tatsuo Ishii (#23)
#44Jeff Janes
jeff.janes@gmail.com
In reply to: Jeff Janes (#43)
#45Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Jeff Janes (#44)
#46Jeff Janes
jeff.janes@gmail.com
In reply to: Tatsuo Ishii (#45)
#47Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Jeff Janes (#46)
#48Hugo <Nabble>
hugo.tech@gmail.com
In reply to: Tatsuo Ishii (#47)
#49Bruce Momjian
bruce@momjian.us
In reply to: Tatsuo Ishii (#24)
#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#49)
#51Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#50)
#52Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#50)
#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#52)
#54Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#53)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Janes (#54)
#56Denis
socsam@gmail.com
In reply to: Robert Haas (#55)
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Denis (#56)
#58Denis
socsam@gmail.com
In reply to: Tom Lane (#57)
#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Denis (#58)
#60Denis
socsam@gmail.com
In reply to: Tom Lane (#59)
#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Denis (#60)
#62Denis
socsam@gmail.com
In reply to: Tom Lane (#61)
#63Jeff Janes
jeff.janes@gmail.com
In reply to: Denis (#62)
#64Denis
socsam@gmail.com
In reply to: Jeff Janes (#63)
#65Jeff Janes
jeff.janes@gmail.com
In reply to: Denis (#64)
#66Andrew Dunstan
andrew@dunslane.net
In reply to: Denis (#64)