15 pg_upgrade with -j

Started by Jeff Rossabout 3 years ago14 messagesgeneral

jross@openvistas.net

about 3 years ago

Hello!

We are moving from 10 to 15 and are in testing now.

Our development database is about 1400G and takes 12 minutes to complete
a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7
server with 80 cores.

Adding -j 40 to use half of those cores also finishes in 12 minutes and
ps / top/ htop never show more than a single process at a time in use.

Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.

Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.

If -j 40 helps there--why not with pg_upgrade?

The full commands we are using for pg_upgrade are pretty stock:

time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80

Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we can
do to knock that time down we will and any suggestions to do so would be
greatly appreciated.

Jeff Ross

Adrian Klaver

adrian.klaver@aklaver.com

about 3 years ago

In reply to: Jeff Ross (#1)

Re: 15 pg_upgrade with -j

On 5/22/23 16:20, Jeff Ross wrote:

Hello!

We are moving from 10 to 15 and are in testing now.

Our development database is about 1400G and takes 12 minutes to complete
a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7
server with 80 cores.

Adding -j 40 to use half of those cores also finishes in 12 minutes and
ps / top/ htop never show more than a single process at a time in use.

Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.

Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.

If -j 40 helps there--why not with pg_upgrade?

From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

The full commands we are using for pg_upgrade are pretty stock:

time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80

Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we can
do to knock that time down we will and any suggestions to do so would be
greatly appreciated.

Jeff Ross

--
Adrian Klaver
adrian.klaver@aklaver.com

Jeff Ross

jross@openvistas.net

about 3 years ago

In reply to: Adrian Klaver (#2)

Re: 15 pg_upgrade with -j

On 5/22/23 5:24 PM, Adrian Klaver wrote:

On 5/22/23 16:20, Jeff Ross wrote:

Hello!

We are moving from 10 to 15 and are in testing now.

Our development database is about 1400G and takes 12 minutes to
complete a pg_upgrade with the -k (hard-links) version. This is on a
CentOS 7 server with 80 cores.

Adding -j 40 to use half of those cores also finishes in 12 minutes
and ps / top/ htop never show more than a single process at a time in
use.

Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.

Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.

If -j 40 helps there--why not with pg_upgrade?

From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

The full commands we are using for pg_upgrade are pretty stock:

time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up
-k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up
-k -j 80

Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we
can do to knock that time down we will and any suggestions to do so
would be greatly appreciated.

Jeff Ross

Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?

Jeff

Tom Lane

tgl@sss.pgh.pa.us

about 3 years ago

In reply to: Jeff Ross (#3)

Re: 15 pg_upgrade with -j

Jeff Ross <jross@openvistas.net> writes:

On 5/22/23 5:24 PM, Adrian Klaver wrote:

So is the 1400G mostly in one database in the cluster?

Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?

AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces. It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem. Maybe that could use
rethinking, not sure.

regards, tom lane

Adrian Klaver

adrian.klaver@aklaver.com

about 3 years ago

In reply to: Jeff Ross (#3)

Re: 15 pg_upgrade with -j

On 5/22/23 16:29, Jeff Ross wrote:

On 5/22/23 5:24 PM, Adrian Klaver wrote:

On 5/22/23 16:20, Jeff Ross wrote:

Hello!

From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

The full commands we are using for pg_upgrade are pretty stock:

Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?

As I understand it no. That the parallelism is between databases not
within a database. Further that 'database schemas' refers to schema as
the overall database object definitions not the namespaces known as
schemas in the database.

'
Jeff

--
Adrian Klaver
adrian.klaver@aklaver.com

Ron

ronljohnsonjr@gmail.com

about 3 years ago

In reply to: Tom Lane (#4)

Re: 15 pg_upgrade with -j

On 5/22/23 18:42, Tom Lane wrote:

Jeff Ross <jross@openvistas.net> writes:

On 5/22/23 5:24 PM, Adrian Klaver wrote:

So is the 1400G mostly in one database in the cluster?

Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?

AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces.

Hmm. I'm glad I'm reading this now.

It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.

It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6 just like
/var/lib/pgsql/backups is on a different LUN.

--
Born in Arizona, moved to Babylonia.

Jeff Ross

jross@openvistas.net

about 3 years ago

In reply to: Tom Lane (#4)

Re: 15 pg_upgrade with -j

On 5/22/23 5:42 PM, Tom Lane wrote:

Jeff Ross<jross@openvistas.net> writes:

On 5/22/23 5:24 PM, Adrian Klaver wrote:

So is the 1400G mostly in one database in the cluster?

Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?

AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces. It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem. Maybe that could use
rethinking, not sure.

regards, tom lane

Thanks Tom. These are all smokingly fast SSDs so it would be
interesting to see how well they'd hold up under that load.

Jeff

Jeff Ross

jross@openvistas.net

about 3 years ago

In reply to: Adrian Klaver (#5)

Re: 15 pg_upgrade with -j

On 5/22/23 5:43 PM, Adrian Klaver wrote:

From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time
to upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

The full commands we are using for pg_upgrade are pretty stock:

Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?

As I understand it no. That the parallelism is between databases not
within a database. Further that 'database schemas' refers to schema as
the overall database object definitions not the namespaces known as
schemas in the database.

Thanks Adrian. That "restore database schemas in parallel" phrase seems
like it would be really easy to read like we did and expect it to work
with one database and multiple schemas.

Maybe it should be changed to "restore multiple databases in parallel"
instead?

Jeff

Peter J. Holzer

hjp-pgsql@hjp.at

about 3 years ago

In reply to: Ron (#6)

Re: 15 pg_upgrade with -j

On 2023-05-22 21:10:48 -0500, Ron wrote:

On 5/22/23 18:42, Tom Lane wrote:

It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.

It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6

You can't hardlink between different file systems.

Even if you could assign single directories to specific LUNs (does any
file system allow this?) this would at best spread the updates across
two LUNs (the inodes would presumable stay on the source LUN and the
target directory would be on the target LUN).

#10

Ron

ronljohnsonjr@gmail.com

about 3 years ago

In reply to: Peter J. Holzer (#9)

Re: 15 pg_upgrade with -j

On 5/23/23 12:19, Peter J. Holzer wrote:

On 2023-05-22 21:10:48 -0500, Ron wrote:

On 5/22/23 18:42, Tom Lane wrote:

It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.

It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6

You can't hardlink between different file systems.

We'd never hardlink. Eliminates the ability to return to the old system if
something goes wrong.

--
Born in Arizona, moved to Babylonia.

#11

Christoph Moench-Tegeder

cmt@burggraben.net

about 3 years ago

In reply to: Ron (#10)

Re: 15 pg_upgrade with -j

## Ron (ronljohnsonjr@gmail.com):

We'd never hardlink. Eliminates the ability to return to the old
system if something goes wrong.

That's why you get yourself a recent XFS and use clone mode (still
sticks you to the same filesystem, but gets you up running much
faster).

Regards,
Christoph

--
Spare Space

#12

Peter J. Holzer

hjp-pgsql@hjp.at

about 3 years ago

In reply to: Ron (#10)

Re: 15 pg_upgrade with -j

On 2023-05-23 13:17:24 -0500, Ron wrote:

On 5/23/23 12:19, Peter J. Holzer wrote:

On 2023-05-22 21:10:48 -0500, Ron wrote:

On 5/22/23 18:42, Tom Lane wrote:

It looks like the assumption was that issuing link()

^^^^^^

requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.

It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6

You can't hardlink between different file systems.

We'd never hardlink.

But that was what Jeff and Tom were talking about. If you are changing
the subject you should at least make it explicit.

#13

Ron

ronljohnsonjr@gmail.com

about 3 years ago

In reply to: Christoph Moench-Tegeder (#11)

Re: 15 pg_upgrade with -j

On 5/23/23 13:58, Christoph Moench-Tegeder wrote:

## Ron (ronljohnsonjr@gmail.com):

We'd never hardlink. Eliminates the ability to return to the old
system if something goes wrong.

That's why you get yourself a recent XFS and use clone mode (still
sticks you to the same filesystem, but gets you up running much
faster).

Sadly, our 9.6 systems (the ones we /really/ need to get off of) are RHEL
6.10 and ext4.

--
Born in Arizona, moved to Babylonia.

#14

Alvaro Herrera

alvherre@2ndquadrant.com

about 3 years ago

In reply to: Ron (#10)

Re: 15 pg_upgrade with -j

On 2023-May-23, Ron wrote:

We'd never hardlink. Eliminates the ability to return to the old system if
something goes wrong.

If you'd never hardlink, then you should run your test without the -k
option. Otherwise, the timings are meaningless.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"La vida es para el que se aventura"