15 pg_upgrade with -j
Hello!
We are moving from 10 to 15 and are in testing now.
Our development database is about 1400G and takes 12 minutes to complete
a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7
server with 80 cores.
Adding -j 40 to use half of those cores also finishes in 12 minutes and
ps / top/ htop never show more than a single process at a time in use.
Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.
Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.
If -j 40 helps there--why not with pg_upgrade?
The full commands we are using for pg_upgrade are pretty stock:
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80
Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we can
do to knock that time down we will and any suggestions to do so would be
greatly appreciated.
Jeff Ross
On 5/22/23 16:20, Jeff Ross wrote:
Hello!
We are moving from 10 to 15 and are in testing now.
Our development database is about 1400G and takes 12 minutes to complete
a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7
server with 80 cores.Adding -j 40 to use half of those cores also finishes in 12 minutes and
ps / top/ htop never show more than a single process at a time in use.Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.If -j 40 helps there--why not with pg_upgrade?
From docs:
https://www.postgresql.org/docs/current/pgupgrade.html
The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.
So is the 1400G mostly in one database in the cluster?
The full commands we are using for pg_upgrade are pretty stock:
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we can
do to knock that time down we will and any suggestions to do so would be
greatly appreciated.Jeff Ross
--
Adrian Klaver
adrian.klaver@aklaver.com
On 5/22/23 5:24 PM, Adrian Klaver wrote:
On 5/22/23 16:20, Jeff Ross wrote:
Hello!
We are moving from 10 to 15 and are in testing now.
Our development database is about 1400G and takes 12 minutes to
complete a pg_upgrade with the -k (hard-links) version. This is on a
CentOS 7 server with 80 cores.Adding -j 40 to use half of those cores also finishes in 12 minutes
and ps / top/ htop never show more than a single process at a time in
use.Bumping that to -j 80 to use them all also finishes in 12 minutes and
still only a single process.Running the suggested vacuum analyze after pg_upgrade completes takes
about 19 minutes. Adding -j 40 takes that time down to around 5
minutes, jumps the server load up over 30 and htop shows 40 processes.If -j 40 helps there--why not with pg_upgrade?
From docs:
https://www.postgresql.org/docs/current/pgupgrade.html
The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.So is the 1400G mostly in one database in the cluster?
The full commands we are using for pg_upgrade are pretty stock:
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up
-k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B
/usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up
-k -j 80Our production database is closer to 1900G. If we're looking at a 30
minute pg_upgrade window we'll be okay but if there is anything we
can do to knock that time down we will and any suggestions to do so
would be greatly appreciated.Jeff Ross
Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?
Jeff
Jeff Ross <jross@openvistas.net> writes:
On 5/22/23 5:24 PM, Adrian Klaver wrote:
So is the 1400G mostly in one database in the cluster?
Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?
AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces. It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem. Maybe that could use
rethinking, not sure.
regards, tom lane
On 5/22/23 16:29, Jeff Ross wrote:
On 5/22/23 5:24 PM, Adrian Klaver wrote:
On 5/22/23 16:20, Jeff Ross wrote:
Hello!
From docs:
https://www.postgresql.org/docs/current/pgupgrade.html
The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time to
upgrade a multi-database server running on a multiprocessor machine.So is the 1400G mostly in one database in the cluster?
The full commands we are using for pg_upgrade are pretty stock:
Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?
As I understand it no. That the parallelism is between databases not
within a database. Further that 'database schemas' refers to schema as
the overall database object definitions not the namespaces known as
schemas in the database.
'
Jeff
--
Adrian Klaver
adrian.klaver@aklaver.com
On 5/22/23 18:42, Tom Lane wrote:
Jeff Ross <jross@openvistas.net> writes:
On 5/22/23 5:24 PM, Adrian Klaver wrote:
So is the 1400G mostly in one database in the cluster?
Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces.
Hmm. I'm glad I'm reading this now.
It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.
It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6 just like
/var/lib/pgsql/backups is on a different LUN.
--
Born in Arizona, moved to Babylonia.
On 5/22/23 5:42 PM, Tom Lane wrote:
Jeff Ross<jross@openvistas.net> writes:
On 5/22/23 5:24 PM, Adrian Klaver wrote:
So is the 1400G mostly in one database in the cluster?
Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces. It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem. Maybe that could use
rethinking, not sure.regards, tom lane
Thanks Tom. These are all smokingly fast SSDs so it would be
interesting to see how well they'd hold up under that load.
Jeff
On 5/22/23 5:43 PM, Adrian Klaver wrote:
From docs:
https://www.postgresql.org/docs/current/pgupgrade.html
The --jobs option allows multiple CPU cores to be used for
copying/linking of files and to dump and restore database schemas in
parallel; a good place to start is the maximum of the number of CPU
cores and tablespaces. This option can dramatically reduce the time
to upgrade a multi-database server running on a multiprocessor machine.So is the 1400G mostly in one database in the cluster?
The full commands we are using for pg_upgrade are pretty stock:
Yes, one big database with about 80 schemas and several other smaller
databases so -j should help, right?As I understand it no. That the parallelism is between databases not
within a database. Further that 'database schemas' refers to schema as
the overall database object definitions not the namespaces known as
schemas in the database.
Thanks Adrian. That "restore database schemas in parallel" phrase seems
like it would be really easy to read like we did and expect it to work
with one database and multiple schemas.
Maybe it should be changed to "restore multiple databases in parallel"
instead?
Jeff
On 2023-05-22 21:10:48 -0500, Ron wrote:
On 5/22/23 18:42, Tom Lane wrote:
It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6
You can't hardlink between different file systems.
Even if you could assign single directories to specific LUNs (does any
file system allow this?) this would at best spread the updates across
two LUNs (the inodes would presumable stay on the source LUN and the
target directory would be on the target LUN).
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On 5/23/23 12:19, Peter J. Holzer wrote:
On 2023-05-22 21:10:48 -0500, Ron wrote:
On 5/22/23 18:42, Tom Lane wrote:
It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6You can't hardlink between different file systems.
We'd never hardlink. Eliminates the ability to return to the old system if
something goes wrong.
--
Born in Arizona, moved to Babylonia.
## Ron (ronljohnsonjr@gmail.com):
We'd never hardlink. Eliminates the ability to return to the old
system if something goes wrong.
That's why you get yourself a recent XFS and use clone mode (still
sticks you to the same filesystem, but gets you up running much
faster).
Regards,
Christoph
--
Spare Space
On 2023-05-23 13:17:24 -0500, Ron wrote:
On 5/23/23 12:19, Peter J. Holzer wrote:
On 2023-05-22 21:10:48 -0500, Ron wrote:
On 5/22/23 18:42, Tom Lane wrote:
It looks like the assumption was that issuing link()
^^^^^^
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.
Maybe that could use rethinking, not sure.It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
going to be on a different LUN from /var/lib/pgsql/9.6You can't hardlink between different file systems.
We'd never hardlink.
But that was what Jeff and Tom were talking about. If you are changing
the subject you should at least make it explicit.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On 5/23/23 13:58, Christoph Moench-Tegeder wrote:
## Ron (ronljohnsonjr@gmail.com):
We'd never hardlink. Eliminates the ability to return to the old
system if something goes wrong.That's why you get yourself a recent XFS and use clone mode (still
sticks you to the same filesystem, but gets you up running much
faster).
Sadly, our 9.6 systems (the ones we /really/ need to get off of) are RHEL
6.10 and ext4.
--
Born in Arizona, moved to Babylonia.
On 2023-May-23, Ron wrote:
We'd never hardlink. Eliminates the ability to return to the old system if
something goes wrong.
If you'd never hardlink, then you should run your test without the -k
option. Otherwise, the timings are meaningless.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"La vida es para el que se aventura"