Trouble Upgrading Postgres

Started by Charles Martinover 7 years ago35 messagesgeneral
Jump to latest
#1Charles Martin
ssappeals@gmail.com

I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest
table. So tried to do pb_basebackup and copy that to the new PG 11 server.
Except that pg_upgrade expects the new and old versions of PG to be
side-by-side. So I installed 9.6 on the new server, ran initdb, verified
that it started, then stopped it and edited postgresql.conf data path to
the location of the pg_basebackup files. Then 9.6 would no longer start. So
how can I get my PG 9.6 data into a new PG 11 database?

Probably related to my troubles are my attempts to get replication set up.
But before I dive back into that, I thought I'd better try getting my 9.6
data into the new 9.6 server, then run PG 11's pg_upgrade and mount the
data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from 9.6 to 11,
but haven't found any documentation on doing that.

Chuck Martin

#2Ron
ronljohnsonjr@gmail.com
In reply to: Charles Martin (#1)
Re: Trouble Upgrading Postgres

On 11/03/2018 02:57 PM, Charles Martin wrote:

I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest table.

What error message?

--
Angular momentum makes the world go 'round.

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#1)
Re: Trouble Upgrading Postgres

On 11/3/18 12:57 PM, Charles Martin wrote:

I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest
table.

I would answer Ron's question on this first as solving it would be the
easiest fix.

So tried to do pb_basebackup and copy that to the new PG 11
server. Except that pg_upgrade expects the new and old versions of PG to
be side-by-side. So I installed 9.6 on the new server, ran initdb,

The is probably the issue, you now have two 9.6 data directory
instances, the one you created with initdb and the one that came over
with pg_basebackup. I am guessing the editing below has left the server
in a confused state about which directory to use. The error messages you
got when trying to restart the server would be helpful.

verified that it started, then stopped it and edited postgresql.conf
data path to the location of the pg_basebackup files. Then 9.6 would no
longer start. So how can I get my PG 9.6 data into a new PG 11 database?

Probably related to my troubles are my attempts to get replication set
up. But before I dive back into that, I thought I'd better try getting
my 9.6 data into the new 9.6 server, then run PG 11's pg_upgrade and
mount the data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from 9.6 to
11, but haven't found any documentation on doing that.

Chuck Martin

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Charles Martin
ssappeals@gmail.com
In reply to: Charles Martin (#1)
Re: Trouble Upgrading Postgres

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed: PQgetCopyData()
failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents, docfilepath,
docfileextension, enddatetime, endby, editnum, insby, insdatetime, modby,
moddatetime, active, doc_fkey) TO stdout;

I've looked and been unable to find where Centos 7, or Postgres 9.6, stores
the path to the config/data directory outside the data/postgresql.conf
file. But I agree there must be something somewhere.

Chuck

On Sat, Nov 3, 2018 at 6:06 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/3/18 2:56 PM, Charles Martin wrote:

Please reply to list also.
Ccing list.

Yes, if I could get pg_dump to work, I think I'd be fine. Maybe. But it
doesn't.

Post the error you got to the list and we maybe able to help.
Also which version of Postgres where you using to take the dump?

I agree that I've confused Postgres, but I don't know how to resolve the
confusion. It is complicated by the fact that my original Centos 7
install included Postgres 9.2, so those files are hanging around, along
with 9.6 and 11.

I posted the error messages I got when postgresql.conf had the data
directory set to my basebackup data:

*postgresql-9.6.service: main process exited, code=exited,

status=1/FAILURE*

*
*

Not very helpful.

systemctl status postgresql-9.6 provided a bit more info:

*●*postgresql-9.6.service - PostgreSQL 9.6 database server

Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service;
disabled; vendor preset: disabled)

Active: *failed*(Result: exit-code) since Sat 2018-11-03 15:05:30
EDT; 15s ago

Docs: https://www.postgresql.org/docs/9.6/static/

Process: 32570 ExecStart=/usr/pgsql-9.6/bin/postmaster -D ${PGDATA}
*(code=exited, status=1/FAILURE)*

Process: 32563
ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)

Main PID: 32570 (code=exited, status=1/FAILURE)

Yet this went away, and PG 9.6 started, when I changed postgresql.conf
to point to the new (empty) data directory, which is confusing.

No not confusing. Not that familiar with RPM packaging as I am with the
Debian/Ubunto packaging. Still if I remember correctly it also allows
multiple instances of Postgres to run. To do that it has its own system
of tracking the data directories. Where you created the new data
directory is obviously where the package scripts expect to find it. The
pg_basebackup directory is not.

Chuck

On Sat, Nov 3, 2018 at 5:17 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/3/18 12:57 PM, Charles Martin wrote:

I'd be grateful for some help. I am trying to move a large

database from

PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL

11 on

Centos 7. I can't do a pg_dump because it always fails on the

largest

table.

I would answer Ron's question on this first as solving it would be

the

easiest fix.

So tried to do pb_basebackup and copy that to the new PG 11
server. Except that pg_upgrade expects the new and old versions

of PG to

be side-by-side. So I installed 9.6 on the new server, ran initdb,

The is probably the issue, you now have two 9.6 data directory
instances, the one you created with initdb and the one that came over
with pg_basebackup. I am guessing the editing below has left the

server

in a confused state about which directory to use. The error messages
you
got when trying to restart the server would be helpful.

verified that it started, then stopped it and edited

postgresql.conf

data path to the location of the pg_basebackup files. Then 9.6

would no

longer start. So how can I get my PG 9.6 data into a new PG 11

database?

Probably related to my troubles are my attempts to get

replication set

up. But before I dive back into that, I thought I'd better try

getting

my 9.6 data into the new 9.6 server, then run PG 11's pg_upgrade

and

mount the data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from

9.6 to

11, but haven't found any documentation on doing that.

Chuck Martin

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#4)
Re: Trouble Upgrading Postgres

On 11/3/18 3:47 PM, Charles Martin wrote:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed:
PQgetCopyData() failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?

Is this the same that is showing up in the server log?

This probably means the server terminated abnormally

So where is the server located relative to the pg_dump client?

On the same machine?

If so is it a virtual machine e.g AWS?

Across a local or remote network?

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents,
docfilepath, docfileextension, enddatetime, endby, editnum, insby,
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

I've looked and been unable to find where Centos 7, or Postgres 9.6,
stores the path to the config/data directory outside the
data/postgresql.conf file. But I agree there must be something somewhere.

Chuck

On Sat, Nov 3, 2018 at 6:06 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/3/18 2:56 PM, Charles Martin wrote:

Please reply to list also.
Ccing list.

Yes, if I could get pg_dump to work, I think I'd be fine. Maybe.

But it

doesn't.

Post the error you got to the list and we maybe able to help.
Also which version of Postgres where you using to take the dump?

I agree that I've confused Postgres, but I don't know how to

resolve the

confusion. It is complicated by the fact that my original Centos 7
install included Postgres 9.2, so those files are hanging around,

along

with 9.6 and 11.

I posted the error messages I got when postgresql.conf had the data
directory set to my basebackup data:

*postgresql-9.6.service: main process exited, code=exited,

status=1/FAILURE*

*
*

Not very helpful.

systemctl status postgresql-9.6 provided a bit more info:

*●*postgresql-9.6.service - PostgreSQL 9.6 database server

     Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service;
disabled; vendor preset: disabled)

     Active: *failed*(Result: exit-code) since Sat 2018-11-03

15:05:30

EDT; 15s ago

       Docs: https://www.postgresql.org/docs/9.6/static/

    Process: 32570 ExecStart=/usr/pgsql-9.6/bin/postmaster -D

${PGDATA}

*(code=exited, status=1/FAILURE)*

    Process: 32563
ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)

   Main PID: 32570 (code=exited, status=1/FAILURE)

Yet this went away, and PG 9.6 started, when I changed

postgresql.conf

to point to the new (empty) data directory, which is confusing.

No not confusing. Not that familiar  with RPM packaging as I am with
the
Debian/Ubunto packaging. Still if I remember correctly it also allows
multiple instances of Postgres to run. To do that it has its own system
of tracking the data directories. Where you created the new data
directory is obviously where the package scripts expect to find it.
The
pg_basebackup directory is not.

Chuck

On Sat, Nov 3, 2018 at 5:17 PM Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

     On 11/3/18 12:57 PM, Charles Martin wrote:
      > I'd be grateful for some help. I am trying to move a large
     database from
      > PostgreSQL 9.6 on Centos 6 to a different server using

PostgreSQL

     11 on
      > Centos 7. I can't do a pg_dump because it always fails on the
     largest
      > table.

     I would answer Ron's question on this first as solving it

would be the

     easiest fix.

      >So tried to do pb_basebackup and copy that to the new PG 11
      > server. Except that pg_upgrade expects the new and old

versions

     of PG to
      > be side-by-side. So I installed 9.6 on the new server, ran

initdb,

     The is probably the issue, you now have two 9.6 data directory
     instances, the one you created with initdb and the one that

came over

     with pg_basebackup. I am guessing the editing below has left

the server

     in a confused state about which directory to use. The error

messages

     you
     got when trying to restart the server would be helpful.

      > verified that it started, then stopped it and edited

postgresql.conf

      > data path to the location of the pg_basebackup files. Then 9.6
     would no
      > longer start. So how can I get my PG 9.6 data into a new PG 11
     database?
      >
      > Probably related to my troubles are my attempts to get
     replication set
      > up. But before I dive back into that, I thought I'd better try
     getting
      > my 9.6 data into the new 9.6 server, then run PG 11's

pg_upgrade and

      > mount the data in PG 11. Then maybe I can get replication

started.

      >
      > I've read that logical replication can be used to migrate from
     9.6 to
      > 11, but haven't found any documentation on doing that.
      >
      > Chuck Martin

     --
     Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Charles Martin (#4)
Re: Trouble Upgrading Postgres

Charles Martin wrote:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed: PQgetCopyData() failed.
pg_dump: Error message from server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: The command was: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, docfileextension, enddatetime,
endby, editnum, insby, insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

You probably have a corrupted database.

You should get that fixed first, then you can upgrade.

Maybe you should hire a professional for that.

The other alternative is that you hava a buggy extension installed
that causes PostgreSQL to crash.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#7Andreas Kretschmer
andreas@a-kretschmer.de
In reply to: Charles Martin (#4)
Re: Trouble Upgrading Postgres

Am 03.11.2018 um 23:47 schrieb Charles Martin:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed:
PQgetCopyData() failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents,
docfilepath, docfileextension, enddatetime, endby, editnum, insby,
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

which exact minor version please?

Regards, Andreas
--

2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

#8Charles Martin
ssappeals@gmail.com
In reply to: Adrian Klaver (#5)
Re: Trouble Upgrading Postgres

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?
Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal window
that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process (PID
30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING: terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server from
the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the table
and that I need a professional to help. If someone here is available, I'm
interested.

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

#9Andreas Kretschmer
andreas@a-kretschmer.de
In reply to: Charles Martin (#8)
Re: Trouble Upgrading Postgres

Am 04.11.2018 um 17:38 schrieb Charles Martin:

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

okay, i asked to just rule out a stale version. 9.6.7, for instance,
contains some importand bug fixes.

Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#8)
Re: Trouble Upgrading Postgres

On 11/4/18 8:38 AM, Charles Martin wrote:

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?
Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal
window that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG:  server process
(PID 30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL:  Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG:  terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING:  terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL:  The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

 I gave the command in a terminal session after SSHing to the server
from the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the table
and that I need a professional to help. If someone here is available,
I'm interested.

Given that this involves your largest table I would confirm that the
signal 9 kill was not coming from the system OOM killer. Take a look at
the system logs to see what they show over the same time period.

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

--
Adrian Klaver
adrian.klaver@aklaver.com

#11Charles Martin
ssappeals@gmail.com
In reply to: Adrian Klaver (#10)
Re: Trouble Upgrading Postgres

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

On Sun, Nov 4, 2018 at 3:13 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/4/18 8:38 AM, Charles Martin wrote:

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?
Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal
window that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process
(PID 30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING: terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server
from the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the table
and that I need a professional to help. If someone here is available,
I'm interested.

Given that this involves your largest table I would confirm that the
signal 9 kill was not coming from the system OOM killer. Take a look at
the system logs to see what they show over the same time period.

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Ron
ronljohnsonjr@gmail.com
In reply to: Charles Martin (#11)
Re: Trouble Upgrading Postgres

Not enough swap space?

On 11/04/2018 04:55 PM, Charles Martin wrote:

Yep, you called it:

Nov  2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov  2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

On Sun, Nov 4, 2018 at 3:13 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/4/18 8:38 AM, Charles Martin wrote:

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

  >Is this error the client reporting?
  >Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal
window that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process
(PID 30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING:  terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL:  The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

   I gave the command in a terminal session after SSHing to the server
from the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the

table

and that I need a professional to help. If someone here is available,
I'm interested.

Given that this involves your largest table I would confirm that the
signal 9 kill was not coming from the system OOM killer. Take a look at
the system logs to see what they show over the same time period.

Andreas said:

  >which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Angular momentum makes the world go 'round.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Charles Martin (#11)
Re: Trouble Upgrading Postgres

Charles Martin <ssappeals@gmail.com> writes:

Yep, you called it:
Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption may
be shared buffers ... what have you got shared_buffers set to on the old
server? If it's more than half a GB or so, maybe reducing it would help.

regards, tom lane

#14Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#11)
Re: Trouble Upgrading Postgres

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov  2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov  2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump command
you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com

#15Charles Martin
ssappeals@gmail.com
In reply to: Adrian Klaver (#14)
Re: Trouble Upgrading Postgres

Tom said:

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption may
be shared buffers ... what have you got shared_buffers set to on the old
server? If it's more than half a GB or so, maybe reducing it would help.

This is not a VM, but hardware.

Adrian said:

In addition to the other suggestions, what is the exact pg_dump command
you are using?

The last time:

[postgres@mandj tmp]$ pg_dump martinandjones >
/mnt/4tbB/pgbackup/2018-11-02/mandj.bak

On Sun, Nov 4, 2018 at 8:16 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump command
you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com

#16Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#15)
Re: Trouble Upgrading Postgres

On 11/5/18 5:56 AM, Charles Martin wrote:

Tom said:

That's kind of odd: a COPY shouldn't really consume very much working
memory.  I suspect that much of the process's apparent VM consumption may
be shared buffers ... what have you got shared_buffers set to on the old
server?  If it's more than half a GB or so, maybe reducing it would help.

This is not a VM, but hardware.

Tom was referring to this from your previous post:

(postmaster) total-vm:3068900kB,

where vm(VM) is Virtual Memory:

https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766

So what is your shared_buffers:

https://www.postgresql.org/docs/10/static/runtime-config-resource.html

set to?

Adrian said:

In addition to the other suggestions, what is the exact pg_dump command
you are using?

The last time:

[postgres@mandj tmp]$ pg_dump martinandjones >
/mnt/4tbB/pgbackup/2018-11-02/mandj.bak

On Sun, Nov 4, 2018 at 8:16 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov  2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov  2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB,

file-rss:1074692kB

So it's running out of memory when trying to dump this table. The

"old"

server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump command
you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#17Daniel Verite
daniel@manitou-mail.org
In reply to: Charles Martin (#8)
Re: Trouble Upgrading Postgres

Charles Martin wrote:

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server from
the same network. It is not a virtual machine.

That means that pg_dump runs on the same machine as the server.

It's plausible that, with only 4GB of RAM, the table that fails
to dump has some very large rows that can't be allocated, especially
since both the backend and pg_dump need to have it simultaneously
in memory.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents,
docfilepath, docfileextension, enddatetime, endby, editnum, insby,
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

The "docfilecontents" column suggests that it might contain
large contents. If its type is bytea, it's going to be expanded
to twice its size to build the hex representation.

You may get a sense on how big is the biggest row expressed
as text with this query:
SELECT max(length(contents.*::text)) FROM public.docfile;

If it's big enough that it might cause the OOM issue,
try to run pg_dump remotely through an SSH tunnel [1]https://www.postgresql.org/docs/current/static/ssh-tunnels.html, which you
can already do in terms of network permissions since you log in with
SSH, so pg_dump itself does not use any memory on the server.

Also, if the machine doesn't have swap space, it might be
that just adding a few GB's of swap would make the operation
succeed.

[1]: https://www.postgresql.org/docs/current/static/ssh-tunnels.html

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#18Charles Martin
ssappeals@gmail.com
In reply to: Adrian Klaver (#16)
Re: Trouble Upgrading Postgres

Adrian said:

Tom was referring to this from your previous post:
(postmaster) total-vm:3068900kB,
where vm(VM) is Virtual Memory:

https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766

So what is your shared_buffers:
https://www.postgresql.org/docs/10/static/runtime-config-resource.html
set to?

Ok, thanks for explaining this. Here is the current value:
"shared_buffers" "131072" "8kB"

On Mon, Nov 5, 2018 at 9:06 AM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/5/18 5:56 AM, Charles Martin wrote:

Tom said:

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption

may

be shared buffers ... what have you got shared_buffers set to on the

old

server? If it's more than half a GB or so, maybe reducing it would

help.

This is not a VM, but hardware.

Tom was referring to this from your previous post:

(postmaster) total-vm:3068900kB,

where vm(VM) is Virtual Memory:

https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766

So what is your shared_buffers:

https://www.postgresql.org/docs/10/static/runtime-config-resource.html

set to?

Adrian said:

In addition to the other suggestions, what is the exact pg_dump command
you are using?

The last time:

[postgres@mandj tmp]$ pg_dump martinandjones >
/mnt/4tbB/pgbackup/2018-11-02/mandj.bak

On Sun, Nov 4, 2018 at 8:16 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process

30438

(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB,

file-rss:1074692kB

So it's running out of memory when trying to dump this table. The

"old"

server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump

command

you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#19Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#18)
Re: Trouble Upgrading Postgres

On 11/5/18 7:04 AM, Charles Martin wrote:

Adrian said:

Tom was referring to this from your previous post:
(postmaster) total-vm:3068900kB,
where vm(VM) is Virtual Memory:
https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766
So what is your shared_buffers:
https://www.postgresql.org/docs/10/static/runtime-config-resource.html
set to?

Ok, thanks for explaining this. Here is the current value:
"shared_buffers""131072""8kB"

It should be a single value something like this for the default:

shared_buffers = 128MB

--
Adrian Klaver
adrian.klaver@aklaver.com

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Charles Martin (#18)
Re: Trouble Upgrading Postgres

Charles Martin <ssappeals@gmail.com> writes:

Ok, thanks for explaining this. Here is the current value:
"shared_buffers" "131072" "8kB"

Well, that's 1GB, which might be ambitious inside a VM with a hard
restriction to 4GB total RAM. Postgres can get by with a *lot* less.
Try knocking it down to a tenth of that and see if it makes a difference.

regards, tom lane

#21Charles Martin
ssappeals@gmail.com
In reply to: Adrian Klaver (#19)
#22Ron
ronljohnsonjr@gmail.com
In reply to: Charles Martin (#21)
#23Charles Martin
ssappeals@gmail.com
In reply to: Ron (#22)
#24Ron
ronljohnsonjr@gmail.com
In reply to: Charles Martin (#23)
#25Daniel Verite
daniel@manitou-mail.org
In reply to: Charles Martin (#21)
#26Charles Martin
ssappeals@gmail.com
In reply to: Daniel Verite (#25)
#27Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Martin (#21)
#28Daniel Verite
daniel@manitou-mail.org
In reply to: Charles Martin (#26)
#29Charles Martin
ssappeals@gmail.com
In reply to: Daniel Verite (#28)
#30Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Daniel Verite (#28)
#31Daniel Verite
daniel@manitou-mail.org
In reply to: Adrian Klaver (#30)
#32Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Daniel Verite (#31)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adrian Klaver (#32)
#34Noname
bend@linux4ms.net
In reply to: Tom Lane (#33)
#35Charles Martin
ssappeals@gmail.com
In reply to: Noname (#34)