Trouble Upgrading Postgres

Started by Charles Martinover 7 years ago35 messagesgeneral

ssappeals@gmail.com

over 7 years ago

I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest
table. So tried to do pb_basebackup and copy that to the new PG 11 server.
Except that pg_upgrade expects the new and old versions of PG to be
side-by-side. So I installed 9.6 on the new server, ran initdb, verified
that it started, then stopped it and edited postgresql.conf data path to
the location of the pg_basebackup files. Then 9.6 would no longer start. So
how can I get my PG 9.6 data into a new PG 11 database?

Probably related to my troubles are my attempts to get replication set up.
But before I dive back into that, I thought I'd better try getting my 9.6
data into the new 9.6 server, then run PG 11's pg_upgrade and mount the
data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from 9.6 to 11,
but haven't found any documentation on doing that.

Chuck Martin

Ron

ronljohnsonjr@gmail.com

over 7 years ago

In reply to: Charles Martin (#1)

Re: Trouble Upgrading Postgres

On 11/03/2018 02:57 PM, Charles Martin wrote:

I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest table.

What error message?

--
Angular momentum makes the world go 'round.

Adrian Klaver

adrian.klaver@aklaver.com

over 7 years ago

In reply to: Charles Martin (#1)

Re: Trouble Upgrading Postgres

On 11/3/18 12:57 PM, Charles Martin wrote:

I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest
table.

I would answer Ron's question on this first as solving it would be the
easiest fix.

So tried to do pb_basebackup and copy that to the new PG 11
server. Except that pg_upgrade expects the new and old versions of PG to
be side-by-side. So I installed 9.6 on the new server, ran initdb,

The is probably the issue, you now have two 9.6 data directory
instances, the one you created with initdb and the one that came over
with pg_basebackup. I am guessing the editing below has left the server
in a confused state about which directory to use. The error messages you
got when trying to restart the server would be helpful.

verified that it started, then stopped it and edited postgresql.conf
data path to the location of the pg_basebackup files. Then 9.6 would no
longer start. So how can I get my PG 9.6 data into a new PG 11 database?

Probably related to my troubles are my attempts to get replication set
up. But before I dive back into that, I thought I'd better try getting
my 9.6 data into the new 9.6 server, then run PG 11's pg_upgrade and
mount the data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from 9.6 to
11, but haven't found any documentation on doing that.

Chuck Martin

--
Adrian Klaver
adrian.klaver@aklaver.com

Charles Martin

ssappeals@gmail.com

over 7 years ago

In reply to: Charles Martin (#1)

Re: Trouble Upgrading Postgres

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed: PQgetCopyData()
failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents, docfilepath,
docfileextension, enddatetime, endby, editnum, insby, insdatetime, modby,
moddatetime, active, doc_fkey) TO stdout;

I've looked and been unable to find where Centos 7, or Postgres 9.6, stores
the path to the config/data directory outside the data/postgresql.conf
file. But I agree there must be something somewhere.

Chuck

On Sat, Nov 3, 2018 at 6:06 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/3/18 2:56 PM, Charles Martin wrote:

Please reply to list also.
Ccing list.

Yes, if I could get pg_dump to work, I think I'd be fine. Maybe. But it
doesn't.

Post the error you got to the list and we maybe able to help.
Also which version of Postgres where you using to take the dump?

I agree that I've confused Postgres, but I don't know how to resolve the
confusion. It is complicated by the fact that my original Centos 7
install included Postgres 9.2, so those files are hanging around, along
with 9.6 and 11.

I posted the error messages I got when postgresql.conf had the data
directory set to my basebackup data:

*postgresql-9.6.service: main process exited, code=exited,

status=1/FAILURE*

*
*

Not very helpful.

systemctl status postgresql-9.6 provided a bit more info:

*●*postgresql-9.6.service - PostgreSQL 9.6 database server

Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service;
disabled; vendor preset: disabled)

Active: *failed*(Result: exit-code) since Sat 2018-11-03 15:05:30
EDT; 15s ago

Docs: https://www.postgresql.org/docs/9.6/static/

Process: 32570 ExecStart=/usr/pgsql-9.6/bin/postmaster -D ${PGDATA}
*(code=exited, status=1/FAILURE)*

Process: 32563
ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)

Main PID: 32570 (code=exited, status=1/FAILURE)

Yet this went away, and PG 9.6 started, when I changed postgresql.conf
to point to the new (empty) data directory, which is confusing.

No not confusing. Not that familiar with RPM packaging as I am with the
Debian/Ubunto packaging. Still if I remember correctly it also allows
multiple instances of Postgres to run. To do that it has its own system
of tracking the data directories. Where you created the new data
directory is obviously where the package scripts expect to find it. The
pg_basebackup directory is not.

Chuck

On Sat, Nov 3, 2018 at 5:17 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/3/18 12:57 PM, Charles Martin wrote:

I'd be grateful for some help. I am trying to move a large

database from

PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL

11 on

Centos 7. I can't do a pg_dump because it always fails on the

largest

table.

I would answer Ron's question on this first as solving it would be

the

easiest fix.

So tried to do pb_basebackup and copy that to the new PG 11
server. Except that pg_upgrade expects the new and old versions

of PG to

be side-by-side. So I installed 9.6 on the new server, ran initdb,

The is probably the issue, you now have two 9.6 data directory
instances, the one you created with initdb and the one that came over
with pg_basebackup. I am guessing the editing below has left the

server

in a confused state about which directory to use. The error messages
you
got when trying to restart the server would be helpful.

verified that it started, then stopped it and edited

postgresql.conf

data path to the location of the pg_basebackup files. Then 9.6

would no

longer start. So how can I get my PG 9.6 data into a new PG 11

database?

Probably related to my troubles are my attempts to get

replication set

up. But before I dive back into that, I thought I'd better try

getting

my 9.6 data into the new 9.6 server, then run PG 11's pg_upgrade

and

mount the data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from

9.6 to

11, but haven't found any documentation on doing that.

Chuck Martin

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

Import Notes

Reply to msg id not found: cafc1eb5-b54c-cb31-b085-5b7ff243e273@aklaver.com

Adrian Klaver

adrian.klaver@aklaver.com

over 7 years ago

In reply to: Charles Martin (#4)

Re: Trouble Upgrading Postgres

On 11/3/18 3:47 PM, Charles Martin wrote:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed:
PQgetCopyData() failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?

Is this the same that is showing up in the server log?

This probably means the server terminated abnormally

So where is the server located relative to the pg_dump client?

On the same machine?

If so is it a virtual machine e.g AWS?

Across a local or remote network?

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents,
docfilepath, docfileextension, enddatetime, endby, editnum, insby,
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

I've looked and been unable to find where Centos 7, or Postgres 9.6,
stores the path to the config/data directory outside the
data/postgresql.conf file. But I agree there must be something somewhere.

Chuck

On Sat, Nov 3, 2018 at 6:06 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/3/18 2:56 PM, Charles Martin wrote:

Please reply to list also.
Ccing list.

Yes, if I could get pg_dump to work, I think I'd be fine. Maybe.

But it

doesn't.

Post the error you got to the list and we maybe able to help.
Also which version of Postgres where you using to take the dump?

I agree that I've confused Postgres, but I don't know how to

resolve the

confusion. It is complicated by the fact that my original Centos 7
install included Postgres 9.2, so those files are hanging around,

along

with 9.6 and 11.

I posted the error messages I got when postgresql.conf had the data
directory set to my basebackup data:

*postgresql-9.6.service: main process exited, code=exited,

status=1/FAILURE*

*
*

Not very helpful.

systemctl status postgresql-9.6 provided a bit more info:

*●*postgresql-9.6.service - PostgreSQL 9.6 database server

   Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service;
disabled; vendor preset: disabled)

   Active: *failed*(Result: exit-code) since Sat 2018-11-03

15:05:30

EDT; 15s ago

   Docs: https://www.postgresql.org/docs/9.6/static/

Process: 32570 ExecStart=/usr/pgsql-9.6/bin/postmaster -D

${PGDATA}

*(code=exited, status=1/FAILURE)*

Process: 32563
ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)

Main PID: 32570 (code=exited, status=1/FAILURE)

Yet this went away, and PG 9.6 started, when I changed

postgresql.conf

to point to the new (empty) data directory, which is confusing.

No not confusing. Not that familiar with RPM packaging as I am with
the
Debian/Ubunto packaging. Still if I remember correctly it also allows
multiple instances of Postgres to run. To do that it has its own system
of tracking the data directories. Where you created the new data
directory is obviously where the package scripts expect to find it.
The
pg_basebackup directory is not.

Chuck

On Sat, Nov 3, 2018 at 5:17 PM Adrian Klaver

<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com

<mailto:adrian.klaver@aklaver.com>>> wrote:

On 11/3/18 12:57 PM, Charles Martin wrote:
> I'd be grateful for some help. I am trying to move a large
database from
> PostgreSQL 9.6 on Centos 6 to a different server using

PostgreSQL

11 on
> Centos 7. I can't do a pg_dump because it always fails on the
largest
> table.

I would answer Ron's question on this first as solving it

would be the

easiest fix.

>So tried to do pb_basebackup and copy that to the new PG 11
> server. Except that pg_upgrade expects the new and old

versions

of PG to
> be side-by-side. So I installed 9.6 on the new server, ran

initdb,

The is probably the issue, you now have two 9.6 data directory
instances, the one you created with initdb and the one that

came over

with pg_basebackup. I am guessing the editing below has left

the server

in a confused state about which directory to use. The error

messages

you
got when trying to restart the server would be helpful.

> verified that it started, then stopped it and edited

postgresql.conf

> data path to the location of the pg_basebackup files. Then 9.6
would no
> longer start. So how can I get my PG 9.6 data into a new PG 11
database?
>
> Probably related to my troubles are my attempts to get
replication set
> up. But before I dive back into that, I thought I'd better try
getting
> my 9.6 data into the new 9.6 server, then run PG 11's

pg_upgrade and

> mount the data in PG 11. Then maybe I can get replication

started.

>
> I've read that logical replication can be used to migrate from
9.6 to
> 11, but haven't found any documentation on doing that.
>
> Chuck Martin

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

Laurenz Albe

laurenz.albe@cybertec.at

over 7 years ago

In reply to: Charles Martin (#4)

Re: Trouble Upgrading Postgres

Charles Martin wrote:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed: PQgetCopyData() failed.
pg_dump: Error message from server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: The command was: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, docfileextension, enddatetime,
endby, editnum, insby, insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

You probably have a corrupted database.

You should get that fixed first, then you can upgrade.

Maybe you should hire a professional for that.

The other alternative is that you hava a buggy extension installed
that causes PostgreSQL to crash.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Andreas Kretschmer

andreas@a-kretschmer.de

over 7 years ago

In reply to: Charles Martin (#4)

Re: Trouble Upgrading Postgres

Am 03.11.2018 um 23:47 schrieb Charles Martin:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed:
PQgetCopyData() failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents,
docfilepath, docfileextension, enddatetime, endby, editnum, insby,
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

which exact minor version please?

Regards, Andreas
--

2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

Charles Martin

ssappeals@gmail.com

over 7 years ago

In reply to: Adrian Klaver (#5)

Re: Trouble Upgrading Postgres

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?
Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal window
that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process (PID
30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING: terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server from
the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the table
and that I need a professional to help. If someone here is available, I'm
interested.

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

Andreas Kretschmer

andreas@a-kretschmer.de

over 7 years ago

In reply to: Charles Martin (#8)

Re: Trouble Upgrading Postgres

Am 04.11.2018 um 17:38 schrieb Charles Martin:

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

okay, i asked to just rule out a stale version. 9.6.7, for instance,
contains some importand bug fixes.

Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

#10

Adrian Klaver

adrian.klaver@aklaver.com

over 7 years ago

In reply to: Charles Martin (#8)

Re: Trouble Upgrading Postgres

On 11/4/18 8:38 AM, Charles Martin wrote:

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?
Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal
window that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process
(PID 30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING: terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server
from the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the table
and that I need a professional to help. If someone here is available,
I'm interested.

Given that this involves your largest table I would confirm that the
signal 9 kill was not coming from the system OOM killer. Take a look at
the system logs to see what they show over the same time period.

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

--
Adrian Klaver
adrian.klaver@aklaver.com

#11

Charles Martin

ssappeals@gmail.com

over 7 years ago

In reply to: Adrian Klaver (#10)

Re: Trouble Upgrading Postgres

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

On Sun, Nov 4, 2018 at 3:13 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/4/18 8:38 AM, Charles Martin wrote:

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

Is this error the client reporting?
Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal
window that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process
(PID 30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING: terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server
from the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the table
and that I need a professional to help. If someone here is available,
I'm interested.

Given that this involves your largest table I would confirm that the
signal 9 kill was not coming from the system OOM killer. Take a look at
the system logs to see what they show over the same time period.

Andreas said:

which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

--
Adrian Klaver
adrian.klaver@aklaver.com

#12

Ron

ronljohnsonjr@gmail.com

over 7 years ago

In reply to: Charles Martin (#11)

Re: Trouble Upgrading Postgres

Not enough swap space?

On 11/04/2018 04:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

On Sun, Nov 4, 2018 at 3:13 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/4/18 8:38 AM, Charles Martin wrote:

Adtrian said:

pg_dump: Error message from server: server closed the connection
unexpectedly

>Is this error the client reporting?
>Is this the same that is showing up in the server log?

Yes, that's the client message, i.e. what appeared in the terminal
window that gave the command. The server log shows:

2018-11-02 20:30:46 EDT [20405]: [4-1] user=,db= LOG: server process
(PID 30438) was terminated by signal 9: Killed

2018-11-02 20:30:46 EDT [20405]: [5-1] user=,db= DETAIL: Failed process
was running: COPY public.docfile (docfile_pkey, docfileoriginalname,
ordernumber, versionnum, docfilecontents, docfilepath, d$

2018-11-02 20:30:46 EDT [20405]: [6-1] user=,db= LOG: terminating any
other active server processes

2018-11-02 20:30:46 EDT [20415]: [10-1] user=,db= WARNING: terminating
connection because of crash of another server process

2018-11-02 20:30:46 EDT [20415]: [11-1] user=,db= DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnor$

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server
from the same network. It is not a virtual machine.

Lsaurenz said:

You probably have a corrupted database.
You should get that fixed first, then you can upgrade.
Maybe you should hire a professional for that.

I suspect this is is correct, both that there is corruption in the

table

and that I need a professional to help. If someone here is available,
I'm interested.

Given that this involves your largest table I would confirm that the
signal 9 kill was not coming from the system OOM killer. Take a look at
the system logs to see what they show over the same time period.

Andreas said:

>which exact minor version please?

PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-23), 64-bit

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Angular momentum makes the world go 'round.

#13

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Charles Martin (#11)

Re: Trouble Upgrading Postgres

Charles Martin <ssappeals@gmail.com> writes:

Yep, you called it:
Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption may
be shared buffers ... what have you got shared_buffers set to on the old
server? If it's more than half a GB or so, maybe reducing it would help.

regards, tom lane

#14

Adrian Klaver

adrian.klaver@aklaver.com

over 7 years ago

In reply to: Charles Martin (#11)

Re: Trouble Upgrading Postgres

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump command
you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com

#15

Charles Martin

ssappeals@gmail.com

over 7 years ago

In reply to: Adrian Klaver (#14)

Re: Trouble Upgrading Postgres

Tom said:

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption may
be shared buffers ... what have you got shared_buffers set to on the old
server? If it's more than half a GB or so, maybe reducing it would help.

This is not a VM, but hardware.

Adrian said:

In addition to the other suggestions, what is the exact pg_dump command
you are using?

The last time:

[postgres@mandj tmp]$ pg_dump martinandjones >
/mnt/4tbB/pgbackup/2018-11-02/mandj.bak

On Sun, Nov 4, 2018 at 8:16 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB, file-rss:1074692kB

So it's running out of memory when trying to dump this table. The "old"
server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump command
you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com

#16

Adrian Klaver

adrian.klaver@aklaver.com

over 7 years ago

In reply to: Charles Martin (#15)

Re: Trouble Upgrading Postgres

On 11/5/18 5:56 AM, Charles Martin wrote:

Tom said:

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption may
be shared buffers ... what have you got shared_buffers set to on the old
server? If it's more than half a GB or so, maybe reducing it would help.

This is not a VM, but hardware.

Tom was referring to this from your previous post:

(postmaster) total-vm:3068900kB,

where vm(VM) is Virtual Memory:

https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766

So what is your shared_buffers:

https://www.postgresql.org/docs/10/static/runtime-config-resource.html

set to?

Adrian said:

In addition to the other suggestions, what is the exact pg_dump command
you are using?

The last time:

[postgres@mandj tmp]$ pg_dump martinandjones >
/mnt/4tbB/pgbackup/2018-11-02/mandj.bak

On Sun, Nov 4, 2018 at 8:16 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process 30438
(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB,

file-rss:1074692kB

So it's running out of memory when trying to dump this table. The

"old"

server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump command
you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#17

Daniel Verite

daniel@manitou-mail.org

over 7 years ago

In reply to: Charles Martin (#8)

Re: Trouble Upgrading Postgres

Charles Martin wrote:

So where is the server located relative to the pg_dump client?
On the same machine?
If so is it a virtual machine e.g AWS?
Across a local or remote network?

I gave the command in a terminal session after SSHing to the server from
the same network. It is not a virtual machine.

That means that pg_dump runs on the same machine as the server.

It's plausible that, with only 4GB of RAM, the table that fails
to dump has some very large rows that can't be allocated, especially
since both the backend and pg_dump need to have it simultaneously
in memory.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents,
docfilepath, docfileextension, enddatetime, endby, editnum, insby,
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;

The "docfilecontents" column suggests that it might contain
large contents. If its type is bytea, it's going to be expanded
to twice its size to build the hex representation.

You may get a sense on how big is the biggest row expressed
as text with this query:
SELECT max(length(contents.*::text)) FROM public.docfile;

If it's big enough that it might cause the OOM issue,
try to run pg_dump remotely through an SSH tunnel [1]https://www.postgresql.org/docs/current/static/ssh-tunnels.html, which you
can already do in terms of network permissions since you log in with
SSH, so pg_dump itself does not use any memory on the server.

Also, if the machine doesn't have swap space, it might be
that just adding a few GB's of swap would make the operation
succeed.

[1]: https://www.postgresql.org/docs/current/static/ssh-tunnels.html

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#18

Charles Martin

ssappeals@gmail.com

over 7 years ago

In reply to: Adrian Klaver (#16)

Re: Trouble Upgrading Postgres

Adrian said:

Tom was referring to this from your previous post:
(postmaster) total-vm:3068900kB,
where vm(VM) is Virtual Memory:

https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766

So what is your shared_buffers:
https://www.postgresql.org/docs/10/static/runtime-config-resource.html
set to?

Ok, thanks for explaining this. Here is the current value:
"shared_buffers" "131072" "8kB"

On Mon, Nov 5, 2018 at 9:06 AM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 11/5/18 5:56 AM, Charles Martin wrote:

Tom said:

That's kind of odd: a COPY shouldn't really consume very much working
memory. I suspect that much of the process's apparent VM consumption

may

be shared buffers ... what have you got shared_buffers set to on the

old

server? If it's more than half a GB or so, maybe reducing it would

help.

This is not a VM, but hardware.

Tom was referring to this from your previous post:

(postmaster) total-vm:3068900kB,

where vm(VM) is Virtual Memory:

https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766

So what is your shared_buffers:

https://www.postgresql.org/docs/10/static/runtime-config-resource.html

set to?

Adrian said:

In addition to the other suggestions, what is the exact pg_dump command
you are using?

The last time:

[postgres@mandj tmp]$ pg_dump martinandjones >
/mnt/4tbB/pgbackup/2018-11-02/mandj.bak

On Sun, Nov 4, 2018 at 8:16 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 11/4/18 2:55 PM, Charles Martin wrote:

Yep, you called it:

Nov 2 20:30:45 localhost kernel: Out of memory: Kill process

30438

(postmaster) score 709 or sacrifice child
Nov 2 20:30:45 localhost kernel: Killed process 30438, UID 26,
(postmaster) total-vm:3068900kB, anon-rss:1695392kB,

file-rss:1074692kB

So it's running out of memory when trying to dump this table. The

"old"

server has 4GB of ram, the "new" server 20GB.

In addition to the other suggestions, what is the exact pg_dump

command

you are using?

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#19

Adrian Klaver

adrian.klaver@aklaver.com

over 7 years ago

In reply to: Charles Martin (#18)

Re: Trouble Upgrading Postgres

On 11/5/18 7:04 AM, Charles Martin wrote:

Adrian said:

Tom was referring to this from your previous post:
(postmaster) total-vm:3068900kB,
where vm(VM) is Virtual Memory:
https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#22326766
So what is your shared_buffers:
https://www.postgresql.org/docs/10/static/runtime-config-resource.html
set to?

Ok, thanks for explaining this. Here is the current value:
"shared_buffers""131072""8kB"

It should be a single value something like this for the default:

shared_buffers = 128MB

--
Adrian Klaver
adrian.klaver@aklaver.com

#20

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Charles Martin (#18)

Re: Trouble Upgrading Postgres

Charles Martin <ssappeals@gmail.com> writes:

Ok, thanks for explaining this. Here is the current value:
"shared_buffers" "131072" "8kB"

Well, that's 1GB, which might be ambitious inside a VM with a hard
restriction to 4GB total RAM. Postgres can get by with a *lot* less.
Try knocking it down to a tenth of that and see if it makes a difference.

regards, tom lane

#21

Charles Martin

ssappeals@gmail.com

over 7 years ago

In reply to: Adrian Klaver (#19)

#22

Ron

ronljohnsonjr@gmail.com

over 7 years ago

In reply to: Charles Martin (#21)

#23