dbase restore

Started by Bradley Brownabout 24 years ago14 messagesgeneral
Jump to latest
#1Bradley Brown
bradley@segrestfarms.com

Hello all,
I am attempting to restore a dbase from file created by pg_dump. I
used "pg_dump > file.dump" as my dump command.
I am using "psql -e dbase_name < file.dump" as my restore method. I
am running postgresql 7.0 on RedHat 7.0.
When I try to restore th dbase I get the following on screen:

"...NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index
'countries_pkey' for table 'countries' pqReadData() -- backend closed
the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
connection to server was lost".

The following is in the pg_err.log:

"...020320.15:20:48.197 [16231] NOTICE: CREATE TABLE/PRIMARY KEY will
create implicit index 'advertisements_pkey' for table 'advertisements'
020320.15:20:48.390 [16231] NOTICE: CREATE TABLE/PRIMARY KEY will
create implicit index 'affiliate_pkey' for table 'affiliate'
020320.15:20:48.569 [16231] NOTICE: CREATE TABLE/PRIMARY KEY will
create implicit index 'affiliatestatus_pkey' for table 'affiliatestatus'

020320.15:20:48.713 [16231] NOTICE: CREATE TABLE/PRIMARY KEY will
create implicit index 'category_pkey' for table 'category'
020320.15:20:48.872 [16231] NOTICE: CREATE TABLE/PRIMARY KEY will
create implicit index 'config_pkey' for table 'config'
020320.15:20:49.056 [16231] NOTICE: CREATE TABLE/PRIMARY KEY will
create implicit index 'countries_pkey' for table 'countries'
Server process (pid 16231) exited with status 139 at Wed Mar 20 15:20:49
2002
Terminating any active server processes...
Server processes were terminated at Wed Mar 20 15:20:49 2002
Reinitializing shared memory and semaphores
020320.15:20:49.261 [16232] DEBUG: Data Base System is starting up at
Wed Mar 20 15:20:49 2002
020320.15:20:49.262 [16232] DEBUG: Data Base System was interrupted
being in production at Wed Mar 20 15:14:43 2002
020320.15:20:49.263 [16232] DEBUG: Data Base System is in production
state at Wed Mar 20 15:20:49 2002"

I get the same errors if I try to restore the schema and data or just
the schema. I know that the dump file is ok because I immediately took
it to another machine running postgresql and had it working in less than
a minute.

Can anyone tell me what exactly is going on and how I might fix whatever
problems might exist? Thanks in advance and I greatly appreciate it.
Bradley Brown

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bradley Brown (#1)
Re: dbase restore

Bradley Brown <bradley@segrestfarms.com> writes:

When I try to restore th dbase I get the following on screen:

"...NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index
'countries_pkey' for table 'countries' pqReadData() -- backend closed
the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
connection to server was lost".

Hmm, could you provide a debugger backtrace from the core file that
(hopefully) was left by the crashing backend?

Also, you might try running the dump script with debug_print_query
turned on, so that the log will show exactly what query triggers
the crash.

Or, you could just update to PG 7.2 and see if the problem goes away ;-)
7.0 is quite a ways back (especially if it's really 7.0 and not
7.0.something).

regards, tom lane

#3Joshua Hoover
joshuahoover@revivalhealth.com
In reply to: Tom Lane (#2)
Postmaster processes running out of control?

I'm running PostgreSQL 7.1.3 on Red Hat Linux 7.1 and believe there is a
problem with my PostgreSQL server. I have a PHP application on a separate
server accessing the PostgreSQL server. The PostgreSQL server seems to be
getting hammered, as even simple queries on indexed columns are taking
FOREVER. When I run top, here I normally see at least 50 entries similar to
these for postmaster:

19336 postgres 9 0 92960 90M 92028 S 0.0 9.0 0:18 postmaster
19341 postgres 9 0 87996 85M 87140 S 0.0 8.5 0:09 postmaster
19355 postgres 9 0 87984 85M 87112 S 11.6 8.5 0:09 postmaster
19337 postgres 9 0 87952 85M 87092 S 0.0 8.5 0:09 postmaster

The server info from top reads:

57 processes: 54 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 54.3% user, 3.7% system, 0.0% nice, 41.8% idle
Mem: 1028908K av, 346760K used, 682148K free, 93812K shrd, 8640K
buff
Swap: 1048536K av, 0K used, 1048536K free 131796K
cached

My postgresql.conf has the following changes/additions to it that change
from the default:

max_connections = 512
sort_mem = 1024
shared_buffers = 12800

I've run a vacuum on all the tables and still see the same slow responses
from the database. Does anyone have any suggestions and/or pointers?

Thank you,

Joshua Hoover

#4Martijn van Oosterhout
kleptog@svana.org
In reply to: Joshua Hoover (#3)
Re: Postmaster processes running out of control?

On Wed, Mar 20, 2002 at 06:04:17PM -0500, Joshua Hoover wrote:

I'm running PostgreSQL 7.1.3 on Red Hat Linux 7.1 and believe there is a
problem with my PostgreSQL server. I have a PHP application on a separate
server accessing the PostgreSQL server. The PostgreSQL server seems to be
getting hammered, as even simple queries on indexed columns are taking
FOREVER. When I run top, here I normally see at least 50 entries similar to
these for postmaster:

19336 postgres 9 0 92960 90M 92028 S 0.0 9.0 0:18 postmaster
19341 postgres 9 0 87996 85M 87140 S 0.0 8.5 0:09 postmaster
19355 postgres 9 0 87984 85M 87112 S 11.6 8.5 0:09 postmaster
19337 postgres 9 0 87952 85M 87092 S 0.0 8.5 0:09 postmaster

90MB per process? wow. Can you look in the server logs to see which query is
taking all the time?

I know PHP has persistant and non-persistant connections. I don't know what
criteria it uses to determine when the connection can be closed. Do you know
what your server is doing?

Lastly, is there a lot of disk activity?
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Ignorance continues to thrive when intelligent people choose to do
nothing. Speaking out against censorship and ignorance is the imperative
of all intelligent people.

#5Jan Wieck
JanWieck@Yahoo.com
In reply to: Joshua Hoover (#3)
Re: Postmaster processes running out of control?

Joshua Hoover wrote:

I'm running PostgreSQL 7.1.3 on Red Hat Linux 7.1 and believe there is a
problem with my PostgreSQL server. I have a PHP application on a separate
server accessing the PostgreSQL server. The PostgreSQL server seems to be
getting hammered, as even simple queries on indexed columns are taking
FOREVER. When I run top, here I normally see at least 50 entries similar to
these for postmaster:

19336 postgres 9 0 92960 90M 92028 S 0.0 9.0 0:18 postmaster
19341 postgres 9 0 87996 85M 87140 S 0.0 8.5 0:09 postmaster
19355 postgres 9 0 87984 85M 87112 S 11.6 8.5 0:09 postmaster
19337 postgres 9 0 87952 85M 87092 S 0.0 8.5 0:09 postmaster

Looks pretty good to me, assuming it's a top(1) output as I
see it on my RedHAT 7.1 system.

Nearly all of the virtual memory of the processes is shared
memory and everything seems to be sucked in (the 90M etc.
column, resident set size).

The server info from top reads:

57 processes: 54 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 54.3% user, 3.7% system, 0.0% nice, 41.8% idle
Mem: 1028908K av, 346760K used, 682148K free, 93812K shrd, 8640K
buff
Swap: 1048536K av, 0K used, 1048536K free 131796K
cached

Confirms my above statement, 0K used Swap, so that server has
plenty of unused RAM.

My postgresql.conf has the following changes/additions to it that change
from the default:

max_connections = 512
sort_mem = 1024
shared_buffers = 12800

1 Gig of physical RAM on a dedicated database server (that's
what you describe at the top, since your PHP app runs on a
different system, so PostgreSQL is the only thing here,
right?).

Why don't you make a serious amount of that memory available
for shared buffers? I would start with 65536 (1/2 GB). Don't
forget to adjust SHMMAX during boot!

I've run a vacuum on all the tables and still see the same slow responses
from the database. Does anyone have any suggestions and/or pointers?

Did you run VACUUM only or did you also allow it to ANALYZE
the tables? This is the main pointer, but don't miss the
above optimization hints.

Thank you,

You're welcome.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#6Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Martijn van Oosterhout (#4)
Re: Postmaster processes running out of control?

On Thu, 21 Mar 2002, Martijn van Oosterhout wrote:

On Wed, Mar 20, 2002 at 06:04:17PM -0500, Joshua Hoover wrote:

I'm running PostgreSQL 7.1.3 on Red Hat Linux 7.1 and believe there is a
problem with my PostgreSQL server. I have a PHP application on a separate
server accessing the PostgreSQL server. The PostgreSQL server seems to be
getting hammered, as even simple queries on indexed columns are taking
FOREVER. When I run top, here I normally see at least 50 entries similar to
these for postmaster:

19336 postgres 9 0 92960 90M 92028 S 0.0 9.0 0:18 postmaster
19341 postgres 9 0 87996 85M 87140 S 0.0 8.5 0:09 postmaster
19355 postgres 9 0 87984 85M 87112 S 11.6 8.5 0:09 postmaster
19337 postgres 9 0 87952 85M 87092 S 0.0 8.5 0:09 postmaster

90MB per process? wow. Can you look in the server logs to see which query is
taking all the time?

I can't help with the problem but is 90MB such a shock? I can get towards that
just by running something like:

SELECT * FROM big_table
WHERE time > 'sometime'
AND time < 'someothertime'
AND name IN ('first', 'second', 'third', 'fourth', 'fifth')
ORDER BY time

Indeed I got blase about running such a thing and rather than the backend
instance dying the last time it froze my kernel. I haven't done it again.

BTW, the killer bit was the fifth name, up to that point things got large but
stayed within capabilities of the machine. I tried all I could think of to get
limits applied to the backend processes (short of editing and recompiling from
source) but nothing worked. There wasn't any change when switching from a IN
test to a string of ORs.

(6.5.1 I think postgres, since upgraded to 7.2 on FreeBSD 3.3-STABLE)

Why am I saying this? No idea. Just not sure why a 90MB footprint for a DB
backend would be so shocking.

Nigel J.Andrews
Logictree Systems Limited

#7Martijn van Oosterhout
kleptog@svana.org
In reply to: Nigel J. Andrews (#6)
Re: Postmaster processes running out of control?

On Thu, Mar 21, 2002 at 02:10:08PM +0000, Nigel J. Andrews wrote:

On Thu, 21 Mar 2002, Martijn van Oosterhout wrote:

90MB per process? wow. Can you look in the server logs to see which query is
taking all the time?

I can't help with the problem but is 90MB such a shock? I can get towards that
just by running something like:

We have tables here running into the hundreds of megabytes and if a backend
process goes over 40MB, it generally means a bug in the query.

SELECT * FROM big_table
WHERE time > 'sometime'
AND time < 'someothertime'
AND name IN ('first', 'second', 'third', 'fourth', 'fifth')
ORDER BY time

Indeed I got blase about running such a thing and rather than the backend
instance dying the last time it froze my kernel. I haven't done it again.

That basically means you gave the backend enough memory to hang the machine.
I think that means you went too far.

Given the above query, do you have any form of index on "time"? How big is
your table? If the backend process is using a lot of memory, that's
generally an indication that it's doing a sort on a large resultset. An
index can solve that problem.

From here it becomes hard to suggest anything without an indication of the
EXPLAIN output of a query. Sometimes with proper jiggling, a 10 second query
can become a sub-second one.

BTW, the killer bit was the fifth name, up to that point things got large
but stayed within capabilities of the machine. I tried all I could think
of to get limits applied to the backend processes (short of editing and
recompiling from source) but nothing worked. There wasn't any change when
switching from a IN test to a string of ORs.

I beleive the parser converts the IN to ORs anyway.

(6.5.1 I think postgres, since upgraded to 7.2 on FreeBSD 3.3-STABLE)

Why am I saying this? No idea. Just not sure why a 90MB footprint for a DB
backend would be so shocking.

I think as someone else pointed out, it's probably all shared memory any and
so may not be a problem. That doesn't solve your basic problem though.
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Ignorance continues to thrive when intelligent people choose to do
nothing. Speaking out against censorship and ignorance is the imperative
of all intelligent people.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#7)
Re: Postmaster processes running out of control?

Martijn van Oosterhout <kleptog@svana.org> writes:

On Thu, Mar 21, 2002 at 02:10:08PM +0000, Nigel J. Andrews wrote:

(6.5.1 I think postgres, since upgraded to 7.2 on FreeBSD 3.3-STABLE)

Why am I saying this? No idea. Just not sure why a 90MB footprint for a DB
backend would be so shocking.

I think as someone else pointed out, it's probably all shared memory any and
so may not be a problem. That doesn't solve your basic problem though.

Yeah, the number reported by ps should be viewed with suspicion until
you know for certain whether it counts the shared memory segment or not.
(In my experience, on some platforms it does and on some it doesn't.)

However, I think the real issue here is probably just 6.5's well known
problems with intra-query memory leaks. If Nigel can reproduce the
difficulty on 7.2 then I'd be more interested in looking into it...

regards, tom lane

#9Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Tom Lane (#8)
Re: Postmaster processes running out of control?

On Thu, 21 Mar 2002, Tom Lane wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

On Thu, Mar 21, 2002 at 02:10:08PM +0000, Nigel J. Andrews wrote:

(6.5.1 I think postgres, since upgraded to 7.2 on FreeBSD 3.3-STABLE)

Why am I saying this? No idea. Just not sure why a 90MB footprint for a DB
backend would be so shocking.

I think as someone else pointed out, it's probably all shared memory any and
so may not be a problem. That doesn't solve your basic problem though.

Yeah, the number reported by ps should be viewed with suspicion until
you know for certain whether it counts the shared memory segment or not.
(In my experience, on some platforms it does and on some it doesn't.)

That I considered somewhat immaterial since the process was using the
memory. That's what normally killed the process, using up all available memory
including swap and still requiring more.

However, I think the real issue here is probably just 6.5's well known
problems with intra-query memory leaks. If Nigel can reproduce the
difficulty on 7.2 then I'd be more interested in looking into it...

Good point, I hadn't tried it since the upgrade becuase that wasn't why I
upgraded (don't worry I've got a _long_ post on that subject waiting to be
sent), I tightened up limits for the generation of the SQL string in the
application before then. However, I have just tried it with 7 poster_names
listed and top never reported even 8MB for the postgres footprint. I won't give
the EXPLAIN output because it's not interesting and it would almost be an
overlap with the contents of my long, pending post.

FWIW, the table has >1 million rows and the list of names I just gave the query
includes some of the highest volume posters, including the top one with 55,000
rows in the table. There is an index on the poster_name and one on the time
columns.

Thanks for the comments, I didn't even know about the 6.5 memory leak.

Nigel Andrews
Logictree Systems Limited

#10Martijn van Oosterhout
kleptog@svana.org
In reply to: Nigel J. Andrews (#9)
Re: Postmaster processes running out of control?

On Fri, Mar 22, 2002 at 01:11:11AM +0000, Nigel J. Andrews wrote:

Good point, I hadn't tried it since the upgrade becuase that wasn't why I
upgraded (don't worry I've got a _long_ post on that subject waiting to be
sent), I tightened up limits for the generation of the SQL string in the
application before then. However, I have just tried it with 7 poster_names
listed and top never reported even 8MB for the postgres footprint. I won't give
the EXPLAIN output because it's not interesting and it would almost be an
overlap with the contents of my long, pending post.

Hmm, with something that matches most of the table, it would likely choose a
sequential scan which can take quite a while over a large table.

FWIW, the table has >1 million rows and the list of names I just gave the query
includes some of the highest volume posters, including the top one with 55,000
rows in the table. There is an index on the poster_name and one on the time
columns.

If you're selecting on one column (poster_name) and sorting on another
(time) it may help to have an index on both (time,poster_name) since that
avoids the sort step. (I hope 7.2 estimates sort costs better than earlier
versions).

However, if you really want the whole output (rather than say the first 100
lines) and that really involves trawling a majority of the table, then you
are simply bound by disk transfer speed.

Alternativly, using a cursor may allow you to start outputting data before
the query has finished.

Thanks for the comments, I didn't even know about the 6.5 memory leak.

Earlier versions of postgres tended to leak a lot within queries (not
between queries). 7.0 fixed most of them but still some issues with
functions and I don't beleive it's a problem anymore.

HTH,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Ignorance continues to thrive when intelligent people choose to do
nothing. Speaking out against censorship and ignorance is the imperative
of all intelligent people.

#11Andrew Bartley
abartley@evolvosystems.com
In reply to: Martijn van Oosterhout (#4)
Re: Postmaster processes running out of control?

2002-03-22 16:44:28 [383] DEBUG: SIInsertDataEntry: table is 70% full,
signaling postmaster
2002-03-22 16:44:28 [383] NOTICE: RegisterSharedInvalid: SI buffer
overflow
2002-03-22 16:44:28 [315] NOTICE: InvalidateSharedInvalid: cache state
reset
2002-03-22 16:44:28 [279] NOTICE: InvalidateSharedInvalid: cache state
reset
2002-03-22 16:44:28 [278] NOTICE: InvalidateSharedInvalid: cache state
reset
2002-03-22 16:44:28 [277] NOTICE: InvalidateSharedInvalid: cache state
reset
2002-03-22 16:44:28 [276] NOTICE: InvalidateSharedInvalid: cache state
reset

This is happening once a rather large batch process commits ie a PLPGSQL
function from within cron.

I have had a good look through groups and most of the time it seems to be
related to a vacuum.

In this circumstance it is causing a peer process job that useses
persistant connections to receive a SIGPIPE signal.

Can some one please advise us can we stop the table filling? Or can we stop
the SIGPIPE?

Thanks

Andrew Bartley

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Bartley (#11)
Re: Postmaster processes running out of control?

"Andrew Bartley" <abartley@evolvosystems.com> writes:

2002-03-22 16:44:28 [383] DEBUG: SIInsertDataEntry: table is 70% full,
signaling postmaster
2002-03-22 16:44:28 [383] NOTICE: RegisterSharedInvalid: SI buffer
overflow
2002-03-22 16:44:28 [315] NOTICE: InvalidateSharedInvalid: cache state
reset

AFAIK these notices are completely harmless (and they have in fact been
downgraded to DEBUG level in 7.2). SI buffer overrun is an expected
condition when a large number of system catalog updates are committed
at once.

In this circumstance it is causing a peer process job that useses
persistant connections to receive a SIGPIPE signal.

I think the SIGPIPE has little or no direct connection to the SI buffer
overruns --- what *exactly* are you doing and what are you seeing?

regards, tom lane

#13Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Martijn van Oosterhout (#10)
Re: Postmaster processes running out of control?

On Fri, 22 Mar 2002, Martijn van Oosterhout wrote:

On Fri, Mar 22, 2002 at 01:11:11AM +0000, Nigel J. Andrews wrote:

Good point, I hadn't tried it since the upgrade becuase that wasn't why I
upgraded (don't worry I've got a _long_ post on that subject waiting to be
sent), ...
[sniped]
...
the EXPLAIN output because it's not interesting and it would almost be
an overlap with the contents of my long, pending post.

Hmm, with something that matches most of the table, it would likely choose a
sequential scan which can take quite a while over a large table

Yep, doing a seqscan does take a while and I can see why it does one most of
the time. This is the subject of my long, pending post, which is no longer
pending as I have recently sent it to the list.

[sniped]

If you're selecting on one column (poster_name) and sorting on another
(time) it may help to have an index on both (time,poster_name) since that
avoids the sort step. (I hope 7.2 estimates sort costs better than earlier
versions).

Thank you, it didn't occur to me that the two column index would ease the
sorting. Would the order of the columns specified in the index creation be
significant?

However, if you really want the whole output (rather than say the first 100
lines) and that really involves trawling a majority of the table, then you
are simply bound by disk transfer speed.

I have coded the application to use all the data, caching the pages (it's a web
site) not requested but which use the results of the query, eg. pages 2 to xxx
of the results when page 1 has been requested by the user. I'm happy this way
since the query hit is taken only once instead of several dozen times and I
have a fancy caching system that fits in nicely with data update frequency. In
fact, that was one of the driving forces behind my fancy caching system. The
I/O is a definitely a bottleneck. There's almost continuous disk activity with
a change in it's 'character' marking the switch from backend query processing
to page cache filling.

Alternativly, using a cursor may allow you to start outputting data before
the query has finished.

At the moment the application just issues the query and waits for the results.
I had considered changing this more in order to reduce the memory usage within
the client than as a speeding up means. I think such a change makes the split
between content and presentation harder to maintain though, possible but
harder, or shall we say more sophisticated.

--
Nigel J. Andrews
Logictree Systems Limited

#14Martijn van Oosterhout
kleptog@svana.org
In reply to: Nigel J. Andrews (#13)
Re: Postmaster processes running out of control?

On Fri, Mar 22, 2002 at 02:09:16PM +0000, Nigel J. Andrews wrote:

If you're selecting on one column (poster_name) and sorting on another
(time) it may help to have an index on both (time,poster_name) since that
avoids the sort step. (I hope 7.2 estimates sort costs better than earlier
versions).

Thank you, it didn't occur to me that the two column index would ease the
sorting. Would the order of the columns specified in the index creation be
significant?

Yes. If you build an index on the columns (a,b), the index becomes a tree
where each value of a is ordered. At each node there is a subtree with each
value of b in ordered format. So if you do a sequential scan on an index it
comes out ordered by a then b.

Ofcourse, after identifying a tuple in the index, the database then has to
go back to the main table to check that it's valid in the current
transaction (and to get the data ofcourse). (There is a good reason
somewhere why transaction information is not stored within the index but I
don't remember it right now).

So it comes down to a comparison between index scan over most of the table
vs. sequential scan + sort of whole table. I'm not sure which would win...

HTH,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Ignorance continues to thrive when intelligent people choose to do
nothing. Speaking out against censorship and ignorance is the imperative
of all intelligent people.