testing cvs HEAD - HS/SR - PANIC: cannot make new WAL entries during recovery
When accidentally running pg_dump from an 8.4.2 instance into a 9.0devel (cvs as of 2010.02.17)
slave as below, this causes a PANIC. The slave can be restarted.
localhost:55432 => 8.4.2 instance (ssh tunnel)
/tmp:7575 => a 9.0devel standby
time pg_dump -h localhost -p 55432 -t public.tab_jobs --clean --no-owner --no-privileges ms | psql
-q -h /tmp -p 7575 -d replicas
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
PANIC: cannot make new WAL entries during recovery
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
real 0m5.569s
After re-starting the slave, its logfile shows this:
[...]
STATEMENT: ALTER TABLE ONLY tab_jobs ALTER COLUMN db SET STATISTICS 10000;
ERROR: transaction is read-only
STATEMENT: CREATE SEQUENCE tab_jobs_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ERROR: transaction is read-only
STATEMENT: ALTER SEQUENCE tab_jobs_id_seq OWNED BY tab_jobs.id;
PANIC: cannot make new WAL entries during recovery
STATEMENT: SELECT pg_catalog.setval('tab_jobs_id_seq', 31907, true);
LOG: server process (PID 25753) was terminated by signal 6: Aborted
LOG: terminating any other active server processes
LOG: database system was interrupted while in recovery at log time 2010-02-18 16:38:39 CET
HINT: If this has occurred more than once some data might be corrupted and you might need to
choose an earlier recovery target.
cp: cannot stat `/var/data1/pg_stuff/dump/hotslave/replication_archive/0000000100000088000000E7':
No such file or directory
LOG: entering standby mode
LOG: redo starts at 88/E500E4E8
LOG: consistent recovery state reached at 89/2C800000
LOG: database system is ready to accept read only connections
[...]
It doesn't happen with just any table (I tried that), so maybe it has some connection with this
specific table, sequence, or sequence name.
There is another (500 GB) replication connection running from a 9.0devel primary to this same
9.0devel slave; the slave seems to pick up that work OK after the restart. I want to mention that
here, but I don't think it causes/influences this panic.
And to be clear: I realize that one cannot expect a pg_dump at a slave to work: I only did this
accidentally. Nevertheless, ISTM it shouldn't PANIC...
hth,
Erik Rijkers
Erik Rijkers wrote:
And to be clear: I realize that one cannot expect a pg_dump at a slave to work:
Hmm, why not?
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Thu, February 18, 2010 23:08, Alvaro Herrera wrote:
Erik Rijkers wrote:
And to be clear: I realize that one cannot expect a pg_dump at a slave to work:
Hmm, why not?
Because the slave is readonly.
My case was:
pg_dump $pg8.4.2 | psql $slave
I ran this by accident.
I suppose I should have written more clearly:
"one cannot expect throwing a pg_dump at a slave to work"
Erik Rijkers
On Thursday 18 February 2010 22:25:35 Erik Rijkers wrote:
When accidentally running pg_dump from an 8.4.2 instance into a 9.0devel
(cvs as of 2010.02.17) slave as below, this causes a PANIC. The slave can
be restarted.localhost:55432 => 8.4.2 instance (ssh tunnel)
/tmp:7575 => a 9.0devel standbytime pg_dump -h localhost -p 55432 -t public.tab_jobs --clean --no-owner
--no-privileges ms | psql -q -h /tmp -p 7575 -d replicas
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
ERROR: transaction is read-only
PANIC: cannot make new WAL entries during recovery
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lostreal 0m5.569s
After re-starting the slave, its logfile shows this:
[...]
STATEMENT: ALTER TABLE ONLY tab_jobs ALTER COLUMN db SET STATISTICS 10000;
ERROR: transaction is read-only
STATEMENT: CREATE SEQUENCE tab_jobs_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ERROR: transaction is read-only
STATEMENT: ALTER SEQUENCE tab_jobs_id_seq OWNED BY tab_jobs.id;
PANIC: cannot make new WAL entries during recovery
STATEMENT: SELECT pg_catalog.setval('tab_jobs_id_seq', 31907, true);
Hm, yea. setval() simply misses a check there (it was added for nextval
though).
I wonder if there are other functions bypassing the layers like
setval/nextval?
Trivial patch attached.
Andres
Attachments:
prevent-setval-during-recovery.patchtext/x-patch; charset=UTF-8; name=prevent-setval-during-recovery.patchDownload
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0f9dcfe..d696203 100644
*** a/src/backend/commands/sequence.c
--- b/src/backend/commands/sequence.c
*************** do_setval(Oid relid, int64 next, bool is
*** 738,743 ****
--- 738,746 ----
Buffer buf;
Form_pg_sequence seq;
+ /* setval() writes to database and must be prevented during recovery */
+ PreventCommandDuringRecovery();
+
/* open and AccessShareLock sequence */
init_sequence(relid, &elm, &seqrel);
Andres Freund wrote:
Trivial patch attached.
Thanks, committed.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Thu, 2010-02-18 at 23:13 +0100, Erik Rijkers wrote:
I ran this by accident
Good test, thanks for running it.
--
Simon Riggs www.2ndQuadrant.com
On Thu, February 18, 2010 23:32, Andres Freund wrote:
On Thursday 18 February 2010 22:25:35 Erik Rijkers wrote:
localhost:55432 => 8.4.2 instance (ssh tunnel)
/tmp:7575 => a 9.0devel standbytime pg_dump -h localhost -p 55432 -t public.tab_jobs --clean --no-owner
--no-privileges ms | psql -q -h /tmp -p 7575 -d replicas
ERROR: transaction is read-only
[...]
PANIC: cannot make new WAL entries during recovery
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
[...]
STATEMENT: ALTER SEQUENCE tab_jobs_id_seq OWNED BY tab_jobs.id;
PANIC: cannot make new WAL entries during recovery
STATEMENT: SELECT pg_catalog.setval('tab_jobs_id_seq', 31907, true);Hm, yea. setval() simply misses a check there (it was added for nextval
though).
I wonder if there are other functions bypassing the layers like
setval/nextval?Trivial patch attached.
That worked - and I see has already been committed, thanks.
There is another small issue related to this dumping into a readonly slave:
With multiline columns (at least, I think that's the reason),
there are on STDERR endless occurrences of:
invalid command \N
and
invalid command \
There can be millions of these lines (obviously).
On top of that, the logfile gets appended
with 'syntax error', follwed by the whole dump:
ERROR: syntax error at or near "18726" at character 1
STATEMENT: 18726 16883 [...] [whole dump follows here]
This does maybe not really qualify as a bug, but it does
seem unnecessary to spam a readonly standby with all
this. (come to think of it, 'syntax error' is a bit off in any case)
Thanks,
Erik Rijkers