PostgreSQL Server Crash using plPHP or PL/Perl
We are writing a multi-master replication process for our Electronic
Medical Records product. We have written triggers in plPHP and then in
PL/Perl to keep an audit trail of the changes as well as flags so the data
can be replicated. We started with plPHP, but then server started
crashing, which reset all connections to the database (requiring our
application to be restarted). We then tried to rewrite the code using
PL/Perl, but the same problem has occurred. The code for the triggers are
available at:
http://medical.bmaenterprises.com/audit.plphp
http://medical.bmaenterprises.com/audit.plperl
We create the triggers by running the follow SQL statement for each table:
CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config FOR
EACH ROW EXECUTE PROCEDURE audit();
Any ideas of what is causing the server to crash will be helpful.
Below are the lines from the PostgreSQL serverlog file when the crash occurs:
----------------------------------------------------------------------------
LOG: server process (PID 29153) exited with exit code 255
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted at 2006-07-11 16:01:32 EDT
LOG: checkpoint record is at 1/F413F26C
LOG: redo record is at 1/F413F26C; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 7628670; next OID: 693120
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 1/F413F2B0
LOG: record with zero length at 1/F4186D3C
LOG: redo done at 1/F4186D14
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by database "postgres"
Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.
Are you certain that it is the trigger that is crashing the process? If
that is true, then there may be a bug in plperl.
To debug, you could use gdb, but try this first:
Use the strict pragma. To do this in plperl (instead of plperlu), use:
BEGIN { strict->import(); }
or set strict mode to on in postgresql.conf [I don't understand why
this isn't the default.]
You will need to declare all your variables using my $var. [You are
already half-way there because you declare a lot of empty strings.]
You already pepper your code with elog(NOTICE,"") so you can tell us
how far the code gets right? You can use more elogs to hone in on the
line that crashes.
Are you aware that your code will be very costly to execute?
On Jul 11, 2006, at 5:43 PM, Carl M. Nasal II wrote:
We are writing a multi-master replication process for our Electronic
Medical Records product. We have written triggers in plPHP and then
in PL/Perl to keep an audit trail of the changes as well as flags so
the data can be replicated. We started with plPHP, but then server
started crashing, which reset all connections to the database
(requiring our application to be restarted). We then tried to rewrite
the code using PL/Perl, but the same problem has occurred. The code
for the triggers are available at:http://medical.bmaenterprises.com/audit.plphp
http://medical.bmaenterprises.com/audit.plperlWe create the triggers by running the follow SQL statement for each
table:CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config
FOR EACH ROW EXECUTE PROCEDURE audit();Any ideas of what is causing the server to crash will be helpful.
Below are the lines from the PostgreSQL serverlog file when the crash
occurs:
-----------------------------------------------------------------------
-----
LOG: server process (PID 29153) exited with exit code 255
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted at 2006-07-11 16:01:32 EDT
LOG: checkpoint record is at 1/F413F26C
LOG: redo record is at 1/F413F26C; undo record is at 0/0; shutdown
FALSE
LOG: next transaction ID: 7628670; next OID: 693120
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery
in progress
LOG: redo starts at 1/F413F2B0
LOG: record with zero length at 1/F4186D3C
LOG: redo done at 1/F4186D14
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by database
"postgres"Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.---------------------------(end of
broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
agentm@themactionfaction.com
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
2006/7/11, Carl M. Nasal II <cnasal@bmaenterprises.com>:
(...)
Any ideas of what is causing the server to crash will be helpful.
Below are the lines from the PostgreSQL serverlog file when the crash occurs:
----------------------------------------------------------------------------
LOG: server process (PID 29153) exited with exit code 255
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
(...)
It would be helpful if you provided the PostgreSQL version you're
using as well as details of the OS and possibly the hardware.
I've seen this kind of error in connection with hardware errors
(typically bad RAM or severe hard disk errors). Have you attempted
replicating this problem on another system to confirm / exclude this
as a possible cause?
Ian Barwick
Thanks for the tips. I will try to use the strict pragma as you suggested
and see if it helps.
The code runs fine most of the time. The issue is as time elapses, the
server ends up crashing. Generally, it will crash within a day of use
(with only 3 clients hitting it). The reason we believe it is the triggers
is that once we remove the triggers, the server works fine for weeks and
months with no problems. Once we re-enable the triggers, the server
crashes within 1 day.
Yes, we have seen quickly the code is very costly to execute, especially on
queries that update many rows. We have this home-grown solution because we
need the ability to have two sites have their own copy of the data to
improve performance. We tried to access data over T1 lines, but it was too
slow, and we would prefer not to be out of service if the T1 is down.
Thanks for your quick response and assistance.
Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.
Agent M wrote:
Show quoted text
Are you certain that it is the trigger that is crashing the process? If
that is true, then there may be a bug in plperl.To debug, you could use gdb, but try this first:
Use the strict pragma. To do this in plperl (instead of plperlu), use:
BEGIN { strict->import(); }
or set strict mode to on in postgresql.conf [I don't understand why
this isn't the default.]
You will need to declare all your variables using my $var. [You are
already half-way there because you declare a lot of empty strings.]You already pepper your code with elog(NOTICE,"") so you can tell us how
far the code gets right? You can use more elogs to hone in on the line
that crashes.Are you aware that your code will be very costly to execute?
On Jul 11, 2006, at 5:43 PM, Carl M. Nasal II wrote:
We are writing a multi-master replication process for our Electronic
Medical Records product. We have written triggers in plPHP and then
in PL/Perl to keep an audit trail of the changes as well as flags so
the data can be replicated. We started with plPHP, but then server
started crashing, which reset all connections to the database
(requiring our application to be restarted). We then tried to rewrite
the code using PL/Perl, but the same problem has occurred. The code
for the triggers are available at:http://medical.bmaenterprises.com/audit.plphp
http://medical.bmaenterprises.com/audit.plperlWe create the triggers by running the follow SQL statement for each
table:CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config
FOR EACH ROW EXECUTE PROCEDURE audit();Any ideas of what is causing the server to crash will be helpful.
Below are the lines from the PostgreSQL serverlog file when the crash
occurs:
----------------------------------------------------------------------------LOG: server process (PID 29153) exited with exit code 255
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted at 2006-07-11 16:01:32 EDT
LOG: checkpoint record is at 1/F413F26C
LOG: redo record is at 1/F413F26C; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 7628670; next OID: 693120
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery
in progress
LOG: redo starts at 1/F413F2B0
LOG: record with zero length at 1/F4186D3C
LOG: redo done at 1/F4186D14
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by database
"postgres"Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly� � � � � � � � � � � � � � � � � �
AgentM
agentm@themactionfaction.com
� � � � � � � � � � � � � � � � � �---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
I apologize for not including the server information in the last e-mail. I
knew I would forget something. :)
PostgreSQL 8.1.4 (compiled from source)
Debian GNU/Linux 3.1
Intel(R) Pentium(R) D CPU 3.00GHz
1GB RAM
We have not be able to test the triggers anywhere else that has any load on
it. If we run it on a test set of data, it works fine because it is not
being hit by that many users. However, if we disable the triggers,
everything on the live server appears to work fine and PostgreSQL never dies.
Also, we are using the PostgreSQL ODBC driver from a Windows application we
have developed. It seems that if the user tries to access the server while
it restarts after this error, they get an error. However, if the user is
not doing anything when the server restarts and then tries to access data
later, it appears to automatically do the reconnect and everything is fine.
Thanks again for your assistance.
Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.
Ian Barwick wrote:
Show quoted text
2006/7/11, Carl M. Nasal II <cnasal@bmaenterprises.com>:
(...)Any ideas of what is causing the server to crash will be helpful.
Below are the lines from the PostgreSQL serverlog file when the crash
occurs:
----------------------------------------------------------------------------LOG: server process (PID 29153) exited with exit code 255
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process(...)
It would be helpful if you provided the PostgreSQL version you're
using as well as details of the OS and possibly the hardware.I've seen this kind of error in connection with hardware errors
(typically bad RAM or severe hard disk errors). Have you attempted
replicating this problem on another system to confirm / exclude this
as a possible cause?Ian Barwick
On Wed, 2006-07-12 at 07:32, Carl M. Nasal II wrote:
I apologize for not including the server information in the last e-mail. I
knew I would forget something. :)PostgreSQL 8.1.4 (compiled from source)
Debian GNU/Linux 3.1
Intel(R) Pentium(R) D CPU 3.00GHz
1GB RAMWe have not be able to test the triggers anywhere else that has any load on
it. If we run it on a test set of data, it works fine because it is not
being hit by that many users. However, if we disable the triggers,
everything on the live server appears to work fine and PostgreSQL never dies.Also, we are using the PostgreSQL ODBC driver from a Windows application we
have developed. It seems that if the user tries to access the server while
it restarts after this error, they get an error. However, if the user is
not doing anything when the server restarts and then tries to access data
later, it appears to automatically do the reconnect and everything is fine.
a poor man's way to apply lots of parallel load is to build a few web
pages that can individually apply a bit of load, and use apache's ab
program to beat on the web server and, by extension, the database.