Problems with 8.3
I'm getting the back end closing connections early for some reason.
Here is an exception report from my servlet. This first started
happening with my instance of Trac, but now it's happening to my Java
apps too. I hope someone can shed some light on what is going on
here.
Alex
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented
it from fulfilling this request.
exception
org.apache.jasper.JasperException: javax.servlet.ServletException:
javax.servlet.jsp.JspException:
select venue,
city, a.state from venue a, city b, subscriber c where
a.city_id=b.city_id and
distance(deg2rad(c.latitude),deg2rad(c.longitude), deg2rad(a.latitude),
deg2rad(a.longitude))<60.0 and c.subscriber_id=?
: An I/O
error occured while sending to the backend.
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:541)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:417)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
root cause
javax.servlet.ServletException: javax.servlet.jsp.JspException:
select venue,
city, a.state from venue a, city b, subscriber c where
a.city_id=b.city_id and
distance(deg2rad(c.latitude),deg2rad(c.longitude), deg2rad(a.latitude),
deg2rad(a.longitude))<60.0 and c.subscriber_id=?
: An I/O
error occured while sending to the backend.
org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextImpl.java:846)
org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImpl.java:779)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspService(find_005fsubscriber_jsp.java:420)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
root cause
org.postgresql.util.PSQLException: An I/O error occured while
sending to the backend.
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:217)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:351)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:255)
org.apache.taglibs.standard.tag.common.sql.QueryTagSupport.doEndTag(QueryTagSupport.java:215)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_sql_005fquery_005f2(find_005fsubscriber_jsp.java:1388)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f6(find_005fsubscriber_jsp.java:933)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fforEach_005f0(find_005fsubscriber_jsp.java:752)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f2(find_005fsubscriber_jsp.java:543)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspService(find_005fsubscriber_jsp.java:220)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
root cause
java.io.EOFException
org.postgresql.core.PGStream.ReceiveChar(PGStream.java:257)
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1165)
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:191)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:351)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:255)
org.apache.taglibs.standard.tag.common.sql.QueryTagSupport.doEndTag(QueryTagSupport.java:215)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_sql_005fquery_005f2(find_005fsubscriber_jsp.java:1388)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f6(find_005fsubscriber_jsp.java:933)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fforEach_005f0(find_005fsubscriber_jsp.java:752)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f2(find_005fsubscriber_jsp.java:543)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspService(find_005fsubscriber_jsp.java:220)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
note The full stack trace of the root cause is available in the Apache
Tomcat/6.0.14 logs.
Ok - lookint at the pg log, it appears that the server process is seg
faulting :(. This might conceivably be my fault. I have 3 stored
procedures written in C, but they've been on the server for months,
and unless I didn't deploy them correctly originally I don't think
it's them.
I'm thinking I need a core file. I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?
Alex
Show quoted text
On Thu, Mar 6, 2008 at 3:49 PM, Alex Turner <armtuk@gmail.com> wrote:
I'm getting the back end closing connections early for some reason.
Here is an exception report from my servlet. This first started
happening with my instance of Trac, but now it's happening to my Java
apps too. I hope someone can shed some light on what is going on
here.Alex
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented
it from fulfilling this request.exception
org.apache.jasper.JasperException: javax.servlet.ServletException:
javax.servlet.jsp.JspException:select venue,
city, a.state from venue a, city b, subscriber c where
a.city_id=b.city_id and
distance(deg2rad(c.latitude),deg2rad(c.longitude), deg2rad(a.latitude),
deg2rad(a.longitude))<60.0 and c.subscriber_id=?: An I/O
error occured while sending to the backend.org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:541)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:417)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
root cause
javax.servlet.ServletException: javax.servlet.jsp.JspException:
select venue,
city, a.state from venue a, city b, subscriber c where
a.city_id=b.city_id and
distance(deg2rad(c.latitude),deg2rad(c.longitude), deg2rad(a.latitude),
deg2rad(a.longitude))<60.0 and c.subscriber_id=?: An I/O
error occured while sending to the backend.org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextImpl.java:846)
org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImpl.java:779)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspService(find_005fsubscriber_jsp.java:420)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
root cause
org.postgresql.util.PSQLException: An I/O error occured while
sending to the backend.org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:217)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:351)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:255)
org.apache.taglibs.standard.tag.common.sql.QueryTagSupport.doEndTag(QueryTagSupport.java:215)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_sql_005fquery_005f2(find_005fsubscriber_jsp.java:1388)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f6(find_005fsubscriber_jsp.java:933)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fforEach_005f0(find_005fsubscriber_jsp.java:752)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f2(find_005fsubscriber_jsp.java:543)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspService(find_005fsubscriber_jsp.java:220)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
root cause
java.io.EOFException
org.postgresql.core.PGStream.ReceiveChar(PGStream.java:257)
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1165)
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:191)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:351)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:255)
org.apache.taglibs.standard.tag.common.sql.QueryTagSupport.doEndTag(QueryTagSupport.java:215)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_sql_005fquery_005f2(find_005fsubscriber_jsp.java:1388)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f6(find_005fsubscriber_jsp.java:933)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fforEach_005f0(find_005fsubscriber_jsp.java:752)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspx_meth_c_005fif_005f2(find_005fsubscriber_jsp.java:543)
org.apache.jsp.admin.find_005fsubscriber_jsp._jspService(find_005fsubscriber_jsp.java:220)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:320)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
note The full stack trace of the root cause is available in the Apache
Tomcat/6.0.14 logs.
On 3/6/08, Alex Turner <armtuk@gmail.com> wrote:
I'm getting the back end closing connections early for some reason.
Here is an exception report from my servlet. This first started
happening with my instance of Trac, but now it's happening to my Java
apps too. I hope someone can shed some light on what is going on
here.
You need to look in the server logs and see why it's crashing--all the
exception traces tell us is that the server went away unexpectedly.
-Doug
On 3/6/08, Alex Turner <armtuk@gmail.com> wrote:
Ok - lookint at the pg log, it appears that the server process is seg
faulting :(. This might conceivably be my fault. I have 3 stored
procedures written in C, but they've been on the server for months,
and unless I didn't deploy them correctly originally I don't think
it's them.I'm thinking I need a core file. I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?
Or just put a 'ulimit' statement in the script that starts PG, or edit
/etc/security/limits.conf if you're on RedHat.
-Doug
Ok - the connection closed thing is happening a lot, but not much is
going into pg_log...
Alex
Show quoted text
On Thu, Mar 6, 2008 at 4:03 PM, Douglas McNaught <doug@mcnaught.org> wrote:
On 3/6/08, Alex Turner <armtuk@gmail.com> wrote:
Ok - lookint at the pg log, it appears that the server process is seg
faulting :(. This might conceivably be my fault. I have 3 stored
procedures written in C, but they've been on the server for months,
and unless I didn't deploy them correctly originally I don't think
it's them.I'm thinking I need a core file. I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?Or just put a 'ulimit' statement in the script that starts PG, or edit
/etc/security/limits.conf if you're on RedHat.-Doug
Alex Turner wrote:
I'm getting the back end closing connections early for some reason.
Here is an exception report from my servlet. This first started
happening with my instance of Trac, but now it's happening to my Java
apps too. I hope someone can shed some light on what is going on
here.
What do postgresql's server logs show?
--
Richard Huxton
Archonet Ltd
Sometimes I'm getting LOG: unexptected EOF on client connection
Alex
Show quoted text
On Thu, Mar 6, 2008 at 4:06 PM, Alex Turner <armtuk@gmail.com> wrote:
Ok - the connection closed thing is happening a lot, but not much is
going into pg_log...Alex
On Thu, Mar 6, 2008 at 4:03 PM, Douglas McNaught <doug@mcnaught.org> wrote:
On 3/6/08, Alex Turner <armtuk@gmail.com> wrote:
Ok - lookint at the pg log, it appears that the server process is seg
faulting :(. This might conceivably be my fault. I have 3 stored
procedures written in C, but they've been on the server for months,
and unless I didn't deploy them correctly originally I don't think
it's them.I'm thinking I need a core file. I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?Or just put a 'ulimit' statement in the script that starts PG, or edit
/etc/security/limits.conf if you're on RedHat.-Doug
Ok - more info
I turned on connection logging a few other things, and normaly I get
something like this:
LOG: 00000: connection authorized: user=postgres database=trend
LOCATION: BackendInitialize, postmaster.c:3097
LOG: 00000: disconnection: session time: 0:00:00.004 user=postgres
database=trend host=localhost port=51586
but when it fails, I get nothing in the log at all...
ALex
Show quoted text
On Thu, Mar 6, 2008 at 4:09 PM, Alex Turner <armtuk@gmail.com> wrote:
Sometimes I'm getting LOG: unexptected EOF on client connection
Alex
On Thu, Mar 6, 2008 at 4:06 PM, Alex Turner <armtuk@gmail.com> wrote:
Ok - the connection closed thing is happening a lot, but not much is
going into pg_log...Alex
On Thu, Mar 6, 2008 at 4:03 PM, Douglas McNaught <doug@mcnaught.org> wrote:
On 3/6/08, Alex Turner <armtuk@gmail.com> wrote:
Ok - lookint at the pg log, it appears that the server process is seg
faulting :(. This might conceivably be my fault. I have 3 stored
procedures written in C, but they've been on the server for months,
and unless I didn't deploy them correctly originally I don't think
it's them.I'm thinking I need a core file. I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?Or just put a 'ulimit' statement in the script that starts PG, or edit
/etc/security/limits.conf if you're on RedHat.-Doug
Alex Turner wrote:
Ok - lookint at the pg log, it appears that the server process is seg
faulting :(. This might conceivably be my fault. I have 3 stored
procedures written in C, but they've been on the server for months,
and unless I didn't deploy them correctly originally I don't think
it's them.
You've had stored procedures written in C deployed on 8.3 for months?
Or you had stored procedures written for 8.2 and have re-deployed onto 8.3?
--
Richard Huxton
Archonet Ltd
"Alex Turner" <armtuk@gmail.com> writes:
I'm thinking I need a core file.
Yeah.
I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?
Probably won't help if your start script does "su postgres" as most do.
What I'd try is putting the ulimit command in ~postgres/.profile or
whatever init files your shell happens to like.
regards, tom lane
I did a pg_ctl start from the postgres user... is that gonna work, or
does pg_ctl do an su?
Alex
Show quoted text
On Thu, Mar 6, 2008 at 4:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Alex Turner" <armtuk@gmail.com> writes:
I'm thinking I need a core file.
Yeah.
I'm guessing I just restart
postgresql from a user whos core file size limit is set to non zero?Probably won't help if your start script does "su postgres" as most do.
What I'd try is putting the ulimit command in ~postgres/.profile or
whatever init files your shell happens to like.regards, tom lane
Alex Turner wrote:
I had stored procs in C on 8.2 for months, and I moved them over to
8.3 when we upgraded.
And re-compiled them, yes?
The thing is that it's happening on a database
that doesn't have the stored procs. It seems to be happening worst on
a system that keeps connections open between requests. It's failing
about half the time.
Hmm - if it's individual backends dying, and not those with the C
functions then that should rule them out. I think that core dump will be
your quickest way of figuring this out.
--
Richard Huxton
Archonet Ltd
Import Notes
Reply to msg id not found: 33c6269f0803061328j1d8bb167i60a1ca0b2ff455c8@mail.gmail.com
It was core dumping on the 5th of March, but it hasn't since. It's
just failing with the connection closed problem. It seems to happen
worst with queries that are going to do updates and with connections
that are persistent between http requests...
I downgraded to 8.2, but it hasn't made any difference :(
Alex
Show quoted text
On Fri, Mar 7, 2008 at 3:00 AM, Richard Huxton <dev@archonet.com> wrote:
Alex Turner wrote:
I had stored procs in C on 8.2 for months, and I moved them over to
8.3 when we upgraded.And re-compiled them, yes?
The thing is that it's happening on a database
that doesn't have the stored procs. It seems to be happening worst on
a system that keeps connections open between requests. It's failing
about half the time.Hmm - if it's individual backends dying, and not those with the C
functions then that should rule them out. I think that core dump will be
your quickest way of figuring this out.--
Richard Huxton
Archonet Ltd
Alex Turner wrote:
It was core dumping on the 5th of March, but it hasn't since. It's
just failing with the connection closed problem. It seems to happen
worst with queries that are going to do updates and with connections
that are persistent between http requests...I downgraded to 8.2, but it hasn't made any difference :(
It seems odd that nothing is being logged.
Is there nothing in the system logs for the machine as a whole? You're
not running out of memory and PG is being killed? Even then, you'd
expect something to pop up in the logs.
--
Richard Huxton
Archonet Ltd
Nothin worth mentioning in /var/log/messages
The wierd thing I do see is there are a number of sockets in
CLOSE_WAIT when doing a netstat -an | grep 5432
I think maybe I'll just reboot and see if that fixes it.
Alex
Show quoted text
On Fri, Mar 7, 2008 at 11:47 AM, Richard Huxton <dev@archonet.com> wrote:
Alex Turner wrote:
It was core dumping on the 5th of March, but it hasn't since. It's
just failing with the connection closed problem. It seems to happen
worst with queries that are going to do updates and with connections
that are persistent between http requests...I downgraded to 8.2, but it hasn't made any difference :(
It seems odd that nothing is being logged.
Is there nothing in the system logs for the machine as a whole? You're
not running out of memory and PG is being killed? Even then, you'd
expect something to pop up in the logs.--
Richard Huxton
Archonet Ltd
I didn't. And after the reboot, I still see 8 new sockets stuck in
CLOSE_WAIT - I'm wondering if this is a hardware/kernel problem...
Alex
Show quoted text
On Fri, Mar 7, 2008 at 11:50 AM, Alex Turner <armtuk@gmail.com> wrote:
Nothin worth mentioning in /var/log/messages
The wierd thing I do see is there are a number of sockets in
CLOSE_WAIT when doing a netstat -an | grep 5432I think maybe I'll just reboot and see if that fixes it.
Alex
On Fri, Mar 7, 2008 at 11:47 AM, Richard Huxton <dev@archonet.com> wrote:
Alex Turner wrote:
It was core dumping on the 5th of March, but it hasn't since. It's
just failing with the connection closed problem. It seems to happen
worst with queries that are going to do updates and with connections
that are persistent between http requests...I downgraded to 8.2, but it hasn't made any difference :(
It seems odd that nothing is being logged.
Is there nothing in the system logs for the machine as a whole? You're
not running out of memory and PG is being killed? Even then, you'd
expect something to pop up in the logs.--
Richard Huxton
Archonet Ltd
On Fri, Mar 7, 2008 at 11:17 AM, Alex Turner <armtuk@gmail.com> wrote:
I didn't. And after the reboot, I still see 8 new sockets stuck in
CLOSE_WAIT - I'm wondering if this is a hardware/kernel problem...
Having sockets in CLOSE_WAIT is actually pretty normal
Well - I think it might be that some of my servlets weren't closing
their database connections properly.
I do have some new evidence though:
I did an strace of the tomcat processes, and I noticed something that
might be odd, but I'm not really qualified to say. I notice that
every time a socket sends a request to Postgresql it gets some kind of
reply. This is true in all cases EXCEPT when the application crashes.
Here is the segment of the strace right before it throws a wobbly:
[pid 4565] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 156
[pid 4565] bind(156, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 4565] getsockname(156, {sa_family=AF_INET,
sin_port=htons(56550), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 4565] connect(156, {sa_family=AF_INET, sin_port=htons(5432),
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid 4565] setsockopt(156, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 4565] send(156, "\0\0\0W\0\3\0\0user\0postgres\0database\0t"...,
87, 0) = 87
[pid 4565] recv(156,
"R\0\0\0\10\0\0\0\0S\0\0\0\34client_encoding\0UN"..., 8192, 0) = 279
[pid 4565] gettimeofday({1204948966, 386187}, NULL) = 0
[pid 4565] send(156, "P\0\0\1\35\0\r\n \t\tselect"...,
334, 0) = 334
[pid 4565] recv(156, "", 8192, 0) = 0
[pid 4565] send(156, "X\0\0\0\4", 5, 0) = 5
[pid 4565] dup2(11, 156) = 156
[pid 4565] close(156) = 0
Notice that the recv(156,... after sending the query comes back blank
which seems odd given that we just sent a query to the database.
I'm really in bind with this one. It started happening a couple of
days ago at this point, and all our admin applications are basically
down :(, people can't even log the bugs that this is generating
because the bugtrac (trac) is running on this postgresql and is
throwing errors too.
I also caught something else that seemed wierd on another trace:
[pid 3553] send(28, "P\0\0\0H\0delete from result_cache w"..., 108, 0) = 108
[pid 3553] recv(28, "N\0\0\1\202SWARNING\0C57P02\0Mterminatin"...,
8192, 0) = 387
[pid 3553] gettimeofday({1204946902, 977641}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977682}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977766}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977902}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977973}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978012}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978053}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978091}, NULL) = 0
[pid 3553] recv(28, "", 8192, 0) = 0
[pid 3553] send(28, "X\0\0\0\4", 5, 0) = -1 EPIPE (Broken pipe)
[pid 3553] --- SIGPIPE (Broken pipe) @ 0 (0) ---
[pid 3553] rt_sigreturn(0x9) = -1 EPIPE (Broken pipe)
I couldn't reproduce this though. It just randomly throws a SIGPIPE
after the query. The other wierd thing is that this process also
throws a SIGSEGV at another point. I wasn't expecting tomcat to
crash, so alas I didn't capture a core file. I guess I should set the
system default up.
Alex
Show quoted text
On Fri, Mar 7, 2008 at 2:28 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Mar 7, 2008 at 11:17 AM, Alex Turner <armtuk@gmail.com> wrote:
I didn't. And after the reboot, I still see 8 new sockets stuck in
CLOSE_WAIT - I'm wondering if this is a hardware/kernel problem...Having sockets in CLOSE_WAIT is actually pretty normal
Well - I know that my stored proc is segfaulting based on a strace of
postgresql. Don't know how that affects trac which isn't using that
stored proc... the mystery continues. Either way I didn't get a
corefile, and ulimit -a show I have unlimited core file size :(
Alex
Show quoted text
On Fri, Mar 7, 2008 at 11:42 PM, Alex Turner <armtuk@gmail.com> wrote:
Well - I think it might be that some of my servlets weren't closing
their database connections properly.I do have some new evidence though:
I did an strace of the tomcat processes, and I noticed something that
might be odd, but I'm not really qualified to say. I notice that
every time a socket sends a request to Postgresql it gets some kind of
reply. This is true in all cases EXCEPT when the application crashes.
Here is the segment of the strace right before it throws a wobbly:[pid 4565] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 156
[pid 4565] bind(156, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 4565] getsockname(156, {sa_family=AF_INET,
sin_port=htons(56550), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 4565] connect(156, {sa_family=AF_INET, sin_port=htons(5432),
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid 4565] setsockopt(156, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 4565] send(156, "\0\0\0W\0\3\0\0user\0postgres\0database\0t"...,
87, 0) = 87
[pid 4565] recv(156,
"R\0\0\0\10\0\0\0\0S\0\0\0\34client_encoding\0UN"..., 8192, 0) = 279
[pid 4565] gettimeofday({1204948966, 386187}, NULL) = 0
[pid 4565] send(156, "P\0\0\1\35\0\r\n \t\tselect"...,
334, 0) = 334
[pid 4565] recv(156, "", 8192, 0) = 0
[pid 4565] send(156, "X\0\0\0\4", 5, 0) = 5
[pid 4565] dup2(11, 156) = 156
[pid 4565] close(156) = 0Notice that the recv(156,... after sending the query comes back blank
which seems odd given that we just sent a query to the database.I'm really in bind with this one. It started happening a couple of
days ago at this point, and all our admin applications are basically
down :(, people can't even log the bugs that this is generating
because the bugtrac (trac) is running on this postgresql and is
throwing errors too.I also caught something else that seemed wierd on another trace:
[pid 3553] send(28, "P\0\0\0H\0delete from result_cache w"..., 108, 0) = 108
[pid 3553] recv(28, "N\0\0\1\202SWARNING\0C57P02\0Mterminatin"...,
8192, 0) = 387
[pid 3553] gettimeofday({1204946902, 977641}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977682}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977766}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977902}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977973}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978012}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978053}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978091}, NULL) = 0
[pid 3553] recv(28, "", 8192, 0) = 0
[pid 3553] send(28, "X\0\0\0\4", 5, 0) = -1 EPIPE (Broken pipe)
[pid 3553] --- SIGPIPE (Broken pipe) @ 0 (0) ---
[pid 3553] rt_sigreturn(0x9) = -1 EPIPE (Broken pipe)I couldn't reproduce this though. It just randomly throws a SIGPIPE
after the query. The other wierd thing is that this process also
throws a SIGSEGV at another point. I wasn't expecting tomcat to
crash, so alas I didn't capture a core file. I guess I should set the
system default up.Alex
On Fri, Mar 7, 2008 at 2:28 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Mar 7, 2008 at 11:17 AM, Alex Turner <armtuk@gmail.com> wrote:
I didn't. And after the reboot, I still see 8 new sockets stuck in
CLOSE_WAIT - I'm wondering if this is a hardware/kernel problem...Having sockets in CLOSE_WAIT is actually pretty normal
On 3/8/08, Alex Turner <armtuk@gmail.com> wrote:
Well - I know that my stored proc is segfaulting based on a strace of
postgresql. Don't know how that affects trac which isn't using that
stored proc... the mystery continues. Either way I didn't get a
corefile, and ulimit -a show I have unlimited core file size :(
* Are you absolutely sure that ulimit applies to the actual running
postmaster process?
* Are you sure you're looking in the right place for core files?
-Doug