Postgres connection errors

Started by Tim Uckunover 15 years ago3 messagesgeneral
Jump to latest
#1Tim Uckun
timuckun@gmail.com

Hello.

I have lots of ruby daemons running connected to postgres. Some of
them start getting connection errors after about a day or two of
running. The odd thing is that they don't all get the same error.

Some get this error: PGError: lost synchronization with server: got
message type "T"
Others get this PGError: lost synchronization with server:
got message type "e"
And sometimes this PGError: lost synchronization with server: got
message type ""

What is postgres trying to tell me here? This error is most likely
coming out of libpq I would think.

Thanks.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Uckun (#1)
Re: Postgres connection errors

Tim Uckun <timuckun@gmail.com> writes:

I have lots of ruby daemons running connected to postgres. Some of
them start getting connection errors after about a day or two of
running. The odd thing is that they don't all get the same error.

Some get this error: PGError: lost synchronization with server: got
message type "T"
Others get this PGError: lost synchronization with server:
got message type "e"
And sometimes this PGError: lost synchronization with server: got
message type ""

What is postgres trying to tell me here?

Most of the cases we've seen like that have been because multiple
threads in the client application were trying to use the same PGconn
connection object concurrently. There's no cross-thread synchronization
built into libpq, so you have to provide the interlocks yourself if
there's any possibility of multiple threads touching the same PGconn
concurrently. And it will not support more than one query at a time
in any case.

But having said that ... usually apps that have made this type of
mistake start falling over almost immediately. Maybe you have a case
where it's mostly interlocked correctly, and you just missed one
infrequent code path?

regards, tom lane

#3Tim Uckun
timuckun@gmail.com
In reply to: Tom Lane (#2)
Re: Postgres connection errors

Most of the cases we've seen like that have been because multiple
threads in the client application were trying to use the same PGconn
connection object concurrently.  There's no cross-thread synchronization
built into libpq, so you have to provide the interlocks yourself if
there's any possibility of multiple threads touching the same PGconn
concurrently.  And it will not support more than one query at a time
in any case.

These are not threaded daemons but this does give me some sort of a
clue to work on. I noticed that there is a call to clear stale
connections which might be the culprit because in the case of these
workers there is only one connection.