"incomplete startup packet" on SGI
I have a working 8.1 server running on Linux and I can connect to it
from other Linux clients. I built postgresql 8.1 on an SGI (using
--without-readline but otherwise stock) and it compiled OK and installed
fine. But when I try to connect to the Linux server I get "could not
send startup packet: transport endpoint is not connected" on the client
end and "incomplete startup packet" on the server end. Connectivity
between the two machines is working.
I could find basically no useful references to the former and the only
references to the latter were portscans and the like.
Browsing the source, I see a couple places that message could come
from. One relates to SSL, which the output from configure says is
turned off on both client and server. The other is just a generic comm
error--but would could cause a partial failure like that?
Just finished building and installing on *Sun* (also
"--without-readline", not that I think that could be the issue): Works
fine. So it's something to do with the SGI build in particular.
David Rysdam wrote:
Show quoted text
I have a working 8.1 server running on Linux and I can connect to it
from other Linux clients. I built postgresql 8.1 on an SGI (using
--without-readline but otherwise stock) and it compiled OK and
installed fine. But when I try to connect to the Linux server I get
"could not send startup packet: transport endpoint is not connected"
on the client end and "incomplete startup packet" on the server end.
Connectivity between the two machines is working.I could find basically no useful references to the former and the only
references to the latter were portscans and the like.Browsing the source, I see a couple places that message could come
from. One relates to SSL, which the output from configure says is
turned off on both client and server. The other is just a generic
comm error--but would could cause a partial failure like that?---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
David Rysdam <drysdam@ll.mit.edu> writes:
Just finished building and installing on *Sun* (also
"--without-readline", not that I think that could be the issue): Works
fine. So it's something to do with the SGI build in particular.
IRIX buggy, film at 11. :)
-Doug
David Rysdam <drysdam@ll.mit.edu> writes:
Just finished building and installing on *Sun* (also
"--without-readline", not that I think that could be the issue): Works
fine. So it's something to do with the SGI build in particular.
More likely it's something to do with weird behavior of the SGI kernel's
TCP stack. I did a little googling for "transport endpoint is not
connected" without turning up anything obviously related, but that or
ENOTCONN is probably what you need to search on.
regards, tom lane
Tom Lane wrote:
David Rysdam <drysdam@ll.mit.edu> writes:
Just finished building and installing on *Sun* (also
"--without-readline", not that I think that could be the issue): Works
fine. So it's something to do with the SGI build in particular.More likely it's something to do with weird behavior of the SGI kernel's
TCP stack. I did a little googling for "transport endpoint is not
connected" without turning up anything obviously related, but that or
ENOTCONN is probably what you need to search on.regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
It's acting like a race condition or pointer problem. When I add random
debug printfs/PQflushs to libpq it sometimes works.
David Rysdam wrote:
Tom Lane wrote:
David Rysdam <drysdam@ll.mit.edu> writes:
Just finished building and installing on *Sun* (also
"--without-readline", not that I think that could be the issue):
Works fine. So it's something to do with the SGI build in particular.More likely it's something to do with weird behavior of the SGI kernel's
TCP stack. I did a little googling for "transport endpoint is not
connected" without turning up anything obviously related, but that or
ENOTCONN is probably what you need to search on.regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmasterIt's acting like a race condition or pointer problem. When I add
random debug printfs/PQflushs to libpq it sometimes works.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Not a race condition: No threads
Not a memory leak: Electric fence says nothing. And it works when
electric fence is running, whereas a binary that uses the same libpq
without linking efence does not work.
David Rysdam wrote:
David Rysdam wrote:
Tom Lane wrote:
David Rysdam <drysdam@ll.mit.edu> writes:
Just finished building and installing on *Sun* (also
"--without-readline", not that I think that could be the issue):
Works fine. So it's something to do with the SGI build in particular.More likely it's something to do with weird behavior of the SGI
kernel's
TCP stack. I did a little googling for "transport endpoint is not
connected" without turning up anything obviously related, but that or
ENOTCONN is probably what you need to search on.regards, tom lane
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmasterIt's acting like a race condition or pointer problem. When I add
random debug printfs/PQflushs to libpq it sometimes works.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
matchNot a race condition: No threads
Not a memory leak: Electric fence says nothing. And it works when
electric fence is running, whereas a binary that uses the same libpq
without linking efence does not work.
I know nobody is interested in this, but I think I should document the
"solution" for anyone who finds this thread in the archives: My theory
is that Irix is unable to keep up with how fast the postgresql client is
going and that the debug statements/efence stuff are slowing it down
enough that Irix can catch up and make sure the socket really is there,
connected and working. To that end, I inserted a sleep(1) in
fe-connect.c just before the pqPacketSend(...startpacket...) stuff.
It's stupid and hacky, but gets me where I need to be and maybe this
hint will inspire somebody who knows (and cares) about Irix to find a
real fix.