[hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

Started by Alvaro Herreraabout 18 years ago6 messages
#1Alvaro Herrera
alvherre@alvh.no-ip.org

Hi,

Here's another problem report on Windows. This time it is usage of SSL
connections and NOTIFY. I talked to Magnus on IRC and he directed me to
bug #2829:
http://archives.postgresql.org/pgsql-bugs/2006-12/msg00122.php

This report seems to be a little different, if only because the reported
error string from SSL mentions an "Unknown winsock error 10004".

This guy is using 8.2.5. SSL seems to be able to fill his log files at
full speed.

Is this an issue we can do something about?

----- Forwarded message from Henry <hensa22@yahoo.es> -----

From: Henry <hensa22@yahoo.es>
To: Alvaro Herrera <alvherre@alvh.no-ip.org>
Cc: Postgres <pgsql-es-ayuda@postgresql.org>
Date: Wed, 12 Dec 2007 03:34:04 +0100 (CET)
Subject: Re: [pgsql-es-ayuda] SLL error 100% cpu
Message-ID: <744138.71684.qm@web30802.mail.mud.yahoo.com>

--- Alvaro Herrera <alvherre@alvh.no-ip.org> escribi�:

Henry escribi�:

buenas a todos los listeros.

ya puse a produccion SSL con postgresql, y la
performance se va degradando mientras se va

usando,

procesos de CPU ocupa el 100% y cuando bajo el
Servicio quedan alguno postgres.exe colgados,
desactive la escritura de Log, porque se creaban
demasiados archivos log con el texto de SYSCALL
ERROR............... , que raro pero hasta se creo

un

archivo de 14MB (ke raro, si esta configurado

hasta

10MB solamente).

---------------------------------

Puedes mandar un extracto de ese archivo gigante?
Unas cuantas lineas
de ese SYSCALL ERROR.

----------------------------------

aqui esta:
LOG: SSL SYSCALL error: Unknown winsock error 10004

saludos

______________________________________________
�Chef por primera vez?
S� un mejor Cocinillas.
http://es.answers.yahoo.com/info/welcome

----- End forwarded message -----

--
Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC
<Schwern> It does it in a really, really complicated way
<crab> why does it need to be complicated?
<Schwern> Because it's MakeMaker.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

This guy is using 8.2.5. SSL seems to be able to fill his log files at
full speed.

Are you *sure* the server is 8.2.5? 8.2.5 shouldn't emit duplicate
messages, but 8.2.4 and before would:

2007-05-17 21:20 tgl

* src/backend/libpq/: be-secure.c (REL7_4_STABLE), be-secure.c
(REL8_1_STABLE), be-secure.c (REL8_0_STABLE), be-secure.c
(REL8_2_STABLE), be-secure.c: Remove redundant logging of send
failures when SSL is in use. While pqcomm.c had been taught not to
do that ages ago, the SSL code was helpfully bleating anyway.
Resolves some recent reports such as bug #3266; however the
underlying cause of the related bug #2829 is still unclear.

Furthermore, it looks to me like "SSL SYSCALL error: %m" doesn't
exist anymore since that patch, so my bogometer is buzzing loudly.

I dunno anything about how to fix the real problem (what's winsock error
10004?), but I don't think he'd be seeing full speed log filling in
8.2.5.

regards, tom lane

#3Trevor Talbot
quension@gmail.com
In reply to: Tom Lane (#2)
Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

On 12/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

I dunno anything about how to fix the real problem (what's winsock error
10004?), but I don't think he'd be seeing full speed log filling in
8.2.5.

WSAEINTR, "A blocking operation was interrupted by a call to
WSACancelBlockingCall."

Offhand I'd take it as either not entirely sane usage of a network
API, or one of the so very many broken software firewalls / network
security products.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Trevor Talbot (#3)
Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

"Trevor Talbot" <quension@gmail.com> writes:

On 12/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I dunno anything about how to fix the real problem (what's winsock error
10004?),

WSAEINTR, "A blocking operation was interrupted by a call to
WSACancelBlockingCall."

Oh, then it's exactly the same thing as our bug #2829.

I opined in that thread that OpenSSL was broken because it failed to
treat this as a retryable case like EINTR. But not being much of a
Windows person, that might be mere hot air. Someone with a Windows
build environment should try patching OpenSSL to treat WSAEINTR
the same as Unix EINTR and see what happens ...

regards, tom lane

#5Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#4)
Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

On Wed, Dec 12, 2007 at 12:30:50AM -0500, Tom Lane wrote:

"Trevor Talbot" <quension@gmail.com> writes:

On 12/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I dunno anything about how to fix the real problem (what's winsock error
10004?),

WSAEINTR, "A blocking operation was interrupted by a call to
WSACancelBlockingCall."

Oh, then it's exactly the same thing as our bug #2829.

I opined in that thread that OpenSSL was broken because it failed to
treat this as a retryable case like EINTR. But not being much of a
Windows person, that might be mere hot air. Someone with a Windows
build environment should try patching OpenSSL to treat WSAEINTR
the same as Unix EINTR and see what happens ...

When I last looked at this (and this was some time ago), I suspected (and
still do) that the problem is in the interaction between our
socket-emulation-stuff (for signals) and openssl. I'm not entirely sure,
but I wanted to rewrite the SSL code so that *our* code is responsible for
aclling the actuall send()/recv(), and not OpenSSL. This would also fix the
fact that if an OpenSSL network operation ends up blocking, that process
can't receive any signals...

I didn't have time to get this done before feature-freeze though, and I
beleive the changes are large enough to qualify as such..

//Magnus

#6Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#5)
Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

Added to TODO:

o Prevent SSL from sending network packets to avoid interference
with Win32 signal emulation

http://archives.postgresql.org/pgsql-hackers/2007-12/msg00455.php

---------------------------------------------------------------------------

Magnus Hagander wrote:

On Wed, Dec 12, 2007 at 12:30:50AM -0500, Tom Lane wrote:

"Trevor Talbot" <quension@gmail.com> writes:

On 12/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I dunno anything about how to fix the real problem (what's winsock error
10004?),

WSAEINTR, "A blocking operation was interrupted by a call to
WSACancelBlockingCall."

Oh, then it's exactly the same thing as our bug #2829.

I opined in that thread that OpenSSL was broken because it failed to
treat this as a retryable case like EINTR. But not being much of a
Windows person, that might be mere hot air. Someone with a Windows
build environment should try patching OpenSSL to treat WSAEINTR
the same as Unix EINTR and see what happens ...

When I last looked at this (and this was some time ago), I suspected (and
still do) that the problem is in the interaction between our
socket-emulation-stuff (for signals) and openssl. I'm not entirely sure,
but I wanted to rewrite the SSL code so that *our* code is responsible for
aclling the actuall send()/recv(), and not OpenSSL. This would also fix the
fact that if an OpenSSL network operation ends up blocking, that process
can't receive any signals...

I didn't have time to get this done before feature-freeze though, and I
beleive the changes are large enough to qualify as such..

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +