Bug in window xp

Started by Wang Haiyongabout 20 years ago17 messagesbugs
Jump to latest
#1Wang Haiyong
wanghaiyong@neusoft.com

Version(8.1.3)
Bug in window xp:

C:\Documents and Settings\openbase>pg_ctl start
LOG: database system was shut down at 2006-4-04 15:54:43 中国标准时间
LOG: checkpoint record is at 0/38C2E0
LOG: redo record is at 0/38C2E0; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 569; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by database "postgres"

C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>psql
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

openbase=# SELECT (-2147483648) / (-1);
LOG: server process (PID 3760) was terminated by signal 21
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
服务器意外地关闭了联接
这种现象通常意味着服务器在处理请求之前
或者正在处理请求的时候意外中止
与服务器的联接已丢失. 尝试重置: LOG: database system was interrupted at 2006-0-05 08:39:56 中国标准时间
LOG: checkpoint record is at 0/38C2E0
LOG: redo record is at 0/38C2E0; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 569; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery in progres

FATAL: the database system is starting up
失败.
!> LOG: record with zero length at 0/38C328
LOG: redo is not required
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by database "postgres"

王海永
东软集团软件产品事业部

地址:沈阳市浑南高新技术产业开发区东软软件园 A1座
邮编:110179
电话:024-83661905
公司网址:www.neusoft.com

----------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Group Ltd., its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful. If you have received this communication in error, please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you.
-----------------------------------------------------------------------------------------------

#2Magnus Hagander
magnus@hagander.net
In reply to: Wang Haiyong (#1)
Re: Bug in window xp

Confirmed here.

What we get is Integer Overflow, on the instruction "idiv esi" in postgres!int4div+0x1f. (Per windows debugger.) Same does not happen on Linux.

Tom - hints? ;-) Any idea why this happens on win32 but not linux?

//Magnus

Show quoted text

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org] On Behalf Of Wang Haiyong
Sent: Wednesday, April 05, 2006 4:34 AM
To: pgsql-bugs@postgresql.org
Subject: [BUGS] Bug in window xp

Version(8.1.3)
Bug in window xp:

C:\Documents and Settings\openbase>pg_ctl start
LOG: database system was shut down at 2006-4-04 15:54:43 中国标准时间
LOG: checkpoint record is at 0/38C2E0
LOG: redo record is at 0/38C2E0; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 569; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by
database "postgres"

C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>psql
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

openbase=# SELECT (-2147483648) / (-1);
LOG: server process (PID 3760) was terminated by signal 21
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
服务器意外地关闭了联接
这种现象通常意味着服务器在处理请求之前
或者正在处理请求的时候意外中止
与服务器的联接已丢失. 尝试重置: LOG: database system was interrupted at
2006-0-05 08:39:56 中国标准时间
LOG: checkpoint record is at 0/38C2E0
LOG: redo record is at 0/38C2E0; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 569; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic
recovery in progres

FATAL: the database system is starting up
失败.
!> LOG: record with zero length at 0/38C328
LOG: redo is not required
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by
database "postgres"

王海永
东软集团软件产品事业部

地址:沈阳市浑南高新技术产业开发区东软软件园 A1座
邮编:110179
电话:024-83661905
公司网址:www.neusoft.com

________________________________

Confidentiality Notice: The information contained in this
e-mail and any accompanying attachment(s) is intended only
for the use of the intended recipient and may be confidential
and/or privileged of Neusoft Group Ltd., its subsidiaries
and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding,
printing, storing, disclosure or copying is strictly
prohibited, and may be unlawful. If you have received this
communication in error, please immediately notify the sender
by return e-mail, and delete the original message and all
copies from your system. Thank you.
________________________________

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#2)
Re: Bug in window xp

"Magnus Hagander" <mha@sollentuna.net> writes:

What we get is Integer Overflow, on the instruction "idiv esi" in postgres!int4div+0x1f. (Per windows debugger.) Same does not happen on Linux.

Tom - hints? ;-) Any idea why this happens on win32 but not linux?

Perhaps there's some process-wide setting that enables or disables that?

It seems fairly inconsistent to have a machine trap on divide overflow
when it doesn't on any other integer overflow, so I'd rather turn it off
than work around it.

regards, tom lane

#4Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#3)
Re: Bug in window xp

"Magnus Hagander" <mha@sollentuna.net> writes:

What we get is Integer Overflow, on the instruction "idiv

esi" in postgres!int4div+0x1f. (Per windows debugger.) Same
does not happen on Linux.

Tom - hints? ;-) Any idea why this happens on win32 but not linux?

Perhaps there's some process-wide setting that enables or
disables that?

It seems fairly inconsistent to have a machine trap on divide
overflow when it doesn't on any other integer overflow, so
I'd rather turn it off than work around it.

Been doing some more research on this one. Seems that since this is a
hardware exception, there is no way to ignore it :-( What you can do is
create a structured exception filter that will get called, and can
detect it. At this point, you can "do your magic" and then have the
processor re-execute the instruction that failed - with any registers
modified per your preference.

So what we'd do in this case is, from what I can tell, to manipulate EIP
to make it point past the exception itself and then return
EXCEPTION_CONTINUE_EXECUTION.

However, this seems like a lot more of a kludge than putting in a check
in the code. And we'd need to know it's *always* safe to advance EIP
once on integer overflows, which I certainly can't speak for :-)

(If we just say continue execution, the program gets stuck in an
infinite loop because the exception just happens over and over again -
no surprise there)

So given that, I think I'm for putting in the check in the code.

As a sidenote, I noticed I never followed through on an old discussion
about crashing. Right now, when a postgres backend crashes it pops up a
GUI window to let the user know so. Only when the user has dismissed
this window does the postmaster notice. Attached patch changes this so
we don't provide GUI notification on crash, but instead just crashes and
let the postmaster deal with it.

//Magnus

Attachments:

gpf_box.patchapplication/octet-stream; name=gpf_box.patchDownload+3-0
#5Martijn van Oosterhout
kleptog@svana.org
In reply to: Tom Lane (#3)
Re: bug in windows xp

Re: SIGFPE on integer divide.

This signal does appear on linux also. On my 7.4.7 installation it
doesn't because of some problem with the integer conversion code but on
8.2devel it gets a SIGFPE.

SELECT (-2147483648) / (-1);

ERROR: floating-point exception
DETAIL: An invalid floating-point operation was signaled. This
probably means an out-of-range result or an invalid operation, such as
division by zero.

A simple C program shows the same. Why isn't it being caught on
windows?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#5)
Re: bug in windows xp

Martijn van Oosterhout <kleptog@svana.org> writes:

Re: SIGFPE on integer divide.
This signal does appear on linux also.

Hmm, it seems to depend on the hardware you're using. I just tried it
on four different machines:

x86 (Pentium 4): SIGFPE
x86_64 (Xeon EM64T): SIGFPE
HPPA: "ERROR: integer out of range" (the intended behavior)
PPC (Mac OS X): no error, returns zero

So the overflow test in int4div is definitely broken and needs to be
changed. However, this is also a good question:

A simple C program shows the same. Why isn't it being caught on
windows?

That still looks like a failure to trap something we should trap.
I'd suggest fixing that first, because if we fix int4div first,
we won't have a simple test case for it.

regards, tom lane

#7Martijn van Oosterhout
kleptog@svana.org
In reply to: Tom Lane (#6)
Re: bug in windows xp

On Sat, Apr 08, 2006 at 12:27:19PM -0400, Tom Lane wrote:

Hmm, it seems to depend on the hardware you're using. I just tried it
on four different machines:

<no consistancy whatsoever>

A simple C program shows the same. Why isn't it being caught on
windows?

That still looks like a failure to trap something we should trap.
I'd suggest fixing that first, because if we fix int4div first,
we won't have a simple test case for it.

Well, we should at least add a regression test for this divide thing
since obviously people assumed it was working when it wasn't.

However, would it be possible to add a test_sigfpe() to regress.c that
simply tries to do 1/0. There is appears to be no regression test that
generates a floating-point exception which appears to be a serious
omission. There are any number of ways to force one: ln(0) for example
(hmm, looks like we protect against that). I suppose one floating point
and one integer example should suffice.

Are there any other signals we should be watching for?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#8Bruce Momjian
bruce@momjian.us
In reply to: Martijn van Oosterhout (#7)
Re: bug in windows xp

Is anyone working on this?

---------------------------------------------------------------------------

Martijn van Oosterhout wrote:
-- Start of PGP signed section.

On Sat, Apr 08, 2006 at 12:27:19PM -0400, Tom Lane wrote:

Hmm, it seems to depend on the hardware you're using. I just tried it
on four different machines:

<no consistancy whatsoever>

A simple C program shows the same. Why isn't it being caught on
windows?

That still looks like a failure to trap something we should trap.
I'd suggest fixing that first, because if we fix int4div first,
we won't have a simple test case for it.

Well, we should at least add a regression test for this divide thing
since obviously people assumed it was working when it wasn't.

However, would it be possible to add a test_sigfpe() to regress.c that
simply tries to do 1/0. There is appears to be no regression test that
generates a floating-point exception which appears to be a serious
omission. There are any number of ways to force one: ln(0) for example
(hmm, looks like we protect against that). I suppose one floating point
and one integer example should suffice.

Are there any other signals we should be watching for?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

-- End of PGP section, PGP failed!

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#9Martijn van Oosterhout
kleptog@svana.org
In reply to: Bruce Momjian (#8)
Re: bug in windows xp

[Re: Uncaught exception when dividing integers]

On Tue, Apr 18, 2006 at 10:50:24PM -0400, Bruce Momjian wrote:

Is anyone working on this?

Not that I know of. However, the first step is to add this regression
test for SIGFPE [-patches CCed]. Note that this will probably redline
windows on the buildfarm. Once this has been added and all
architechures are in compliance, we can deal with the integer overflow
problem.

Triggering a SIGFPE is a bit tricky. On my i386 system the integer
divide will do it, but the rest just return +inf. Given there are
systems that don't SIGFPE the integer divide, I hope one of the others
will trigger... For UNIX systems I've made it try kill() first, that
seems the most reliable.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

Attachments:

sigfpe.difftext/plain; charset=us-asciiDownload+50-1
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#9)
Re: bug in windows xp

Martijn van Oosterhout <kleptog@svana.org> writes:

Not that I know of. However, the first step is to add this regression
test for SIGFPE [-patches CCed].

This seems completely pointless. The question is not about whether the
SIGFPE catcher works when fired, it's about what conditions trigger it.

regards, tom lane

#11Martijn van Oosterhout
kleptog@svana.org
In reply to: Tom Lane (#10)
Re: bug in windows xp

On Wed, Apr 19, 2006 at 10:15:54AM -0400, Tom Lane wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

Not that I know of. However, the first step is to add this regression
test for SIGFPE [-patches CCed].

This seems completely pointless. The question is not about whether the
SIGFPE catcher works when fired, it's about what conditions trigger it.

Well, depends how you look at it. The original bug report was about a
backend crash, which is what happens if you don't catch the SIGFPE. Can
we guarentee that we know every situation that might generate a SIGFPE?

Besides, isn't this what you were referring to here:

http://archives.postgresql.org/pgsql-bugs/2006-04/msg00091.php

Otherwise we should just fix int4div.
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#11)
Re: bug in windows xp

Martijn van Oosterhout <kleptog@svana.org> writes:

Well, depends how you look at it. The original bug report was about a
backend crash, which is what happens if you don't catch the SIGFPE. Can
we guarentee that we know every situation that might generate a SIGFPE?

The point here is that under Windows int4div seems to be generating
something other than a SIGFPE --- if it were actually generating that
particular signal then the existing SIGFPE catcher would catch it.

It's barely possible that int4div *is* generating a SIGFPE and there's
some other breakage preventing FloatExceptionHandler from catching it,
but that's a question that deserves a one-shot test, not permanent
memorialization in a regression test. Besides, if that's the situation
then testing that the handler catches kill(SIGFPE) proves exactly zero
about what the int4div problem is.

regards, tom lane

#13Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#2)
Fix for Win32 division involving INT_MIN

With no Win32 exception detection code in sight, I propose the following
patch to prevent server crashes for unusual INT_MIN integer division.

One interesting thing is that int4div already has code that a check for
a similar division on all platforms, but _after_ the division, rather
than before.

---------------------------------------------------------------------------

Magnus Hagander wrote:

Confirmed here.

What we get is Integer Overflow, on the instruction "idiv esi" in postgres!int4div+0x1f. (Per windows debugger.) Same does not happen on Linux.

Tom - hints? ;-) Any idea why this happens on win32 but not linux?

//Magnus

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org] On Behalf Of Wang Haiyong
Sent: Wednesday, April 05, 2006 4:34 AM
To: pgsql-bugs@postgresql.org
Subject: [BUGS] Bug in window xp

Version(8.1.3)
Bug in window xp:

C:\Documents and Settings\openbase>pg_ctl start
LOG: database system was shut down at 2006-4-04 15:54:43 ??????
LOG: checkpoint record is at 0/38C2E0
LOG: redo record is at 0/38C2E0; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 569; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by
database "postgres"

C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>
C:\Documents and Settings\openbase>psql
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

openbase=# SELECT (-2147483648) / (-1);
LOG: server process (PID 3760) was terminated by signal 21
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
???????????
???????????????????
???????????????
??????????. ????: LOG: database system was interrupted at
2006-0-05 08:39:56 ??????
LOG: checkpoint record is at 0/38C2E0
LOG: redo record is at 0/38C2E0; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 569; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic
recovery in progres

FATAL: the database system is starting up
??.
!> LOG: record with zero length at 0/38C328
LOG: redo is not required
LOG: database system is ready
LOG: transaction ID wrap limit is 2147484146, limited by
database "postgres"

???
???????????

?????????????????????? A1?
???110179
???024?83661905
?????www.neusoft.com

________________________________

Confidentiality Notice: The information contained in this
e-mail and any accompanying attachment(s) is intended only
for the use of the intended recipient and may be confidential
and/or privileged of Neusoft Group Ltd., its subsidiaries
and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding,
printing, storing, disclosure or copying is strictly
prohibited, and may be unlawful. If you have received this
communication in error, please immediately notify the sender
by return e-mail, and delete the original message and all
copies from your system. Thank you.
________________________________

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/pgpatches/int_mintext/x-diffDownload+11-0
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#13)
Re: Fix for Win32 division involving INT_MIN

Bruce Momjian <pgman@candle.pha.pa.us> writes:

With no Win32 exception detection code in sight, I propose the following
patch to prevent server crashes for unusual INT_MIN integer division.

The overflow code tries hard to avoid assuming it knows what INT_MIN and
INT_MAX are --- this is maybe not so important for int4 but it is for
int8 (because of our support for int8-less machines). I don't
immediately see how to make this test without assuming you know the
value of INT_MIN, but we ought to try to come up with one.

We do see funny behavior on Intel chips even without Windows, so it'd
be better to not #ifdef WIN32 but use the same overflow test for
everyone.

I would imagine the same problem arises with int8, has anyone checked?
Also, the overflow tests in the intNmul routines seem vulnerable.

regards, tom lane

#15Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#14)
Re: Fix for Win32 division involving INT_MIN

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

With no Win32 exception detection code in sight, I propose the following
patch to prevent server crashes for unusual INT_MIN integer division.

The overflow code tries hard to avoid assuming it knows what INT_MIN and
INT_MAX are --- this is maybe not so important for int4 but it is for
int8 (because of our support for int8-less machines). I don't
immediately see how to make this test without assuming you know the
value of INT_MIN, but we ought to try to come up with one.

We do see funny behavior on Intel chips even without Windows, so it'd
be better to not #ifdef WIN32 but use the same overflow test for
everyone.

I would imagine the same problem arises with int8, has anyone checked?

Seems int8 is OK on Win32:

postgres=# SELECT (-9223372036854775808) / (-1);
ERROR: bigint out of range

Also, the overflow tests in the intNmul routines seem vulnerable.

I reproduced the crash using int4 multiplication. Again int8
multiplication seemed OK.

I tried int2 and that seemed OK.

Updated patch attached.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/pgpatches/int_mintext/x-diffDownload+22-0
#16Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#15)
Re: Fix for Win32 division involving INT_MIN

Patch applied. Backpatch to 8.1.X.

---------------------------------------------------------------------------

Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

With no Win32 exception detection code in sight, I propose the following
patch to prevent server crashes for unusual INT_MIN integer division.

The overflow code tries hard to avoid assuming it knows what INT_MIN and
INT_MAX are --- this is maybe not so important for int4 but it is for
int8 (because of our support for int8-less machines). I don't
immediately see how to make this test without assuming you know the
value of INT_MIN, but we ought to try to come up with one.

We do see funny behavior on Intel chips even without Windows, so it'd
be better to not #ifdef WIN32 but use the same overflow test for
everyone.

I would imagine the same problem arises with int8, has anyone checked?

Seems int8 is OK on Win32:

postgres=# SELECT (-9223372036854775808) / (-1);
ERROR: bigint out of range

Also, the overflow tests in the intNmul routines seem vulnerable.

I reproduced the crash using int4 multiplication. Again int8
multiplication seemed OK.

I tried int2 and that seemed OK.

Updated patch attached.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#17Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#4)
Re: [BUGS] Bug in window xp

Patch applied. Thanks.

---------------------------------------------------------------------------

Magnus Hagander wrote:

As a sidenote, I noticed I never followed through on an old discussion
about crashing. Right now, when a postgres backend crashes it pops up a
GUI window to let the user know so. Only when the user has dismissed
this window does the postmaster notice. Attached patch changes this so
we don't provide GUI notification on crash, but instead just crashes and
let the postmaster deal with it.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +