Win32 WEXITSTATUS too simplistic
win32.h says
/*
* Signal stuff
* WIN32 doesn't have wait(), so the return value for children
* is simply the return value specified by the child, without
* any additional information on whether the child terminated
* on its own or via a signal. These macros are also used
* to interpret the return value of system().
*/
#define WEXITSTATUS(w) (w)
#define WIFEXITED(w) (true)
#define WIFSIGNALED(w) (false)
#define WTERMSIG(w) (0)
I think this supposition has been pretty much proven false by recent
reports of silly "exit code" numbers from Win32 users, as for instance
here
http://archives.postgresql.org/pgsql-bugs/2006-12/msg00163.php
where the postmaster reports
server process exited with exit code -1073741819
from what I suspect is really the equivalent of a SIGSEGV trap,
ie, attempted access to already-deallocated memory. My calculator
says the above is equivalent to hex C0000005, and I say that this
makes it pretty clear that *some* parts of Windows put flag bits into
the process exit code. Anyone want to run down what we should really
be using instead of the above macros?
regards, tom lane
Tom Lane wrote:
win32.h says
/*
* Signal stuff
* WIN32 doesn't have wait(), so the return value for children
* is simply the return value specified by the child, without
* any additional information on whether the child terminated
* on its own or via a signal. These macros are also used
* to interpret the return value of system().
*/
#define WEXITSTATUS(w) (w)
#define WIFEXITED(w) (true)
#define WIFSIGNALED(w) (false)
#define WTERMSIG(w) (0)I think this supposition has been pretty much proven false by recent
reports of silly "exit code" numbers from Win32 users, as for instance
here
http://archives.postgresql.org/pgsql-bugs/2006-12/msg00163.php
where the postmaster reports
server process exited with exit code -1073741819
from what I suspect is really the equivalent of a SIGSEGV trap,
ie, attempted access to already-deallocated memory. My calculator
says the above is equivalent to hex C0000005, and I say that this
makes it pretty clear that *some* parts of Windows put flag bits into
the process exit code. Anyone want to run down what we should really
be using instead of the above macros?
The exit code is apparently what is reported from GetExitCodeProcess().
For info on that see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/getexitcodeprocess.asp
I think we are possibly seeing the third case, i.e. the code from an
unhandled exception. I haven't managed to find an API to handle them
though ...
cheers
andrew
"Andrew Dunstan" <andrew@dunslane.net> writes:
Tom Lane wrote:
Anyone want to run down what we should really
be using instead of the above macros?
The exit code is apparently what is reported from GetExitCodeProcess().
For info on that see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/getexitcodeprocess.asp
I think we are possibly seeing the third case, i.e. the code from an
unhandled exception. I haven't managed to find an API to handle them
though ...
Right ... but I don't think we want to "handle the exception". The
right question to be asking is "what is the encoding of these 'exception
values' it's talking about?"
regards, tom lane
Tom Lane wrote:
"Andrew Dunstan" <andrew@dunslane.net> writes:
Tom Lane wrote:
Anyone want to run down what we should really
be using instead of the above macros?The exit code is apparently what is reported from GetExitCodeProcess().
For info on that see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/getexitcodeprocess.aspI think we are possibly seeing the third case, i.e. the code from an
unhandled exception. I haven't managed to find an API to handle them
though ...Right ... but I don't think we want to "handle the exception". The
right question to be asking is "what is the encoding of these 'exception
values' it's talking about?"
Yes, sorry for my loose expression. That's what I meant - I didn't find an
API that would translate the exception values.
cheers
andrew
Tom Lane <tgl@sss.pgh.pa.us> wrote:
server process exited with exit code -1073741819
from what I suspect is really the equivalent of a SIGSEGV trap,
ie, attempted access to already-deallocated memory. My calculator
says the above is equivalent to hex C0000005, and I say that this
makes it pretty clear that *some* parts of Windows put flag bits into
the process exit code. Anyone want to run down what we should really
be using instead of the above macros?
C0000005 equals to EXCEPTION_ACCESS_VIOLATION. The value returned by
GetExceptionCode() seems to be the exit code in unhandeled exception cases.
AFAICS, all EXCEPTION_xxx (or STATUS_xxx) values are defined as 0xCxxxxxxx.
I think we can use the second high bit to distinguish exit by exception
from normal exits.
#define WEXITSTATUS(w) ((int) ((w) & 0x40000000))
#define WIFEXITED(w) ((w) & 0x40000000) == 0)
#define WIFSIGNALED(w) ((w) & 0x40000000) != 0)
#define WTERMSIG(w) (w) // or ((w) & 0x3FFFFFFF)
However, it comes from reverse engineering of the headers of Windows.
I cannot find any official documentation.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
I did some research on this, and found a nice Win32 list of STATUS_
error values. Looking at the list, I think the non-exit() return values
are much larger than just the second high bit.
I am proposing the attached patch, which basically has all system()
return values < 0x100 as exit() calls, and everything above that as a
signal exits. I also think it is too risky to backpatch to 8.2.X.
Also, should we print Win32 WTERMSIG() values as hex because they are so
large? I have added that to the patch.
---------------------------------------------------------------------------
ITAGAKI Takahiro wrote:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
server process exited with exit code -1073741819
from what I suspect is really the equivalent of a SIGSEGV trap,
ie, attempted access to already-deallocated memory. My calculator
says the above is equivalent to hex C0000005, and I say that this
makes it pretty clear that *some* parts of Windows put flag bits into
the process exit code. Anyone want to run down what we should really
be using instead of the above macros?C0000005 equals to EXCEPTION_ACCESS_VIOLATION. The value returned by
GetExceptionCode() seems to be the exit code in unhandeled exception cases.AFAICS, all EXCEPTION_xxx (or STATUS_xxx) values are defined as 0xCxxxxxxx.
I think we can use the second high bit to distinguish exit by exception
from normal exits.#define WEXITSTATUS(w) ((int) ((w) & 0x40000000))
#define WIFEXITED(w) ((w) & 0x40000000) == 0)
#define WIFSIGNALED(w) ((w) & 0x40000000) != 0)
#define WTERMSIG(w) (w) // or ((w) & 0x3FFFFFFF)However, it comes from reverse engineering of the headers of Windows.
I cannot find any official documentation.Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Attachments:
/pgpatches/win32text/x-diffDownload+35-28
I have applied a modified version of this patch. We now print the
exception value in hex, and give a URL where the exception can be looked
up.
---------------------------------------------------------------------------
Bruce Momjian wrote:
I did some research on this, and found a nice Win32 list of STATUS_
error values. Looking at the list, I think the non-exit() return values
are much larger than just the second high bit.I am proposing the attached patch, which basically has all system()
return values < 0x100 as exit() calls, and everything above that as a
signal exits. I also think it is too risky to backpatch to 8.2.X.Also, should we print Win32 WTERMSIG() values as hex because they are so
large? I have added that to the patch.---------------------------------------------------------------------------
ITAGAKI Takahiro wrote:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
server process exited with exit code -1073741819
from what I suspect is really the equivalent of a SIGSEGV trap,
ie, attempted access to already-deallocated memory. My calculator
says the above is equivalent to hex C0000005, and I say that this
makes it pretty clear that *some* parts of Windows put flag bits into
the process exit code. Anyone want to run down what we should really
be using instead of the above macros?C0000005 equals to EXCEPTION_ACCESS_VIOLATION. The value returned by
GetExceptionCode() seems to be the exit code in unhandeled exception cases.AFAICS, all EXCEPTION_xxx (or STATUS_xxx) values are defined as 0xCxxxxxxx.
I think we can use the second high bit to distinguish exit by exception
from normal exits.#define WEXITSTATUS(w) ((int) ((w) & 0x40000000))
#define WIFEXITED(w) ((w) & 0x40000000) == 0)
#define WIFSIGNALED(w) ((w) & 0x40000000) != 0)
#define WTERMSIG(w) (w) // or ((w) & 0x3FFFFFFF)However, it comes from reverse engineering of the headers of Windows.
I cannot find any official documentation.Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com+ If your life is a hard drive, Christ can be your backup. +
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Attachments:
/rtmp/difftext/x-diffDownload+57-42
Bruce Momjian wrote:
I have applied a modified version of this patch. We now print the
exception value in hex, and give a URL where the exception can be looked
up.
Humm, wouldn't it be more appropriate to put the URL in a errhint()
instead?
+ ereport(lev, + + /*------ + translator: %s is a noun phrase describing a child process, such as + "server process" */ + (errmsg("%s (PID %d) was terminated by exception %X\nSee http://source.winehq.org/source/include/ntstatus.h for a description\nof the hex value.", + procname, pid, WTERMSIG(exitstatus)))); + #endif
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote:
Bruce Momjian wrote:
I have applied a modified version of this patch. We now print the
exception value in hex, and give a URL where the exception can be looked
up.Humm, wouldn't it be more appropriate to put the URL in a errhint()
instead?+ ereport(lev, + + /*------ + translator: %s is a noun phrase describing a child process, such as + "server process" */ + (errmsg("%s (PID %d) was terminated by exception %X\nSee http://source.winehq.org/source/include/ntstatus.h for a description\nof the hex value.", + procname, pid, WTERMSIG(exitstatus)))); + #endif
Oops, forgot to mention that detail. We are using log_error() in one
case, and ereport() in another. Let me do the hint in the report case,
but I have to leave the log_error case alone because it takes only three
arguments.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Alvaro Herrera <alvherre@commandprompt.com> writes:
Bruce Momjian wrote:
I have applied a modified version of this patch. We now print the
exception value in hex, and give a URL where the exception can be looked
up.
Humm, wouldn't it be more appropriate to put the URL in a errhint()
instead?
It should not be there at all. Do you see URLs in any of our other
error messages?
regards, tom lane
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Bruce Momjian wrote:
I have applied a modified version of this patch. We now print the
exception value in hex, and give a URL where the exception can be looked
up.Humm, wouldn't it be more appropriate to put the URL in a errhint()
instead?It should not be there at all. Do you see URLs in any of our other
error messages?
Sure, ideally, but how else can we give information about that hex
value?
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
Tom Lane wrote:
It should not be there at all. Do you see URLs in any of our other
error messages?
Sure, ideally, but how else can we give information about that hex
value?
It's not the responsibility of that error message to tell someone to
go look up the error number in Microsoft documentation. If they're
clueful enough to make any sense of the number beyond the strerror
translation we already provide, then they already know where to look.
Even if it were the responsibility of the error message to suggest this,
a URL seems far too transient.
regards, tom lane
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Tom Lane wrote:
It should not be there at all. Do you see URLs in any of our other
error messages?Sure, ideally, but how else can we give information about that hex
value?It's not the responsibility of that error message to tell someone to
go look up the error number in Microsoft documentation. If they're
clueful enough to make any sense of the number beyond the strerror
translation we already provide, then they already know where to look.
Well, it took me like 25 minutes to find that list, so it isn't obvious.
Search for STATUS_CARDBUS_NOT_SUPPORTED, and you get only 75 hits on
Google, and our URL is #7. One idea Andrew Dunstan had was to print
descriptions for the most popular values. I asked him to give it a try
once I applied this patch.
Even if it were the responsibility of the error message to suggest this,
a URL seems far too transient.
It is a URL to the Wine CVS repository, so I assume it will be around for
a while. One thing we could do is copy that file to a URL on our web
site and point error messages to that. We could put the file in our CVS
and point to that too.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
bruce wrote:
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Bruce Momjian wrote:
OK, maybe /doc or src/tools. A more radical approach would be to put
the list in our documentation, or have initdb install it.Why not put it in techdocs or some such?
I think we've learned by now that putting copies of other peoples' code
in our tree isn't such a hot idea; what is going to cause it to be
updated when things change? How do you know the values are even the
same across all the Windows versions we support?Basically this whole idea is misconceived. Just print the number and
have done.And how do people interpret that number?
Ah, I found something:
http://support.microsoft.com/kb/259693
Someone on IRC says that is kernel mode only, and is looking for a
user-mode version, so we would be able to print out a meaningful message
rather than a hex value that has to be looked up.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Import Notes
Reply to msg id not found: | Resolved by subject fallback
OK, I have tested on MinGW and found I can use FormatMessage() to print
a description for all ERROR* system() failures, rather than print a hex
value. This removes the need for a URL or lookup of hex values.
Attached and applied.
---------------------------------------------------------------------------
Bruce Momjian wrote:
bruce wrote:
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Bruce Momjian wrote:
OK, maybe /doc or src/tools. A more radical approach would be to put
the list in our documentation, or have initdb install it.Why not put it in techdocs or some such?
I think we've learned by now that putting copies of other peoples' code
in our tree isn't such a hot idea; what is going to cause it to be
updated when things change? How do you know the values are even the
same across all the Windows versions we support?Basically this whole idea is misconceived. Just print the number and
have done.And how do people interpret that number?
Ah, I found something:
http://support.microsoft.com/kb/259693
Someone on IRC says that is kernel mode only, and is looking for a
user-mode version, so we would be able to print out a meaningful message
rather than a hex value that has to be looked up.--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com+ If your life is a hard drive, Christ can be your backup. +
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Attachments:
/rtmp/difftext/x-diffDownload+71-69
From: "Bruce Momjian" <bruce@momjian.us>
OK, I have tested on MinGW and found I can use FormatMessage() to
a description for all ERROR* system() failures, rather than print a
hex
value. This removes the need for a URL or lookup of hex values.
Attached and applied.
Excuse me if I'm misunderstanding, but I'm afraid you are mixing up
Win32 error codes and exception codes. I saw the following fragment
in your patch:
! * On MinGW, system() returns STATUS_* values. MSVC might be
! * different. To test, create a binary that does *(NULL), and
! * then create a second binary that calls it via system(),
! * and check the return value of system(). On MinGW, it is
! * 0xC0000005 == STATUS_ACCESS_VIOLATION, and 0x5 is a value
! * FormatMessage() can look up. GetLastError() does not work;
! * always zero.
Exception codes and error codes are different and not related. In the
above test, 0xC0000005 is an "exception code". On the other hand, what
FormatMessage() accepts is an error code. Error codes can't derived
from exception codes. Stripping off 0xC bit from an exception code
does not convert it to an error code.
I suspect the reason why you misunderstood is that the descriptions
are similar:
the description for exception 0xC0000005 (STATUS_ACCESS_VIOLATION) is
"access violation" (though the text can't be obtained). This is
caused by an illegal memory access. This is a program bug.
The description for 0x5 (ERROR_ACCESS_DENIED) is "Access is denied."
This is caused by permission checks. This is not a bug, and can
happen normally.
Try "1.0 / 0.0" (devide by zero) instead of (*NULL). What would your
patch display? The exception would be 0xC000008E
(STATUS_FLOAT_DIVIDE_BY_ZERO), I think. 0x8E is ERROR_BUSY_DRIVE.
Takayuki Tsunakawa wrote:
From: "Bruce Momjian" <bruce@momjian.us>
OK, I have tested on MinGW and found I can use FormatMessage() to
a description for all ERROR* system() failures, rather than print a
hex
value. This removes the need for a URL or lookup of hex values.
Attached and applied.Excuse me if I'm misunderstanding, but I'm afraid you are mixing up
Win32 error codes and exception codes. I saw the following fragment
in your patch:! * On MinGW, system() returns STATUS_* values. MSVC might be
! * different. To test, create a binary that does *(NULL), and
! * then create a second binary that calls it via system(),
! * and check the return value of system(). On MinGW, it is
! * 0xC0000005 == STATUS_ACCESS_VIOLATION, and 0x5 is a value
! * FormatMessage() can look up. GetLastError() does not work;
! * always zero.Exception codes and error codes are different and not related. In the
above test, 0xC0000005 is an "exception code". On the other hand, what
FormatMessage() accepts is an error code. Error codes can't derived
from exception codes. Stripping off 0xC bit from an exception code
does not convert it to an error code.
I suspect the reason why you misunderstood is that the descriptions
are similar:
the description for exception 0xC0000005 (STATUS_ACCESS_VIOLATION) is
"access violation" (though the text can't be obtained). This is
caused by an illegal memory access. This is a program bug.
The description for 0x5 (ERROR_ACCESS_DENIED) is "Access is denied."
This is caused by permission checks. This is not a bug, and can
happen normally.Try "1.0 / 0.0" (devide by zero) instead of (*NULL). What would your
patch display? The exception would be 0xC000008E
(STATUS_FLOAT_DIVIDE_BY_ZERO), I think. 0x8E is ERROR_BUSY_DRIVE.
Yes, you are 100% correct that I had exceptions and errors confused. I
have backed out the patch that used FormatMessage(), and instead of
using a URL, the message is now:
child process was terminated by exception %X
See /include/ntstatus.h for a description of the hex value.
When I search for /include/ntstatus.h, I get the Wine page first, so
hopefully we can mark this item as completed.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From: "Bruce Momjian" <bruce@momjian.us>
Yes, you are 100% correct that I had exceptions and errors confused.
I
have backed out the patch that used FormatMessage(), and instead of
using a URL, the message is now:child process was terminated by exception %X
See /include/ntstatus.h for a description of the hex value.When I search for /include/ntstatus.h, I get the Wine page first, so
hopefully we can mark this item as completed.
Thank you, Bruce-san. I agree.
----- Original Message -----
From: "Bruce Momjian" <bruce@momjian.us>
To: "Takayuki Tsunakawa" <tsunakawa.takay@jp.fujitsu.com>
Cc: "PostgreSQL-patches" <pgsql-patches@postgresql.org>; "Tom Lane"
<tgl@sss.pgh.pa.us>; "Alvaro Herrera" <alvherre@commandprompt.com>;
"Magnus Hagander" <magnus@hagander.net>; "ITAGAKI Takahiro"
<itagaki.takahiro@oss.ntt.co.jp>
Sent: Tuesday, January 23, 2007 12:35 PM
Subject: Re: [pgsql-patches] [HACKERS] Win32 WEXITSTATUS too
Takayuki Tsunakawa wrote:
From: "Bruce Momjian" <bruce@momjian.us>
OK, I have tested on MinGW and found I can use FormatMessage() to
a description for all ERROR* system() failures, rather than print
a
hex
value. This removes the need for a URL or lookup of hex values.
Attached and applied.Excuse me if I'm misunderstanding, but I'm afraid you are mixing up
Win32 error codes and exception codes. I saw the following
fragment
in your patch:
! * On MinGW, system() returns STATUS_* values. MSVC might be
! * different. To test, create a binary that does *(NULL), and
! * then create a second binary that calls it via system(),
! * and check the return value of system(). On MinGW, it is
! * 0xC0000005 == STATUS_ACCESS_VIOLATION, and 0x5 is a value
! * FormatMessage() can look up. GetLastError() does not work;
! * always zero.Exception codes and error codes are different and not related. In
the
above test, 0xC0000005 is an "exception code". On the other hand,
what
FormatMessage() accepts is an error code. Error codes can't
derived
from exception codes. Stripping off 0xC bit from an exception code
does not convert it to an error code.
I suspect the reason why you misunderstood is that the descriptions
are similar:
the description for exception 0xC0000005 (STATUS_ACCESS_VIOLATION)
is
"access violation" (though the text can't be obtained). This is
caused by an illegal memory access. This is a program bug.
The description for 0x5 (ERROR_ACCESS_DENIED) is "Access is
denied."
This is caused by permission checks. This is not a bug, and can
happen normally.Try "1.0 / 0.0" (devide by zero) instead of (*NULL). What would
your
patch display? The exception would be 0xC000008E
(STATUS_FLOAT_DIVIDE_BY_ZERO), I think. 0x8E is ERROR_BUSY_DRIVE.Yes, you are 100% correct that I had exceptions and errors confused.
I
have backed out the patch that used FormatMessage(), and instead of
using a URL, the message is now:child process was terminated by exception %X
See /include/ntstatus.h for a description of the hex value.When I search for /include/ntstatus.h, I get the Wine page first, so
hopefully we can mark this item as completed.--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com+ If your life is a hard drive, Christ can be your backup. +
---------------------------(end of
broadcast)---------------------------
Show quoted text
TIP 4: Have you searched our list archives?
Takayuki Tsunakawa wrote:
From: "Bruce Momjian" <bruce@momjian.us>
Yes, you are 100% correct that I had exceptions and errors confused.
Ihave backed out the patch that used FormatMessage(), and instead of
using a URL, the message is now:child process was terminated by exception %X
See /include/ntstatus.h for a description of the hex value.When I search for /include/ntstatus.h, I get the Wine page first, so
hopefully we can mark this item as completed.Thank you, Bruce-san. I agree.
The Win32 port has always been done in small steps, sometimes to the
left or right, but eventually forward.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote:
Takayuki Tsunakawa wrote:
From: "Bruce Momjian" <bruce@momjian.us>
Yes, you are 100% correct that I had exceptions and errors confused.
Ihave backed out the patch that used FormatMessage(), and instead of
using a URL, the message is now:child process was terminated by exception %X
See /include/ntstatus.h for a description of the hex value.When I search for /include/ntstatus.h, I get the Wine page first, so
hopefully we can mark this item as completed.Thank you, Bruce-san. I agree.
The Win32 port has always been done in small steps, sometimes to the
left or right, but eventually forward.
He feints to the left, he feints to the right, he ducks and POW!
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/