ecpg assertion on windows

Started by Andres Freundover 3 years ago6 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

I occasionally

Running the ecpg regression tests interactively (to try to find a different
issue), triggered a crash on windows due to an uninitialized variable (after
pressing "ignore" in that stupid gui window that we've only disabled for the
backend).

"The variable 'replace_val' is being used without being initialized."

Child-SP RetAddr Call Site
000000b3`3bcfe140 00007ff9`03f9cd74 libpgtypes!failwithmessage(
void * retaddr = 0x00007ff9`03f96133,
int crttype = 0n1,
int errnum = 0n3,
char * msg = 0x000000b3`3bcff050 "The variable 'replace_val' is being used without being initialized.")+0x234 [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\rtc\error.cpp @ 213]
000000b3`3bcff030 00007ff9`03f96133 libpgtypes!_RTC_UninitUse(
char * varname = 0x00007ff9`03fa8a90 "replace_val")+0xa4 [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\rtc\error.cpp @ 362]
000000b3`3bcff470 00007ff9`03f94acd libpgtypes!dttofmtasc_replace(
int64 * ts = 0x000000b3`3bcff778,
long dDate = 0n0,
int dow = 0n6,
struct tm * tm = 0x000000b3`3bcff598,
char * output = 0x0000026e`9c223de0 "abc-00:00:00",
int * pstr_len = 0x000000b3`3bcff620,
char * fmtstr = 0x00007ff7`b01ae5c0 "abc-%X-def-%x-ghi%%")+0xe53 [C:\dev\postgres-meson\src\interfaces\ecpg\pgtypeslib\timestamp.c @ 759]
*** WARNING: Unable to verify checksum for C:\dev\postgres-meson\build-msbuild\src\interfaces\ecpg\test\pgtypeslib\dt_test.exe
000000b3`3bcff550 00007ff7`b01a23c9 libpgtypes!PGTYPEStimestamp_fmt_asc(
int64 * ts = 0x000000b3`3bcff778,
char * output = 0x0000026e`9c223de0 "abc-00:00:00",
int str_len = 0n19,
char * fmtstr = 0x00007ff7`b01ae5c0 "abc-%X-def-%x-ghi%%")+0xed [C:\dev\postgres-meson\src\interfaces\ecpg\pgtypeslib\timestamp.c @ 794]
000000b3`3bcff610 00007ff7`b01a4499 dt_test!main(void)+0xe59 [C:\dev\postgres-meson\src\interfaces\ecpg\test\pgtypeslib\dt_test.pgc @ 200]
000000b3`3bcff860 00007ff7`b01a433e dt_test!invoke_main(void)+0x39 [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 79]
000000b3`3bcff8b0 00007ff7`b01a41fe dt_test!__scrt_common_main_seh(void)+0x12e [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
000000b3`3bcff920 00007ff7`b01a452e dt_test!__scrt_common_main(void)+0xe [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 331]
000000b3`3bcff950 00007ff9`1d987034 dt_test!mainCRTStartup(
void * __formal = 0x000000b3`3bbe8000)+0xe [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp @ 17]
000000b3`3bcff980 00007ff9`1f842651 KERNEL32!BaseThreadInitThunk+0x14
000000b3`3bcff9b0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

I haven't analyzed this further.

CI also shows ecpg itself occasionally crashing, but I haven't managed to
catch it in the act.

Greetings,

Andres Freund

#2Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: ecpg assertion on windows

Hi,

On 2022-08-23 20:36:55 -0700, Andres Freund wrote:

Running the ecpg regression tests interactively (to try to find a different
issue), triggered a crash on windows due to an uninitialized variable (after
pressing "ignore" in that stupid gui window that we've only disabled for the
backend).

"The variable 'replace_val' is being used without being initialized."

Looks to me like that's justified. The paths in dttofmtasc_replace using
PGTYPES_TYPE_NOTHING don't set replace_val, but call pgtypes_fmt_replace() -
with replace_val passed by value. If that's the first replacement, an
unitialized variable is passed...

Seems either the caller should skip calling pgtypes_fmt_replace() in the
NOTHING case, or replace_val should be zero initialized?

- Andres

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#2)
Re: ecpg assertion on windows

Andres Freund <andres@anarazel.de> writes:

On 2022-08-23 20:36:55 -0700, Andres Freund wrote:

Running the ecpg regression tests interactively (to try to find a different
issue), triggered a crash on windows due to an uninitialized variable (after
pressing "ignore" in that stupid gui window that we've only disabled for the
backend).
"The variable 'replace_val' is being used without being initialized."

Looks to me like that's justified.

Hmm ... that message sounded like it is a run-time detection not from
static analysis. But if the regression tests are triggering use of
uninitialized values, how could we have failed to detect that?
Either valgrind or unstable behavior should have found this ages ago.

Seeing that replace_val is a union of differently-sized types,
I was wondering if this message is a false positive based on
struct assignment transferring a few uninitialized bytes, or
something like that.

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: ecpg assertion on windows

Hi,

On 2022-08-24 00:18:27 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2022-08-23 20:36:55 -0700, Andres Freund wrote:

Running the ecpg regression tests interactively (to try to find a different
issue), triggered a crash on windows due to an uninitialized variable (after
pressing "ignore" in that stupid gui window that we've only disabled for the
backend).
"The variable 'replace_val' is being used without being initialized."

Looks to me like that's justified.

Hmm ... that message sounded like it is a run-time detection not from
static analysis.

Yes, it's a runtime error.

But if the regression tests are triggering use of uninitialized values, how
could we have failed to detect that? Either valgrind or unstable behavior
should have found this ages ago.

I think it's just different criteria for when to report issues. Valgrind
reports uninitialized memory only when there's a conditional branch depending
on it or such. Whereas this seems to trigger when passing an uninitialized
value to a function by value, even if it's then not relied upon.

I don't think we regularly test all client tests with valgrind, btw. Skink
only runs the server under valgrind at least.

Seeing that replace_val is a union of differently-sized types,
I was wondering if this message is a false positive based on
struct assignment transferring a few uninitialized bytes, or
something like that.

I think it's genuinely uninitialized - if you track what happens if the first
parameter is e.g. %X: It'll not initialize replace_val, but then call
pgtypes_fmt_replace(). So an uninit value is passed.

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: ecpg assertion on windows

Andres Freund <andres@anarazel.de> writes:

On 2022-08-24 00:18:27 -0400, Tom Lane wrote:

But if the regression tests are triggering use of uninitialized values, how
could we have failed to detect that? Either valgrind or unstable behavior
should have found this ages ago.

I think it's just different criteria for when to report issues. Valgrind
reports uninitialized memory only when there's a conditional branch depending
on it or such. Whereas this seems to trigger when passing an uninitialized
value to a function by value, even if it's then not relied upon.

If the value is not actually relied on, then it's a false positive.

I don't say we shouldn't fix it, because we routinely jump through
hoops to silence various sorts of functionally-harmless warnings.
But let's be clear about whether there's a real bug here.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: ecpg assertion on windows

Hi,

On 2022-08-24 00:32:53 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2022-08-24 00:18:27 -0400, Tom Lane wrote:

But if the regression tests are triggering use of uninitialized values, how
could we have failed to detect that? Either valgrind or unstable behavior
should have found this ages ago.

I think it's just different criteria for when to report issues. Valgrind
reports uninitialized memory only when there's a conditional branch depending
on it or such. Whereas this seems to trigger when passing an uninitialized
value to a function by value, even if it's then not relied upon.

If the value is not actually relied on, then it's a false positive.

My understanding is that formally speaking passing an undefined value by value
to a function is "relying on it" and undefined behaviour. Hard to believe
it'll cause any compiler go haywire and eat the computer, but ...

I don't say we shouldn't fix it, because we routinely jump through
hoops to silence various sorts of functionally-harmless warnings.
But let's be clear about whether there's a real bug here.

Yea.

Greetings,

Andres Freund