Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

Started by Michael Paquierover 11 years ago11 messages
#1Michael Paquier
michael.paquier@gmail.com

Hi,

This morning while running make check-world on my OSX Mavericks laptop, I
found the following failure:
test pgtypeslib/dt_test2 ... stderr FAILED (test process was
terminated by signal 6: Abort trap)
(lldb) bt
* thread #1: tid = 0x0000, 0x00007fff8052c866
libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGSTOP
* frame #0: 0x00007fff8052c866 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff83cb035c libsystem_pthread.dylib`pthread_kill + 92
frame #2: 0x00007fff81899bba libsystem_c.dylib`__abort + 145
frame #3: 0x00007fff8189a46d libsystem_c.dylib`__stack_chk_fail + 196
frame #4: 0x000000010f7cb3bb
libpgtypes.3.dylib`PGTYPESdate_from_asc(str=0x000000010f6a2d6c,
endptr=0x00007fff5055e488) + 635 at datetime.c:104
frame #5: 0x000000010f6a260f dt_test2`main + 255 at dt_test2.pgc:91
frame #6: 0x00007fff87acc5fd libdyld.dylib`start + 1
frame #7: 0x00007fff87acc5fd libdyld.dylib`start + 1
Bisecting is showing me that this failure has been introduced by 4318dae,
and is reproducible on all the active branches, down to REL9_0_STABLE.

Note that this problem has been introduced after discussing a separate
issue here:
/messages/by-id/1399399313.27807.28.camel@sussancws0025
Regards,
--
Michael

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#1)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

Michael Paquier <michael.paquier@gmail.com> writes:

This morning while running make check-world on my OSX Mavericks laptop, I
found the following failure:

[ scratches head... ] Doesn't reproduce on my OSX Mavericks laptop,
either with or without --disable-integer-datetimes. What compiler
are you using exactly? Any special build options?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Michael Paquier
michael.paquier@gmail.com
In reply to: Tom Lane (#2)
1 attachment(s)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

On Mon, Oct 6, 2014 at 12:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

This morning while running make check-world on my OSX Mavericks laptop, I
found the following failure:

[ scratches head... ] Doesn't reproduce on my OSX Mavericks laptop,
either with or without --disable-integer-datetimes.
What compiler are you using exactly?

clang from developer tools 6.0 of September 2014, even if configure points
to "gcc" in /usr/bin/:
$ which gcc
/usr/bin/gcc
$ gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.51) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix
$ clang --version
Apple LLVM version 6.0 (clang-600.0.51) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

Any special build options?

Nothing really fancy:
$ ./configure --enable-depend --enable-debug --disable-rpath
--enable-cassert --prefix=/to/path/bin/pgsql --with-libxml
CFLAGS=
I am attaching config.log in case. Btw that's 10.9.5, and I have been able
to reproduce it on a second machine running 10.9.5 as well.
Regards,
--
Michael

Attachments:

config.tar.gzapplication/x-gzip; name=config.tar.gzDownload
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#3)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

Michael Paquier <michael.paquier@gmail.com> writes:

On Mon, Oct 6, 2014 at 12:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ scratches head... ] Doesn't reproduce on my OSX Mavericks laptop,
either with or without --disable-integer-datetimes.
What compiler are you using exactly?

clang from developer tools 6.0 of September 2014, even if configure points
to "gcc" in /usr/bin/:
$ which gcc
/usr/bin/gcc
$ gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.51) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

Exact same here, so that's not it. (I think ... my Xcode says it's 6.0.1,
but the compiler --version report is just the same as you show.)

Any special build options?

Nothing really fancy:
$ ./configure --enable-depend --enable-debug --disable-rpath
--enable-cassert --prefix=/to/path/bin/pgsql --with-libxml

That looks about like mine too, though I'm not using --disable-rpath
... what's the reason for that?

I am attaching config.log in case. Btw that's 10.9.5, and I have been able
to reproduce it on a second machine running 10.9.5 as well.

10.9.5 here as well. We're running out of explanations ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Michael Paquier
michael.paquier@gmail.com
In reply to: Tom Lane (#4)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

Nothing really fancy:
$ ./configure --enable-depend --enable-debug --disable-rpath
--enable-cassert --prefix=/to/path/bin/pgsql --with-libxml

That looks about like mine too, though I'm not using --disable-rpath
... what's the reason for that?

No real reason. That was only some old remnant in a build script that was
here for ages :)
--
Michael

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#5)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

Michael Paquier <michael.paquier@gmail.com> writes:

On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

That looks about like mine too, though I'm not using --disable-rpath
... what's the reason for that?

No real reason. That was only some old remnant in a build script that was
here for ages :)

Hm. Grasping at straws here ... what's your locale enviroment?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Michael Paquier
michael.paquier@gmail.com
In reply to: Tom Lane (#6)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

On Mon, Oct 6, 2014 at 10:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

That looks about like mine too, though I'm not using --disable-rpath
... what's the reason for that?

No real reason. That was only some old remnant in a build script that was
here for ages :)

Hm. Grasping at straws here ... what's your locale enviroment?

The system locales have nothing really special...
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
But now that you mention it I have as well that:
$ defaults read -g AppleLocale
en_JP
--
Michael

#8Michael Paquier
michael.paquier@gmail.com
In reply to: Michael Paquier (#7)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

On Tue, Oct 7, 2014 at 8:14 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:

The system locales have nothing really special...
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
But now that you mention it I have as well that:
$ defaults read -g AppleLocale
en_JP

Hm... I have tried changing the system locales (to en_US for example) and
time format but I can still trigger the issue all the time. I'll try to
have a closer look.. It looks like this test does not like some settings at
the OS level.
--
Michael

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#8)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

Michael Paquier <michael.paquier@gmail.com> writes:

Hm... I have tried changing the system locales (to en_US for example) and
time format but I can still trigger the issue all the time. I'll try to
have a closer look.. It looks like this test does not like some settings at
the OS level.

I eventually realized that the critical difference was you'd added
"CFLAGS=" to the configure call. On this platform that has the net
effect of removing -O2 from the compiler flags, and apparently that
shifts around the stack layout enough to expose the clobber.

The fix is simple enough: ecpg's version of ParseDateTime is failing
to check for overrun of the field[] array until *after* it's already
clobbered the stack:

*** a/src/interfaces/ecpg/pgtypeslib/dt_common.c
--- b/src/interfaces/ecpg/pgtypeslib/dt_common.c
*************** ParseDateTime(char *timestr, char *lowst
*** 1695,1703 ****
    while (*(*endstr) != '\0')
    {
        /* Record start of current field */
-       field[nf] = lp;
        if (nf >= MAXDATEFIELDS)
            return -1;
        /* leading digit? then date or time */
        if (isdigit((unsigned char) *(*endstr)))
--- 1695,1703 ----
    while (*(*endstr) != '\0')
    {
        /* Record start of current field */
        if (nf >= MAXDATEFIELDS)
            return -1;
+       field[nf] = lp;

/* leading digit? then date or time */
if (isdigit((unsigned char) *(*endstr)))

Kind of astonishing that nobody else has reported this, given that
there's been a regression test specifically meant to catch such a
problem since 4318dae. The stack layout in PGTYPESdate_from_asc
must happen to avoid the issue on practically all platforms.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Michael Paquier
michael.paquier@gmail.com
In reply to: Tom Lane (#9)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

On Tue, Oct 7, 2014 at 9:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

Hm... I have tried changing the system locales (to en_US for example) and
time format but I can still trigger the issue all the time. I'll try to
have a closer look.. It looks like this test does not like some settings

at

the OS level.

I eventually realized that the critical difference was you'd added
"CFLAGS=" to the configure call. On this platform that has the net
effect of removing -O2 from the compiler flags, and apparently that
shifts around the stack layout enough to expose the clobber.

At least my scripts are weird enough to trigger such behaviors. The funny
part is that it's really a coincidence, CFLAGS was being set with an empty
variable, variable removed in this script some time ago.

The fix is simple enough: ecpg's version of ParseDateTime is failing

to check for overrun of the field[] array until *after* it's already
clobbered the stack:
Kind of astonishing that nobody else has reported this, given that
there's been a regression test specifically meant to catch such a
problem since 4318dae. The stack layout in PGTYPESdate_from_asc
must happen to avoid the issue on practically all platforms.

Yes, thanks. That's it. At least I am not going crazy.
Regards,
--
Michael

#11Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#9)
Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

On Mon, Oct 06, 2014 at 08:57:54PM -0400, Tom Lane wrote:

I eventually realized that the critical difference was you'd added
"CFLAGS=" to the configure call. On this platform that has the net
effect of removing -O2 from the compiler flags, and apparently that
shifts around the stack layout enough to expose the clobber.

The fix is simple enough: ecpg's version of ParseDateTime is failing
to check for overrun of the field[] array until *after* it's already
clobbered the stack:

Thanks for tracking that down. Oops.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers