Odd 9.4, 9.3 buildfarm failure on s390x
Hi,
It seems Mark started a new buildfarm animal on s390x. It shows a pretty
odd failure on 9.3 and 9.4, but *not* on newer animals:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lumpsucker&dt=2018-09-26%2020%3A30%3A58
================== pgsql.build/src/test/regress/regression.diffs ===================
*** /home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/expected/uuid.out Mon Sep 24 17:49:30 2018
--- /home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/results/uuid.out Wed Sep 26 16:31:31 2018
***************
*** 64,72 ****
SELECT guid_field FROM guid1 ORDER BY guid_field DESC;
guid_field
--------------------------------------
- 3f3e3c3b-3a30-3938-3736-353433a2313e
- 22222222-2222-2222-2222-222222222222
11111111-1111-1111-1111-111111111111
(3 rows)
-- = operator test
--- 64,72 ----
SELECT guid_field FROM guid1 ORDER BY guid_field DESC;
guid_field
--------------------------------------
11111111-1111-1111-1111-111111111111
+ 22222222-2222-2222-2222-222222222222
+ 3f3e3c3b-3a30-3938-3736-353433a2313e
(3 rows)
-- = operator test
======================================================================
Mark, is there anything odd for specific branches?
I don't see anything immediately suspicious in the relevant comparator
code...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
It seems Mark started a new buildfarm animal on s390x. It shows a pretty
odd failure on 9.3 and 9.4, but *not* on newer animals:
No, lumpsucker is showing the same failure on 9.5 as well. I suspect
that the reason 9.6 and up are OK is that 9.6 is where we introduced
the abbreviated-sort-key machinery. IOW, the problem exists in the
old-style UUID sort comparator but not the new one. Which is pretty
darn odd, because the old-style comparator is just memcmp(). How
could that be broken without causing lots more issues?
regards, tom lane
On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote:
Mark, is there anything odd for specific branches?
No... I don't have anything in the config that would be applied to
specific branches...
Regards,
Mark
--
Mark Wong http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
Hi,
On 2018-09-28 15:22:23 -0700, Mark Wong wrote:
On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote:
Mark, is there anything odd for specific branches?
No... I don't have anything in the config that would be applied to
specific branches...
Could you perhaps do some manual debugging on that machine?
Maybe starting with manually running something like:
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '22222222-2222-2222-2222-222222222222'::uuid);
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111111-1111-1111-1111-111111111111'::uuid);
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111113-1111-1111-1111-111111111111'::uuid);
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111110-1111-1111-1111-111111111111'::uuid);
on both master and one of the failing branches?
Greetings,
Andres Freund
Hi Andres,
On Fri, Sep 28, 2018 at 03:41:27PM -0700, Andres Freund wrote:
On 2018-09-28 15:22:23 -0700, Mark Wong wrote:
On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote:
Mark, is there anything odd for specific branches?
No... I don't have anything in the config that would be applied to
specific branches...Could you perhaps do some manual debugging on that machine?
Maybe starting with manually running something like:
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '22222222-2222-2222-2222-222222222222'::uuid);
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111111-1111-1111-1111-111111111111'::uuid);
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111113-1111-1111-1111-111111111111'::uuid);
SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111110-1111-1111-1111-111111111111'::uuid);on both master and one of the failing branches?
I've attached the output for head and the 9.4 stable branch. It appears
they are returning the same results.
I built them both by:
CC=/usr/bin/clang ./configure --enable-cassert --enable-debug \
--enable-nls --with-perl --with-python --with-tcl \
--with-tclconfig=/usr/lib64 --with-gssapi --with-openssl \
--with-ldap --with-libxml --with-libxslt
What should I try next?
Regards,
Mark
--
Mark Wong http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
"Mark" == Mark Wong <mark@2ndQuadrant.com> writes:
Mark> What should I try next?
What is the size of a C "int" on this platform?
--
Andrew (irc:RhodiumToad)
On 09/29/2018 01:36 AM, Andrew Gierth wrote:
Mark> What should I try next?
What is the size of a C "int" on this platform?
4.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
"Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
What is the size of a C "int" on this platform?
Andrew> 4.
Hmm.
Because int being more than 32 bits is the simplest explanation for this
difference.
How about the output of this query:
with d(a) as (values ('11111111-1111-1111-1111-111111111111'::uuid),
('22222222-2222-2222-2222-222222222222'::uuid),
('3f3e3c3b-3a30-3938-3736-353433a2313e'::uuid))
select d1.a, d2.a, uuid_cmp(d1.a,d2.a) from d d1, d d2
order by d1.a asc, d2.a desc;
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Because int being more than 32 bits is the simplest explanation for this
difference.
Curious to hear your reasoning behind that statement? I hadn't gotten
further than "memcmp is broken" ... and neither of those theories is
tenable, because if they were true then a lot more things besides uuid
sorting would be falling over.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Because int being more than 32 bits is the simplest explanation for
this difference.
Tom> Curious to hear your reasoning behind that statement? I hadn't
Tom> gotten further than "memcmp is broken" ... and neither of those
Tom> theories is tenable, because if they were true then a lot more
Tom> things besides uuid sorting would be falling over.
memcmp() returns an int, and guarantees only the sign of the result, so
((int32) memcmp()) may have the wrong value if int is wider than int32.
But yeah, it seems unlikely that it would break for uuid but not bytea
(or text in collate C).
--
Andrew (irc:RhodiumToad)
On Sun, Sep 30, 2018 at 12:38:46AM +0100, Andrew Gierth wrote:
"Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
What is the size of a C "int" on this platform?
Andrew> 4.
Hmm.
Because int being more than 32 bits is the simplest explanation for this
difference.How about the output of this query:
with d(a) as (values ('11111111-1111-1111-1111-111111111111'::uuid),
('22222222-2222-2222-2222-222222222222'::uuid),
('3f3e3c3b-3a30-3938-3736-353433a2313e'::uuid))
select d1.a, d2.a, uuid_cmp(d1.a,d2.a) from d d1, d d2
order by d1.a asc, d2.a desc;
That also appears to produce the same results:
With 9.4:
postgres=# select version();
version
-------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.4.19 on s390x-ibm-linux-gnu, compiled by clang version 5.0.1 (tags/RELEASE_501/final 312548), 64-bit
(1 row)
...
a | a | uuid_cmp
--------------------------------------+--------------------------------------+-------------
11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0
11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648
11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1
22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0
22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0
(9 rows)
Then with HEAD:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------------
PostgreSQL 12devel on s390x-ibm-linux-gnu, compiled by clang version 5.0.1 (tags/RELEASE_501/final 312548), 64-bit
(1 row)
...
a | a | uuid_cmp
--------------------------------------+--------------------------------------+-------------
11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0
11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648
11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1
22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0
22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0
(9 rows)
Regards,
Mark
--
Mark Wong http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
Mark Wong <mark@2ndQuadrant.com> writes:
a | a | uuid_cmp
--------------------------------------+--------------------------------------+-------------
11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0
11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648
11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1
22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0
22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0
(9 rows)
Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC. I think we
implement DESC by negating the comparator's result, which explains
why only the DESC case fails.
regards, tom lane
On 2018-10-01 11:58:51 -0400, Tom Lane wrote:
Mark Wong <mark@2ndQuadrant.com> writes:
a | a | uuid_cmp
--------------------------------------+--------------------------------------+-------------
11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0
11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648
11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1
22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0
22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0
(9 rows)Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC.
Hm, that'd be pretty painful - memcmp() isn't guaranteed to return
anything smaller. And we use memcmp in a fair number of comparators.
I think we implement DESC by negating the comparator's result, which
explains why only the DESC case fails.
That makes sense.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-10-01 11:58:51 -0400, Tom Lane wrote:
Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC.
Hm, that'd be pretty painful - memcmp() isn't guaranteed to return
anything smaller. And we use memcmp in a fair number of comparators.
Yeah. So our choices are
(1) Retain the current restriction on what sort comparators can
produce. Find all the places where memcmp's result is returned
directly, and fix them. (I wonder if strcmp has same issue.)
(2) Drop the restriction. This'd require at least changing the
DESC correction, and maybe other things. I'm not sure what the
odds would be of finding everyplace we need to check.
Neither one is sounding very pleasant, or maintainable.
regards, tom lane
On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-10-01 11:58:51 -0400, Tom Lane wrote:
Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC.Hm, that'd be pretty painful - memcmp() isn't guaranteed to return
anything smaller. And we use memcmp in a fair number of comparators.Yeah. So our choices are
(1) Retain the current restriction on what sort comparators can
produce. Find all the places where memcmp's result is returned
directly, and fix them. (I wonder if strcmp has same issue.)(2) Drop the restriction. This'd require at least changing the
DESC correction, and maybe other things. I'm not sure what the
odds would be of finding everyplace we need to check.Neither one is sounding very pleasant, or maintainable.
(2) seems more maintainable to me (or perhaps less unmaintainable). It's
infrastructure, rather than every datatype + support out there...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
Yeah. So our choices are
(1) Retain the current restriction on what sort comparators can
produce. Find all the places where memcmp's result is returned
directly, and fix them. (I wonder if strcmp has same issue.)(2) Drop the restriction. This'd require at least changing the
DESC correction, and maybe other things. I'm not sure what the
odds would be of finding everyplace we need to check.Neither one is sounding very pleasant, or maintainable.
(2) seems more maintainable to me (or perhaps less unmaintainable). It's
infrastructure, rather than every datatype + support out there...
I guess we could set up some testing infrastructure: hack int4cmp
and/or a couple other popular comparators so that they *always*
return INT_MIN, 0, or INT_MAX, and then see what falls over.
I'm fairly sure that btree, as well as the sort code proper,
has got an issue here.
regards, tom lane
On 10/01/2018 11:58 AM, Tom Lane wrote:
Mark Wong <mark@2ndQuadrant.com> writes:
a | a | uuid_cmp
--------------------------------------+--------------------------------------+-------------
11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0
11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648
11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1
22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0
22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1
3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0
(9 rows)Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC. I think we
implement DESC by negating the comparator's result, which explains
why only the DESC case fails.
Is there a standard that forbids this, or have we just been lucky up to now?
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
On 10/01/2018 11:58 AM, Tom Lane wrote:
Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC. I think we
implement DESC by negating the comparator's result, which explains
why only the DESC case fails.
Is there a standard that forbids this, or have we just been lucky up to now?
We've been lucky; POSIX just says the value is less than, equal to,
or greater than zero.
In practice, a memcmp that operates byte-at-a-time would not likely
return anything outside +-255. But on a big-endian machine you could
easily optimize to use word-wide operations to compare 4 bytes at a
time, and I suspect that's what's happening here. Or maybe there's
just some weird architecture-specific reason that makes it cheap
for them to return INT_MIN rather than some other value?
regards, tom lane
On Mon, Oct 01, 2018 at 05:11:02PM -0400, Tom Lane wrote:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
On 10/01/2018 11:58 AM, Tom Lane wrote:
Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases. That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC. I think we
implement DESC by negating the comparator's result, which explains
why only the DESC case fails.Is there a standard that forbids this, or have we just been lucky up to now?
We've been lucky; POSIX just says the value is less than, equal to,
or greater than zero.In practice, a memcmp that operates byte-at-a-time would not likely
return anything outside +-255. But on a big-endian machine you could
easily optimize to use word-wide operations to compare 4 bytes at a
time, and I suspect that's what's happening here. Or maybe there's
just some weird architecture-specific reason that makes it cheap
for them to return INT_MIN rather than some other value?
as a former S3[79]x assembler programmer, they probably do it in
registers or using TRT. All of which could be word wise.
regards, tom lane
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
On 10/01/2018 12:50 PM, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
Yeah. So our choices are
(1) Retain the current restriction on what sort comparators can
produce. Find all the places where memcmp's result is returned
directly, and fix them. (I wonder if strcmp has same issue.)(2) Drop the restriction. This'd require at least changing the
DESC correction, and maybe other things. I'm not sure what the
odds would be of finding everyplace we need to check.Neither one is sounding very pleasant, or maintainable.
(2) seems more maintainable to me (or perhaps less unmaintainable). It's
infrastructure, rather than every datatype + support out there...I guess we could set up some testing infrastructure: hack int4cmp
and/or a couple other popular comparators so that they *always*
return INT_MIN, 0, or INT_MAX, and then see what falls over.I'm fairly sure that btree, as well as the sort code proper,
has got an issue here.
I agree option 2 seems less unmaintainable. (Nice use of litotes there?)
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services