Server crash on RHEL 9/s390x platform against PG16

Started by Suraj Kharageover 2 years ago7 messages
#1Suraj Kharage
suraj.kharage@enterprisedb.com

Hi,

Found server crash on RHEL 9/s390x platform with below test case -

*Machine details:*

*[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
(Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits
physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
--with-perl --with-python --with-tcl --with-openssl --enable-nls
--with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
--enable-debug --enable-cassert --with-pgport=5414

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
= rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

*backtrace:*
[edb@9428da9d2137 postgres]$ gdb bin/postgres
data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc@entry=0x1ba3d10,
values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at
heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x
libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x
libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x
libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x
libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x
llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x
ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x
systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc@entry=0x1ba3d10,
values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at
heaptuple.c:227
#1 0x00000000010a9bb0 in heap_form_minimal_tuple
(tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at
heaptuple.c:1484
#2 0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>)
at ../../../../src/include/executor/tuptable.h:472
#3 tuplesort_puttupleslot (state=state@entry=0x1be4d18,
slot=slot@entry=0x1ba4120)
at tuplesortvariants.c:610
#4 0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at
nodeIncrementalSort.c:716
#5 0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at
../../../src/include/executor/executor.h:273
#6 ExecutePlan (execute_once=<optimized out>, dest=0x1ade698,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7 standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>,
count=0, execute_once=<optimized out>) at execMain.c:365
#8 0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558,
forward=forward@entry=true, count=0, count@entry=9223372036854775807,
dest=dest@entry=0x1ade698) at pquery.c:924
#9 0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x1ade698, altdest=0x1ade698,
qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2
ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;")
at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at
postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x19a59a0)
at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at
main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

#2Suraj Kharage
suraj.kharage@enterprisedb.com
In reply to: Suraj Kharage (#1)
Re: Server crash on RHEL 9/s390x platform against PG16

Few more details on this:

(gdb) p val
$1 = 0
(gdb) p i
$2 = 3
(gdb) f 3
#3 0x0000000001a1ef70 in ExecCopySlotMinimalTuple (slot=0x202e4f8) at
../../../../src/include/executor/tuptable.h:472
472 return slot->tts_ops->copy_minimal_tuple(slot);
(gdb) p *slot
$3 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 8, tts_ops =
0x1b6dcc8 <TTSOpsVirtual>, tts_tupleDescriptor = 0x202e0e8, tts_values =
0x202e540, tts_isnull = 0x202e580, tts_mcxt = 0x1f54550, tts_tid =
{ip_blkid = {bi_hi = 65535,
bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot->tts_tupleDescriptor
$2 = {natts = 8, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr =
0x0, attrs = 0x202cd28}

(gdb) p slot.tts_values[3]
$4 = 0
(gdb) p slot.tts_values[2]
$5 = 1
(gdb) p slot.tts_values[1]
$6 = 34027556

As per the resultslot, it has 0 value for the third attribute (column
lable).
Im testing this on the docker container and facing some issues with gdb
hence could not able to debug it further.

Here is a explain plan:

postgres=# explain (verbose, costs off) SELECT * FROM rm32044_t1 LEFT JOIN
rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN
rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by
rm32044_t1.pkey,label,hidden;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Incremental Sort
Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey,
rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Sort Key: rm32044_t1.pkey, rm32044_t2.label, rm32044_t2.hidden
Presorted Key: rm32044_t1.pkey
-> Merge Left Join
Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey,
rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Merge Cond: (rm32044_t1.pkey = rm32044_t2.pkey)
-> Sort
Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey,
rm32044_t1.pkey, rm32044_t1.val
Sort Key: rm32044_t1.pkey
-> Nested Loop
Output: rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
-> Merge Left Join
Output: rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Merge Cond: (rm32044_t3.pkey = rm32044_t4.pkey)
-> Sort
Output: rm32044_t3.pkey, rm32044_t3.val
Sort Key: rm32044_t3.pkey
-> Seq Scan on public.rm32044_t3
Output: rm32044_t3.pkey,
rm32044_t3.val
-> Sort
Output: rm32044_t4.pkey
Sort Key: rm32044_t4.pkey
-> Seq Scan on public.rm32044_t4
Output: rm32044_t4.pkey
-> Materialize
Output: rm32044_t1.pkey, rm32044_t1.val
-> Seq Scan on public.rm32044_t1
Output: rm32044_t1.pkey, rm32044_t1.val
-> Sort
Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
Sort Key: rm32044_t2.pkey
-> Seq Scan on public.rm32044_t2
Output: rm32044_t2.pkey, rm32044_t2.label,
rm32044_t2.hidden
(34 rows)

It seems like while building the innerslot for merge join, the value for
attnum 1 is not getting fetched correctly.

On Tue, Sep 12, 2023 at 3:27 PM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

Hi,

Found server crash on RHEL 9/s390x platform with below test case -

*Machine details:*

*[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release
9.2 (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39
bits physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd
--with-llvm --with-perl --with-python --with-tcl --with-openssl
--enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl
--without-icu --enable-debug --enable-cassert --with-pgport=5414

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON
rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

*backtrace:*
[edb@9428da9d2137 postgres]$ gdb bin/postgres
data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10,
values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at
heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x
libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x
libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x
libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x
libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x
llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x
ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x
systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0 0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10,
values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at
heaptuple.c:227
#1 0x00000000010a9bb0 in heap_form_minimal_tuple
(tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at
heaptuple.c:1484
#2 0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>)
at ../../../../src/include/executor/tuptable.h:472
#3 tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120)
at tuplesortvariants.c:610
#4 0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at
nodeIncrementalSort.c:716
#5 0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at
../../../src/include/executor/executor.h:273
#6 ExecutePlan (execute_once=<optimized out>, dest=0x1ade698,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7 standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>,
count=0, execute_once=<optimized out>) at execMain.c:365
#8 0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558,
forward=forward@entry=true, count=0, count@entry=9223372036854775807,
dest=dest@entry=0x1ade698) at pquery.c:924
#9 0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x1ade698,
altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2
ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;")
at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at
postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at
main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

#3Suraj Kharage
suraj.kharage@enterprisedb.com
In reply to: Suraj Kharage (#2)
Re: Server crash on RHEL 9/s390x platform against PG16

It looks like an issue with JIT. If I disable the JIT then the above query
runs successfully.

postgres=# set jit to off;

SET

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
= rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

pkey | val | pkey | label | hidden | pkey | val | pkey

------+------+------+---------+--------+------+-----+------

1 | row1 | 1 | hidden | t | 1 | 1 |

1 | row1 | 1 | hidden | t | 2 | 1 |

2 | row2 | 2 | visible | f | 1 | 1 |

2 | row2 | 2 | visible | f | 2 | 1 |

(4 rows)

Any idea on this?

On Mon, Sep 18, 2023 at 11:20 AM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

Few more details on this:

(gdb) p val
$1 = 0
(gdb) p i
$2 = 3
(gdb) f 3
#3 0x0000000001a1ef70 in ExecCopySlotMinimalTuple (slot=0x202e4f8) at
../../../../src/include/executor/tuptable.h:472
472 return slot->tts_ops->copy_minimal_tuple(slot);
(gdb) p *slot
$3 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 8, tts_ops =
0x1b6dcc8 <TTSOpsVirtual>, tts_tupleDescriptor = 0x202e0e8, tts_values =
0x202e540, tts_isnull = 0x202e580, tts_mcxt = 0x1f54550, tts_tid =
{ip_blkid = {bi_hi = 65535,
bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot->tts_tupleDescriptor
$2 = {natts = 8, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr =
0x0, attrs = 0x202cd28}

(gdb) p slot.tts_values[3]
$4 = 0
(gdb) p slot.tts_values[2]
$5 = 1
(gdb) p slot.tts_values[1]
$6 = 34027556

As per the resultslot, it has 0 value for the third attribute (column
lable).
Im testing this on the docker container and facing some issues with gdb
hence could not able to debug it further.

Here is a explain plan:

postgres=# explain (verbose, costs off) SELECT * FROM rm32044_t1 LEFT JOIN
rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN
rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by
rm32044_t1.pkey,label,hidden;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Incremental Sort
Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey,
rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Sort Key: rm32044_t1.pkey, rm32044_t2.label, rm32044_t2.hidden
Presorted Key: rm32044_t1.pkey
-> Merge Left Join
Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey,
rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Merge Cond: (rm32044_t1.pkey = rm32044_t2.pkey)
-> Sort
Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey,
rm32044_t1.pkey, rm32044_t1.val
Sort Key: rm32044_t1.pkey
-> Nested Loop
Output: rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
-> Merge Left Join
Output: rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Merge Cond: (rm32044_t3.pkey = rm32044_t4.pkey)
-> Sort
Output: rm32044_t3.pkey, rm32044_t3.val
Sort Key: rm32044_t3.pkey
-> Seq Scan on public.rm32044_t3
Output: rm32044_t3.pkey,
rm32044_t3.val
-> Sort
Output: rm32044_t4.pkey
Sort Key: rm32044_t4.pkey
-> Seq Scan on public.rm32044_t4
Output: rm32044_t4.pkey
-> Materialize
Output: rm32044_t1.pkey, rm32044_t1.val
-> Seq Scan on public.rm32044_t1
Output: rm32044_t1.pkey, rm32044_t1.val
-> Sort
Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
Sort Key: rm32044_t2.pkey
-> Seq Scan on public.rm32044_t2
Output: rm32044_t2.pkey, rm32044_t2.label,
rm32044_t2.hidden
(34 rows)

It seems like while building the innerslot for merge join, the value for
attnum 1 is not getting fetched correctly.

On Tue, Sep 12, 2023 at 3:27 PM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

Hi,

Found server crash on RHEL 9/s390x platform with below test case -

*Machine details:*

*[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release
9.2 (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39
bits physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd
--with-llvm --with-perl --with-python --with-tcl --with-openssl
--enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl
--without-icu --enable-debug --enable-cassert --with-pgport=5414

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON
rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

*backtrace:*
[edb@9428da9d2137 postgres]$ gdb bin/postgres
data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168,
isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x
libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x
libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x
libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x
libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x
llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x
ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x
systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168,
isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
#1 0x00000000010a9bb0 in heap_form_minimal_tuple
(tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at
heaptuple.c:1484
#2 0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>)
at ../../../../src/include/executor/tuptable.h:472
#3 tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120)
at tuplesortvariants.c:610
#4 0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at
nodeIncrementalSort.c:716
#5 0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at
../../../src/include/executor/executor.h:273
#6 ExecutePlan (execute_once=<optimized out>, dest=0x1ade698,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7 standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>,
count=0, execute_once=<optimized out>) at execMain.c:365
#8 0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558,
forward=forward@entry=true, count=0, count@entry=9223372036854775807,
dest=dest@entry=0x1ade698) at pquery.c:924
#9 0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x1ade698,
altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2
ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;")
at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at
postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at
main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

#4Suraj Kharage
suraj.kharage@enterprisedb.com
In reply to: Suraj Kharage (#3)
Re: Server crash on RHEL 9/s390x platform against PG16

Here is clang version:

[edb@9428da9d2137]$ clang --version

clang version 15.0.7 (Red Hat 15.0.7-2.el9)

Target: s390x-ibm-linux-gnu

Thread model: posix

InstalledDir: /usr/bin

Let me know if any further information is needed.

On Mon, Oct 9, 2023 at 8:21 AM Suraj Kharage <suraj.kharage@enterprisedb.com>
wrote:

It looks like an issue with JIT. If I disable the JIT then the above query
runs successfully.

postgres=# set jit to off;

SET

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON
rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

pkey | val | pkey | label | hidden | pkey | val | pkey

------+------+------+---------+--------+------+-----+------

1 | row1 | 1 | hidden | t | 1 | 1 |

1 | row1 | 1 | hidden | t | 2 | 1 |

2 | row2 | 2 | visible | f | 1 | 1 |

2 | row2 | 2 | visible | f | 2 | 1 |

(4 rows)

Any idea on this?

On Mon, Sep 18, 2023 at 11:20 AM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

Few more details on this:

(gdb) p val
$1 = 0
(gdb) p i
$2 = 3
(gdb) f 3
#3 0x0000000001a1ef70 in ExecCopySlotMinimalTuple (slot=0x202e4f8) at
../../../../src/include/executor/tuptable.h:472
472 return slot->tts_ops->copy_minimal_tuple(slot);
(gdb) p *slot
$3 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 8, tts_ops =
0x1b6dcc8 <TTSOpsVirtual>, tts_tupleDescriptor = 0x202e0e8, tts_values =
0x202e540, tts_isnull = 0x202e580, tts_mcxt = 0x1f54550, tts_tid =
{ip_blkid = {bi_hi = 65535,
bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot->tts_tupleDescriptor
$2 = {natts = 8, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr
= 0x0, attrs = 0x202cd28}

(gdb) p slot.tts_values[3]
$4 = 0
(gdb) p slot.tts_values[2]
$5 = 1
(gdb) p slot.tts_values[1]
$6 = 34027556

As per the resultslot, it has 0 value for the third attribute (column
lable).
Im testing this on the docker container and facing some issues with gdb
hence could not able to debug it further.

Here is a explain plan:

postgres=# explain (verbose, costs off) SELECT * FROM rm32044_t1 LEFT
JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN
rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by
rm32044_t1.pkey,label,hidden;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Incremental Sort
Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey,
rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Sort Key: rm32044_t1.pkey, rm32044_t2.label, rm32044_t2.hidden
Presorted Key: rm32044_t1.pkey
-> Merge Left Join
Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey,
rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Merge Cond: (rm32044_t1.pkey = rm32044_t2.pkey)
-> Sort
Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey,
rm32044_t1.pkey, rm32044_t1.val
Sort Key: rm32044_t1.pkey
-> Nested Loop
Output: rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
-> Merge Left Join
Output: rm32044_t3.pkey, rm32044_t3.val,
rm32044_t4.pkey
Merge Cond: (rm32044_t3.pkey = rm32044_t4.pkey)
-> Sort
Output: rm32044_t3.pkey, rm32044_t3.val
Sort Key: rm32044_t3.pkey
-> Seq Scan on public.rm32044_t3
Output: rm32044_t3.pkey,
rm32044_t3.val
-> Sort
Output: rm32044_t4.pkey
Sort Key: rm32044_t4.pkey
-> Seq Scan on public.rm32044_t4
Output: rm32044_t4.pkey
-> Materialize
Output: rm32044_t1.pkey, rm32044_t1.val
-> Seq Scan on public.rm32044_t1
Output: rm32044_t1.pkey, rm32044_t1.val
-> Sort
Output: rm32044_t2.pkey, rm32044_t2.label,
rm32044_t2.hidden
Sort Key: rm32044_t2.pkey
-> Seq Scan on public.rm32044_t2
Output: rm32044_t2.pkey, rm32044_t2.label,
rm32044_t2.hidden
(34 rows)

It seems like while building the innerslot for merge join, the value for
attnum 1 is not getting fetched correctly.

On Tue, Sep 12, 2023 at 3:27 PM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

Hi,

Found server crash on RHEL 9/s390x platform with below test case -

*Machine details:*

*[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release
9.2 (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39
bits physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd
--with-llvm --with-perl --with-python --with-tcl --with-openssl
--enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl
--without-icu --enable-debug --enable-cassert --with-pgport=5414

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON
rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

*backtrace:*
[edb@9428da9d2137 postgres]$ gdb bin/postgres
data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168,
isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x
libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x
libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x
libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x
libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x
llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x
ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x
systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0 0x00000000010a8366 in heap_compute_data_size
(tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168,
isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
#1 0x00000000010a9bb0 in heap_form_minimal_tuple
(tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at
heaptuple.c:1484
#2 0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized
out>) at ../../../../src/include/executor/tuptable.h:472
#3 tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120)
at tuplesortvariants.c:610
#4 0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at
nodeIncrementalSort.c:716
#5 0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at
../../../src/include/executor/executor.h:273
#6 ExecutePlan (execute_once=<optimized out>, dest=0x1ade698,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7 standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized
out>, count=0, execute_once=<optimized out>) at execMain.c:365
#8 0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558,
forward=forward@entry=true, count=0, count@entry=9223372036854775807,
dest=dest@entry=0x1ade698) at pquery.c:924
#9 0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x1ade698,
altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN
rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN
rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by
rm32044_t1.pkey,label,hidden;") at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at
postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at
main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

#5Robert Haas
robertmhaas@gmail.com
In reply to: Suraj Kharage (#3)
Re: Server crash on RHEL 9/s390x platform against PG16

On Sun, Oct 8, 2023 at 10:55 PM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

It looks like an issue with JIT. If I disable the JIT then the above query
runs successfully.

postgres=# set jit to off;

SET

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON
rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON
rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

pkey | val | pkey | label | hidden | pkey | val | pkey

------+------+------+---------+--------+------+-----+------

1 | row1 | 1 | hidden | t | 1 | 1 |

1 | row1 | 1 | hidden | t | 2 | 1 |

2 | row2 | 2 | visible | f | 1 | 1 |

2 | row2 | 2 | visible | f | 2 | 1 |

(4 rows)

Any idea on this?

No, but I found a few previous threads complaining about JIT not working on
s390x.

/messages/by-id/4106722.1616177001@sss.pgh.pa.us
/messages/by-id/3ba50664-56a2-bcf4-2b24-05a3e0a75829@enterprisedb.com
/messages/by-id/20200715091509.GA3354074@msg.df7cb.de

The most interesting email I found in those threads was this one:

/messages/by-id/3358505.1594912112@sss.pgh.pa.us

The backtrace there is different from the one you posted here in
significant ways, but it seems like both that case and this one involve a
null pointer showing up for a non-null pass-by-reference datum. That
doesn't seem like a whole lot to go on, but maybe somebody who understands
the JIT stuff better than I do will have an idea.

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Andres Freund
andres@anarazel.de
In reply to: Suraj Kharage (#1)
Re: Server crash on RHEL 9/s390x platform against PG16

Hi,

On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:

*[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
(Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits

Can you provide the rest of the lscpu output? There have been issues with Z14
vs Z15:
https://github.com/llvm/llvm-project/issues/53009

You're apparently not hitting that, but given that fact, you either are on a
slightly older CPU, or you have applied a patch to work around it. Because
otherwise your uild instructions below would hit that problem, I think.

physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
--with-perl --with-python --with-tcl --with-openssl --enable-nls
--with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
--enable-debug --enable-cassert --with-pgport=5414

Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have you
verified the issue reproduces on upstream postgres?

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
= rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

I tried this on both master and 16, without hitting this issue.

If you can reproduce the issue on upstream postgres, can you share more about
your configuration?

Greetings,

Andres Freund

#7Suraj Kharage
suraj.kharage@enterprisedb.com
In reply to: Andres Freund (#6)
Re: Server crash on RHEL 9/s390x platform against PG16

On Sat, Oct 21, 2023 at 5:17 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:

*[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release

9.2

(Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
s390x CPU op-mode(s): 32-bit, 64-bit Address sizes: 39

bits

Can you provide the rest of the lscpu output? There have been issues with
Z14
vs Z15:
https://github.com/llvm/llvm-project/issues/53009

You're apparently not hitting that, but given that fact, you either are on
a
slightly older CPU, or you have applied a patch to work around it. Because
otherwise your uild instructions below would hit that problem, I think.

physical, 48 bits virtual Byte Order: Big Endian*
*Configure command:*
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd

--with-llvm

--with-perl --with-python --with-tcl --with-openssl --enable-nls
--with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
--enable-debug --enable-cassert --with-pgport=5414

Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have
you
verified the issue reproduces on upstream postgres?

Yes, I can reproduce this on upstream postgres master and v16 branch.

Here are details:

./configure --prefix=/home/edb/postgres/ --with-zstd --with-llvm
--with-perl --with-python --with-tcl --with-openssl --enable-nls
--with-libxml --with-libxslt --with-systemd --without-icu --enable-debug
--enable-cassert --with-pgport=5414 CFLAGS="-g -O0"

[edb@9428da9d2137 postgres]$ cat /etc/redhat-release

AlmaLinux release 9.2 (Turquoise Kodkod)

[edb@9428da9d2137 edbas]$ lscpu

Architecture: s390x

CPU op-mode(s): 32-bit, 64-bit

Address sizes: 39 bits physical, 48 bits virtual

Byte Order: Big Endian

CPU(s): 9

On-line CPU(s) list: 0-8

Vendor ID: GenuineIntel

Model name: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz

CPU family: 6

Model: 158

Thread(s) per core: 1

Core(s) per socket: 1

Socket(s): 9

Stepping: 10

BogoMIPS: 5200.00

Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx
pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni
pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx

16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave
avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 avx2
bmi2 erms xsaveopt arat

Caches (sum of all):

L1d: 288 KiB (9 instances)

L1i: 288 KiB (9 instances)

L2: 2.3 MiB (9 instances)

L3: 108 MiB (9 instances)

Vulnerabilities:

Itlb multihit: KVM: Mitigation: VMX unsupported

L1tf: Mitigation; PTE Inversion

Mds: Vulnerable; SMT Host state unknown

Meltdown: Vulnerable

Mmio stale data: Vulnerable

Spec store bypass: Vulnerable

Spectre v1: Vulnerable: __user pointer sanitization and
usercopy barriers only; no swapgs barriers

Spectre v2: Vulnerable, STIBP: disabled

Srbds: Unknown: Dependent on hypervisor status

Tsx async abort: Not affected

[edb@9428da9d2137 postgres]$ clang --version

clang version 15.0.7 (Red Hat 15.0.7-2.el9)

Target: s390x-ibm-linux-gnu

Thread model: posix

InstalledDir: /usr/bin

[edb@9428da9d2137 postgres]$ rpm -qa | grep llvm

*llvm*-libs-15.0.7-1.el9.s390x

*llvm*-15.0.7-1.el9.s390x

*llvm*-test-15.0.7-1.el9.s390x

*llvm*-static-15.0.7-1.el9.s390x

*llvm*-devel-15.0.7-1.el9.s390x

Please let me know if any further information is required.

*Test case:*
CREATE TABLE rm32044_t1
(
pkey integer,
val text
);
CREATE TABLE rm32044_t2
(
pkey integer,
label text,
hidden boolean
);
CREATE TABLE rm32044_t3
(
pkey integer,
val integer
);
CREATE TABLE rm32044_t4
(
pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON

rm32044_t1.pkey

= rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

I tried this on both master and 16, without hitting this issue.

If you can reproduce the issue on upstream postgres, can you share more
about
your configuration?

Greetings,

Andres Freund

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com