pg11.1 jit segv

Started by Justin Pryzbyabout 7 years ago21 messages
#1Justin Pryzby
pryzby@telsasoft.com

Crash is reproducible but only when JIT=on.

postgresql11-llvmjit-11.1-1PGDG.rhel7.x86_64

[2769871.453033] postmaster[8582]: segfault at 7f083bddb780 ip 00007f08127e814e sp 00007ffe463506e0 error 4
[2770774.470600] postmaster[29410]: segfault at 7f0812eeb6c8 ip 00007f08127eb4f0 sp 00007ffe463506e0 error 4

Core was generated by `postgres: telsasoft ts 192.168.122.11(41908) SELECT '.
Program terminated with signal 11, Segmentation fault.

(gdb) bt
#0 0x00007f08127e814e in ?? ()
#1 0x0000000000000000 in ?? ()

[pryzbyj@database ~]$ sudo -u postgres valgrind /usr/pgsql-11/bin/postgres --single -D /var/lib/pgsql/11/data ts <tmp/sql-jit-crash-2018-11-15

==26448== Conditional jump or move depends on uninitialised value(s)
==26448== at 0x1B510F09: getAdjustedPtr(llvm::IRBuilder<llvm::ConstantFolder, (anonymous namespace)::IRBuilderPrefixedInserter>&, llvm::DataLayout const&, llvm::Value*, llvm::APInt, llvm::Type*, llvm::Twine) (SROA.cpp:1531)
==26448== by 0x1B511C52: llvm::sroa::AllocaSliceRewriter::getNewAllocaSlicePtr(llvm::IRBuilder<llvm::ConstantFolder, (anonymous namespace)::IRBuilderPrefixedInserter>&, llvm::Type*) (SROA.cpp:2313)
==26448== by 0x1B516BA0: llvm::sroa::AllocaSliceRewriter::visitIntrinsicInst(llvm::IntrinsicInst&) (SROA.cpp:2921)
==26448== by 0x1B5190CC: visitCall (Instruction.def:190)
==26448== by 0x1B5190CC: llvm::InstVisitor<llvm::sroa::AllocaSliceRewriter, bool>::visit(llvm::Instruction&) (Instruction.def:190)
==26448== by 0x1B51947A: visit (InstVisitor.h:114)
==26448== by 0x1B51947A: llvm::sroa::AllocaSliceRewriter::visit((anonymous namespace)::Slice const*) (SROA.cpp:2262)
==26448== by 0x1B51D426: llvm::SROA::rewritePartition(llvm::AllocaInst&, llvm::sroa::AllocaSlices&, llvm::sroa::Partition&) (SROA.cpp:3921)
==26448== by 0x1B51E630: llvm::SROA::splitAlloca(llvm::AllocaInst&, llvm::sroa::AllocaSlices&) (SROA.cpp:4029)
==26448== by 0x1B51F25D: llvm::SROA::runOnAlloca(llvm::AllocaInst&) (SROA.cpp:4156)
==26448== by 0x1B52048A: llvm::SROA::runImpl(llvm::Function&, llvm::DominatorTree&, llvm::AssumptionCache&) (SROA.cpp:4243)
==26448== by 0x1B520E40: llvm::sroa::SROALegacyPass::runOnFunction(llvm::Function&) (SROA.cpp:4296)
==26448== by 0x1AC49C31: llvm::FPPassManager::runOnFunction(llvm::Function&) (LegacyPassManager.cpp:1514)
==26448== by 0x1B6A805E: RunPassOnSCC (Timer.h:149)
==26448== by 0x1B6A805E: RunAllPassesOnSCC (CallGraphSCCPass.cpp:419)
==26448== by 0x1B6A805E: (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (CallGraphSCCPass.cpp:474)
==26448==
==26448== Conditional jump or move depends on uninitialised value(s)
==26448== at 0x1B510F09: getAdjustedPtr(llvm::IRBuilder<llvm::ConstantFolder, (anonymous namespace)::IRBuilderPrefixedInserter>&, llvm::DataLayout const&, llvm::Value*, llvm::APInt, llvm::Type*, llvm::Twine) (SROA.cpp:1531)
==26448== by 0x1B511C52: llvm::sroa::AllocaSliceRewriter::getNewAllocaSlicePtr(llvm::IRBuilder<llvm::ConstantFolder, (anonymous namespace)::IRBuilderPrefixedInserter>&, llvm::Type*) (SROA.cpp:2313)
==26448== by 0x1B515EF0: llvm::sroa::AllocaSliceRewriter::visitMemSetInst(llvm::MemSetInst&) (SROA.cpp:2656)
==26448== by 0x1B5190CC: visitCall (Instruction.def:190)
==26448== by 0x1B5190CC: llvm::InstVisitor<llvm::sroa::AllocaSliceRewriter, bool>::visit(llvm::Instruction&) (Instruction.def:190)
==26448== by 0x1B51947A: visit (InstVisitor.h:114)
==26448== by 0x1B51947A: llvm::sroa::AllocaSliceRewriter::visit((anonymous namespace)::Slice const*) (SROA.cpp:2262)
==26448== by 0x1B51D426: llvm::SROA::rewritePartition(llvm::AllocaInst&, llvm::sroa::AllocaSlices&, llvm::sroa::Partition&) (SROA.cpp:3921)
==26448== by 0x1B51E630: llvm::SROA::splitAlloca(llvm::AllocaInst&, llvm::sroa::AllocaSlices&) (SROA.cpp:4029)
==26448== by 0x1B51F25D: llvm::SROA::runOnAlloca(llvm::AllocaInst&) (SROA.cpp:4156)
==26448== by 0x1B52048A: llvm::SROA::runImpl(llvm::Function&, llvm::DominatorTree&, llvm::AssumptionCache&) (SROA.cpp:4243)
==26448== by 0x1B520E40: llvm::sroa::SROALegacyPass::runOnFunction(llvm::Function&) (SROA.cpp:4296)
==26448== by 0x1AC49C31: llvm::FPPassManager::runOnFunction(llvm::Function&) (LegacyPassManager.cpp:1514)
==26448== by 0x1B6A805E: RunPassOnSCC (Timer.h:149)
==26448== by 0x1B6A805E: RunAllPassesOnSCC (CallGraphSCCPass.cpp:419)
==26448== by 0x1B6A805E: (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (CallGraphSCCPass.cpp:474)
==26448==

==26448== Invalid write of size 8
==26448== at 0x4C2E0C3: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==26448== by 0x824075: UnknownInlinedFun (string3.h:51)
==26448== by 0x824075: varstrfastcmp_locale (varlena.c:2135)
==26448== by 0x87C6F3: ApplySortComparator (sortsupport.h:224)
==26448== by 0x87C6F3: comparetup_heap (tuplesort.c:3560)
==26448== by 0x8793F4: qsort_tuple (qsort_tuple.c:112)
==26448== by 0x87EE53: tuplesort_performsort (tuplesort.c:1811)
==26448== by 0x628CE2: ExecSort (nodeSort.c:118)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== Address 0x1f77c1c0 is 0 bytes after a block of size 8,192 alloc'd
==26448== at 0x4C29C23: malloc (vg_replace_malloc.c:299)
==26448== by 0x86D5AD: AllocSetContextCreateExtended (aset.c:477)
==26448== by 0x87AA3D: tuplesort_begin_common (tuplesort.c:697)
==26448== by 0x87DAA9: tuplesort_begin_heap (tuplesort.c:812)
==26448== by 0x628C93: ExecSort (nodeSort.c:89)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x61B364: UnknownInlinedFun (executor.h:237)
==26448== by 0x61B364: MultiExecPrivateHash (nodeHash.c:164)
==26448== by 0x61B364: MultiExecHash (nodeHash.c:114)
==26448==
==26448== Invalid read of size 8
==26448== at 0x4C2E0CE: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==26448== by 0x824075: UnknownInlinedFun (string3.h:51)
==26448== by 0x824075: varstrfastcmp_locale (varlena.c:2135)
==26448== by 0x87C6F3: ApplySortComparator (sortsupport.h:224)
==26448== by 0x87C6F3: comparetup_heap (tuplesort.c:3560)
==26448== by 0x8793F4: qsort_tuple (qsort_tuple.c:112)
==26448== by 0x87EE53: tuplesort_performsort (tuplesort.c:1811)
==26448== by 0x628CE2: ExecSort (nodeSort.c:118)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== Address 0x1f77e200 is 0 bytes after a block of size 8,192 alloc'd
==26448== at 0x4C29C23: malloc (vg_replace_malloc.c:299)
==26448== by 0x86D5AD: AllocSetContextCreateExtended (aset.c:477)
==26448== by 0x87AA5A: tuplesort_begin_common (tuplesort.c:710)
==26448== by 0x87DAA9: tuplesort_begin_heap (tuplesort.c:812)
==26448== by 0x628C93: ExecSort (nodeSort.c:89)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x61B364: UnknownInlinedFun (executor.h:237)
==26448== by 0x61B364: MultiExecPrivateHash (nodeHash.c:164)
==26448== by 0x61B364: MultiExecHash (nodeHash.c:114)
==26448==
==26448== Invalid read of size 8
==26448== at 0x4C2E0C0: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==26448== by 0x824075: UnknownInlinedFun (string3.h:51)
==26448== by 0x824075: varstrfastcmp_locale (varlena.c:2135)
==26448== by 0x87C6F3: ApplySortComparator (sortsupport.h:224)
==26448== by 0x87C6F3: comparetup_heap (tuplesort.c:3560)
==26448== by 0x8793F4: qsort_tuple (qsort_tuple.c:112)
==26448== by 0x87EE53: tuplesort_performsort (tuplesort.c:1811)
==26448== by 0x628CE2: ExecSort (nodeSort.c:118)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== Address 0x1f77e210 is 16 bytes after a block of size 8,192 alloc'd
==26448== at 0x4C29C23: malloc (vg_replace_malloc.c:299)
==26448== by 0x86D5AD: AllocSetContextCreateExtended (aset.c:477)
==26448== by 0x87AA5A: tuplesort_begin_common (tuplesort.c:710)
==26448== by 0x87DAA9: tuplesort_begin_heap (tuplesort.c:812)
==26448== by 0x628C93: ExecSort (nodeSort.c:89)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x61B364: UnknownInlinedFun (executor.h:237)
==26448== by 0x61B364: MultiExecPrivateHash (nodeHash.c:164)
==26448== by 0x61B364: MultiExecHash (nodeHash.c:114)
==26448==
==26448==
==26448== More than 10000000 total errors detected. I'm not reporting any more.
==26448== Final error counts will be inaccurate. Go fix your program!
==26448== Rerun with --error-limit=no to disable this cutoff. Note
==26448== that errors may occur in your program without prior warning from
==26448== Valgrind, because errors are no longer being displayed.
==26448==
==26448==
==26448== Process terminating with default action of signal 11 (SIGSEGV)
==26448== Access not within mapped region at address 0x23337000
==26448== at 0x4C2E0CE: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==26448== by 0x824075: UnknownInlinedFun (string3.h:51)
==26448== by 0x824075: varstrfastcmp_locale (varlena.c:2135)
==26448== by 0x87C6F3: ApplySortComparator (sortsupport.h:224)
==26448== by 0x87C6F3: comparetup_heap (tuplesort.c:3560)
==26448== by 0x8793F4: qsort_tuple (qsort_tuple.c:112)
==26448== by 0x87EE53: tuplesort_performsort (tuplesort.c:1811)
==26448== by 0x628CE2: ExecSort (nodeSort.c:118)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x610D48: UnknownInlinedFun (executor.h:237)
==26448== by 0x610D48: fetch_input_tuple (nodeAgg.c:406)
==26448== by 0x61277F: agg_retrieve_direct (nodeAgg.c:1736)
==26448== by 0x61277F: ExecAgg (nodeAgg.c:1551)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== by 0x60A764: ExecScanFetch (execScan.c:95)
==26448== by 0x60A764: ExecScan (execScan.c:162)
==26448== by 0x609067: ExecProcNodeInstr (execProcnode.c:461)
==26448== If you believe this happened as a result of a stack
==26448== overflow in your program's main thread (unlikely but
==26448== possible), you can try to increase the size of the
==26448== main thread stack using the --main-stacksize= flag.
==26448== The main thread stack size used in this run was 8388608.
=
Let me know if there's anything else I can provide.

Justin

#2Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#1)
Re: pg11.1 jit segv

Hi,

On 2018-11-15 16:39:59 -0600, Justin Pryzby wrote:

Crash is reproducible but only when JIT=on.

postgresql11-llvmjit-11.1-1PGDG.rhel7.x86_64

[2769871.453033] postmaster[8582]: segfault at 7f083bddb780 ip 00007f08127e814e sp 00007ffe463506e0 error 4
[2770774.470600] postmaster[29410]: segfault at 7f0812eeb6c8 ip 00007f08127eb4f0 sp 00007ffe463506e0 error 4

Core was generated by `postgres: telsasoft ts 192.168.122.11(41908) SELECT '.
Program terminated with signal 11, Segmentation fault.

Thanks for reporting!

(gdb) bt
#0 0x00007f08127e814e in ?? ()
#1 0x0000000000000000 in ?? ()

Could you enable jit_debugging_support and reproduce? That should give
a more useful backtrace.

It'd also be interesting if toggling jit_inline_above_cost=-1,
jit_optimize_above_cost=-1 and jit_tuple_deforming=false, each set
individually, make a difference.

Let me know if there's anything else I can provide.

Any way for me to reproduce this locally? I.e. can you narrow this down
to a reproducible test case?

Greetings,

Andres Freund

#3Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#2)
Re: pg11.1 jit segv

On Thu, Nov 15, 2018 at 02:47:55PM -0800, Andres Freund wrote:

(gdb) bt
#0 0x00007f08127e814e in ?? ()
#1 0x0000000000000000 in ?? ()

Could you enable jit_debugging_support and reproduce? That should give
a more useful backtrace.

Core was generated by `postgres: pryzbyj ts [local] EXPLAIN '.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f819e227cb9 in __memcpy_ssse3_back () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install audit-libs-2.8.1-3.el7_5.1.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 elfutils-libelf-0.170-4.el7.x86_64 elfutils-libs-0.170-4.el7.x86_64 glib4
(gdb) bt
#0 0x00007f819e227cb9 in __memcpy_ssse3_back () from /lib64/libc.so.6
#1 0x0000000000824076 in memcpy (__len=18446744073709551612, __src=0x6c35818, __dest=<optimized out>) at /usr/include/bits/string3.h:51
#2 varstrfastcmp_locale (x=113465364, y=113465636, ssup=<optimized out>) at varlena.c:2135
#3 0x000000000087c6f4 in ApplySortComparator (ssup=0x6c33c40, isNull2=<optimized out>, datum2=<optimized out>, isNull1=false, datum1=<optimized out>) at ../../../../src/include/utils/sortsupport.h:224
#4 comparetup_heap (a=<optimized out>, b=<optimized out>, state=0x6c337f0) at tuplesort.c:3560
#5 0x00000000008793f5 in qsort_tuple (a=0x8311958, n=5272, cmp_tuple=0x87c2d0 <comparetup_heap>, state=state@entry=0x6c337f0) at qsort_tuple.c:112
#6 0x000000000087d2db in tuplesort_sort_memtuples (state=state@entry=0x6c337f0) at tuplesort.c:3320
#7 0x000000000087ee54 in tuplesort_performsort (state=state@entry=0x6c337f0) at tuplesort.c:1811
#8 0x0000000000628ce3 in ExecSort (pstate=0x5e50ed8) at nodeSort.c:118
#9 0x0000000000609068 in ExecProcNodeInstr (node=0x5e50ed8) at execProcnode.c:461
#10 0x0000000000610d49 in ExecProcNode (node=0x5e50ed8) at ../../../src/include/executor/executor.h:237
#11 fetch_input_tuple (aggstate=aggstate@entry=0x5e50aa0) at nodeAgg.c:406
#12 0x0000000000612780 in agg_retrieve_direct (aggstate=0x5e50aa0) at nodeAgg.c:1736
#13 ExecAgg (pstate=0x5e50aa0) at nodeAgg.c:1551
#14 0x0000000000609068 in ExecProcNodeInstr (node=0x5e50aa0) at execProcnode.c:461
#15 0x000000000060a765 in ExecScanFetch (recheckMtd=0x62be50 <SubqueryRecheck>, accessMtd=0x62be70 <SubqueryNext>, node=0x5e508e0) at execScan.c:95
#16 ExecScan (node=0x5e508e0, accessMtd=0x62be70 <SubqueryNext>, recheckMtd=0x62be50 <SubqueryRecheck>) at execScan.c:162
#17 0x0000000000609068 in ExecProcNodeInstr (node=0x5e508e0) at execProcnode.c:461
#18 0x000000000061b365 in ExecProcNode (node=0x5e508e0) at ../../../src/include/executor/executor.h:237
#19 MultiExecPrivateHash (node=0x5e50720) at nodeHash.c:164
#20 MultiExecHash (node=node@entry=0x5e50720) at nodeHash.c:114
#21 0x0000000000609610 in MultiExecProcNode (node=node@entry=0x5e50720) at execProcnode.c:501
#22 0x000000000061bd98 in ExecHashJoinImpl (parallel=false, pstate=0x5e4f620) at nodeHashjoin.c:290
#23 ExecHashJoin (pstate=0x5e4f620) at nodeHashjoin.c:565
#24 0x0000000000609068 in ExecProcNodeInstr (node=0x5e4f620) at execProcnode.c:461
#25 0x000000000061b365 in ExecProcNode (node=0x5e4f620) at ../../../src/include/executor/executor.h:237
#26 MultiExecPrivateHash (node=0x5e4f460) at nodeHash.c:164
#27 MultiExecHash (node=node@entry=0x5e4f460) at nodeHash.c:114
#28 0x0000000000609610 in MultiExecProcNode (node=node@entry=0x5e4f460) at execProcnode.c:501
#29 0x000000000061bd98 in ExecHashJoinImpl (parallel=false, pstate=0x38344f8) at nodeHashjoin.c:290
#30 ExecHashJoin (pstate=0x38344f8) at nodeHashjoin.c:565
#31 0x0000000000609068 in ExecProcNodeInstr (node=0x38344f8) at execProcnode.c:461
#32 0x000000000061b365 in ExecProcNode (node=0x38344f8) at ../../../src/include/executor/executor.h:237
#33 MultiExecPrivateHash (node=0x38343c8) at nodeHash.c:164
#34 MultiExecHash (node=node@entry=0x38343c8) at nodeHash.c:114
#35 0x0000000000609610 in MultiExecProcNode (node=node@entry=0x38343c8) at execProcnode.c:501
#36 0x000000000061bd98 in ExecHashJoinImpl (parallel=false, pstate=0x3833148) at nodeHashjoin.c:290
#37 ExecHashJoin (pstate=0x3833148) at nodeHashjoin.c:565
#38 0x0000000000609068 in ExecProcNodeInstr (node=0x3833148) at execProcnode.c:461
#39 0x0000000000628cd6 in ExecProcNode (node=0x3833148) at ../../../src/include/executor/executor.h:237
#40 ExecSort (pstate=0x3832bd0) at nodeSort.c:107
#41 0x0000000000609068 in ExecProcNodeInstr (node=0x3832bd0) at execProcnode.c:461
#42 0x0000000000610d49 in ExecProcNode (node=0x3832bd0) at ../../../src/include/executor/executor.h:237
#43 fetch_input_tuple (aggstate=aggstate@entry=0x3832e30) at nodeAgg.c:406
#44 0x0000000000612780 in agg_retrieve_direct (aggstate=0x3832e30) at nodeAgg.c:1736
#45 ExecAgg (pstate=0x3832e30) at nodeAgg.c:1551
#46 0x0000000000609068 in ExecProcNodeInstr (node=0x3832e30) at execProcnode.c:461
#47 0x00000000006025ba in ExecProcNode (node=0x3832e30) at ../../../src/include/executor/executor.h:237
#48 ExecutePlan (execute_once=<optimized out>, dest=0xced3a0 <donothingDR>, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x3832e30,
estate=0x3832970) at execMain.c:1723
#49 standard_ExecutorRun (queryDesc=0x5c3ec68, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#50 0x00007f8191ade08d in pgss_ExecutorRun (queryDesc=0x5c3ec68, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:892
#51 0x00007f80eb8bf5e1 in explain_ExecutorRun (queryDesc=0x5c3ec68, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:268
#52 0x00000000005a497c in ExplainOnePlan (plannedstmt=plannedstmt@entry=0x5c32cd8, into=into@entry=0x0, es=es@entry=0x207aaf8,
queryString=queryString@entry=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_u5
#53 0x00000000005a4c87 in ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=0x0, es=0x207aaf8,
queryString=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_utrancell.period as1
#54 0x00000000005a5174 in ExplainQuery (pstate=pstate@entry=0x2078dc8, stmt=stmt@entry=0x1fe1df8,
queryString=queryString@entry=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_u4
#55 0x00000000007469dd in standard_ProcessUtility (pstmt=pstmt@entry=0x1fe1ea8,
queryString=queryString@entry=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_u5
#56 0x00007f8191ae0025 in pgss_ProcessUtility (pstmt=0x1fe1ea8,
queryString=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_utrancell.period as5
#57 0x0000000000743e86 in PortalRunUtility (portal=0x1f12be0, pstmt=0x1fe1ea8, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=0x2078d38, completionTag=0x7ffc304d0e20 "") at pquery.c:1178
#58 0x0000000000744bc2 in FillPortalStore (portal=portal@entry=0x1f12be0, isTopLevel=isTopLevel@entry=true) at pquery.c:1038
#59 0x000000000074564f in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x7f81a09e2dc8, altdest=0x7f81a09e2dc8, completionTag=0x7ffc304d1040 "")
at pquery.c:768
#60 0x0000000000741577 in exec_simple_query (query_string=<optimized out>) at postgres.c:1145
#61 0x0000000000742842 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4182
#62 0x00000000004801dd in BackendRun (port=0x1ec3f70) at postmaster.c:4361
#63 BackendStartup (port=0x1ec3f70) at postmaster.c:4033
#64 ServerLoop () at postmaster.c:1706
#65 0x00000000006d5d49 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1e98280) at postmaster.c:1379
#66 0x000000000048103f in main (argc=3, argv=0x1e98280) at main.c:228
(

(gdb) bt f
#0 0x00007f819e227cb9 in __memcpy_ssse3_back () from /lib64/libc.so.6
No symbol table info available.
#1 0x0000000000824076 in memcpy (__len=18446744073709551612, __src=0x6c35818, __dest=<optimized out>) at /usr/include/bits/string3.h:51
No locals.
#2 varstrfastcmp_locale (x=113465364, y=113465636, ssup=<optimized out>) at varlena.c:2135
arg1 = 0x6c35814
arg2 = 0x6c35924
arg1_match = false
sss = 0x6c33d60
a1p = 0x6c35818 ""
a2p = 0x6c35928 ""
len1 = -4
len2 = -4
result = <optimized out>
#3 0x000000000087c6f4 in ApplySortComparator (ssup=0x6c33c40, isNull2=<optimized out>, datum2=<optimized out>, isNull1=false, datum1=<optimized out>) at ../../../../src/include/utils/sortsupport.h:224
compare = <optimized out>
isNull2 = false
isNull1 = false
ssup = 0x6c33c40
datum2 = <optimized out>
compare = <optimized out>
#4 comparetup_heap (a=<optimized out>, b=<optimized out>, state=0x6c337f0) at tuplesort.c:3560
sortKey = 0x6c33c40
ltup = {t_len = 248, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 42884}, t_tableOid = 0, t_data = 0x6c357f8}
rtup = {t_len = 248, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 1}, t_tableOid = 0, t_data = 0x6c35908}
tupDesc = 0x5ede208
nkey = 1
attno = 2
datum1 = <optimized out>
datum2 = <optimized out>
isnull1 = false
isnull2 = false
#5 0x00000000008793f5 in qsort_tuple (a=0x8311958, n=5272, cmp_tuple=0x87c2d0 <comparetup_heap>, state=state@entry=0x6c337f0) at qsort_tuple.c:112
pa = <optimized out>
pb = <optimized out>
pc = <optimized out>
pd = <optimized out>
pl = <optimized out>
pm = 0x8311970
pn = <optimized out>
d1 = <optimized out>
d2 = <optimized out>
r = <optimized out>
presorted = 1
#6 0x000000000087d2db in tuplesort_sort_memtuples (state=state@entry=0x6c337f0) at tuplesort.c:3320
No locals.
#7 0x000000000087ee54 in tuplesort_performsort (state=state@entry=0x6c337f0) at tuplesort.c:1811
oldcontext = 0x3832860
__func__ = "tuplesort_performsort"
#8 0x0000000000628ce3 in ExecSort (pstate=0x5e50ed8) at nodeSort.c:118
plannode = <optimized out>
outerNode = 0x5e50fe8
tupDesc = <optimized out>
node = 0x5e50ed8
estate = 0x3832970
dir = ForwardScanDirection
tuplesortstate = 0x6c337f0
slot = <optimized out>
#9 0x0000000000609068 in ExecProcNodeInstr (node=0x5e50ed8) at execProcnode.c:461
result = <optimized out>
#10 0x0000000000610d49 in ExecProcNode (node=0x5e50ed8) at ../../../src/include/executor/executor.h:237
No locals.
#11 fetch_input_tuple (aggstate=aggstate@entry=0x5e50aa0) at nodeAgg.c:406
slot = <optimized out>
#12 0x0000000000612780 in agg_retrieve_direct (aggstate=0x5e50aa0) at nodeAgg.c:1736
econtext = 0x5e50e28
firstSlot = 0x674fa50
numGroupingSets = 1
node = 0x5c03788
tmpcontext = 0x5e50cc8
peragg = 0x6754918
outerslot = <optimized out>
nextSetSize = <optimized out>
pergroups = 0x6755928
result = <optimized out>
hasGroupingSets = false
currentSet = <optimized out>
numReset = 1
i = <optimized out>
#13 ExecAgg (pstate=0x5e50aa0) at nodeAgg.c:1551
node = 0x5e50aa0
result = 0x0
#14 0x0000000000609068 in ExecProcNodeInstr (node=0x5e50aa0) at execProcnode.c:461
result = <optimized out>
#15 0x000000000060a765 in ExecScanFetch (recheckMtd=0x62be50 <SubqueryRecheck>, accessMtd=0x62be70 <SubqueryNext>, node=0x5e508e0) at execScan.c:95
estate = 0x3832970
#16 ExecScan (node=0x5e508e0, accessMtd=0x62be70 <SubqueryNext>, recheckMtd=0x62be50 <SubqueryRecheck>) at execScan.c:162
econtext = 0x5e509f0
qual = 0x0
projInfo = 0x675dbb8
#17 0x0000000000609068 in ExecProcNodeInstr (node=0x5e508e0) at execProcnode.c:461
result = <optimized out>
#18 0x000000000061b365 in ExecProcNode (node=0x5e508e0) at ../../../src/include/executor/executor.h:237
No locals.
#19 MultiExecPrivateHash (node=0x5e50720) at nodeHash.c:164
outerNode = <optimized out>
hashkeys = 0x67662f0
econtext = 0x5e50830
hashvalue = 0
hashtable = 0x6a7d468
#20 MultiExecHash (node=node@entry=0x5e50720) at nodeHash.c:114
No locals.
#21 0x0000000000609610 in MultiExecProcNode (node=node@entry=0x5e50720) at execProcnode.c:501
__func__ = "MultiExecProcNode"
#22 0x000000000061bd98 in ExecHashJoinImpl (parallel=false, pstate=0x5e4f620) at nodeHashjoin.c:290
outerNode = 0x5e4f8e0
hashNode = <optimized out>
econtext = <optimized out>
node = 0x5e4f620
joinqual = <optimized out>
otherqual = <optimized out>
hashtable = 0x6a7d468
hashvalue = 0
batchno = 48
parallel_state = <optimized out>
#23 ExecHashJoin (pstate=0x5e4f620) at nodeHashjoin.c:565
No locals.
#24 0x0000000000609068 in ExecProcNodeInstr (node=0x5e4f620) at execProcnode.c:461
result = <optimized out>
#25 0x000000000061b365 in ExecProcNode (node=0x5e4f620) at ../../../src/include/executor/executor.h:237
No locals.
#26 MultiExecPrivateHash (node=0x5e4f460) at nodeHash.c:164
outerNode = <optimized out>
hashkeys = 0x676ef80
econtext = 0x5e4f570
hashvalue = 0
hashtable = 0x6a7cf48
#27 MultiExecHash (node=node@entry=0x5e4f460) at nodeHash.c:114
No locals.
#28 0x0000000000609610 in MultiExecProcNode (node=node@entry=0x5e4f460) at execProcnode.c:501
__func__ = "MultiExecProcNode"
#29 0x000000000061bd98 in ExecHashJoinImpl (parallel=false, pstate=0x38344f8) at nodeHashjoin.c:290
outerNode = 0x5c4b4b8
hashNode = <optimized out>
econtext = <optimized out>
node = 0x38344f8
joinqual = <optimized out>
otherqual = <optimized out>
hashtable = 0x6a7cf48
hashvalue = 0
batchno = 0
parallel_state = <optimized out>
#30 ExecHashJoin (pstate=0x38344f8) at nodeHashjoin.c:565
No locals.
#31 0x0000000000609068 in ExecProcNodeInstr (node=0x38344f8) at execProcnode.c:461
result = <optimized out>
#32 0x000000000061b365 in ExecProcNode (node=0x38344f8) at ../../../src/include/executor/executor.h:237
No locals.
#33 MultiExecPrivateHash (node=0x38343c8) at nodeHash.c:164
outerNode = <optimized out>
hashkeys = 0x6778c40
econtext = 0x3834048
hashvalue = 0
hashtable = 0x6a7ce38
#34 MultiExecHash (node=node@entry=0x38343c8) at nodeHash.c:114
No locals.
#35 0x0000000000609610 in MultiExecProcNode (node=node@entry=0x38343c8) at execProcnode.c:501
__func__ = "MultiExecProcNode"
#36 0x000000000061bd98 in ExecHashJoinImpl (parallel=false, pstate=0x3833148) at nodeHashjoin.c:290
outerNode = 0x3833408
hashNode = <optimized out>
econtext = <optimized out>
node = 0x3833148
joinqual = <optimized out>
otherqual = <optimized out>
hashtable = 0x6a7ce38
hashvalue = 0
batchno = 96457208
parallel_state = <optimized out>
#37 ExecHashJoin (pstate=0x3833148) at nodeHashjoin.c:565
No locals.
#38 0x0000000000609068 in ExecProcNodeInstr (node=0x3833148) at execProcnode.c:461
result = <optimized out>
#39 0x0000000000628cd6 in ExecProcNode (node=0x3833148) at ../../../src/include/executor/executor.h:237
No locals.
#40 ExecSort (pstate=0x3832bd0) at nodeSort.c:107
plannode = <optimized out>
outerNode = 0x3833148
tupDesc = <optimized out>
node = 0x3832bd0
estate = 0x3832970
dir = ForwardScanDirection
tuplesortstate = 0x6c0f6f0
slot = <optimized out>
#41 0x0000000000609068 in ExecProcNodeInstr (node=0x3832bd0) at execProcnode.c:461
result = <optimized out>
#42 0x0000000000610d49 in ExecProcNode (node=0x3832bd0) at ../../../src/include/executor/executor.h:237
No locals.
#43 fetch_input_tuple (aggstate=aggstate@entry=0x3832e30) at nodeAgg.c:406
slot = <optimized out>
#44 0x0000000000612780 in agg_retrieve_direct (aggstate=0x3832e30) at nodeAgg.c:1736
econtext = 0x3833098
firstSlot = 0x677a550
numGroupingSets = 1
node = 0x5c06950
tmpcontext = 0x3832ce0
peragg = 0x67873d0
outerslot = <optimized out>
nextSetSize = <optimized out>
pergroups = 0x6a752e8
result = <optimized out>
hasGroupingSets = false
currentSet = <optimized out>
numReset = 1
i = <optimized out>
#45 ExecAgg (pstate=0x3832e30) at nodeAgg.c:1551
node = 0x3832e30
result = 0x0
#46 0x0000000000609068 in ExecProcNodeInstr (node=0x3832e30) at execProcnode.c:461
result = <optimized out>
#47 0x00000000006025ba in ExecProcNode (node=0x3832e30) at ../../../src/include/executor/executor.h:237
No locals.
#48 ExecutePlan (execute_once=<optimized out>, dest=0xced3a0 <donothingDR>, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x3832e30,
estate=0x3832970) at execMain.c:1723
slot = <optimized out>
current_tuple_count = 0
#49 standard_ExecutorRun (queryDesc=0x5c3ec68, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
estate = 0x3832970
operation = CMD_SELECT
dest = 0xced3a0 <donothingDR>
sendTuples = <optimized out>
oldcontext = 0x2078b00
__func__ = "standard_ExecutorRun"
#50 0x00007f8191ade08d in pgss_ExecutorRun (queryDesc=0x5c3ec68, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:892
save_exception_stack = 0x7ffc304d0890
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {34056952, -4317634825662262871, 140721118841376, 96726120, 0, 0, 4318053775260225961, 4258026755233780137}, __mask_was_saved = 0, __saved_mask = {__val = {33258448, 2, 8603255,
33258448, 5392503, 16, 0, 72057594037927936, 336, 1, 58927200, 58927472, 8861374, 7, 1, 58928688}}}}
#51 0x00007f80eb8bf5e1 in explain_ExecutorRun (queryDesc=0x5c3ec68, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:268
save_exception_stack = 0x7ffc304d0cf0
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {34056952, -4317634825662262871, 140721118841376, 96726120, 0, 0, 4318053775153271209, 4257441557765001641}, __mask_was_saved = 0, __saved_mask = {__val = {336, 1, 58927200, 96726120,
8861374, 2147483647, 1, 34048768, 6356030, 96726120, 96726120, 13907488, 140191684360137, 13554592, 810355232, 34056952}}}}
#52 0x00000000005a497c in ExplainOnePlan (plannedstmt=plannedstmt@entry=0x5c32cd8, into=into@entry=0x0, es=es@entry=0x207aaf8,
queryString=queryString@entry=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_u5
dir = <optimized out>
dest = <optimized out>
queryDesc = 0x5c3ec68
starttime = {tv_sec = 2774013, tv_nsec = 519809227}
totaltime = 0
eflags = <optimized out>
instrument_option = <optimized out>
#53 0x00000000005a4c87 in ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=0x0, es=0x207aaf8,
queryString=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_utrancell.period as1
plan = 0x5c32cd8
planstart = {tv_sec = 2774013, tv_nsec = 111256886}
planduration = {tv_sec = 0, tv_nsec = 408551920}
params = 0x0
queryString = 0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_utrancell.per.
es = 0x207aaf8
into = 0x0
queryEnv = 0x0
cursorOptions = <optimized out>
query = <optimized out>
#54 0x00000000005a5174 in ExplainQuery (pstate=pstate@entry=0x2078dc8, stmt=stmt@entry=0x1fe1df8,
queryString=queryString@entry=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_u4
l = 0x20e15a8
es = 0x207aaf8
tstate = <optimized out>
rewritten = <optimized out>
lc = <optimized out>
timing_set = <optimized out>
summary_set = <optimized out>
__func__ = "ExplainQuery"
#55 0x00000000007469dd in standard_ProcessUtility (pstmt=pstmt@entry=0x1fe1ea8,
queryString=queryString@entry=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_u5
parsetree = 0x1fe1df8
isTopLevel = true
isAtomicContext = false
pstate = 0x2078dc8
__func__ = "standard_ProcessUtility"
#56 0x00007f8191ae0025 in pgss_ProcessUtility (pstmt=0x1fe1ea8,
queryString=0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_utrancell.period as5
save_exception_stack = 0x7ffc304d0f10
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {32582624, -4317636606361366103, 34049336, 32848032, 32582624, 2, 4318053775111328169, 4258026755403256233}, __mask_was_saved = 0, __saved_mask = {__val = {0, 0, 8860127,
140721118842400, 16, 32088528, 32847712, 24, 34049336, 140721118842400, 8860127, 32848032, 32582624, 32848032, 8926208, 2}}}}
duration = {tv_sec = 33430760, tv_nsec = 33985024}
bufusage = {shared_blks_hit = -140721118842111, shared_blks_read = 1028, shared_blks_dirtied = 257, shared_blks_written = 32, local_blks_hit = 528280977713, local_blks_read = 140721118842112, local_blks_dirtied = 4,
local_blks_written = 0, temp_blks_read = 511101108338, temp_blks_written = 0, blk_read_time = {tv_sec = 140721118842111, tv_nsec = 140721118842160}, blk_write_time = {tv_sec = 2, tv_nsec = 0}}
start = {tv_sec = 2774013, tv_nsec = 108551544}
rows = <optimized out>
bufusage_start = {shared_blks_hit = 900, shared_blks_read = 0, shared_blks_dirtied = 0, shared_blks_written = 0, local_blks_hit = <optimized out>, local_blks_read = <optimized out>,
local_blks_dirtied = <optimized out>, local_blks_written = <optimized out>, temp_blks_read = <optimized out>, temp_blks_written = <optimized out>, blk_read_time = {tv_sec = <optimized out>,
tv_nsec = <optimized out>}, blk_write_time = {tv_sec = <optimized out>, tv_nsec = <optimized out>}}
parsetree = <optimized out>
#57 0x0000000000743e86 in PortalRunUtility (portal=0x1f12be0, pstmt=0x1fe1ea8, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=0x2078d38, completionTag=0x7ffc304d0e20 "") at pquery.c:1178
utilityStmt = <optimized out>
snapshot = 0x1f538a0
#58 0x0000000000744bc2 in FillPortalStore (portal=portal@entry=0x1f12be0, isTopLevel=isTopLevel@entry=true) at pquery.c:1038
treceiver = 0x2078d38
completionTag = "\000\377\377\377\000\000\000\000\260z\372\001\000\000\000\000\360\240\351\001\000\000\000\000\245c\207\000\000\000\000\000\020\214\a\002\000\000\000\000\260z\372\001\000\000\000\000\a\000\000\000\000\"
__func__ = "FillPortalStore"
#59 0x000000000074564f in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x7f81a09e2dc8, altdest=0x7f81a09e2dc8, completionTag=0x7ffc304d1040 "")
at pquery.c:768
save_exception_stack = 0x7ffc304d1220
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {0, -4317636606344588887, 33431128, 140721118842944, 32582624, 2, 4318053774943556009, -4317633223726624343}, __mask_was_saved = 0, __saved_mask = {__val = {480, 10132060, 8746404, 1,
32590832, 10476062, 34048768, 2, 1, 2, 112, 33431080, 32101840, 10233096, 8861374, 2}}}}
result = <optimized out>
nprocessed = <optimized out>
saveTopTransactionResourceOwner = 0x1ed5a08
saveTopTransactionContext = 0x1f53760
saveActivePortal = 0x0
saveResourceOwner = 0x1ed5a08
savePortalContext = 0x0
saveMemoryContext = 0x1f53760
__func__ = "PortalRun"
#60 0x0000000000741577 in exec_simple_query (query_string=<optimized out>) at postgres.c:1145
dest = <optimized out>
oldcontext = <optimized out>
parsetree_list = <optimized out>
parsetree_item = <optimized out>
save_log_statement_stats = <optimized out>
was_logged = <optimized out>
use_implicit_block = <optimized out>
msec_str = "\000 \000\000\000\000\000\000?\252c\000\000\000\000\000\247\235\354\001\000\000\000\000p?\354\001\002\000\000"
__func__ = "exec_simple_query"
#61 0x0000000000742842 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4182
firstchar = 32267608
input_message = {
data = 0x1ec5d58 "explain(analyze)\nSELECT site_office as site_gran, site_location AS bs, sect_mscid || ' (' || gsm_site_name_to_sect_name (site_name, sect_num, sect_name) || ')' AS sitename, data_utrancell.period a}
local_sigjmp_buf = {{__jmpbuf = {32292536, -4317633050175762007, 32259952, 0, 0, 32260608, 4318053775948091817, -4317633221742456407}, __mask_was_saved = 1, __saved_mask = {__val = {0, 32260608, 140194680307352, 0,
206158430256, 140721118843760, 140721118843552, 32102304, 32080672, 16, 66, 32102304, 32292536, 0, 32292616, 32292784}}}}
send_ready_for_query = false
disable_idle_in_transaction_timeout = <optimized out>
__func__ = "PostgresMain"
#62 0x00000000004801dd in BackendRun (port=0x1ec3f70) at postmaster.c:4361
ac = 1
secs = 595637552
usecs = 441639
i = 1
av = 0x1ecbfe8
maxac = <optimized out>
#63 BackendStartup (port=0x1ec3f70) at postmaster.c:4033
bn = 0x1ec4200
pid = 0
#64 ServerLoop () at postmaster.c:1706
rmask = {fds_bits = {32, 0 <repeats 15 times>}}
selres = <optimized out>
now = <optimized out>
readmask = {fds_bits = {120, 0 <repeats 15 times>}}
last_lockfile_recheck_time = <optimized out>
last_touch_time = 1542322298
__func__ = "ServerLoop"
#65 0x00000000006d5d49 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1e98280) at postmaster.c:1379
opt = <optimized out>
status = <optimized out>
userDoption = <optimized out>
listen_addr_saved = true
i = <optimized out>
output_config_variable = <optimized out>
__func__ = "PostmasterMain"
#66 0x000000000048103f in main (argc=3, argv=0x1e98280) at main.c:228
No locals.
(gdb)

It'd also be interesting if toggling jit_inline_above_cost=-1,
jit_optimize_above_cost=-1 and jit_tuple_deforming=false, each set
individually, make a difference.

Crashes with jit_tuple_deforming=true (others off).

Doesn't crash with other combinations:
Options: Inlining false, Optimization true, Expressions true, Deforming false
Options: Inlining true, Optimization false, Expressions true, Deforming false

Any way for me to reproduce this locally? I.e. can you narrow this down
to a reproducible test case?

Will work on it.

Justin

#4Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#3)
Re: pg11.1 jit segv

Hi,

On 2018-11-15 17:03:35 -0600, Justin Pryzby wrote:

On Thu, Nov 15, 2018 at 02:47:55PM -0800, Andres Freund wrote:

(gdb) bt
#0 0x00007f08127e814e in ?? ()
#1 0x0000000000000000 in ?? ()

Could you enable jit_debugging_support and reproduce? That should give
a more useful backtrace.

Core was generated by `postgres: pryzbyj ts [local] EXPLAIN '.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f819e227cb9 in __memcpy_ssse3_back () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install audit-libs-2.8.1-3.el7_5.1.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 elfutils-libelf-0.170-4.el7.x86_64 elfutils-libs-0.170-4.el7.x86_64 glib4
(gdb) bt
#0 0x00007f819e227cb9 in __memcpy_ssse3_back () from /lib64/libc.so.6
...

Huh, that's the same crash? Because I don't see any evalexpr functions
in the stack, and without those the above bt should have worked...

It'd also be interesting if toggling jit_inline_above_cost=-1,
jit_optimize_above_cost=-1 and jit_tuple_deforming=false, each set
individually, make a difference.

Crashes with jit_tuple_deforming=true (others off).

Doesn't crash with other combinations:
Options: Inlining false, Optimization true, Expressions true, Deforming false
Options: Inlining true, Optimization false, Expressions true, Deforming false

Oh, interesting. That helps. Could you perhaps show an EXPLAIN VERBOSE
of the query plan?

Any way for me to reproduce this locally? I.e. can you narrow this down
to a reproducible test case?

Will work on it.

Thanks!

Greetings,

Andres Freund

#5Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#4)
Re: pg11.1 jit segv

On Thu, Nov 15, 2018 at 03:14:01PM -0800, Andres Freund wrote:

Huh, that's the same crash? Because I don't see any evalexpr functions
in the stack, and without those the above bt should have worked...

TTBOMK it's the same ..

Is it odd if i'm seeing this: (odd because of "ANOTHER") ?
I guess that's maybe because it's running parallel query ?
psql:tmp/sql-jit-crash-2018-11-15.jtp:16: WARNING: terminating connection because of crash of another server process

I'm started trying to minimize the query. Here's what's left:

SELECT * FROM
daily_eric_umts_rnc_utrancell_view t2
JOIN (SELECT * FROM eric_umts_rbs_sector_carrier_view t1 WHERE t1.start_time >= '2018-07-01 00:00:00' AND t1.start_time < '2018-07-09 00:00:00') AS t1
USING(start_time)
WHERE (t2.start_time >= '2018-07-01 00:00:00' AND t2.start_time < '2018-07-09 00:00:00')

Verbose plan, munged for brevity/sanity due to joining wide tables, and
redacted since the view probably has to be considered proprietary. Hopefully
the remaining bits are still useful. I replaced column names with x.

Gather (cost=481961.60..482102.41 rows=1180 width=8068)
Output: [ there are 1700+ columns here... ]
Workers Planned: 3
-> Merge Join (cost=480961.60..480984.41 rows=381 width=8068)
Output: ARRAY[x, x, ..., COALESCE(x, 0), COALESCE(x, 0)], ARRAY[..., COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COA
LESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALE
SCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESC
E(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0), COALESCE(x, 0)], ((x)::numeric / 10.0), CASE WHEN (x IS NULL) THEN ARRAY[..., eric_umts_rbs_sector_carrier_201807.pmaveragerssi_064] ELSE ARRAY[...] END, CASE WHEN (x IS NULL) THEN '{-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-60.0}'::numeric[] ELSE '{-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-x,-39.5}'::numeric[] END, ARRAY[x], x, x
Merge Cond: (eric_umts_rbs_sector_carrier_201807.start_time = t2.start_time)
-> Sort (cost=480421.29..480428.89 rows=3039 width=1652)
Output: [...]
Sort Key: eric_umts_rbs_sector_carrier_201807.start_time
-> Parallel Append (cost=0.00..480245.49 rows=3039 width=1652)
-> Parallel Seq Scan on child.eric_umts_rbs_sector_carrier_201807 (cost=0.00..480230.30 rows=3039 width=1652)
Output: [...]
Filter: ((eric_umts_rbs_sector_carrier_201807.start_time >= '2018-07-01 00:00:00-07'::timestamp with time zone) AND (eric_umts_rbs_sector_carrier_201807.start_time < '2018-07-09 00:00:00-07'::timesta
mp with time zone))
-> Sort (cost=540.31..540.65 rows=139 width=7760)
Output: [...]
Sort Key: t2.start_time
-> Append (cost=0.00..535.36 rows=139 width=7760)
-> Seq Scan on public.daily_eric_umts_rnc_utrancell_view t2 (cost=0.00..0.00 rows=1 width=7760)
Output: [...]
Filter: ((t2.start_time >= '2018-07-01 00:00:00'::timestamp without time zone) AND (t2.start_time < '2018-07-09 00:00:00'::timestamp without time zone))
-> Bitmap Heap Scan on child.daily_eric_umts_rnc_utrancell_view_201807 t2_1 (cost=9.83..534.66 rows=138 width=7760)
Output: [...]
Recheck Cond: ((t2_1.start_time >= '2018-07-01 00:00:00'::timestamp without time zone) AND (t2_1.start_time < '2018-07-09 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on daily_eric_umts_rnc_utrancell_view_201807_unique_idx (cost=0.00..9.79 rows=138 width=0)
Index Cond: ((t2_1.start_time >= '2018-07-01 00:00:00'::timestamp without time zone) AND (t2_1.start_time < '2018-07-09 00:00:00'::timestamp without time zone))
JIT:
Functions: 19
Options: Inlining false, Optimization false, Expressions true, Deforming true
(28 rows)

Justin

#6Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#5)
Re: pg11.1 jit segv

On Thu, Nov 15, 2018 at 06:03:34PM -0600, Justin Pryzby wrote:

Verbose plan, munged for brevity/sanity due to joining wide tables, and
redacted since the view probably has to be considered proprietary. Hopefully
the remaining bits are still useful. I replaced column names with x.

Actually the view isn't as intricate as I thought, but I'd like to avoid
publishing it for sake of simplicity. I replaced the view with its underlying
table and now I get:

[pryzbyj@database ~]$ time psql ts -f tmp/sql-jit-crash-2018-11-15.jtp
psql:tmp/sql-jit-crash-2018-11-15.jtp:12: ERROR: invalid memory alloc request size 2447889552

It's unclear if that's a useful hint, a separate problem, or a nonissue..

Justin

#7Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#6)
Re: pg11.1 jit segv

Hi,

On 2018-11-15 18:11:05 -0600, Justin Pryzby wrote:

On Thu, Nov 15, 2018 at 06:03:34PM -0600, Justin Pryzby wrote:

Verbose plan, munged for brevity/sanity due to joining wide tables, and
redacted since the view probably has to be considered proprietary. Hopefully
the remaining bits are still useful. I replaced column names with x.

Actually the view isn't as intricate as I thought, but I'd like to avoid
publishing it for sake of simplicity. I replaced the view with its underlying
table and now I get:

[pryzbyj@database ~]$ time psql ts -f tmp/sql-jit-crash-2018-11-15.jtp
psql:tmp/sql-jit-crash-2018-11-15.jtp:12: ERROR: invalid memory alloc request size 2447889552

It's unclear if that's a useful hint, a separate problem, or a nonissue..

It's probably the same / closely related issue. ISTM that JITed
deforming either doesn't deform correctly, or skips deforming
erroneously.

I'm about to commit some changes to 12/master that'd possibly make it
easier to find issues like this. Is there any chance that it's easier to
repro this on master than making a reproducible test case?

Greetings,

Andres Freund

#8Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#7)
Re: pg11.1 jit segv

On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote:

I'm about to commit some changes to 12/master that'd possibly make it
easier to find issues like this. Is there any chance that it's easier to
repro this on master than making a reproducible test case?

Yes, very possibly - this is on a customer's server, so their data, plus views,
etc are potentially involved until excluded.

Justin

#9Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#7)
1 attachment(s)
Re: pg11.1 jit segv

On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote:

I'm about to commit some changes to 12/master that'd possibly make it
easier to find issues like this.

Are you referring to this a future commit ?
commit 763f2edd92095b1ca2f4476da073a28505c13820
Rejigger materializing and fetching a HeapTuple from a slot.

I was able to reproduce under HEAD with pg_restored data.

I guess you're right that the "memory alloc failure" is related/same thing,
I've seen it intermittently with queries which also sometimes crash (and also
sometimes don't).

Note that when it crashes, it seems to take a longer time to do so than the
query would normally take. Like we're walking off the end of an array, say.

I've been able to reproduce the crash with a self join of a table (no view, no
expressions, no parallel, directly querying a relkind='r' child). In that
case, enable_bitmapscan=on and jit_tuple_deforming=on are both needed to crash,
and jit_debugging_support=on does not yield a useful bt.

The table is not too special, but was probably ALTERed to add columns a good
number of times by one of our processes. It has ~1100 columns, including
arrays, and some with null_frac=1. I'm trying to come up with a test case
involving column types and order.

(gdb) bt
#0 0x00007f81a08b8b98 in ?? ()
#1 0x0000000000000000 in ?? ()

ts=# SET jit=on;SET jit_above_cost=0;explain(analyze off,verbose off) SELECT a.* FROM child.daily_eric_umts_rnc_utrancell_view_201804 a JOIN child.daily_eric_umts_rnc_utrancell_view_201804 b USING(start_time,sect_id) WHERE a.start_time BETWEEN '2018-04-30' AND '2018-05-04' AND b.start_time BETWEEN '2018-04-30' AND '2018-05-04';
SET
SET
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=527.36..1038.17 rows=1 width=7760)
Hash Cond: ((a.start_time = b.start_time) AND (a.sect_id = b.sect_id))
-> Bitmap Heap Scan on daily_eric_umts_rnc_utrancell_view_201804 a (cost=9.78..515.59 rows=133 width=7760)
Recheck Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on daily_eric_umts_rnc_utrancell_view_201804_unique_idx (cost=0.00..9.74 rows=133 width=0)
Index Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone))
-> Hash (cost=515.59..515.59 rows=133 width=12)
-> Bitmap Heap Scan on daily_eric_umts_rnc_utrancell_view_201804 b (cost=9.78..515.59 rows=133 width=12)
Recheck Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on daily_eric_umts_rnc_utrancell_view_201804_unique_idx (cost=0.00..9.74 rows=133 width=0)
Index Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone))
JIT:
Functions: 19
Options: Inlining false, Optimization false, Expressions true, Deforming true

BTW find attached patch which I believe corrects some comments.

Justin

Attachments:

deform-comments.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 59e38d2..ab0c6d0 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -93,7 +93,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
 	funcname = llvm_expand_funcname(context, "deform");
 
 	/*
-	 * Check which columns do have to exist, so we don't have to check the
+	 * Check which columns have to exist, so we don't have to check the
 	 * rows natts unnecessarily.
 	 */
 	for (attnum = 0; attnum < desc->natts; attnum++)
@@ -252,7 +252,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
 	}
 
 	/*
-	 * Check if's guaranteed the all the desired attributes are available in
+	 * Check if it's guaranteed that all the desired attributes are available in
 	 * tuple. If so, we can start deforming. If not, need to make sure to
 	 * fetch the missing columns.
 	 */
@@ -337,7 +337,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
 
 		/*
 		 * If this is the first attribute, slot->tts_nvalid was 0. Therefore
-		 * reset offset to 0 to, it be from a previous execution.
+		 * also reset offset to 0, it may be from a previous execution.
 		 */
 		if (attnum == 0)
 		{
@@ -554,7 +554,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
 		else if (att->attnotnull && attguaranteedalign && known_alignment >= 0)
 		{
 			/*
-			 * If the offset to the column was previously known a NOT NULL &
+			 * If the offset to the column was previously known, a NOT NULL &
 			 * fixed width column guarantees that alignment is just the
 			 * previous alignment plus column width.
 			 */
#10Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#9)
Re: pg11.1 jit segv

On Fri, Nov 16, 2018 at 08:38:26AM -0600, Justin Pryzby wrote:

Are you referring to this a future commit ?

this OR a future commit

The table is not too special, but was probably ALTERed to add columns a good
number of times by one of our processes. It has ~1100 columns, including
arrays, and some with null_frac=1. I'm trying to come up with a test case
involving column types and order.

I don't have a failing test case yet but here's what the columns look like:

ts=# SELECT attnum, null_frac, atttypid::regtype, attnotnull, attname, attalign , attstorage, attbyval, attlen, attislocal FROM pg_stats s JOIN pg_attribute a USING(attname) JOIN pg_class c ON s.tablename=c.relname WHERE c.oid=a.attrelid AND tablename='daily_eric_umts_rnc_utrancell_view_201804' AND (attnotnull OR null_frac>0.9 OR atttypid::regtype::text LIKE '%[]' OR NOT atttypid::regtype::text~'int$|integer|double|numeric' OR attlen=-1 OR NOT attbyval OR atthasmissing OR attisdropped OR attnum BETWEEN 80 AND 99) ORDER BY 1;

attnum | null_frac | atttypid | attnotnull | attname | attalign | attstorage | attbyval | attlen | attislocal
--------+-----------+-----------------------------+------------+-------------------------------------------------+----------+------------+----------+--------+------------
1 | 0 | timestamp without time zone | t | start_time | d | p | t | 8 | t
2 | 0 | integer | t | site_id | i | p | t | 4 | t
3 | 0 | integer | t | sect_id | i | p | t | 4 | t
4 | 0 | integer | t | rnc_id | i | p | t | 4 | t
5 | 0 | text | t | utrancell | i | x | f | -1 | t
30 | 1 | bigint | f | dl_alt_chcode_alloc | d | p | t | 8 | t
31 | 1 | integer | f | dl_alt_chcode_alloc_min | i | p | t | 4 | t
32 | 1 | integer | f | dl_alt_chcode_alloc_max | i | p | t | 4 | t
45 | 0 | integer[] | f | dch_ul_rlc_user_tput_samples | i | x | f | -1 | t
46 | 0 | integer[] | f | dch_ul_rlc_user_tput_samples_min | i | x | f | -1 | t
47 | 0 | integer[] | f | dch_ul_rlc_user_tput_samples_max | i | x | f | -1 | t
51 | 0 | numeric | f | ps_int_sum_latency_2 | i | m | f | -1 | t
69 | 0 | numeric | f | mbytes_ul_srb_only_eul | i | m | f | -1 | t
[...]
87 | 0 | numeric | f | mbytes_dl_active_cs57 | i | m | f | -1 | t
88 | 0 | numeric | f | mbytes_dl_active_cs57_min | i | m | f | -1 | t
89 | 0 | numeric | f | mbytes_dl_active_cs57_max | i | m | f | -1 | t

If I query for cs57, it doesen't crash (in 500ms), but if I query for the next
column, cs57_min, it does (in 18000ms).

Here's a new error message instead of a crash this time:
ts=# SET jit=on;SET jit_above_cost=0;explain(analyze on,verbose off) SELECT b.mbytes_dl_active_cs57_min FROM child.daily_eric_umts_rnc_utrancell_view_201804 a JOIN child.daily_eric_umts_rnc_utrancell_view_201804 b USING(start_time,sect_id) WHERE a.start_time BETWEEN '2018-04-30' AND '2018-05-04' AND b.start_time BETWEEN '2018-04-30' AND '2018-05-04';
SET
SET
ERROR: out of memory
DETAIL: Failed on request of size 425170160 in memory context "HashBatchContext".

Here's verbose output you requested, sans expressions:

ts=# SET jit=on;SET jit_above_cost=0;explain(analyze off,verbose) SELECT b.mbytes_dl_active_cs57_min FROM child.daily_eric_umts_rnc_utrancell_view_201804 a JOIN child.daily_eric_umts_rnc_utrancell_view_201804 b USING(start_time,sect_id) WHERE a.start_time BETWEEN '2018-04-30' AND '2018-05-04' AND b.start_time BETWEEN '2018-04-30' AND '2018-05-04';
SET
SET
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=2224.45..4491.19 rows=22 width=3)
Output: b.mbytes_dl_active_cs57_min
Hash Cond: ((a.start_time = b.start_time) AND (a.sect_id = b.sect_id))
-> Index Only Scan using daily_eric_umts_rnc_utrancell_view_201804_unique_idx on child.daily_eric_umts_rnc_utrancell_view_201804 a (cost=0.29..2214.33 rows=656 width=12)
Output: a.start_time, a.site_id, a.rnc_id, a.sect_id, a.utrancell
Index Cond: ((a.start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (a.start_time <= '2018-05-04 00:00:00'::timestamp without time zone))
-> Hash (cost=2214.33..2214.33 rows=656 width=15)
Output: b.mbytes_dl_active_cs57_min, b.start_time, b.sect_id
-> Index Scan using daily_eric_umts_rnc_utrancell_view_201804_unique_idx on child.daily_eric_umts_rnc_utrancell_view_201804 b (cost=0.29..2214.33 rows=656 width=15)
Output: b.mbytes_dl_active_cs57_min, b.start_time, b.sect_id
Index Cond: ((b.start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (b.start_time <= '2018-05-04 00:00:00'::timestamp without time zone))
JIT:
Functions: 19
Options: Inlining false, Optimization false, Expressions true, Deforming true

If I query instead for a.mbytes_dl_active_cs57_min, I get no crash (yet).

Justin

#11Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#9)
Re: pg11.1 jit segv

Hi,

On 2018-11-16 08:38:26 -0600, Justin Pryzby wrote:

On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote:

I'm about to commit some changes to 12/master that'd possibly make it
easier to find issues like this.

Are you referring to this a future commit ?
commit 763f2edd92095b1ca2f4476da073a28505c13820
Rejigger materializing and fetching a HeapTuple from a slot.

I was thinking of

commit 15d8f83128e15de97de61430d0b9569f5ebecc26
Author: Andres Freund <andres@anarazel.de>
Date: 2018-11-15 22:00:30 -0800

Verify that expected slot types match returned slot types.

(and then the followup fix in a387a3df)

(will try to catch up later with the rest of what you wrote, after I had
some coffee).

- Andres

#12Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#9)
1 attachment(s)
Re: pg11.1 jit segv

On Fri, Nov 16, 2018 at 08:38:26AM -0600, Justin Pryzby wrote:

BTW find attached patch which I believe corrects some comments.

Updated. Some of the changes may be gratuitous, but changed while I was
already looking.

Also note that I had to remove -flto=thin to compile under RH7.

Justin

Attachments:

v2-deform-comments.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 938dfc7..0663719 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -98,7 +98,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	funcname = llvm_expand_funcname(context, "deform");
 
 	/*
-	 * Check which columns do have to exist, so we don't have to check the
+	 * Check which columns have to exist, so we don't have to check the
 	 * rows natts unnecessarily.
 	 */
 	for (attnum = 0; attnum < desc->natts; attnum++)
@@ -257,7 +257,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	}
 
 	/*
-	 * Check if's guaranteed the all the desired attributes are available in
+	 * Check if it's guaranteed that all the desired attributes are available in
 	 * tuple. If so, we can start deforming. If not, need to make sure to
 	 * fetch the missing columns.
 	 */
@@ -342,7 +342,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 		/*
 		 * If this is the first attribute, slot->tts_nvalid was 0. Therefore
-		 * reset offset to 0 to, it be from a previous execution.
+		 * also reset offset to 0, it may be from a previous execution.
 		 */
 		if (attnum == 0)
 		{
@@ -372,7 +372,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 		/*
 		 * Check for nulls if necessary. No need to take missing attributes
-		 * into account, because in case they're present the heaptuple's natts
+		 * into account, because if they're present, the heaptuple's natts
 		 * would have indicated that a slot_getmissingattrs() is needed.
 		 */
 		if (!att->attnotnull)
@@ -459,13 +459,13 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 			(known_alignment < 0 || known_alignment != TYPEALIGN(alignto, known_alignment)))
 		{
 			/*
-			 * When accessing a varlena field we have to "peek" to see if we
+			 * When accessing a varlena field, we have to "peek" to see if we
 			 * are looking at a pad byte or the first byte of a 1-byte-header
 			 * datum.  A zero byte must be either a pad byte, or the first
-			 * byte of a correctly aligned 4-byte length word; in either case
+			 * byte of a correctly aligned 4-byte length word; in either case,
 			 * we can align safely.  A non-zero byte must be either a 1-byte
 			 * length word, or the first byte of a correctly aligned 4-byte
-			 * length word; in either case we need not align.
+			 * length word; in either case, we need not align.
 			 */
 			if (att->attlen == -1)
 			{
@@ -559,8 +559,8 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 		else if (att->attnotnull && attguaranteedalign && known_alignment >= 0)
 		{
 			/*
-			 * If the offset to the column was previously known a NOT NULL &
-			 * fixed width column guarantees that alignment is just the
+			 * If the offset to the column was previously known, a NOT NULL &
+			 * fixed-width column guarantees that alignment is just the
 			 * previous alignment plus column width.
 			 */
 			Assert(att->attlen > 0);
@@ -601,8 +601,8 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 					   LLVMBuildGEP(b, v_tts_nulls, &l_attno, 1, ""));
 
 		/*
-		 * Store datum. For byval datums copy the value, extend to Datum's
-		 * width, and store. For byref types, store pointer to data.
+		 * Store datum. For byval datums: copy the value, extend to Datum's
+		 * width, and store. For byref types: store pointer to data.
 		 */
 		if (att->attbyval)
 		{
#13Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#11)
Re: pg11.1 jit segv

On Fri, Nov 16, 2018 at 08:29:27AM -0800, Andres Freund wrote:

On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote:

I'm about to commit some changes to 12/master that'd possibly make it

commit 15d8f83128e15de97de61430d0b9569f5ebecc26

I don't think it had to do with your commit, but I recompiled HEAD with debug +
casserts and have this to show.

TRAP: unrecognized TOAST vartag("1", File: "heaptuple.c", Line: 1705)

#0 0x00007faf5fac9277 in raise () from /lib64/libc.so.6
#1 0x00007faf5faca968 in abort () from /lib64/libc.so.6
#2 0x000000000088e347 in ExceptionalCondition (conditionName=conditionName@entry=0xa8a69e "1", errorType=errorType@entry=0x8deda8 "unrecognized TOAST vartag", fileName=fileName@entry=0x8dff82 "heaptuple.c",
lineNumber=lineNumber@entry=1705) at assert.c:54
#3 0x0000000000489830 in varsize_any (p=<optimized out>) at heaptuple.c:1705
#4 0x00007faf60c98560 in ?? ()
#5 0x00000000ffffffff in ?? ()
#6 0x000000000008c21c in ?? ()
#7 0x00000000004897d0 in ?? () at heaptuple.c:1690
#8 0x000000000008c21c in ?? ()
#9 0x000000000008c21c in ?? ()
#10 0x000000000000021c in ?? ()
#11 0x00000000004897d0 in ?? () at heaptuple.c:1690
[...]

#14Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#13)
Re: pg11.1 jit segv

Hi,

On 2018-11-16 19:23:44 -0600, Justin Pryzby wrote:

On Fri, Nov 16, 2018 at 08:29:27AM -0800, Andres Freund wrote:

On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote:

I'm about to commit some changes to 12/master that'd possibly make it

commit 15d8f83128e15de97de61430d0b9569f5ebecc26

I don't think it had to do with your commit, but I recompiled HEAD with debug +
casserts and have this to show.

That's probably just the same issue as before, namely random data
somehow being produced as the result of tuple deforming.

Greetings,

Andres Freund

#15Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#14)
Re: pg11.1 jit segv

On Fri, Nov 16, 2018 at 05:47:24PM -0800, Andres Freund wrote:

That's probably just the same issue as before, namely random data
somehow being produced as the result of tuple deforming.

Does this help at all?

ts=# SELECT utrancell FROM child.daily_eric_umts_rnc_utrancell_view_201807 LIMIT 9;
ERROR: invalid memory alloc request size 18446744073709551613

Note that's a table named after a view, and not itself a view..

(gdb) b errfinish
Breakpoint 3 at 0x891a00: file elog.c, line 411.
(gdb) c
Continuing.
SELECT start_time, site_id, sect_id, rnc_id, utrancell, eul_harq_ps_trans_tti2_1 FROM child.daily_eric_umts_rnc_utrancell_view_201801 LIMIT 1;
1: start_time (typeid = 1114, len = 8, typmod = -1, byval = t)
2: site_id (typeid = 23, len = 4, typmod = -1, byval = t)
3: sect_id (typeid = 23, len = 4, typmod = -1, byval = t)
4: rnc_id (typeid = 23, len = 4, typmod = -1, byval = t)
5: utrancell (typeid = 25, len = -1, typmod = -1, byval = f)
6: eul_harq_ps_trans_tti2_1 (typeid = 20, len = 8, typmod = -1, byval = t)
----
1: start_time = "2000-01-01 00:00:00" (typeid = 1114, len = 8, typmod = -1, byval = t)
2: site_id = "0" (typeid = 23, len = 4, typmod = -1, byval = t)
3: sect_id = "0" (typeid = 23, len = 4, typmod = -1, byval = t)
4: rnc_id = "0" (typeid = 23, len = 4, typmod = -1, byval = t)

The error occurs when selecting the 5th column, which is "text not null"; but,
the other fields are also garbage (namely 0).

#0 slot_compile_deform (context=context@entry=0xe184b8, desc=desc@entry=0xe9baf0, ops=ops@entry=0xa21020 <TTSOpsBufferHeapTuple>, natts=5) at llvmjit_deform.c:36
#1 0x00007fffe72535f5 in llvm_compile_expr (state=0xed8458) at llvmjit_expr.c:325
#2 0x000000000061e429 in ExecReadyExpr (state=state@entry=0xed8458) at execExpr.c:628
#3 0x00000000006219b4 in ExecBuildProjectionInfo (targetList=<optimized out>, econtext=<optimized out>, slot=<optimized out>, parent=parent@entry=0xed7d40, inputDesc=inputDesc@entry=0xe9baf0) at execExpr.c:472
#4 0x00000000006370b2 in ExecAssignProjectionInfo (planstate=planstate@entry=0xed7d40, inputDesc=inputDesc@entry=0xe9baf0) at execUtils.c:501
#5 0x00000000006371c6 in ExecConditionalAssignProjectionInfo (planstate=planstate@entry=0xed7d40, inputDesc=0xe9baf0, varno=<optimized out>) at execUtils.c:539
#6 0x0000000000632627 in ExecAssignScanProjectionInfo (node=node@entry=0xed7d40) at execScan.c:240
#7 0x0000000000655867 in ExecInitSeqScan (node=node@entry=0xeecbb8, estate=estate@entry=0xed78d8, eflags=eflags@entry=16) at nodeSeqscan.c:182
#8 0x0000000000630e53 in ExecInitNode (node=0xeecbb8, estate=estate@entry=0xed78d8, eflags=eflags@entry=16) at execProcnode.c:207
#9 0x000000000064c78a in ExecInitLimit (node=node@entry=0xeecc50, estate=estate@entry=0xed78d8, eflags=eflags@entry=16) at nodeLimit.c:368
#10 0x0000000000630b52 in ExecInitNode (node=node@entry=0xeecc50, estate=estate@entry=0xed78d8, eflags=eflags@entry=16) at execProcnode.c:363
#11 0x000000000062ac45 in InitPlan (eflags=16, queryDesc=<optimized out>) at execMain.c:1025
#12 standard_ExecutorStart (queryDesc=<optimized out>, eflags=16) at execMain.c:265
#13 0x0000000000784fdb in PortalStart (portal=portal@entry=0xe48528, params=params@entry=0x0, eflags=eflags@entry=0, snapshot=snapshot@entry=0x0) at pquery.c:520
#14 0x0000000000781218 in exec_simple_query (query_string=0xe56568 "SELECT start_time,site_id,sect_id,rnc_id,utrancell FROM child.daily_eric_umts_rnc_utrancell_view_201801 LIMIT 1\n") at postgres.c:1176
#15 0x00000000007825c2 in PostgresMain (argc=argc@entry=17, argv=argv@entry=0xddbb40, dbname=0xdfcd70 "postgres", dbname@entry=0x0, username=<optimized out>) at postgres.c:4243
#16 0x000000000047d8be in main (argc=17, argv=0xddbb40) at main.c:224

#0 errfinish (dummy=dummy@entry=0) at elog.c:411
#1 0x0000000000895379 in elog_finish (elevel=elevel@entry=20, fmt=fmt@entry=0xaabf68 "invalid memory alloc request size %zu") at elog.c:1365
#2 0x00000000008bb46e in palloc (size=18446744073709551613) at mcxt.c:934
#3 0x000000000086ad3a in text_to_cstring (t=0x7fffe7b83344) at varlena.c:189
#4 0x0000000000897ed5 in FunctionCall1Coll (flinfo=flinfo@entry=0x7fffffffdce0, collation=collation@entry=0, arg1=arg1@entry=140737080996676) at fmgr.c:1123
#5 0x0000000000899206 in OutputFunctionCall (val=140737080996676, flinfo=0x7fffffffdce0) at fmgr.c:1755
#6 OidOutputFunctionCall (functionId=<optimized out>, val=val@entry=140737080996676) at fmgr.c:1838
#7 0x000000000048b57b in debugtup (slot=0xed8330, self=<optimized out>) at printtup.c:625
#8 0x0000000000629a50 in ExecutePlan (execute_once=<optimized out>, dest=0xa66160 <debugtupDR>, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0xed7ad0, estate=0xed7878) at execMain.c:1748
#9 standard_ExecutorRun (queryDesc=0xe6ac78, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#10 0x000000000078409b in PortalRunSelect (portal=portal@entry=0xe484c8, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0xa66160 <debugtupDR>) at pquery.c:932
#11 0x00000000007855b8 in PortalRun (portal=portal@entry=0xe484c8, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0xa66160 <debugtupDR>,
altdest=altdest@entry=0xa66160 <debugtupDR>, completionTag=completionTag@entry=0x7fffffffdfc0 "") at pquery.c:773
#12 0x000000000078128d in exec_simple_query (query_string=0xe56508 "SELECT start_time,site_id,sect_id,rnc_id,utrancell,eul_harq_ps_trans_tti2_1 FROM child.daily_eric_umts_rnc_utrancell_view_201801 LIMIT 1;\n")
at postgres.c:1215
#13 0x00000000007825c2 in PostgresMain (argc=argc@entry=15, argv=argv@entry=0xddbb40, dbname=0xdfcd30 "postgres", dbname@entry=0x0, username=<optimized out>) at postgres.c:4243
#14 0x000000000047d8be in main (argc=15, argv=0xddbb40) at main.c:224

Justin

#16Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#10)
Re: pg11.1 jit segv

On Fri, Nov 16, 2018 at 10:24:46AM -0600, Justin Pryzby wrote:

On Fri, Nov 16, 2018 at 08:38:26AM -0600, Justin Pryzby wrote:

The table is not too special, but was probably ALTERed to add columns a good
number of times by one of our processes. It has ~1100 columns, including
arrays, and some with null_frac=1. I'm trying to come up with a test case
involving column types and order.

Try this ?

SELECT 'DROP TABLE t; CREATE TABLE t (a3 text, a1 int[], '||array_to_string(array_agg('c'||i||' bigint default 0'),',')||'); INSERT INTO t VALUES(0)' FROM generate_series(1,999) i;
\gexec
SET jit=on; SET jit_above_cost=0; SELECT a3 FROM t LIMIT 9;

That's given all sorts of nice errors:

ERROR: invalid memory alloc request size 18446744073709551613
ERROR: compressed data is corrupted

And occasionally crashes and/or returns unrelated data:

= '0', $21 = '0', $22 = '0', $23 = '0', $24 = '0', $25 = '2741'\x03
n 21782 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]}) :location

Justin

#17Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#16)
1 attachment(s)
Re: pg11.1 jit segv

On 2018-11-17 17:37:15 -0600, Justin Pryzby wrote:

On Fri, Nov 16, 2018 at 10:24:46AM -0600, Justin Pryzby wrote:

On Fri, Nov 16, 2018 at 08:38:26AM -0600, Justin Pryzby wrote:

The table is not too special, but was probably ALTERed to add columns a good
number of times by one of our processes. It has ~1100 columns, including
arrays, and some with null_frac=1. I'm trying to come up with a test case
involving column types and order.

Try this ?

SELECT 'DROP TABLE t; CREATE TABLE t (a3 text, a1 int[], '||array_to_string(array_agg('c'||i||' bigint default 0'),',')||'); INSERT INTO t VALUES(0)' FROM generate_series(1,999) i;
\gexec
SET jit=on; SET jit_above_cost=0; SELECT a3 FROM t LIMIT 9;

That's given all sorts of nice errors:

ERROR: invalid memory alloc request size 18446744073709551613
ERROR: compressed data is corrupted

And occasionally crashes and/or returns unrelated data:

= '0', $21 = '0', $22 = '0', $23 = '0', $24 = '0', $25 = '2741'\x03
n 21782 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]}) :location

Ah, hah. The issue is that t_hoff is larger than 128 here (due to the
size of the NULL bitmap), and apparently getelementptr interprets an
i8 > 128 as a signed integer. Which thus yields a negative offset from
the start of the tuple, which predictably doesn't work great.

v_hoff =
l_load_struct_gep(b, v_tuplep,
FIELDNO_HEAPTUPLEHEADERDATA_HOFF,
"t_hoff");
v_tupdata_base =
LLVMBuildGEP(b,
LLVMBuildBitCast(b,
v_tuplep,
l_ptr(LLVMInt8Type()),
""),
&v_hoff, 1,
"v_tupdata_base");

I'd missed the "These integers are treated as signed values where
relevant." bit in the getelementptr docs
http://llvm.org/docs/LangRef.html#getelementptr-instruction

The fix is easy enough, just adding a
v_hoff = LLVMBuildZExt(b, v_hoff, LLVMInt32Type(), "");
fixes the issue for me.

Could you check that the attached patch this also fixes your original
issue? Going through the code to see if there's other occurances of
this.

Greetings,

Andres Freund

Attachments:

hoff-fix.difftext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 4111bf0a54b..0719675d5b8 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -249,9 +249,11 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 							"maxatt");
 
 	v_hoff =
-		l_load_struct_gep(b, v_tuplep,
-						  FIELDNO_HEAPTUPLEHEADERDATA_HOFF,
-						  "t_hoff");
+		LLVMBuildZExt(b,
+					  l_load_struct_gep(b, v_tuplep,
+										FIELDNO_HEAPTUPLEHEADERDATA_HOFF,
+										""),
+					  LLVMInt32Type(), "t_hoff");
 
 	v_tupdata_base =
 		LLVMBuildGEP(b,
#18Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#17)
Re: pg11.1 jit segv

On Mon, Nov 26, 2018 at 07:00:35PM -0800, Andres Freund wrote:

Could you check that the attached patch this also fixes your original
issue? Going through the code to see if there's other occurances of
this.

Confirmed that fixes my crash.

Thanks,
Justin

#19Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#18)
1 attachment(s)
Re: pg11.1 jit segv

Hi,

On 2018-11-26 22:56:09 -0600, Justin Pryzby wrote:

On Mon, Nov 26, 2018 at 07:00:35PM -0800, Andres Freund wrote:

Could you check that the attached patch this also fixes your original
issue? Going through the code to see if there's other occurances of
this.

Confirmed that fixes my crash.

Thanks a lot for narrowing down your crash to something I can reproduce!

Here's a more complete patch, with a testcase.

Tom, the test creates a 1100k column table (using \set ECHO none +
gexec), but with a small row. Currently it's not dropped after the
table, as I thought it might be worthwhile to be tested by
pg_dump/upgrade etc too. You're probably the person most concerned with
test runtimes, ... Any concerns about that? The table creation is
quick*, on the order of 30ms.

Greetings,

Andres Freund

*at least as long as there's no default columns and the table's not
dropped, seems we have somewhat of an O(N^2) situation going on when
dropping a table with many columns that have default columns - we
re-build the cache entry after each dropped default value. But as the
max is 1600 columns, that's not too bad.

Attachments:

v1-0001-Fix-jit-compilation-issue-on-very-wide-tables.patchtext/x-diff; charset=us-asciiDownload
From fda7a2b41cba265f6dc3212f65a9da5f3518cf7c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 27 Nov 2018 00:22:02 -0800
Subject: [PATCH v1] Fix jit compilation issue on very wide tables.

Reported-By: Justin Pryzby
Author: Andres Freund
Discussion: https://postgr.es/m/20181115223959.GB10913@telsasoft.com
Backpatch: 11, just as jit compilation was
---
 src/backend/jit/llvm/llvmjit_deform.c      | 12 +++++++++---
 src/test/regress/expected/create_table.out | 10 ++++++++++
 src/test/regress/expected/sanity_check.out |  1 +
 src/test/regress/sql/create_table.sql      | 10 ++++++++++
 4 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 4111bf0a54b..d168eae0873 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -248,10 +248,16 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 							v_infomask2,
 							"maxatt");
 
+	/*
+	 * Need to zext, as getelementptr otherwise treats hoff as a signed 8bit
+	 * integer, which'd yield a negative offset for t_hoff > 127.
+	 */
 	v_hoff =
-		l_load_struct_gep(b, v_tuplep,
-						  FIELDNO_HEAPTUPLEHEADERDATA_HOFF,
-						  "t_hoff");
+		LLVMBuildZExt(b,
+					  l_load_struct_gep(b, v_tuplep,
+										FIELDNO_HEAPTUPLEHEADERDATA_HOFF,
+										""),
+					  LLVMInt32Type(), "t_hoff");
 
 	v_tupdata_base =
 		LLVMBuildGEP(b,
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index e92748c1ea0..b26b4e7b6d9 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -261,6 +261,16 @@ ERROR:  relation "as_select1" already exists
 CREATE TABLE IF NOT EXISTS as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r';
 NOTICE:  relation "as_select1" already exists, skipping
 DROP TABLE as_select1;
+-- create an extra wide table to test for issues related to that
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+INSERT INTO extra_wide_table(firstc, lastc) VALUES('first col', 'last col');
+SELECT firstc, lastc FROM extra_wide_table;
+  firstc   |  lastc   
+-----------+----------
+ first col | last col
+(1 row)
+
 -- check that tables with oids cannot be created anymore
 CREATE TABLE withoid() WITH OIDS;
 ERROR:  syntax error at or near "OIDS"
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index c77060d36c1..009a89fc1ad 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -44,6 +44,7 @@ dupindexcols|t
 e_star|f
 emp|f
 equipment_r|f
+extra_wide_table|f
 f_star|f
 fast_emp4000|t
 float4_tbl|f
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index 90cc1a578f3..c6f048f8c2a 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -277,6 +277,16 @@ CREATE TABLE as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r';
 CREATE TABLE IF NOT EXISTS as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r';
 DROP TABLE as_select1;
 
+-- create an extra wide table to test for issues related to that
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SELECT 'CREATE TABLE extra_wide_table(firstc text, '|| array_to_string(array_agg('c'||i||' bool'),',')||', lastc text);'
+FROM generate_series(1, 1100) g(i)
+\gexec
+\set ECHO all
+INSERT INTO extra_wide_table(firstc, lastc) VALUES('first col', 'last col');
+SELECT firstc, lastc FROM extra_wide_table;
+
 -- check that tables with oids cannot be created anymore
 CREATE TABLE withoid() WITH OIDS;
 CREATE TABLE withoid() WITH (oids);
-- 
2.18.0.rc2.dirty

#20Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#19)
Re: pg11.1 jit segv

On 2018-11-27 00:26:55 -0800, Andres Freund wrote:

Hi,

On 2018-11-26 22:56:09 -0600, Justin Pryzby wrote:

On Mon, Nov 26, 2018 at 07:00:35PM -0800, Andres Freund wrote:

Could you check that the attached patch this also fixes your original
issue? Going through the code to see if there's other occurances of
this.

Confirmed that fixes my crash.

Thanks a lot for narrowing down your crash to something I can reproduce!

Here's a more complete patch, with a testcase.

Tom, the test creates a 1100k column table (using \set ECHO none +
gexec), but with a small row. Currently it's not dropped after the
table, as I thought it might be worthwhile to be tested by
pg_dump/upgrade etc too. You're probably the person most concerned with
test runtimes, ... Any concerns about that? The table creation is
quick*, on the order of 30ms.

And pushed. Justin, thanks again for reporting the bug and then
narrowing it down to a reproducible test case! Would've been much harder
to diagnose without that.

I'll look into your comments patch in a bit.

Greetings,

Andres Freund

#21Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#20)
1 attachment(s)
jit comments typos (Re: pg11.1 jit segv)

On Tue, Nov 27, 2018 at 10:24:52AM -0800, Andres Freund wrote:

And pushed. Justin, thanks again for reporting the bug and then
narrowing it down to a reproducible test case! Would've been much harder
to diagnose without that.

I'll look into your comments patch in a bit.

Thanks for implementing and patching it :)

And thanks for remembering the patch, and reminding me.

Here's an updated copy with additional hunks.

Justin

Attachments:

v4-deform-comments.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index ec4a250..83e4e05 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -1873,7 +1873,7 @@ CheckOpSlotCompatibility(ExprEvalStep *op, TupleTableSlot *slot)
 
 	/*
 	 * Should probably fixed at some point, but for now it's easier to allow
-	 * buffer and heap tuples to be used interchangably.
+	 * buffer and heap tuples to be used interchangeably.
 	 */
 	if (slot->tts_ops == &TTSOpsBufferHeapTuple &&
 		op->d.fetch.kind == &TTSOpsHeapTuple)
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 4111bf0..ba238f1 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -103,7 +103,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	funcname = llvm_expand_funcname(context, "deform");
 
 	/*
-	 * Check which columns do have to exist, so we don't have to check the
+	 * Check which columns have to exist, so we don't have to check the
 	 * rows natts unnecessarily.
 	 */
 	for (attnum = 0; attnum < desc->natts; attnum++)
@@ -292,7 +292,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	}
 
 	/*
-	 * Check if's guaranteed the all the desired attributes are available in
+	 * Check if it's guaranteed that all the desired attributes are available in
 	 * tuple. If so, we can start deforming. If not, need to make sure to
 	 * fetch the missing columns.
 	 */
@@ -377,7 +377,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 		/*
 		 * If this is the first attribute, slot->tts_nvalid was 0. Therefore
-		 * reset offset to 0 to, it be from a previous execution.
+		 * also reset offset to 0, it may be from a previous execution.
 		 */
 		if (attnum == 0)
 		{
@@ -407,7 +407,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 		/*
 		 * Check for nulls if necessary. No need to take missing attributes
-		 * into account, because in case they're present the heaptuple's natts
+		 * into account, because if they're present, the heaptuple's natts
 		 * would have indicated that a slot_getmissingattrs() is needed.
 		 */
 		if (!att->attnotnull)
@@ -494,13 +494,13 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 			(known_alignment < 0 || known_alignment != TYPEALIGN(alignto, known_alignment)))
 		{
 			/*
-			 * When accessing a varlena field we have to "peek" to see if we
+			 * When accessing a varlena field, we have to "peek" to see if we
 			 * are looking at a pad byte or the first byte of a 1-byte-header
 			 * datum.  A zero byte must be either a pad byte, or the first
-			 * byte of a correctly aligned 4-byte length word; in either case
+			 * byte of a correctly aligned 4-byte length word; in either case,
 			 * we can align safely.  A non-zero byte must be either a 1-byte
 			 * length word, or the first byte of a correctly aligned 4-byte
-			 * length word; in either case we need not align.
+			 * length word; in either case, we need not align.
 			 */
 			if (att->attlen == -1)
 			{
@@ -594,8 +594,8 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 		else if (att->attnotnull && attguaranteedalign && known_alignment >= 0)
 		{
 			/*
-			 * If the offset to the column was previously known a NOT NULL &
-			 * fixed width column guarantees that alignment is just the
+			 * If the offset to the column was previously known, a NOT NULL &
+			 * fixed-width column guarantees that alignment is just the
 			 * previous alignment plus column width.
 			 */
 			Assert(att->attlen > 0);
@@ -636,8 +636,8 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 					   LLVMBuildGEP(b, v_tts_nulls, &l_attno, 1, ""));
 
 		/*
-		 * Store datum. For byval datums copy the value, extend to Datum's
-		 * width, and store. For byref types, store pointer to data.
+		 * Store datum. For byval datums: copy the value, extend to Datum's
+		 * width, and store. For byref types: store pointer to data.
 		 */
 		if (att->attbyval)
 		{
diff --git a/src/backend/jit/llvm/llvmjit_inline.cpp b/src/backend/jit/llvm/llvmjit_inline.cpp
index b33a321..2ad29be 100644
--- a/src/backend/jit/llvm/llvmjit_inline.cpp
+++ b/src/backend/jit/llvm/llvmjit_inline.cpp
@@ -9,7 +9,7 @@
  * for an external function is found - not guaranteed! - the index will then
  * be used to judge their instruction count / inline worthiness. After doing
  * so for all external functions, all the referenced functions (and
- * prerequisites) will be imorted.
+ * prerequisites) will be imported.
  *
  * Copyright (c) 2016-2018, PostgreSQL Global Development Group
  *