crash 11.5~

Started by Justin Pryzbyover 6 years ago5 messages
#1Justin Pryzby
pryzby@telsasoft.com

A daily report crashed repeatedly this morning running pg11.4.
I compiled 11.5 and reproduced it there, too, so I'm including backtrace with
-O0.

I'm trying to dig further into it, but it seems to be crashing under load, but
not when I try to narrow down to a single report, which seem to run to
completion when run individually.

postmaster[6750]: segfault at 7f4a527545bc ip 00000000004834ae sp 00007ffd547e2760 error 4 in postgres[400000+6e6000]
postmaster[29786]: segfault at 7f49a968d000 ip 00007f49b24abb0f sp 00007ffd547e2268 error 4 in libc-2.12.so (deleted)[7f49b2422000+18b000]

Core was generated by `postgres: telsasoft ts 192.168.122.11(35454) SELECT '.
Program terminated with signal 11, Segmentation fault.

#0 0x00000000004877df in slot_deform_tuple (slot=0xe204638, natts=3) at heaptuple.c:1465
#1 0x0000000000487f5b in slot_getsomeattrs (slot=0xe204638, attnum=3) at heaptuple.c:1673
#2 0x00000000006d5aae in ExecInterpExpr (state=0xe257948, econtext=0xe204578, isnull=0x7ffff5f38077) at execExprInterp.c:443
#3 0x00000000006ed221 in ExecEvalExprSwitchContext (state=0xe257948, econtext=0xe204578, isNull=0x7ffff5f38077) at ../../../src/include/executor/executor.h:313
#4 0x00000000006ed30e in ExecQual (state=0xe257948, econtext=0xe204578) at ../../../src/include/executor/executor.h:382
#5 0x00000000006ed5f4 in ExecScan (node=0x252edc8, accessMtd=0x71c54c <SeqNext>, recheckMtd=0x71c623 <SeqRecheck>) at execScan.c:190
#6 0x000000000071c670 in ExecSeqScan (pstate=0x252edc8) at nodeSeqscan.c:129
#7 0x00000000006eb855 in ExecProcNodeInstr (node=0x252edc8) at execProcnode.c:461
#8 0x00000000006f6555 in ExecProcNode (node=0x252edc8) at ../../../src/include/executor/executor.h:247
#9 0x00000000006f6a05 in ExecAppend (pstate=0xe20e7b0) at nodeAppend.c:294
#10 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e7b0) at execProcnode.c:461
#11 0x0000000000708a96 in ExecProcNode (node=0xe20e7b0) at ../../../src/include/executor/executor.h:247
#12 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20e7b0, hjstate=0xe20e4d8, hashvalue=0x7ffff5f3827c) at nodeHashjoin.c:821
#13 0x0000000000708f8d in ExecHashJoinImpl (pstate=0xe20e4d8, parallel=false) at nodeHashjoin.c:355
#14 0x00000000007094ca in ExecHashJoin (pstate=0xe20e4d8) at nodeHashjoin.c:565
#15 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e4d8) at execProcnode.c:461
#16 0x000000000071d9da in ExecProcNode (node=0xe20e4d8) at ../../../src/include/executor/executor.h:247
#17 0x000000000071db26 in ExecSort (pstate=0xe20e3c0) at nodeSort.c:107
#18 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e3c0) at execProcnode.c:461
#19 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20e3c0) at execProcnode.c:445
#20 0x00000000006f74c5 in ExecProcNode (node=0xe20e3c0) at ../../../src/include/executor/executor.h:247
#21 0x00000000006f79bb in fetch_input_tuple (aggstate=0xe20df70) at nodeAgg.c:406
#22 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0xe20df70) at nodeAgg.c:1755
#23 0x00000000006f98fa in ExecAgg (pstate=0xe20df70) at nodeAgg.c:1570
#24 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20df70) at execProcnode.c:461
#25 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20df70) at execProcnode.c:445
#26 0x0000000000708a96 in ExecProcNode (node=0xe20df70) at ../../../src/include/executor/executor.h:247
#27 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20df70, hjstate=0x2778580, hashvalue=0x7ffff5f3865c) at nodeHashjoin.c:821
#28 0x0000000000708f8d in ExecHashJoinImpl (pstate=0x2778580, parallel=false) at nodeHashjoin.c:355
#29 0x00000000007094ca in ExecHashJoin (pstate=0x2778580) at nodeHashjoin.c:565
#30 0x00000000006eb855 in ExecProcNodeInstr (node=0x2778580) at execProcnode.c:461
#31 0x00000000006eb826 in ExecProcNodeFirst (node=0x2778580) at execProcnode.c:445
#32 0x0000000000702a9b in ExecProcNode (node=0x2778580) at ../../../src/include/executor/executor.h:247
#33 0x0000000000702d12 in MultiExecPrivateHash (node=0x27783a8) at nodeHash.c:164
#34 0x0000000000702c8c in MultiExecHash (node=0x27783a8) at nodeHash.c:114
#35 0x00000000006eb8f7 in MultiExecProcNode (node=0x27783a8) at execProcnode.c:501
#36 0x0000000000708e2a in ExecHashJoinImpl (pstate=0x2777210, parallel=false) at nodeHashjoin.c:290
#37 0x00000000007094ca in ExecHashJoin (pstate=0x2777210) at nodeHashjoin.c:565
#38 0x00000000006eb855 in ExecProcNodeInstr (node=0x2777210) at execProcnode.c:461
#39 0x00000000006eb826 in ExecProcNodeFirst (node=0x2777210) at execProcnode.c:445
#40 0x000000000071d9da in ExecProcNode (node=0x2777210) at ../../../src/include/executor/executor.h:247
#41 0x000000000071db26 in ExecSort (pstate=0x2776ce0) at nodeSort.c:107
#42 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ce0) at execProcnode.c:461
#43 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ce0) at execProcnode.c:445
#44 0x00000000006f74c5 in ExecProcNode (node=0x2776ce0) at ../../../src/include/executor/executor.h:247
#45 0x00000000006f79bb in fetch_input_tuple (aggstate=0x2776ec8) at nodeAgg.c:406
#46 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0x2776ec8) at nodeAgg.c:1755
#47 0x00000000006f98fa in ExecAgg (pstate=0x2776ec8) at nodeAgg.c:1570
#48 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ec8) at execProcnode.c:461
#49 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ec8) at execProcnode.c:445
#50 0x00000000006e0129 in ExecProcNode (node=0x2776ec8) at ../../../src/include/executor/executor.h:247
#51 0x00000000006e2abd in ExecutePlan (estate=0x27769d8, planstate=0x2776ec8, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xa66bef0, execute_once=true) at execMain.c:1723
#52 0x00000000006e0710 in standard_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:364
#53 0x00007fd37e757633 in pgss_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at pg_stat_statements.c:892
#54 0x00007fd37e2d86ef in explain_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at auto_explain.c:281
#55 0x00000000006e0511 in ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:305
#56 0x00000000008cfebe in PortalRunSelect (portal=0x21d4bf8, forward=true, count=0, dest=0xa66bef0) at pquery.c:932
#57 0x00000000008cfb4c in PortalRun (portal=0x21d4bf8, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0xa66bef0, altdest=0xa66bef0, completionTag=0x7ffff5f39100 "") at pquery.c:773
#58 0x00000000008c9b22 in exec_simple_query (query_string=0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS site_gran,\n\tsite_location||'.'||sect_num AS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS sitename,\n\tperiod AS start_time,\n\t(maybe_div(sum(rrc_su"...)
at postgres.c:1145
#59 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218fef8, dbname=0x215da48 "ts", username=0x218fed0 "telsasoft") at postgres.c:4182
#60 0x000000000082a098 in BackendRun (port=0x21867d0) at postmaster.c:4358
#61 0x0000000000829806 in BackendStartup (port=0x21867d0) at postmaster.c:4030
#62 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#63 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0) at postmaster.c:1380
#64 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228
(gdb)

#0 0x00000000004877df in slot_deform_tuple (slot=0xe204638, natts=3) at heaptuple.c:1465
thisatt = 0x5265db0
tuple = 0xe838d40
tupleDesc = 0x5265d20
values = 0xe204698
isnull = 0xe2057e8
tup = 0x7fd3eb1b8058
hasnulls = false
attnum = 1
tp = 0x7fd3eb1b80bd <Address 0x7fd3eb1b80bd out of bounds>
off = 583369983
bp = 0x7fd3eb1b806f <Address 0x7fd3eb1b806f out of bounds>
slow = true
#1 0x0000000000487f5b in slot_getsomeattrs (slot=0xe204638, attnum=3) at heaptuple.c:1673
tuple = 0xe838d40
attno = 3
__func__ = "slot_getsomeattrs"
#2 0x00000000006d5aae in ExecInterpExpr (state=0xe257948, econtext=0xe204578, isnull=0x7ffff5f38077) at execExprInterp.c:443
op = 0xe257530
resultslot = 0x0
innerslot = 0x0
outerslot = 0x0
scanslot = 0xe204638
dispatch_table = {0x6d5a0d, 0x6d5a32, 0x6d5a61, 0x6d5a93, 0x6d5ac5, 0x6d5b74, 0x6d5c23, 0x6d5cd2, 0x6d5da3, 0x6d5e74, 0x6d5f45, 0x6d5f7c, 0x6d6056, 0x6d6130, 0x6d620a, 0x6d627e, 0x6d6347, 0x6d638f, 0x6d640e, 0x6d64ec, 0x6d6523, 0x6d655a, 0x6d6568, 0x6d65e9, 0x6d6657, 0x6d6665,
0x6d66e6, 0x6d6754, 0x6d6790, 0x6d6821, 0x6d6856, 0x6d68b4, 0x6d6915, 0x6d6986, 0x6d69ca, 0x6d6a14, 0x6d6a4b, 0x6d6a82, 0x6d6acb, 0x6d6b3b, 0x6d6bab, 0x6d6bf4, 0x6d6c2b, 0x6d6c62, 0x6d6ca1, 0x6d6dc3, 0x6d6e2e, 0x6d704c, 0x6d7161, 0x6d726f, 0x6d7376, 0x6d73a6, 0x6d73d6, 0x6d7406,
0x6d7436, 0x6d746d, 0x6d749d, 0x6d7625, 0x6d7724, 0x6d7754, 0x6d778b, 0x6d77c2, 0x6d77f9, 0x6d7862, 0x6d7892, 0x6d78c2, 0x6d6d32, 0x6d7959, 0x6d7989, 0x6d78f2, 0x6d7929, 0x6d79b9, 0x6d79e9, 0x6d7a98, 0x6d7ac8, 0x6d7b77, 0x6d7bae, 0x6d7be5, 0x6d7c40, 0x6d7cf5, 0x6d7d9a, 0x6d7ead,
0x6d7f52, 0x6d80b4, 0x6d8261, 0x6d8298, 0x6d82cf}
#3 0x00000000006ed221 in ExecEvalExprSwitchContext (state=0xe257948, econtext=0xe204578, isNull=0x7ffff5f38077) at ../../../src/include/executor/executor.h:313
retDatum = 140737319764160
oldContext = 0x27768c0
#4 0x00000000006ed30e in ExecQual (state=0xe257948, econtext=0xe204578) at ../../../src/include/executor/executor.h:382
ret = 3018996923
isnull = false
#5 0x00000000006ed5f4 in ExecScan (node=0x252edc8, accessMtd=0x71c54c <SeqNext>, recheckMtd=0x71c623 <SeqRecheck>) at execScan.c:190
slot = 0xe204638
econtext = 0xe204578
qual = 0xe257948
projInfo = 0xe257498
#6 0x000000000071c670 in ExecSeqScan (pstate=0x252edc8) at nodeSeqscan.c:129
---Type <return> to continue, or q <return> to quit---
node = 0x252edc8
#7 0x00000000006eb855 in ExecProcNodeInstr (node=0x252edc8) at execProcnode.c:461
result = 0x702ace
#8 0x00000000006f6555 in ExecProcNode (node=0x252edc8) at ../../../src/include/executor/executor.h:247
No locals.
#9 0x00000000006f6a05 in ExecAppend (pstate=0xe20e7b0) at nodeAppend.c:294
subnode = 0x252edc8
result = 0xe258508
node = 0xe20e7b0
#10 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e7b0) at execProcnode.c:461
result = 0x20e257058
#11 0x0000000000708a96 in ExecProcNode (node=0xe20e7b0) at ../../../src/include/executor/executor.h:247
No locals.
#12 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20e7b0, hjstate=0xe20e4d8, hashvalue=0x7ffff5f3827c) at nodeHashjoin.c:821
hashtable = 0xe8387c8
curbatch = 0
slot = 0x0
#13 0x0000000000708f8d in ExecHashJoinImpl (pstate=0xe20e4d8, parallel=false) at nodeHashjoin.c:355
node = 0xe20e4d8
outerNode = 0xe20e7b0
hashNode = 0xe258720
joinqual = 0x0
otherqual = 0x0
econtext = 0xe20e6f0
hashtable = 0xe8387c8
outerTupleSlot = 0xe257058
hashvalue = 3018996923
batchno = 0
parallel_state = 0x0
__func__ = "ExecHashJoinImpl"
#14 0x00000000007094ca in ExecHashJoin (pstate=0xe20e4d8) at nodeHashjoin.c:565
No locals.
#15 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e4d8) at execProcnode.c:461
result = 0x20c4
#16 0x000000000071d9da in ExecProcNode (node=0xe20e4d8) at ../../../src/include/executor/executor.h:247
No locals.
#17 0x000000000071db26 in ExecSort (pstate=0xe20e3c0) at nodeSort.c:107
plannode = 0xa6309c0
outerNode = 0xe20e4d8
---Type <return> to continue, or q <return> to quit---
tupDesc = 0xe2598a0
node = 0xe20e3c0
estate = 0x27769d8
dir = ForwardScanDirection
tuplesortstate = 0x2766998
slot = 0xe25b8b8
#18 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e3c0) at execProcnode.c:461
result = 0x4785d0
#19 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20e3c0) at execProcnode.c:445
No locals.
#20 0x00000000006f74c5 in ExecProcNode (node=0xe20e3c0) at ../../../src/include/executor/executor.h:247
No locals.
#21 0x00000000006f79bb in fetch_input_tuple (aggstate=0xe20df70) at nodeAgg.c:406
slot = 0xe20e240
#22 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0xe20df70) at nodeAgg.c:1755
node = 0xa6315d0
econtext = 0xe20e300
tmpcontext = 0xe20e1a8
peragg = 0x252e198
pergroups = 0xe7cdd40
outerslot = 0x7ffff5f38520
firstSlot = 0xe263e70
result = 0x0
hasGroupingSets = false
numGroupingSets = 1
currentSet = 222632768
nextSetSize = 0
numReset = 1
i = 1
#23 0x00000000006f98fa in ExecAgg (pstate=0xe20df70) at nodeAgg.c:1570
node = 0xe20df70
result = 0x0
#24 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20df70) at execProcnode.c:461
result = 0x4785d0
#25 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20df70) at execProcnode.c:445
No locals.
#26 0x0000000000708a96 in ExecProcNode (node=0xe20df70) at ../../../src/include/executor/executor.h:247
No locals.
#27 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20df70, hjstate=0x2778580, hashvalue=0x7ffff5f3865c) at nodeHashjoin.c:821
#28 0x0000000000708f8d in ExecHashJoinImpl (pstate=0x2778580, parallel=false) at nodeHashjoin.c:355
node = 0x2778580
outerNode = 0xe20df70
hashNode = 0xe7e2560
joinqual = 0x0
otherqual = 0x0
econtext = 0x2778798
hashtable = 0xe837df0
outerTupleSlot = 0x4040
hashvalue = 0
batchno = 10926240
parallel_state = 0x0
__func__ = "ExecHashJoinImpl"
#29 0x00000000007094ca in ExecHashJoin (pstate=0x2778580) at nodeHashjoin.c:565
No locals.
#30 0x00000000006eb855 in ExecProcNodeInstr (node=0x2778580) at execProcnode.c:461
result = 0x4785d0
#31 0x00000000006eb826 in ExecProcNodeFirst (node=0x2778580) at execProcnode.c:445
No locals.
#32 0x0000000000702a9b in ExecProcNode (node=0x2778580) at ../../../src/include/executor/executor.h:247
No locals.
#33 0x0000000000702d12 in MultiExecPrivateHash (node=0x27783a8) at nodeHash.c:164
outerNode = 0x2778580
hashkeys = 0xe7ebf78
hashtable = 0xe837cd8
slot = 0x80000000001
econtext = 0x27784c0
hashvalue = 0
#34 0x0000000000702c8c in MultiExecHash (node=0x27783a8) at nodeHash.c:114
No locals.
#35 0x00000000006eb8f7 in MultiExecProcNode (node=0x27783a8) at execProcnode.c:501
result = 0x4
__func__ = "MultiExecProcNode"
#36 0x0000000000708e2a in ExecHashJoinImpl (pstate=0x2777210, parallel=false) at nodeHashjoin.c:290
node = 0x2777210
outerNode = 0x27774e8
---Type <return> to continue, or q <return> to quit---
hashNode = 0x27783a8
joinqual = 0x0
otherqual = 0x0
econtext = 0x2777428
hashtable = 0xe837cd8
outerTupleSlot = 0x7ffff5f388ac
hashvalue = 0
batchno = 0
parallel_state = 0x0
__func__ = "ExecHashJoinImpl"
#37 0x00000000007094ca in ExecHashJoin (pstate=0x2777210) at nodeHashjoin.c:565
No locals.
#38 0x00000000006eb855 in ExecProcNodeInstr (node=0x2777210) at execProcnode.c:461
result = 0x4785d0
#39 0x00000000006eb826 in ExecProcNodeFirst (node=0x2777210) at execProcnode.c:445
No locals.
#40 0x000000000071d9da in ExecProcNode (node=0x2777210) at ../../../src/include/executor/executor.h:247
No locals.
#41 0x000000000071db26 in ExecSort (pstate=0x2776ce0) at nodeSort.c:107
plannode = 0xa632c18
outerNode = 0x2777210
tupDesc = 0xe7e6980
node = 0x2776ce0
estate = 0x27769d8
dir = ForwardScanDirection
tuplesortstate = 0x24feec8
slot = 0x219bfa0
#42 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ce0) at execProcnode.c:461
result = 0x4785d0
#43 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ce0) at execProcnode.c:445
No locals.
#44 0x00000000006f74c5 in ExecProcNode (node=0x2776ce0) at ../../../src/include/executor/executor.h:247
No locals.
#45 0x00000000006f79bb in fetch_input_tuple (aggstate=0x2776ec8) at nodeAgg.c:406
slot = 0x2776df8
#46 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0x2776ec8) at nodeAgg.c:1755
node = 0xa633eb0
econtext = 0x2777150
tmpcontext = 0x2776bf0
---Type <return> to continue, or q <return> to quit---
peragg = 0xe824520
pergroups = 0xe832e88
outerslot = 0x7ffff5f38b40
firstSlot = 0xe81f6b8
result = 0x7ffff5f38b10
hasGroupingSets = false
numGroupingSets = 1
currentSet = 243398904
nextSetSize = 0
numReset = 1
i = 1
#47 0x00000000006f98fa in ExecAgg (pstate=0x2776ec8) at nodeAgg.c:1570
node = 0x2776ec8
result = 0x0
#48 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ec8) at execProcnode.c:461
result = 0x4785d0
#49 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ec8) at execProcnode.c:445
No locals.
#50 0x00000000006e0129 in ExecProcNode (node=0x2776ec8) at ../../../src/include/executor/executor.h:247
No locals.
#51 0x00000000006e2abd in ExecutePlan (estate=0x27769d8, planstate=0x2776ec8, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xa66bef0, execute_once=true) at execMain.c:1723
slot = 0x150
current_tuple_count = 0
#52 0x00000000006e0710 in standard_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:364
estate = 0x27769d8
operation = CMD_SELECT
dest = 0xa66bef0
sendTuples = true
oldcontext = 0xdeb5e20
__func__ = "standard_ExecutorRun"
#53 0x00007fd37e757633 in pgss_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at pg_stat_statements.c:892
save_exception_stack = 0x7ffff5f38dd0
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {233529144, 9176610699338068304, 4687312, 140737319770032, 0, 0, 9176610699377914192, 9151536278965596496}, __mask_was_saved = 0, __saved_mask = {__val = {4294967302, 243497128, 4295032832, 140737319767408, 7255973, 140737319767408, 68726688595,
41380312, 174276272, 174278840, 41381576, 0, 41381576, 16, 174278840, 140737319767696}}}}
#54 0x00007fd37e2d86ef in explain_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at auto_explain.c:281
save_exception_stack = 0x7ffff5f38fa0
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {233529144, 9176610699451314512, 4687312, 140737319770032, 0, 0, 9176610699340165456, 9151535941362266448}, __mask_was_saved = 0, __saved_mask = {__val = {10926707, 24, 41371824, 41371824, 238009728, 41372168, 8600905475, 35241888, 336, 243497664,
---Type <return> to continue, or q <return> to quit---
41380032, 243498000, 233528864, 336, 174505712, 140737319767760}}}}
#55 0x00000000006e0511 in ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:305
No locals.
#56 0x00000000008cfebe in PortalRunSelect (portal=0x21d4bf8, forward=true, count=0, dest=0xa66bef0) at pquery.c:932
queryDesc = 0xdeb5f38
direction = ForwardScanDirection
nprocessed = 233528864
__func__ = "PortalRunSelect"
#57 0x00000000008cfb4c in PortalRun (portal=0x21d4bf8, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0xa66bef0, altdest=0xa66bef0, completionTag=0x7ffff5f39100 "") at pquery.c:773
save_exception_stack = 0x7ffff5f39240
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {174505712, 9176610698715214160, 4687312, 140737319770032, 0, 0, 9176610699432440144, -9176589804060397232}, __mask_was_saved = 0, __saved_mask = {__val = {233529080, 2405784, 21306251184, 128, 13, 616, 174505656, 140737319768192, 10970883, 480, 112,
174505712, 34999312, 174505824, 233528864, 112}}}}
result = false
nprocessed = 4687312
saveTopTransactionResourceOwner = 0x2197a20
saveTopTransactionContext = 0x27748b0
saveActivePortal = 0x0
saveResourceOwner = 0x2197a20
savePortalContext = 0x0
saveMemoryContext = 0x27748b0
__func__ = "PortalRun"
#58 0x00000000008c9b22 in exec_simple_query (query_string=0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS site_gran,\n\tsite_location||'.'||sect_num AS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS sitename,\n\tperiod AS start_time,\n\t(maybe_div(sum(rrc_su"...)
at postgres.c:1145
parsetree = 0xd4f1c38
portal = 0x21d4bf8
snapshot_set = true
commandTag = 0xc5ed4d "SELECT"
completionTag = "\000\000\000\000\002\000\000\000\n\000\000\000\000\000\000\000P\221\363\365\377\177\000\000\017s\245\000\000\000\000\000\036\020\000\000\000\000\000\000\000\f\026\002\030\020\000\000\200\275/\016\000\000\000\000(\r\026\002\006\000\000"
querytree_list = 0xe230d58
plantree_list = 0xa66beb8
receiver = 0xa66bef0
format = 0
dest = DestRemote
oldcontext = 0x27748b0
parsetree_list = 0xd4f1c98
parsetree_item = 0xd4f1c70
save_log_statement_stats = false
was_logged = false
---Type <return> to continue, or q <return> to quit---
use_implicit_block = false
msec_str = "\220\221\363\365\377\177\000\000&C\245\000\000\000\000\000\006\000\000\000\030\020\000\000\200\275/\016\000\000\000"
__func__ = "exec_simple_query"
#59 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218fef8, dbname=0x215da48 "ts", username=0x218fed0 "telsasoft") at postgres.c:4182
query_string = 0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS site_gran,\n\tsite_location||'.'||sect_num AS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS sitename,\n\tperiod AS start_time,\n\t(maybe_div(sum(rrc_su"...
firstchar = 81
input_message = {data = 0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS site_gran,\n\tsite_location||'.'||sect_num AS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS sitename,\n\tperiod AS start_time,\n\t(maybe_div(sum(rrc_su"..., len = 4121, maxlen = 8192,
cursor = 4121}
local_sigjmp_buf = {{__jmpbuf = {35192568, 9176610698734088528, 4687312, 140737319770032, 0, 0, 9176610698788614480, -9176589805133483696}, __mask_was_saved = 1, __saved_mask = {__val = {0, 0, 4494597, 0, 0, 0, 10923965, 1024, 35138464, 140737319768848, 10928134, 35138464, 34999312,
34999312, 35138440, 8}}}}
send_ready_for_query = false
disable_idle_in_transaction_timeout = false
__func__ = "PostgresMain"
#60 0x000000000082a098 in BackendRun (port=0x21867d0) at postmaster.c:4358
av = 0x218fef8
maxac = 2
ac = 1
secs = 618513625
usecs = 637903
i = 1
__func__ = "BackendRun"
#61 0x0000000000829806 in BackendStartup (port=0x21867d0) at postmaster.c:4030
bn = 0x2185ef0
pid = 0
__func__ = "BackendStartup"
#62 0x0000000000825cab in ServerLoop () at postmaster.c:1707
port = 0x21867d0
i = 0
rmask = {fds_bits = {8, 0 <repeats 15 times>}}
selres = 1
now = 1565198425
readmask = {fds_bits = {56, 0 <repeats 15 times>}}
nSockets = 6
last_lockfile_recheck_time = 1565198368
last_touch_time = 1565197325
__func__ = "ServerLoop"
#63 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0) at postmaster.c:1380
opt = -1
status = 0
---Type <return> to continue, or q <return> to quit---
userDoption = 0x217d500 "/var/lib/pgsql/11/data"
listen_addr_saved = true
i = 64
output_config_variable = 0x0
__func__ = "PostmasterMain"
#64 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228
do_check_root = true

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#1)
Re: crash 11.5~ (and 11.4)

I checked this still happens with max_parallel_workers_per_gather=0.
Now, I just reproduced using SELECT * FROM that table:

(gdb) p thisatt->attrelid
$4 = 2015128626

ts=# SELECT 2015128626::regclass;
regclass | child.huawei_umts_ucell_201908

(gdb) p thisatt->attnum
$1 = 2
(gdb) p attnum # For earlier crash this is 1....
$2 = 1

ts=# \dt+ child.huawei_umts_ucell_201908
child | huawei_umts_ucell_201908 | table | telsasoft | 612 MB |

\d+ child.huawei_umts_ucell_201908
bsc6900ucell | text | | not null | | extended | |
ne_name | text | | not null | | extended | |
[...]
Partition of: huawei_umts_ucell_metrics FOR VALUES FROM ('2019-08-01 00:00:00') TO ('2019-09-01 00:00:00')
Partition constraint: ((start_time IS NOT NULL) AND (start_time >= '2019-08-01 00:00:00'::timestamp without time zone) AND (start_time < '2019-09-01 00:00:00'::timestamp without time zone))
Indexes:
"huawei_umts_ucell_201908_unique_idx" UNIQUE, btree (start_time, device_id, bsc6900ucell, ne_name, interval_seconds)
"huawei_umts_ucell_201908_idx" brin (start_time) WITH (autosummarize='true')
"huawei_umts_ucell_201908_site_idx" btree (site_id)
Check constraints:
"huawei_umts_ucell_201908_start_time_check" CHECK (start_time >= '2019-08-01 00:00:00'::timestamp without time zone AND start_time < '2019-09-01 00:00:00'::timestamp without time zone)
Statistics objects:
"child"."huawei_umts_ucell_201908" (ndistinct) ON bsc6900ucell, ne_name, start_time, interval_seconds, device_id FROM child.huawei_umts_ucell_201908
Options: autovacuum_analyze_threshold=2, autovacuum_analyze_scale_factor=0.005

ts=# SELECT COUNT(1) FROM pg_attribute WHERE attrelid='child.huawei_umts_ucell_201908'::regclass;
count | 560

ts=# SELECT * FROM pg_attribute WHERE attrelid='child.huawei_umts_ucell_201908'::regclass AND attnum>=0 ORDER BY attnum;

attrelid | 2015128626
attname | bsc6900ucell
atttypid | 25
attstattarget | -1
attlen | -1
attnum | 1
attndims | 0
attcacheoff | -1
atttypmod | -1
attbyval | f
attstorage | x
attalign | i
attnotnull | t
atthasdef | f
atthasmissing | f
attidentity |
attisdropped | f
attislocal | f
attinhcount | 1
attcollation | 100
attacl |
attoptions |
attfdwoptions |
attmissingval |

The only interesting thing about the parent/siblings is that the previous month
was partitioned with "daily" granularity. I adjusted our threshold for that,
so that August is partitioned monthly:

[...]
child.huawei_umts_ucell_20190730 FOR VALUES FROM ('2019-07-30 00:00:00') TO ('2019-07-31 00:00:00'),
child.huawei_umts_ucell_20190731 FOR VALUES FROM ('2019-07-31 00:00:00') TO ('2019-08-01 00:00:00'),
child.huawei_umts_ucell_201908 FOR VALUES FROM ('2019-08-01 00:00:00') TO ('2019-09-01 00:00:00')

Program terminated with signal 11, Segmentation fault.
#0 0x00000000004877df in slot_deform_tuple (slot=0x22b5860, natts=554) at heaptuple.c:1465
1465 off = att_align_pointer(off, thisatt->attalign, -1,

(gdb) bt
#0 0x00000000004877df in slot_deform_tuple (slot=0x22b5860, natts=554) at heaptuple.c:1465
#1 0x0000000000487e4b in slot_getallattrs (slot=0x22b5860) at heaptuple.c:1626
#2 0x000000000048aab9 in printtup (slot=0x22b5860, self=0x2161b28) at printtup.c:392
#3 0x00000000008d0346 in RunFromStore (portal=0x21d5638, direction=ForwardScanDirection, count=0, dest=0x2161b28) at pquery.c:1106
#4 0x00000000008cfe88 in PortalRunSelect (portal=0x21d5638, forward=true, count=0, dest=0x2161b28) at pquery.c:928
#5 0x00000000008cfb4c in PortalRun (portal=0x21d5638, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2161b28, altdest=0x2161b28, completionTag=0x7ffff5f39100 "") at pquery.c:773
#6 0x00000000008c9b22 in exec_simple_query (query_string=0x2160d28 "FETCH FORWARD 999 FROM _psql_cursor") at postgres.c:1145
#7 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218bfb0, dbname=0x218beb0 "ts", username=0x218be90 "pryzbyj") at postgres.c:4182
#8 0x000000000082a098 in BackendRun (port=0x2181ac0) at postmaster.c:4358
#9 0x0000000000829806 in BackendStartup (port=0x2181ac0) at postmaster.c:4030
#10 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#11 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0) at postmaster.c:1380
#12 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228

#0 0x00000000004877df in slot_deform_tuple (slot=0x22b5860, natts=554) at heaptuple.c:1465
thisatt = 0x22dec50
tuple = 0x22b58a0
tupleDesc = 0x22debc0
values = 0x22b58c0
isnull = 0x22b6a10
tup = 0x263f9a0
hasnulls = false
attnum = 1
tp = 0x263fa05 "\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
off = 583369983
bp = 0x263f9b7 "\270\027$U}\232\246\235\004\255\331\033\006Qp\376E\316h\376\363\247\366Նgy7\311E\224~F\274\023ϋ%\216,\221\331@\024\363\233\070\275\004\254L\217t\262X\227\352\346\347\371\070\321ш\221\350fc\316\r\356\351h\275\213\230\360\203\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
slow = true
#1 0x0000000000487e4b in slot_getallattrs (slot=0x22b5860) at heaptuple.c:1626
tdesc_natts = 554
attnum = 554
tuple = 0x22b58a0
__func__ = "slot_getallattrs"

#2 0x000000000048aab9 in printtup (slot=0x22b5860, self=0x2161b28) at printtup.c:392
typeinfo = 0x22debc0
myState = 0x2161b28
oldcontext = 0xa8e29a
buf = 0x2161b78
natts = 554
i = 0
---Type <return> to continue, or q <return> to quit---
#3 0x00000000008d0346 in RunFromStore (portal=0x21d5638, direction=ForwardScanDirection, count=0, dest=0x2161b28) at pquery.c:1106
oldcontext = 0x21931a0
ok = true
forward = true
current_tuple_count = 879
slot = 0x22b5860
#4 0x00000000008cfe88 in PortalRunSelect (portal=0x21d5638, forward=true, count=0, dest=0x2161b28) at pquery.c:928
queryDesc = 0x0
direction = ForwardScanDirection
nprocessed = 35205816
__func__ = "PortalRunSelect"
#5 0x00000000008cfb4c in PortalRun (portal=0x21d5638, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2161b28, altdest=0x2161b28, completionTag=0x7ffff5f39100 "") at pquery.c:773
save_exception_stack = 0x7ffff5f39240
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {35003176, 9176610698715214160, 4687312, 140737319770032, 0, 0, 9176610699432440144, -9176589804060397232}, __mask_was_saved = 0, __saved_mask = {__val = {
36636288, 4352, 35002176, 1112, 1176, 0, 35003120, 140737319768192, 10970883, 35488536, 112, 35003176, 34999312, 35003288, 35205536, 112}}}}
result = false
nprocessed = 4687312
saveTopTransactionResourceOwner = 0x2187b88
saveTopTransactionContext = 0x21f5d90
saveActivePortal = 0x0
saveResourceOwner = 0x2187b88
savePortalContext = 0x0
saveMemoryContext = 0x21f5d90
__func__ = "PortalRun"
#6 0x00000000008c9b22 in exec_simple_query (query_string=0x2160d28 "FETCH FORWARD 999 FROM _psql_cursor") at postgres.c:1145
[...]

#3Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#2)
Re: crash 11.5~ (and 11.4)

Just found this, although I'm not sure what to do about it. If it's corrupt
table data, I can restore from backup.

ts=# VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;
INFO: 00000: aggressively vacuuming "child.huawei_umts_ucell_201908"
LOCATION: lazy_scan_heap, vacuumlazy.c:502
ERROR: XX001: found xmin 73850277 from before relfrozenxid 111408920
LOCATION: heap_prepare_freeze_tuple, heapam.c:6853

I confirmed I updated to 11.4 immediately on its release:
[pryzbyj@database ~]$ ls -ltc /usr/pgsql-11/bin/postgres
-rwxr-xr-x. 1 root root 7291736 Jun 20 07:09 /usr/pgsql-11/bin/postgres

That table would've been created shortly after midnight on Aug 1 when we loaded
first data for the month. So it was created and processed only on pg11.4,
although the parent has probably been around since pg_upgrade last October.

Here's usable looking bt

#0 heap_prepare_freeze_tuple (tuple=0x7fd3d8fa0058, relfrozenxid=111408920,
relminmxid=1846, cutoff_xid=111658731, cutoff_multi=1846, frz=0x223e930,
totally_frozen_p=0x7ffff5f3848f) at heapam.c:6832
changed = 211
xmax_already_frozen = 127
xmin_frozen = false
freeze_xmax = false
xid = 0
__func__ = "heap_prepare_freeze_tuple"

#1 0x00000000006c8e6e in lazy_scan_heap (onerel=0x7fd37e4f8fa0, options=13,
vacrelstats=0x223e790, Irel=0x223e8d8, nindexes=3, aggressive=true)
at vacuumlazy.c:1151
tuple_totally_frozen = false
itemid = 0x7fd3d8f9e918
buf = 168313
page = 0x7fd3d8f9e900 "\020:"
offnum = 1
maxoff = 3
hastup = true
nfrozen = 0
freespace = 35907664
all_frozen = true
tupgone = false
prev_dead_count = 0
all_visible_according_to_vm = true
all_visible = true
has_dead_tuples = false
visibility_cutoff_xid = 73850277
nblocks = 75494
blkno = 34915
tuple = {t_len = 2212, t_self = {ip_blkid = {bi_hi = 0,
bi_lo = 34915}, ip_posid = 1}, t_tableOid = 2015128626,
t_data = 0x7fd3d8fa0058}
relname = 0x7fd37e4f6ac8 "huawei_umts_ucell_201908"
relfrozenxid = 111408920
relminmxid = 1846
empty_pages = 0
vacuumed_pages = 0
next_fsm_block_to_vacuum = 0
num_tuples = 1
live_tuples = 1
tups_vacuumed = 0
nkeep = 0
nunused = 0
indstats = 0x223e850
i = 32512
ru0 = {tv = {tv_sec = 1565204981, tv_usec = 609217}, ru = {ru_utime = {
tv_sec = 13, tv_usec = 865892}, ru_stime = {tv_sec = 1,
tv_usec = 960701}, ru_maxrss = 136988, ru_ixrss = 0,
ru_idrss = 0, ru_isrss = 0, ru_minflt = 48841, ru_majflt = 1,
ru_nswap = 0, ru_inblock = 196152, ru_oublock = 292656,
ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 849,
ru_nivcsw = 6002}}
vmbuffer = 4267
next_unskippable_block = 34916
skipping_blocks = false
frozen = 0x223e930
buf = {data = 0x0, len = -168590068, maxlen = 32767,
cursor = -168589904}
initprog_index = {0, 1, 5}
initprog_val = {1, 75494, 21968754}
__func__ = "lazy_scan_heap"
#2 0x00000000006c77c7 in lazy_vacuum_rel (onerel=0x7fd37e4f8fa0, options=13,
params=0x7ffff5f38a70, bstrategy=0x22b8cf8) at vacuumlazy.c:265
vacrelstats = 0x223e790
Irel = 0x223e8d8
nindexes = 3
ru0 = {tv = {tv_sec = 140733193388036, tv_usec = 281474985219646},
ru = {ru_utime = {tv_sec = 12933152, tv_usec = 35123496},
ru_stime = {tv_sec = 140737319765996, tv_usec = 4687312},
ru_maxrss = 0, ru_ixrss = -1887226286461968708,
ru_idrss = 35351240, ru_isrss = 35249392,
ru_minflt = 140737319765776, ru_majflt = 35249392, ru_nswap = 20,
ru_inblock = 0, ru_oublock = 140737319765936,
ru_msgsnd = 10990471, ru_msgrcv = 140546333970336,
ru_nsignals = 35249264, ru_nvcsw = 4, ru_nivcsw = 4687312}}
starttime = 0
secs = 72057594037927936
usecs = 2015128626
read_rate = 6.9533474784178538e-310
write_rate = 6.9439115263673584e-310
aggressive = true
scanned_all_unfrozen = false
xidFullScanLimit = 111658735
mxactFullScanLimit = 1846
new_rel_pages = 0
new_rel_allvisible = 36424984
new_live_tuples = 6.9533474784194348e-310
new_frozen_xid = 9080197
new_min_multi = 0
__func__ = "lazy_vacuum_rel"
#3 0x00000000006c72b3 in vacuum_rel (relid=2015128626, relation=0x2161820,
options=13, params=0x7ffff5f38a70) at vacuum.c:1557
lmode = 4
onerel = 0x7fd37e4f8fa0
onerelid = {relId = 2015128626, dbId = 16564}
toast_relid = 2015128673
save_userid = 18712
save_sec_context = 0
save_nestlevel = 2
rel_lock = true
__func__ = "vacuum_rel"
#4 0x00000000006c5939 in vacuum (options=13, relations=0x22b8e70,
params=0x7ffff5f38a70, bstrategy=0x22b8cf8, isTopLevel=true)
at vacuum.c:340
vrel = 0x22b8e10
cur = 0x22b8e48
save_exception_stack = 0x7ffff5f38c20
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {35002584, 9176610699604406608,
4687312, 140737319770032, 0, 0, 9176610699520520528,
-9176587885370396336}, __mask_was_saved = 0, __saved_mask = {
__val = {35906192, 140737319766400, 10964897, 6853, 35905936,
208, 35890568, 21474836680, 35890776, 140737319766560,
10927612, 208, 35890288, 35890288, 35890504, 35890544}}}}
in_vacuum = true
stmttype = 0xc03841 "VACUUM"
in_outer_xact = false
use_own_xacts = true
__func__ = "vacuum"
#5 0x00000000006c54e0 in ExecVacuum (vacstmt=0x2161910, isTopLevel=true)
at vacuum.c:141
params = {freeze_min_age = 0, freeze_table_age = 0,
multixact_freeze_min_age = 0, multixact_freeze_table_age = 0,
is_wraparound = false, log_min_duration = -1}
__func__ = "ExecVacuum"

errfinish one seems to be not useful?

(gdb) bt
#0 errfinish (dummy=0) at elog.c:415
#1 0x000000000054aefd in ShowTransactionStateRec (
str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
#2 0x000000000054ad5c in ShowTransactionState (
str=0xafed12 "StartTransaction") at xact.c:5130
#3 0x00000000005470c8 in StartTransaction () at xact.c:1961
#4 0x0000000000547db9 in StartTransactionCommand () at xact.c:2734
#5 0x00000000008cbe65 in start_xact_command () at postgres.c:2500
#6 0x00000000008c9772 in exec_simple_query (
query_string=0x2160d28 "VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;") at postgres.c:948
#7 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218c768,
dbname=0x218c5d0 "ts", username=0x218c5b0 "pryzbyj") at postgres.c:4182
#8 0x000000000082a098 in BackendRun (port=0x21818d0) at postmaster.c:4358
#9 0x0000000000829806 in BackendStartup (port=0x21818d0) at postmaster.c:4030
#10 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#11 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0)
at postmaster.c:1380
#12 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228

#0 errfinish (dummy=0) at elog.c:415
edata = 0xfe7b80
elevel = 0
oldcontext = 0x215df90
econtext = 0xa4d11a
__func__ = "errfinish"
#1 0x000000000054aefd in ShowTransactionStateRec (
str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
buf = {data = 0x22462d8 "", len = 0, maxlen = 1024, cursor = 0}
__func__ = "ShowTransactionStateRec"
#2 0x000000000054ad5c in ShowTransactionState (
str=0xafed12 "StartTransaction") at xact.c:5130
No locals.
#3 0x00000000005470c8 in StartTransaction () at xact.c:1961
s = 0xf9efa0
vxid = {backendId = 5, localTransactionId = 16}
#4 0x0000000000547db9 in StartTransactionCommand () at xact.c:2734
s = 0xf9efa0
__func__ = "StartTransactionCommand"
#5 0x00000000008cbe65 in start_xact_command () at postgres.c:2500
No locals.

I looked at this briefly:

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 0));
lsn | 3A49/66F44310
checksum | 0
flags | 4
lower | 36
upper | 1432
special | 8192
pagesize | 8192
version | 4
prune_xid | 0

In case someone wants me to look at bits and pieces, I saved a copy of:
relfilenode | 2015128626

#4Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#3)
Re: crash 11.5~ (and 11.4)

Hi,

On 2019-08-07 14:29:28 -0500, Justin Pryzby wrote:

Just found this, although I'm not sure what to do about it. If it's corrupt
table data, I can restore from backup.

ts=# VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;
INFO: 00000: aggressively vacuuming "child.huawei_umts_ucell_201908"
LOCATION: lazy_scan_heap, vacuumlazy.c:502
ERROR: XX001: found xmin 73850277 from before relfrozenxid 111408920
LOCATION: heap_prepare_freeze_tuple, heapam.c:6853

Ugh :(

We really need to add a error context to vacuumlazy that shows which
block is being processed.

I confirmed I updated to 11.4 immediately on its release:
[pryzbyj@database ~]$ ls -ltc /usr/pgsql-11/bin/postgres
-rwxr-xr-x. 1 root root 7291736 Jun 20 07:09 /usr/pgsql-11/bin/postgres

Any chance you might not have restarted postgres at that time?

That table would've been created shortly after midnight on Aug 1 when we loaded
first data for the month. So it was created and processed only on pg11.4,
although the parent has probably been around since pg_upgrade last October.

I don't think there's a way the parent could play a role.

Here's usable looking bt

#0 heap_prepare_freeze_tuple (tuple=0x7fd3d8fa0058, relfrozenxid=111408920,
relminmxid=1846, cutoff_xid=111658731, cutoff_multi=1846, frz=0x223e930,
totally_frozen_p=0x7ffff5f3848f) at heapam.c:6832
changed = 211
xmax_already_frozen = 127
xmin_frozen = false
freeze_xmax = false
xid = 0
__func__ = "heap_prepare_freeze_tuple"

Hm. That's a backtrace to what precisely? Are you sure it's the the
erroring call to heap_prepare_freeze_tuple? Because I think that's at
the function's start - which is why some of the stack variables have
bogus contents.

I think you'd need to set the breakpoint to heapam.c:6850 to be sure to
catch the error (while the error message heapam.c:6853, that's because
the line macro in some compilers expands to the end of the statement).

errfinish one seems to be not useful?

(gdb) bt
#0 errfinish (dummy=0) at elog.c:415
#1 0x000000000054aefd in ShowTransactionStateRec (
str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
#2 0x000000000054ad5c in ShowTransactionState (
str=0xafed12 "StartTransaction") at xact.c:5130
#3 0x00000000005470c8 in StartTransaction () at xact.c:1961
#4 0x0000000000547db9 in StartTransactionCommand () at xact.c:2734
#5 0x00000000008cbe65 in start_xact_command () at postgres.c:2500
#6 0x00000000008c9772 in exec_simple_query (
query_string=0x2160d28 "VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;") at postgres.c:948
#7 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218c768,
dbname=0x218c5d0 "ts", username=0x218c5b0 "pryzbyj") at postgres.c:4182
#8 0x000000000082a098 in BackendRun (port=0x21818d0) at postmaster.c:4358
#9 0x0000000000829806 in BackendStartup (port=0x21818d0) at postmaster.c:4030
#10 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#11 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0)
at postmaster.c:1380
#12 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228

That's probably just the errfinish for a debug message. Might be easier
to set a breakpoint to pg_re_throw().

I looked at this briefly:

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 0));
lsn | 3A49/66F44310
checksum | 0
flags | 4
lower | 36
upper | 1432
special | 8192
pagesize | 8192
version | 4
prune_xid | 0

That's not necessarily the target block though, right? It'd be useful
to get this for the block with the corruption, and perhaps one
before/after. If the backtrace I commented on at the top is correct,
the relevant tuple was at "blkno = 34915" (visible a frame or two
up).

Also heap_page_items() for those pages would be useful - you might want
to not include the t_data column, as it has raw tuple data.

Could you show
* pg_controldata
* SELECT oid, datname, datfrozenxid, datminmxid FROM pg_database;

That should be easy to redact if necessary. It'd be useful to know which
database the corrupted table is in.

It'd also be helpful to get something roughly like
SELECT
oid, oid::regclass,
relkind, reltoastrelid,
relfrozenxid, age(relfrozenxid),
relminmxid, mxid_age(relminmxid),
relpages,
(SELECT txid_current())
FROM pg_class
WHERE relfrozenxid <> 0;

But I'm not sure if including the tablenames would be a problematic for
your case. Possibly a bit hard to infer "usage" correlations without,
but probably still worthwhile.

Greetings,

Andres Freund

#5Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#4)
Re: crash 11.5~ (and 11.4)

On Wed, Aug 07, 2019 at 04:51:54PM -0700, Andres Freund wrote:

Hi,

On 2019-08-07 14:29:28 -0500, Justin Pryzby wrote:

Just found this, although I'm not sure what to do about it. If it's corrupt
table data, I can restore from backup.

In the meantime, I've renamed+uninherited the table and restored from backup,
with the broken table preserved.

However, the previous days' backup did this, so I used Monday's backup.

[pryzbyj@database ~]$ time sudo -u postgres pg_restore /srv/cdrperfbackup/ts/2019-08-06/curtables/ -t huawei_umts_ucell_201908 --verbose -d ts --host /tmp
pg_restore: connecting to database for restore
pg_restore: creating TABLE "child.huawei_umts_ucell_201908"
pg_restore: INFO: partition constraint for table "huawei_umts_ucell_201908" is implied by existing constraints
pg_restore: processing data for table "child.huawei_umts_ucell_201908"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 62710; 0 2015128626 TABLE DATA huawei_umts_ucell_201908 telsasoft
pg_restore: [archiver (db)] COPY failed for table "huawei_umts_ucell_201908": ERROR: invalid byte sequence for encoding "UTF8": 0xe7 0x28 0x9d
CONTEXT: COPY huawei_umts_ucell_201908, line 104746

Also, I found core with this BT from my own manual invocation of ANALYZE.

(gdb) bt
#0 0x00000039674324f5 in raise () from /lib64/libc.so.6
#1 0x0000003967433cd5 in abort () from /lib64/libc.so.6
#2 0x0000000000a3c4b2 in ExceptionalCondition (conditionName=0xaad9e7 "!(j > attnum)", errorType=0xaad84c "FailedAssertion", fileName=0xaad840 "heaptuple.c", lineNumber=582) at assert.c:54
#3 0x00000000004856a7 in nocachegetattr (tuple=0x1c116b08, attnum=1, tupleDesc=0x23332b0) at heaptuple.c:582
#4 0x000000000062981d in std_fetch_func (stats=0x22700c8, rownum=54991, isNull=0x7ffff5f3844f) at analyze.c:1718
#5 0x000000000062aa99 in compute_scalar_stats (stats=0x22700c8, fetchfunc=0x62942b <std_fetch_func>, samplerows=120000, totalrows=227970) at analyze.c:2370
#6 0x00000000006270db in do_analyze_rel (onerel=0x7fd37e2afa50, options=2, params=0x7ffff5f38a70, va_cols=0x0, acquirefunc=0x627e9e <acquire_sample_rows>, relpages=75494, inh=false,
in_outer_xact=false, elevel=13) at analyze.c:579
#7 0x0000000000626616 in analyze_rel (relid=2015128626, relation=0x2161820, options=2, params=0x7ffff5f38a70, va_cols=0x0, in_outer_xact=false, bstrategy=0x21f7848) at analyze.c:310
#8 0x00000000006c59b2 in vacuum (options=2, relations=0x21f79c0, params=0x7ffff5f38a70, bstrategy=0x21f7848, isTopLevel=true) at vacuum.c:357
#9 0x00000000006c54e0 in ExecVacuum (vacstmt=0x2161910, isTopLevel=true) at vacuum.c:141
#10 0x00000000008d1d7e in standard_ProcessUtility (pstmt=0x21619d0, queryString=0x2160d28 "ANALYZE child. huawei_umts_ucell_201908;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x2161cc0, completionTag=0x7ffff5f39100 "") at utility.c:670

#3 0x00000000004856a7 in nocachegetattr (tuple=0x1c116b08, attnum=1, tupleDesc=0x23332b0) at heaptuple.c:582
natts = 554
j = 1
tup = 0x1c116b20
tp = 0x1c116b85 "\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
bp = 0x1c116b37 "\270\027$U}\232\246\235\004\255\331\033\006Qp\376E\316h\376\363\247\366Նgy7\311E\224~F\274\023ϋ%\216,\221\331@\024\363\233\070\275\004\254L\217t\262X\227\352\346\347\371\070\321ш\221\350fc\316\r\356\351h\275\213\230\360\203\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
slow = false
off = -1
#4 0x000000000062981d in std_fetch_func (stats=0x22700c8, rownum=54991, isNull=0x7ffff5f3844f) at analyze.c:1718
attnum = 2
tuple = 0x1c116b08
tupDesc = 0x23332b0
#5 0x000000000062aa99 in compute_scalar_stats (stats=0x22700c8, fetchfunc=0x62942b <std_fetch_func>, samplerows=120000, totalrows=227970) at analyze.c:2370
value = 546120480
isnull = false
i = 54991
null_cnt = 0
nonnull_cnt = 54991
toowide_cnt = 0
---Type <return> to continue, or q <return> to quit---
total_width = 274955
is_varlena = true
is_varwidth = true
corr_xysum = 3.9525251667299724e-323
ssup = {ssup_cxt = 0x22ae080, ssup_collation = 100, ssup_reverse = false, ssup_nulls_first = false, ssup_attno = 0, ssup_extra = 0x22ae5b0, comparator = 0xa06720 <varstrfastcmp_locale>,
abbreviate = false, abbrev_converter = 0, abbrev_abort = 0, abbrev_full_comparator = 0}
values = 0x204f9970
values_cnt = 54991
tupnoLink = 0x206ce5c0
track = 0x22ae198
track_cnt = 0
num_mcv = 100
num_bins = 100
mystats = 0x2270490

ts=# VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;
INFO: 00000: aggressively vacuuming "child.huawei_umts_ucell_201908"
LOCATION: lazy_scan_heap, vacuumlazy.c:502
ERROR: XX001: found xmin 73850277 from before relfrozenxid 111408920
LOCATION: heap_prepare_freeze_tuple, heapam.c:6853

Ugh :(

We really need to add a error context to vacuumlazy that shows which
block is being processed.

I confirmed I updated to 11.4 immediately on its release:
[pryzbyj@database ~]$ ls -ltc /usr/pgsql-11/bin/postgres
-rwxr-xr-x. 1 root root 7291736 Jun 20 07:09 /usr/pgsql-11/bin/postgres

Any chance you might not have restarted postgres at that time?

I don't think so, nagios would've been yelling at me. Also..I found I have a
log of pg_settings, which shows server_version updated 2019-06-20
07:05:01.425645. Is there a bug in 11.3 which could explain it ?

Here's usable looking bt

#0 heap_prepare_freeze_tuple (tuple=0x7fd3d8fa0058, relfrozenxid=111408920,
relminmxid=1846, cutoff_xid=111658731, cutoff_multi=1846, frz=0x223e930,
totally_frozen_p=0x7ffff5f3848f) at heapam.c:6832
changed = 211
xmax_already_frozen = 127
xmin_frozen = false
freeze_xmax = false
xid = 0
__func__ = "heap_prepare_freeze_tuple"

Hm. That's a backtrace to what precisely? Are you sure it's the the
erroring call to heap_prepare_freeze_tuple? Because I think that's at
the function's start - which is why some of the stack variables have
bogus contents.

Yes, I had just done this: b heap_prepare_freeze_tuple.

I think you'd need to set the breakpoint to heapam.c:6850 to be sure to
catch the error (while the error message heapam.c:6853, that's because
the line macro in some compilers expands to the end of the statement).

I did 6848 and got;

#0 heap_prepare_freeze_tuple (tuple=0x7fd403a02058, relfrozenxid=111408920, relminmxid=1846, cutoff_xid=111665057, cutoff_multi=1846, frz=0x22aa6f8, totally_frozen_p=0x7ffff5f3848f) at heapam.c:6849
changed = false
xmax_already_frozen = false
xmin_frozen = false
freeze_xmax = false
xid = 73850277
__func__ = "heap_prepare_freeze_tuple"

errfinish one seems to be not useful?

(gdb) bt
#0 errfinish (dummy=0) at elog.c:415
#1 0x000000000054aefd in ShowTransactionStateRec (
str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
[...]

That's probably just the errfinish for a debug message. Might be easier
to set a breakpoint to pg_re_throw().

Right, that works.

(gdb) bt
#0 pg_re_throw () at elog.c:1730
#1 0x0000000000a3cb1e in errfinish (dummy=0) at elog.c:467
#2 0x00000000004e72c4 in heap_prepare_freeze_tuple (tuple=0x7fd403a02058, relfrozenxid=111408920, relminmxid=1846, cutoff_xid=111665065, cutoff_multi=1846, frz=0x2260558, totally_frozen_p=0x7ffff5f3848f) at heapam.c:6850
#3 0x00000000006c8e6e in lazy_scan_heap (onerel=0x7fd37d5f99c8, options=13, vacrelstats=0x22603b8, Irel=0x2260500, nindexes=3, aggressive=true) at vacuumlazy.c:1151

I looked at this briefly:

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 0));
lsn | 3A49/66F44310
checksum | 0
flags | 4
lower | 36
upper | 1432
special | 8192
pagesize | 8192
version | 4
prune_xid | 0

That's not necessarily the target block though, right? It'd be useful
to get this for the block with the corruption, and perhaps one
before/after. If the backtrace I commented on at the top is correct,
the relevant tuple was at "blkno = 34915" (visible a frame or two
up).

Ah, right.

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 34914));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
---------------+----------+-------+-------+-------+---------+----------+---------+-----------
3A4A/4F2192B8 | 0 | 0 | 36 | 1520 | 8192 | 8192 | 4 | 0

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 34915));
3A4A/4F21C268 | 0 | 0 | 36 | 1520 | 8192 | 8192 | 4 | 0

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 34916));
3A4A/4F21DC90 | 0 | 0 | 36 | 1496 | 8192 | 8192 | 4 | 0

Also heap_page_items() for those pages would be useful - you might want
to not include the t_data column, as it has raw tuple data.

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(get_raw_page('child.huawei_umts_ucell_201908', 34914));
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits ... | t_oid

1 | 5952 | 1 | 2236 | 111659903 | 0 | 0 | (34914,1) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111011111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |
2 | 3736 | 1 | 2212 | 111659903 | 0 | 0 | (34914,2) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000000011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |
3 | 1520 | 1 | 2216 | 111659903 | 0 | 0 | (34914,3) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111011111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(get_raw_page('child.huawei_umts_ucell_201908', 34915));
1 | 5976 | 1 | 2212 | 111659903 | 0 | 0 | (34915,1) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |
2 | 3744 | 1 | 2232 | 111659903 | 0 | 0 | (34915,2) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |
3 | 1520 | 1 | 2220 | 111659903 | 0 | 0 | (34915,3) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(get_raw_page('child.huawei_umts_ucell_201908', 34916));
1 | 5952 | 1 | 2236 | 111659903 | 0 | 0 | (34916,1) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111011111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |
2 | 3736 | 1 | 2212 | 111659903 | 0 | 0 | (34916,2) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000 |
3 | 1496 | 1 | 2240 | 111659903 | 0 | 0 | (34916,3) | 554 | 2311 | 96 | 11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001100000001111111111111111111111111111111111111111111100111111000000 |

Could you show
* pg_controldata

[pryzbyj@database ~]$ sudo -u postgres ./src/postgresql.bin/bin/pg_controldata -D /var/lib/pgsql/11/data
[sudo] password for pryzbyj:
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6616072377454496350
Database cluster state: in production
pg_control last modified: Wed 07 Aug 2019 06:32:05 PM MDT
Latest checkpoint location: 3A4C/D1BB5BD8
Latest checkpoint's REDO location: 3A4C/CCC1DF90
Latest checkpoint's REDO WAL file: 0000000100003A4C000000CC
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:111664949
Latest checkpoint's NextOID: 2069503852
Latest checkpoint's NextMultiXactId: 1846
Latest checkpoint's NextMultiOffset: 3891
Latest checkpoint's oldestXID: 33682470
Latest checkpoint's oldestXID's DB: 16564
Latest checkpoint's oldestActiveXID: 111664945
Latest checkpoint's oldestMultiXid: 1484
Latest checkpoint's oldestMulti's DB: 16564
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Wed 07 Aug 2019 06:31:35 PM MDT
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 200
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 128
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: ff863744b99b849579d561c5117157c78a8ced0563df824f3035f70810fec534

* SELECT oid, datname, datfrozenxid, datminmxid FROM pg_database;

ts=# SELECT oid, datname, datfrozenxid, datminmxid FROM pg_database;
oid | datname | datfrozenxid | datminmxid
-------+-----------+--------------+------------
13529 | template0 | 103564559 | 1846
16400 | template1 | 52362404 | 1486
16443 | postgres | 46060229 | 1485
16564 | ts | 33682470 | 1484
(4 rows)

That should be easy to redact if necessary. It'd be useful to know which
database the corrupted table is in.

It'd also be helpful to get something roughly like
SELECT oid, oid::regclass, relkind, reltoastrelid, relfrozenxid, age(relfrozenxid), relminmxid, mxid_age(relminmxid), relpages, (SELECT txid_current()) FROM pg_class WHERE relfrozenxid <> 0;

But I'm not sure if including the tablenames would be a problematic for
your case. Possibly a bit hard to infer "usage" correlations without,
but probably still worthwhile.

I'll send to you individually, I don't think it's secret, but it's 60k lines
long... Its OID verifies to me that the table was created on 8/1, and not 7/1,
which was possible. Week-old backup also verifies that.

Thanks,
Justin