assertion failure 9.3.4
With a client's code I have just managed to produce the following
assertion failure on 9.3.4:
2014-04-15 01:02:46 GMT [19854] 76299: LOG: execute <unnamed>:
select * from "asp_ins_event_task_log"( job_id:=1, event_id:=3164,
task_name:='EventUtcComputeTask', task_status_code:='VALID'
, task_start_utc:='04/15/2014 01:02:44.563',
task_end_utc:='04/15/2014 01:02:44.563')
TRAP: FailedAssertion("!(update_xact == ((TransactionId) 0))", File:
"/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c", Line: 5873)
2014-04-15 01:02:46 GMT [11959] 11: LOG: server process (PID 19854)
was terminated by signal 6: Aborted
2014-04-15 01:02:46 GMT [11959] 12: DETAIL: Failed process was
running: select * from "asp_ins_event_task_log"( job_id:=1,
event_id:=3164, task_name:='EventUtcComputeTask', task_status_code
:='VALID', task_start_utc:='04/15/2014 01:02:44.563',
task_end_utc:='04/15/2014 01:02:44.563')
2014-04-15 01:02:46 GMT [11959] 13: LOG: terminating any other
active server processes
When running without assertions, the client reports experiencing tables
with duplicate primary keys among other things. It's apparently quite
reproducible.
I'm digging into this, but it's a nasty bug and any useful thoughts
would be appreciated.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/14/2014 09:28 PM, Andrew Dunstan wrote:
With a client's code I have just managed to produce the following
assertion failure on 9.3.4:2014-04-15 01:02:46 GMT [19854] 76299: LOG: execute <unnamed>:
select * from "asp_ins_event_task_log"( job_id:=1, event_id:=3164,
task_name:='EventUtcComputeTask', task_status_code:='VALID'
, task_start_utc:='04/15/2014 01:02:44.563',
task_end_utc:='04/15/2014 01:02:44.563')
TRAP: FailedAssertion("!(update_xact == ((TransactionId) 0))", File:
"/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c", Line:
5873)
2014-04-15 01:02:46 GMT [11959] 11: LOG: server process (PID 19854)
was terminated by signal 6: Aborted
2014-04-15 01:02:46 GMT [11959] 12: DETAIL: Failed process was
running: select * from "asp_ins_event_task_log"( job_id:=1,
event_id:=3164, task_name:='EventUtcComputeTask', task_status_code
:='VALID', task_start_utc:='04/15/2014 01:02:44.563',
task_end_utc:='04/15/2014 01:02:44.563')
2014-04-15 01:02:46 GMT [11959] 13: LOG: terminating any other
active server processesWhen running without assertions, the client reports experiencing
tables with duplicate primary keys among other things. It's apparently
quite reproducible.I'm digging into this, but it's a nasty bug and any useful thoughts
would be appreciated.
and here the stack trace:
#0 0x000000361ba36285 in __GI_raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x000000361ba37b9b in __GI_abort () at abort.c:91
#2 0x000000000075c157 in ExceptionalCondition
(conditionName=<optimized out>, errorType=<optimized out>,
fileName=<optimized out>, lineNumber=<optimized out>)
at /home/andrew/pgl/pg_9_3/src/backend/utils/error/assert.c:54
#3 0x000000000048c2af in MultiXactIdGetUpdateXid (xmax=<optimized
out>, t_infomask=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c:5873
#4 0x000000000078ad50 in HeapTupleSatisfiesMVCC
(tuple=0x7feb3be8a790, snapshot=0x1025d70, buffer=2172) at
/home/andrew/pgl/pg_9_3/src/backend/utils/time/tqual.c:1221
#5 0x000000000048aad2 in heapgetpage (scan=0x11522f0, page=6) at
/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c:397
#6 0x000000000048aee9 in heapgettup_pagemode (scan=0x11522f0,
dir=<optimized out>, nkeys=0, key=0x0) at
/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c:939
#7 0x000000000048d646 in heap_getnext (scan=0x11522f0,
direction=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c:1459
#8 0x00000000005becab in SeqNext (node=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/executor/nodeSeqscan.c:66
#9 0x00000000005ac66e in ExecScanFetch (recheckMtd=0x5bec70
<SeqRecheck>, accessMtd=0x5bec80 <SeqNext>, node=0x1151488) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execScan.c:82
#10 ExecScan (node=0x1151488, accessMtd=0x5bec80 <SeqNext>,
recheckMtd=0x5bec70 <SeqRecheck>) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execScan.c:167
#11 0x00000000005a5338 in ExecProcNode (node=0x1151488) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execProcnode.c:400
#12 0x00000000005b9bcf in ExecLockRows (node=0x1151278) at
/home/andrew/pgl/pg_9_3/src/backend/executor/nodeLockRows.c:56
#13 0x00000000005a51d0 in ExecProcNode (node=0x1151278) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execProcnode.c:496
#14 0x00000000005a250a in ExecutePlan (dest=0xb952e0,
direction=<optimized out>, numberTuples=1, sendTuples=1 '\001',
operation=CMD_SELECT, planstate=0x1151278, estate=0x1151038)
at /home/andrew/pgl/pg_9_3/src/backend/executor/execMain.c:1473
#15 standard_ExecutorRun (queryDesc=0x114f230, direction=<optimized
out>, count=1) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execMain.c:307
#16 0x00007feb5cf2f27d in explain_ExecutorRun (queryDesc=0x114f230,
direction=ForwardScanDirection, count=1) at
/home/andrew/pgl/pg_9_3/contrib/auto_explain/auto_explain.c:233
#17 0x00007feb5cd2a235 in pgss_ExecutorRun (queryDesc=0x114f230,
direction=ForwardScanDirection, count=1) at
/home/andrew/pgl/pg_9_3/contrib/pg_stat_statements/pg_stat_statements.c:717
#18 0x00000000005c8b97 in _SPI_pquery (tcount=1, fire_triggers=0
'\000', queryDesc=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/executor/spi.c:2369
#19 _SPI_execute_plan (plan=0x10afac8, paramLI=0x114f028,
snapshot=0x0, crosscheck_snapshot=0x0, read_only=0 '\000',
fire_triggers=0 '\000', tcount=1)
at /home/andrew/pgl/pg_9_3/src/backend/executor/spi.c:2157
#20 0x00000000005c8fd1 in SPI_execute_snapshot (plan=0x10afac8,
Values=0x7fff147d8500, Nulls=0x7fff147d8700 " ", snapshot=0x0,
crosscheck_snapshot=0x0, read_only=0 '\000',
fire_triggers=0 '\000', tcount=1) at
/home/andrew/pgl/pg_9_3/src/backend/executor/spi.c:488
#21 0x0000000000722635 in ri_PerformCheck (riinfo=0xf98168,
qkey=0x7fff147d8a20, qplan=0x10afac8, fk_rel=0x7feb39048048,
pk_rel=0x7feb38ff9b40, old_tuple=0x0, new_tuple=0x7fff147d9010,
detectNewRows=0 '\000', expect_OK=5) at
/home/andrew/pgl/pg_9_3/src/backend/utils/adt/ri_triggers.c:3094
#22 0x0000000000722cfc in RI_FKey_check (trigdata=<optimized out>)
at /home/andrew/pgl/pg_9_3/src/backend/utils/adt/ri_triggers.c:440
#23 0x0000000000588add in ExecCallTriggerFunc
(trigdata=0x7fff147d8fe0, tgindx=4, finfo=0x11409a8, instr=0x0,
per_tuple_context=0x1198d28)
at /home/andrew/pgl/pg_9_3/src/backend/commands/trigger.c:1871
#24 0x0000000000588f6b in AfterTriggerExecute
(per_tuple_context=0x1198d28, instr=0x0, finfo=0x11408e8,
trigdesc=0x11401f8, rel=0x7feb39048048, event=0x114d038)
at /home/andrew/pgl/pg_9_3/src/backend/commands/trigger.c:3454
#25 afterTriggerInvokeEvents (events=0x1025620, firing_id=1,
estate=0x113ffb8, delete_ok=1 '\001') at
/home/andrew/pgl/pg_9_3/src/backend/commands/trigger.c:3633
#26 0x000000000058e873 in AfterTriggerEndQuery (estate=0x113ffb8) at
/home/andrew/pgl/pg_9_3/src/backend/commands/trigger.c:3823
#27 0x00000000005a26f4 in standard_ExecutorFinish
(queryDesc=0x1076d08) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execMain.c:379
#28 0x00007feb5cf2f1a5 in explain_ExecutorFinish
(queryDesc=0x1076d08) at
/home/andrew/pgl/pg_9_3/contrib/auto_explain/auto_explain.c:256
#29 0x00007feb5cd2a163 in pgss_ExecutorFinish (queryDesc=0x1076d08)
at
/home/andrew/pgl/pg_9_3/contrib/pg_stat_statements/pg_stat_statements.c:740
#30 0x00000000005c8be4 in _SPI_pquery (tcount=2, fire_triggers=1
'\001', queryDesc=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/executor/spi.c:2381
#31 _SPI_execute_plan (plan=0x10905a8, paramLI=0x112ab10,
snapshot=0x0, crosscheck_snapshot=0x0, read_only=0 '\000',
fire_triggers=1 '\001', tcount=2)
at /home/andrew/pgl/pg_9_3/src/backend/executor/spi.c:2157
#32 0x00000000005c909f in SPI_execute_plan_with_paramlist
(plan=0x10905a8, params=0x112ab10, read_only=0 '\000', tcount=2) at
/home/andrew/pgl/pg_9_3/src/backend/executor/spi.c:449
#33 0x00007feb3908d877 in exec_stmt_execsql (estate=0x7fff147d9700,
stmt=0x10630f0) at
/home/andrew/pgl/pg_9_3/src/pl/plpgsql/src/pl_exec.c:3293
#34 0x00007feb3908ee9e in exec_stmt (stmt=0x10630f0,
estate=0x7fff147d9700) at
/home/andrew/pgl/pg_9_3/src/pl/plpgsql/src/pl_exec.c:1446
#35 exec_stmts (estate=0x7fff147d9700, stmts=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/pl/plpgsql/src/pl_exec.c:1345
#36 0x00007feb3909157b in exec_stmt_block (estate=0x7fff147d9700,
block=0x1063c68) at
/home/andrew/pgl/pg_9_3/src/pl/plpgsql/src/pl_exec.c:1283
#37 0x00007feb390921dd in plpgsql_exec_function (func=0x106bbe8,
fcinfo=0x7fff147d9940) at
/home/andrew/pgl/pg_9_3/src/pl/plpgsql/src/pl_exec.c:321
#38 0x00007feb39086ff4 in plpgsql_call_handler
(fcinfo=0x7fff147d9940) at
/home/andrew/pgl/pg_9_3/src/pl/plpgsql/src/pl_handler.c:129
#39 0x00000000005aa632 in ExecMakeTableFunctionResult
(funcexpr=0x1122070, econtext=0x1121b98, expectedDesc=0x1121f18,
randomAccess=0 '\000')
at /home/andrew/pgl/pg_9_3/src/backend/executor/execQual.c:2164
#40 0x00000000005be0c1 in FunctionNext (node=0x1121a80) at
/home/andrew/pgl/pg_9_3/src/backend/executor/nodeFunctionscan.c:64
#41 0x00000000005ac74e in ExecScanFetch (recheckMtd=0x5be040
<FunctionRecheck>, accessMtd=0x5be050 <FunctionNext>, node=0x1121a80)
at /home/andrew/pgl/pg_9_3/src/backend/executor/execScan.c:82
#42 ExecScan (node=0x1121a80, accessMtd=0x5be050 <FunctionNext>,
recheckMtd=0x5be040 <FunctionRecheck>) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execScan.c:132
#43 0x00000000005a52d8 in ExecProcNode (node=0x1121a80) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execProcnode.c:426
#44 0x00000000005a250a in ExecutePlan (dest=0xeeaa40,
direction=<optimized out>, numberTuples=0, sendTuples=1 '\001',
operation=CMD_SELECT, planstate=0x1121a80, estate=0x1121968)
at /home/andrew/pgl/pg_9_3/src/backend/executor/execMain.c:1473
#45 standard_ExecutorRun (queryDesc=0x1029e60, direction=<optimized
out>, count=0) at
/home/andrew/pgl/pg_9_3/src/backend/executor/execMain.c:307
#46 0x00007feb5cf2f27d in explain_ExecutorRun (queryDesc=0x1029e60,
direction=ForwardScanDirection, count=0) at
/home/andrew/pgl/pg_9_3/contrib/auto_explain/auto_explain.c:233
#47 0x00007feb5cd2a235 in pgss_ExecutorRun (queryDesc=0x1029e60,
direction=ForwardScanDirection, count=0) at
/home/andrew/pgl/pg_9_3/contrib/pg_stat_statements/pg_stat_statements.c:717
#48 0x000000000068fe77 in PortalRunSelect (portal=0xf2d598,
forward=<optimized out>, count=0, dest=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/tcop/pquery.c:946
#49 0x00000000006913c1 in PortalRun (portal=0xf2d598,
count=9223372036854775807, isTopLevel=1 '\001', dest=0xeeaa40,
altdest=0xeeaa40, completionTag=0x7fff147da470 "")
at /home/andrew/pgl/pg_9_3/src/backend/tcop/pquery.c:790
#50 0x000000000068ec50 in exec_execute_message (max_rows=<optimized
out>, portal_name=0xeea628 "") at
/home/andrew/pgl/pg_9_3/src/backend/tcop/postgres.c:1931
#51 PostgresMain (argc=<optimized out>, argv=<optimized out>,
dbname=0xe3e0c8 "eventjobs", username=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/tcop/postgres.c:4060
#52 0x000000000064465c in BackendRun (port=0xe63490) at
/home/andrew/pgl/pg_9_3/src/backend/postmaster/postmaster.c:4011
#53 BackendStartup (port=0xe63490) at
/home/andrew/pgl/pg_9_3/src/backend/postmaster/postmaster.c:3685
#54 ServerLoop () at
/home/andrew/pgl/pg_9_3/src/backend/postmaster/postmaster.c:1586
#55 PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/postmaster/postmaster.c:1253
#56 0x000000000045c897 in main (argc=3, argv=0xe3d270) at
/home/andrew/pgl/pg_9_3/src/backend/main/main.c:226
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan wrote:
and here the stack trace:
#0 0x000000361ba36285 in __GI_raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x000000361ba37b9b in __GI_abort () at abort.c:91
#2 0x000000000075c157 in ExceptionalCondition
(conditionName=<optimized out>, errorType=<optimized out>,
fileName=<optimized out>, lineNumber=<optimized out>)
at /home/andrew/pgl/pg_9_3/src/backend/utils/error/assert.c:54
#3 0x000000000048c2af in MultiXactIdGetUpdateXid (xmax=<optimized
out>, t_infomask=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c:5873
#4 0x000000000078ad50 in HeapTupleSatisfiesMVCC
(tuple=0x7feb3be8a790, snapshot=0x1025d70, buffer=2172) at
/home/andrew/pgl/pg_9_3/src/backend/utils/time/tqual.c:1221
Clearly this is a bug related to multixacts and the related tqual.c
changse. Will look.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/14/2014 10:02 PM, Alvaro Herrera wrote:
Andrew Dunstan wrote:
and here the stack trace:
#0 0x000000361ba36285 in __GI_raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x000000361ba37b9b in __GI_abort () at abort.c:91
#2 0x000000000075c157 in ExceptionalCondition
(conditionName=<optimized out>, errorType=<optimized out>,
fileName=<optimized out>, lineNumber=<optimized out>)
at /home/andrew/pgl/pg_9_3/src/backend/utils/error/assert.c:54
#3 0x000000000048c2af in MultiXactIdGetUpdateXid (xmax=<optimized
out>, t_infomask=<optimized out>) at
/home/andrew/pgl/pg_9_3/src/backend/access/heap/heapam.c:5873
#4 0x000000000078ad50 in HeapTupleSatisfiesMVCC
(tuple=0x7feb3be8a790, snapshot=0x1025d70, buffer=2172) at
/home/andrew/pgl/pg_9_3/src/backend/utils/time/tqual.c:1221Clearly this is a bug related to multixacts and the related tqual.c
changse. Will look.
OK. If you need me to test patches I can do so.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
So, from top to bottom I see the following elements:
* backend is executing a query
* this query is getting captured by pg_stat_statements
* the query is also getting captured by autoexplain, in chain from
pg_stat_statements
* autoexplain runs the query, which invokes a plpgsql function
* this plpgsql function runs another query, and this one is captured by
pg_stat_statements
* and also by autoexplain
* The autoexplain run of this inner query invokes a trigger
* .. which is a FK trigger, and therefore runs ri_PerformCheck which
runs another query
* This FK check query gets captured by pg_stat_statements
* and also by autoexplain, which runs it
* this autoexplain run of the third query invokes SeqNext
* this does a heap access, which uses HeapTupleSatisfiesMVCC
* this one tries to read the update XID and gets an assertion failure.
Oh my.
Would it be possible for you to provide a self-contained setup that
behaves like this? I think a "bt full" would be useful since it'd
provide the queries at each of the three stages.
I'm not quite clear on why the third query, the one in ri_PerformCheck,
is invoking a sequence. AFAICS it's this:
/* ----------
* The query string built is
* SELECT 1 FROM ONLY <pktable> x WHERE pkatt1 = $1 [AND ...]
* FOR KEY SHARE OF x
* The type id's for the $ parameters are those of the
* corresponding FK attributes.
* ----------
*/
so either I'm misreading the whole thing, or there is something very odd
going on here.
Are you aware of something unusual in the FK setups?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
I'm not quite clear on why the third query, the one in ri_PerformCheck,
is invoking a sequence.
It's not --- SeqNext is the next-tuple function for a sequential scan.
Nothing to do with sequences.
Now, it *is* worth wondering why the heck a query on the table's primary
key is using a seqscan and not an indexscan. Maybe the planner thinks
there are just a few rows in the table? But the stack trace seems
unexceptional other than that.
I'm wondering if the combination of autoexplain and pg_stat_statements
is causing trouble.
Yeah, it would be real nice to see a self-contained test case for this.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/16/2014 07:19 PM, Tom Lane wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
I'm not quite clear on why the third query, the one in ri_PerformCheck,
is invoking a sequence.It's not --- SeqNext is the next-tuple function for a sequential scan.
Nothing to do with sequences.Now, it *is* worth wondering why the heck a query on the table's primary
key is using a seqscan and not an indexscan. Maybe the planner thinks
there are just a few rows in the table? But the stack trace seems
unexceptional other than that.I'm wondering if the combination of autoexplain and pg_stat_statements
is causing trouble.Yeah, it would be real nice to see a self-contained test case for this.
Well, that might be hard to put together, but I did try running without
pg_stat_statements and auto_explain loaded and the error did not occur.
Not sure where that gets us in terms of deciding on a culprit.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
On 04/16/2014 07:19 PM, Tom Lane wrote:
Yeah, it would be real nice to see a self-contained test case for this.
Well, that might be hard to put together, but I did try running without
pg_stat_statements and auto_explain loaded and the error did not occur.
Not sure where that gets us in terms of deciding on a culprit.
Could we at least see the exact parameter settings for pg_stat_statements
and auto_explain? (And any other GUCs with nondefault values?)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan wrote:
On 04/16/2014 07:19 PM, Tom Lane wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
I'm not quite clear on why the third query, the one in ri_PerformCheck,
is invoking a sequence.It's not --- SeqNext is the next-tuple function for a sequential scan.
Nothing to do with sequences.Now, it *is* worth wondering why the heck a query on the table's primary
key is using a seqscan and not an indexscan. Maybe the planner thinks
there are just a few rows in the table? But the stack trace seems
unexceptional other than that.I'm wondering if the combination of autoexplain and pg_stat_statements
is causing trouble.Yeah, it would be real nice to see a self-contained test case for this.
Well, that might be hard to put together, but I did try running
without pg_stat_statements and auto_explain loaded and the error did
not occur.
Well, can you get us the queries that are being run, and the schemas
involved in those queries?
Not sure where that gets us in terms of deciding on a
culprit.
I suspect, in the blind, that autoexplain is executing a query that
causes a second update XID to be added to a multixact (which already has
another one), perhaps with the same snapshot as the original update.
Why would this be happening in a FK-check query? No idea.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/16/2014 10:28 PM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
On 04/16/2014 07:19 PM, Tom Lane wrote:
Yeah, it would be real nice to see a self-contained test case for this.
Well, that might be hard to put together, but I did try running without
pg_stat_statements and auto_explain loaded and the error did not occur.
Not sure where that gets us in terms of deciding on a culprit.Could we at least see the exact parameter settings for pg_stat_statements
and auto_explain? (And any other GUCs with nondefault values?)
Here are all the settings from the run that failed:
listen_addresses = '*'
port = 5432
fsync = on
synchronous_commit = off
checkpoint_segments = 128
checkpoint_completion_target = 0.9
shared_buffers = 512MB
max_connections = 300
work_mem = 128MB
maintenance_work_mem = 32MB
effective_cache_size = 16GB
effective_io_concurrency = 2
logging_collector = on
log_destination = 'stderr'
log_filename = 'postgresql-%a.log'
log_rotation_size = 0
log_truncate_on_rotation = on
log_line_prefix = '%t [%p] %l: '
log_connections = on
log_disconnections = on
log_statement = 'all'
track_activity_query_size = 10240
shared_preload_libraries = 'auto_explain,pg_stat_statements'
As you can see, auto_explain's log_min_duration hasn't been set, so it
shouldn't be doing anything very much, I should think.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
All,
So have encountered a 2nd report of this issue, or of an issue which
sounds very similar:
- corruption in two "queue" tables
- the tables are written in a high-concurrency, lock-contested environment
- user uses SELECT FOR UPDATE with these tables.
- pg_stat_statements .so is loaded, but the extension is not installed
- four rows were added to the queue tables, while not being added to the
PK index. This allowed duplicate PKs to be added.
Currently the user is testing not loading the pg_stat_statements.so to
see if the problem goes away. They have a destruction test environment,
so we should be able to confirm/deny in a couple days.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM1a30f22308068687e28feff5e7704fa2b2ede0ab5b348aba3d9f2d69731daf4cf8d2fdfc91529d4c5f37c619c2a18731@asav-1.01.com
On Thu, Apr 17, 2014 at 7:15 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
track_activity_query_size = 10240
shared_preload_libraries = 'auto_explain,pg_stat_statements'As you can see, auto_explain's log_min_duration hasn't been set, so it
shouldn't be doing anything very much, I should think.
track_activity_query_size is 10240? Not 1024?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/17/2014 09:04 PM, Peter Geoghegan wrote:
On Thu, Apr 17, 2014 at 7:15 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
track_activity_query_size = 10240
shared_preload_libraries = 'auto_explain,pg_stat_statements'As you can see, auto_explain's log_min_duration hasn't been set, so it
shouldn't be doing anything very much, I should think.track_activity_query_size is 10240? Not 1024?
yes.
I normally find 1024 is far too small.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/17/2014 10:15 AM, Andrew Dunstan wrote:
On 04/16/2014 10:28 PM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
On 04/16/2014 07:19 PM, Tom Lane wrote:
Yeah, it would be real nice to see a self-contained test case for
this.Well, that might be hard to put together, but I did try running without
pg_stat_statements and auto_explain loaded and the error did not occur.
Not sure where that gets us in terms of deciding on a culprit.Could we at least see the exact parameter settings for
pg_stat_statements
and auto_explain? (And any other GUCs with nondefault values?)Here are all the settings from the run that failed:
listen_addresses = '*'
port = 5432
fsync = on
synchronous_commit = off
checkpoint_segments = 128
checkpoint_completion_target = 0.9
shared_buffers = 512MB
max_connections = 300
work_mem = 128MB
maintenance_work_mem = 32MB
effective_cache_size = 16GB
effective_io_concurrency = 2
logging_collector = on
log_destination = 'stderr'
log_filename = 'postgresql-%a.log'
log_rotation_size = 0
log_truncate_on_rotation = on
log_line_prefix = '%t [%p] %l: '
log_connections = on
log_disconnections = on
log_statement = 'all'
track_activity_query_size = 10240
shared_preload_libraries = 'auto_explain,pg_stat_statements'As you can see, auto_explain's log_min_duration hasn't been set, so it
shouldn't be doing anything very much, I should think.
There definitely seems to be something going on involving these two
pre-loaded modules. With both auto_explain and pg_stat_statements
preloaded I can reproduce the error fairly reliably. I have also
reproduced it, but less reliably, with auto_explain alone loaded. I have
not reproduced it with pg_stat_statements alone loaded, but Josh Berkus
has reported that another client has experienced something very similar
(duplicate PK errors on a non-assertion-enabled build) with
pg_stat_statements alone loaded.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/18/2014 09:42 AM, Andrew Dunstan wrote:
There definitely seems to be something going on involving these two
pre-loaded modules. With both auto_explain and pg_stat_statements
preloaded I can reproduce the error fairly reliably. I have also
reproduced it, but less reliably, with auto_explain alone loaded. I have
not reproduced it with pg_stat_statements alone loaded, but Josh Berkus
has reported that another client has experienced something very similar
(duplicate PK errors on a non-assertion-enabled build) with
pg_stat_statements alone loaded.
Correct. However, this seems to produce the issue less "reliably";
that is, it takes several hours of load testing for a duplicate PK to
show up, so I suspect that with pg_stat_statements alone it's also
timing issue. I'm still waiting on the results with
pg_stat_statements.so NOT loaded to confirm that we're not actually
getting two separate bugs with similar symptoms.
The second site does not have any increase in query size. Their only
settings for pg_s_s are:
pg_stat_statements.max = 10000
pg_stat_statements.track = all
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMbedbedbc491599830664fefd27f60cdd795c693e222bca8a7de33d54de28db7885f843115be639f7f5fd6054e0c2afff@asav-2.01.com
All,
1) I've confirmed at the 2nd site that the issue doesn't happen if
pg_stat_statements.so is not loaded. So this seems to be confirmation
that either auto_explain, pg_stat_statements, or both need to be loaded
(but not necessarily created as extensions) in order to have the issue.
2) I've been unable to reproduce the issue using a naive test case, but
I'll keep trying.
3) The XID information is interesting.
a) I've confirmed that this is a case of having multiple row versions
rather than allowing a duplicate PK to be inserted.
b) the end result of this corruption is XIDs which go backwards:
xmin | xmax | ctid | step_id | workitem_id | status
---------+---------+-----------+---------+-------------+---------
3362707 | 3362707 | (6177,31) | 1 | 446469 | pending
3362710 | 3362707 | (6177,32) | 1 | 446469 | working
5520953 | 5520953 | (5064,105) | 1 | 727946 | pending
5520954 | 5520953 | (5064,108) | 1 | 727946 | working
What's additionally problematic is that the current snapshot minxid is
in the 9000000 range, so it's not clear why any of the above rows are
visible at all.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMfd86f7a14a7fd2e0da330350861e0bae7e102c071fc126e2d9ef1ea0b736adbf350634e7698352a3011810ac415ef3c5@asav-2.01.com
* Josh Berkus (josh@agliodbs.com) wrote:
1) I've confirmed at the 2nd site that the issue doesn't happen if
pg_stat_statements.so is not loaded. So this seems to be confirmation
that either auto_explain, pg_stat_statements, or both need to be loaded
(but not necessarily created as extensions) in order to have the issue.
Interesting...
2) I've been unable to reproduce the issue using a naive test case, but
I'll keep trying.
Thanks.
3) The XID information is interesting.
a) I've confirmed that this is a case of having multiple row versions
rather than allowing a duplicate PK to be inserted.b) the end result of this corruption is XIDs which go backwards:
xmin | xmax | ctid | step_id | workitem_id | status
---------+---------+-----------+---------+-------------+---------
3362707 | 3362707 | (6177,31) | 1 | 446469 | pending
3362710 | 3362707 | (6177,32) | 1 | 446469 | working5520953 | 5520953 | (5064,105) | 1 | 727946 | pending
5520954 | 5520953 | (5064,108) | 1 | 727946 | workingWhat's additionally problematic is that the current snapshot minxid is
in the 9000000 range, so it's not clear why any of the above rows are
visible at all.
Can you get the infomask bits..? What's does pg_controldata report wrt
the MultiXid's?
THanks,
Stephen
Can you get the infomask bits..? What's does pg_controldata report wrt
the MultiXid's?
Can't get the infomask bits.
pg_controldata attached, with some redactions. Unfortunately, it
appears that they've continued to do tests on this system, so the XID
counter has advanced somewhat.
pg_control version number: 937
Catalog version number: {redacted}
Database system identifier: {redacted}
Database cluster state: in production
pg_control last modified: Mon 21 Apr 2014 04:28:30 PM UTC
Latest checkpoint location: 8/C8B05F98
Prior checkpoint location: 8/C663A4B0
Latest checkpoint's REDO location: 8/C6DC9A10
Latest checkpoint's REDO WAL file: 0000000300000008000000C6
Latest checkpoint's TimeLineID: 3
Latest checkpoint's PrevTimeLineID: 3
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0/11214375
Latest checkpoint's NextOID: 32769
Latest checkpoint's NextMultiXactId: 28939
Latest checkpoint's NextMultiOffset: 58004
Latest checkpoint's oldestXID: 674
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 11214375
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint: Mon 21 Apr 2014 04:21:02 PM UTC
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: hot_standby
Current max_connections setting: 100
Current max_prepared_xacts setting: 0
Current max_locks_per_xact setting: 64
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM5a51bbc3bfad35e4ff30f0b3f7caedaae12088b1b0e86020c783e16652742afc951f88516ee60ebb21a5d38ba5a03757@asav-1.01.com
Josh Berkus <josh@agliodbs.com> writes:
1) I've confirmed at the 2nd site that the issue doesn't happen if
pg_stat_statements.so is not loaded. So this seems to be confirmation
that either auto_explain, pg_stat_statements, or both need to be loaded
(but not necessarily created as extensions) in order to have the issue.
I looked a little bit for possible interactions between those two modules,
and almost immediately noticed that pg_stat_statements is not being polite
enough to call any previous occupant of the post_parse_analyze_hook.
That's not the bug we're looking for, since auto_explain does not use
post_parse_analyze_hook, but it's a bug nonetheless.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.
Since the symptom of the problem seem to be multixacts with more than
one updating xid, I added a check to MultiXactIdCreateFromMembers()
preventing that. That requires to move ISUPDATE_from_mxstatus() to a
header, but I think we should definitely add such a assert.
As it turns out the problem is in the
else if (result == HeapTupleBeingUpdated && wait)
branch in (at least) heap_update(). When the problem is hit the
to-be-updated tuple originally has HEAP_XMIN_COMMITTED |
HEAP_XMAX_LOCK_ONLY | HEAP_XMAX_KEYSHR_LOCK set. So we release the
buffer lock, acquire the tuple lock, and reacquire the buffer lock. But
inbetween the locking backend has actually updated the tuple.
The code tries to protect against that with:
/*
* recheck the locker; if someone else changed the tuple while
* we weren't looking, start over.
*/
if ((oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
goto l2;
can_continue = true;
locker_remains = true;
and similar. The problem is that in Andrew's case the infomask changes
from 0x2192 to 0x2102 (i.e. it's a normal update afterwards), while xmax
stays the same. Ooops.
A bit later there's:
result = can_continue ? HeapTupleMayBeUpdated : HeapTupleUpdated;
So, from thereon we happily continue to update the tuple, thinking
there's no previous updater. Which obviously causes problems.
I've hacked^Wfixed this by changing the infomask test above into
infomask != oldtup.t_data->t_infomask in a couple of places. That seems
to be sufficient to survive the testcase a couple of times.
I am too hungry right now to think about a proper fix for this and
whether there's further problematic areas.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.
Hmm ... is this the same thing Josh is reporting? If so, why the
apparent connection with use of pg_stat_statements? Maybe that
just changes the timing to make the race condition more likely?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/21/2014 12:26 PM, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.Hmm ... is this the same thing Josh is reporting? If so, why the
apparent connection with use of pg_stat_statements? Maybe that
just changes the timing to make the race condition more likely?
Well, in Andrew's case it only happens if auto_explain is loaded.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM350acbb3c04b2907e34023e0c58af1d7a592f30e3b3abce5e801435d8f9ae8577bbba8357dc08714ebfd45a65791ddc7@asav-3.01.com
On 04/21/2014 03:26 PM, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.Hmm ... is this the same thing Josh is reporting? If so, why the
apparent connection with use of pg_stat_statements? Maybe that
just changes the timing to make the race condition more likely?
That's my theory. Josh's case is very similar indeed to mine.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-04-21 15:26:03 -0400, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.Hmm ... is this the same thing Josh is reporting? If so, why the
apparent connection with use of pg_stat_statements? Maybe that
just changes the timing to make the race condition more likely?
I am not sure. I could only reproduce the problem with both
stat_statements and auto_explain loaded, but I didn't try very
long. It's pretty timing sensitive, so I could very well imagine that
some additional overhead makes it easier to reproduce.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-04-21 12:31:09 -0700, Josh Berkus wrote:
On 04/21/2014 12:26 PM, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.Hmm ... is this the same thing Josh is reporting? If so, why the
apparent connection with use of pg_stat_statements? Maybe that
just changes the timing to make the race condition more likely?Well, in Andrew's case it only happens if auto_explain is loaded.
Josh, how long does it take you to reproduce the issue? And can you
reproduce it outside of a production environment?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Josh, how long does it take you to reproduce the issue?
A couple hours.
And can you
reproduce it outside of a production environment?
Not yet, still working on that.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM33d92656234cbf699c25c810d96bae75418f5e4b4bb8f6c8ba804469576c8e79a5d893aa355e9ed7d4d1162c95e040e5@asav-2.01.com
On 04/21/2014 02:54 PM, Andres Freund wrote:
Hi,
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.
What's the assert that makes it happen faster? That might help a lot in
constructing a self-contained test.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
All,
I've taken a stab at creating a reproduceable test case based on the
characterisitics of the production issues I'm seeing. But clearly
there's an element I'm missing, because I'm not able to produce the bug
with a pgbench-based test case.
My current test has FKs, updating both FK'd tables, updating both
indexed and non-indexed columns, and doing multiple updates in the same
transaction, and lock-blocking.
Files are attached in case someone has better ideas. queue_bench.sql is
the setup file, and then you do:
pgbench -n -T 600 -c 15 -j 5 -f queue_adder.bench -f queue_worker.bench
-f queue_worker.bench -f queue_worker.bench -f queue_worker.bench {dbname}
... or whatever levels of c and j make sense on your hardware.
FWIW, this particular test case might make a nice destruction test case
for replication testing, too.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
Import Notes
Reply to msg id not found: WM216a818542757b4a170499a16e5ddfffe1bd0d8e2f1beda3219560b0330b1bc7d8618047775315cae0578d68a962c542@asav-2.01.com
On 2014-04-21 19:43:15 -0400, Andrew Dunstan wrote:
On 04/21/2014 02:54 PM, Andres Freund wrote:
Hi,
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.What's the assert that makes it happen faster? That might help a lot in
constructing a self-contained test.
Assertion and *preliminary*, *hacky* fix attached.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
0001-Error-out-when-creating-a-multixact-with-more-than-o.patchtext/x-patch; charset=us-asciiDownload
>From 40c09d8a85db137caf42c2dfe36adda63a381c75 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 22 Apr 2014 08:42:11 +0200
Subject: [PATCH 1/2] Error out when creating a multixact with more than one
updater.
---
src/backend/access/heap/heapam.c | 3 ---
src/backend/access/transam/multixact.c | 15 +++++++++++++++
src/include/access/multixact.h | 3 +++
3 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2605f20..bd26b80 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -190,9 +190,6 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
/* Get the LockTupleMode for a given MultiXactStatus */
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
-/* Get the is_update bit for a given MultiXactStatus */
-#define ISUPDATE_from_mxstatus(status) \
- ((status) > MultiXactStatusForUpdate)
/* ----------------------------------------------------------------
* heap support routines
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index d4ad678..4ff1e58 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -713,6 +713,21 @@ MultiXactIdCreateFromMembers(int nmembers, MultiXactMember *members)
return multi;
}
+ {
+ int i;
+ TransactionId update_xact = InvalidTransactionId;
+
+ for (i = 0; i < nmembers; i++)
+ {
+ /* Ignore lockers */
+ if (!ISUPDATE_from_mxstatus(members[i].status))
+ continue;
+ if (update_xact != InvalidTransactionId)
+ elog(ERROR, "new multixact has more than one updating member");
+ update_xact = members[i].xid;
+ }
+ }
+
/*
* Assign the MXID and offsets range to use, and make sure there is space
* in the OFFSETs and MEMBERs files. NB: this routine does
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 9729f27..95ffe0e 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -48,6 +48,9 @@ typedef enum
#define MaxMultiXactStatus MultiXactStatusUpdate
+/* Get the is_update bit for a given MultiXactStatus */
+#define ISUPDATE_from_mxstatus(status) \
+ ((status) > MultiXactStatusForUpdate)
typedef struct MultiXactMember
{
--
1.8.5.rc2.dirty
0002-preliminary-fix-for-corruption-issue.patchtext/x-patch; charset=us-asciiDownload
>From ebd4f834b7cdccb9a9fe336dd981811e26bfc271 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 22 Apr 2014 08:42:44 +0200
Subject: [PATCH 2/2] preliminary fix for corruption issue.
---
src/backend/access/heap/heapam.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index bd26b80..9c075c4 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2722,7 +2722,7 @@ l1:
* update this tuple before we get to this point. Check for xmax
* change, and start over if so.
*/
- if (!(tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (tp.t_data->t_infomask != infomask ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data),
xwait))
goto l1;
@@ -2748,7 +2748,7 @@ l1:
* other xact could update this tuple before we get to this point.
* Check for xmax change, and start over if so.
*/
- if ((tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (tp.t_data->t_infomask != infomask ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data),
xwait))
goto l1;
@@ -3275,7 +3275,7 @@ l2:
* update this tuple before we get to this point. Check for xmax
* change, and start over if so.
*/
- if (!(oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (oldtup.t_data->t_infomask != infomask ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
goto l2;
@@ -3329,7 +3329,7 @@ l2:
* recheck the locker; if someone else changed the tuple while
* we weren't looking, start over.
*/
- if ((oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (oldtup.t_data->t_infomask != infomask ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
@@ -3350,7 +3350,7 @@ l2:
* some other xact could update this tuple before we get to
* this point. Check for xmax change, and start over if so.
*/
- if ((oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (oldtup.t_data->t_infomask != infomask ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
@@ -4354,7 +4354,7 @@ l3:
* over.
*/
LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
- if (!(tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (tuple->t_data->t_infomask != infomask ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
{
@@ -4373,7 +4373,7 @@ l3:
LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
/* if the xmax changed in the meantime, start over */
- if ((tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (tuple->t_data->t_infomask != infomask ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
@@ -4441,7 +4441,7 @@ l3:
* could update this tuple before we get to this point. Check
* for xmax change, and start over if so.
*/
- if (!(tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (tuple->t_data->t_infomask != infomask ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
@@ -4497,7 +4497,7 @@ l3:
* some other xact could update this tuple before we get to
* this point. Check for xmax change, and start over if so.
*/
- if ((tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
+ if (tuple->t_data->t_infomask != infomask ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
--
1.8.5.rc2.dirty
Andres Freund wrote:
On 2014-04-21 19:43:15 -0400, Andrew Dunstan wrote:
On 04/21/2014 02:54 PM, Andres Freund wrote:
Hi,
I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.What's the assert that makes it happen faster? That might help a lot in
constructing a self-contained test.Assertion and *preliminary*, *hacky* fix attached.
Thanks for the analysis and patches. I've been playing with this on my
own a bit, and one thing that I just noticed is that at least for
heap_update I cannot reproduce a problem when the xmax is originally a
multixact, so AFAICT the number of places that need patched aren't as
many.
Some testing later, I think the issue only occurs if we determine that
we don't need to wait for the xid/multi to complete, because otherwise
the wait itself saves us. (It's easy to cause the problem by adding a
breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
lock, and then having transaction A lock for key share, then transaction
B update the tuple which stops at the breakpoint, then transaction A
also update the tuple, and finally release transaction B).
For now I offer a cleaned up version of your patch to add the assertion
that multis don't contain multiple updates. I considered the idea of
making this #ifdef USE_ASSERT_CHECKING, because it has to walk the
complete array of members; and then have full elogs in MultiXactIdExpand
and MultiXactIdCreate, which are lighter because they can check more
easily. But on second thoughts I refrained from doing that, because
surely the arrays are not as large anyway, are they.
I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes. Does that sound okay?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
complain-multi-updates.patchtext/x-diff; charset=us-asciiDownload
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***************
*** 190,198 **** static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
/* Get the LockTupleMode for a given MultiXactStatus */
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
- /* Get the is_update bit for a given MultiXactStatus */
- #define ISUPDATE_from_mxstatus(status) \
- ((status) > MultiXactStatusForUpdate)
/* ----------------------------------------------------------------
* heap support routines
--- 190,195 ----
*** a/src/backend/access/transam/multixact.c
--- b/src/backend/access/transam/multixact.c
***************
*** 457,463 **** MultiXactIdExpand(MultiXactId multi, TransactionId xid, MultiXactStatus status)
for (i = 0, j = 0; i < nmembers; i++)
{
if (TransactionIdIsInProgress(members[i].xid) ||
! ((members[i].status > MultiXactStatusForUpdate) &&
TransactionIdDidCommit(members[i].xid)))
{
newMembers[j].xid = members[i].xid;
--- 457,463 ----
for (i = 0, j = 0; i < nmembers; i++)
{
if (TransactionIdIsInProgress(members[i].xid) ||
! (ISUPDATE_from_mxstatus(members[i].status) &&
TransactionIdDidCommit(members[i].xid)))
{
newMembers[j].xid = members[i].xid;
***************
*** 713,718 **** MultiXactIdCreateFromMembers(int nmembers, MultiXactMember *members)
--- 713,734 ----
return multi;
}
+ /* Verify that there is a single update Xid among the given members. */
+ {
+ int i;
+ bool has_update = false;
+
+ for (i = 0; i < nmembers; i++)
+ {
+ if (ISUPDATE_from_mxstatus(members[i].status))
+ {
+ if (has_update)
+ elog(ERROR, "new multixact has more than one updating member");
+ has_update = true;
+ }
+ }
+ }
+
/*
* Assign the MXID and offsets range to use, and make sure there is space
* in the OFFSETs and MEMBERs files. NB: this routine does
*** a/src/include/access/multixact.h
--- b/src/include/access/multixact.h
***************
*** 48,53 **** typedef enum
--- 48,57 ----
#define MaxMultiXactStatus MultiXactStatusUpdate
+ /* does a status value correspond to a tuple update? */
+ #define ISUPDATE_from_mxstatus(status) \
+ ((status) > MultiXactStatusForUpdate)
+
typedef struct MultiXactMember
{
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes. Does that sound okay?
Note that I have a limited ability to actually test my failing test case
-- I have to fire up the user's full application test suite to produce
it, and that needs to be scheduled. However, Andrew is able to rerun
his test case repeatedly.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMc67809c56863adcd61c1b9fd5b75b718c0e2125da2bcdf8468bd07820de573c1b32c18b57b6891614706df93c712a293@asav-3.01.com
On 04/22/2014 05:20 PM, Josh Berkus wrote:
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes. Does that sound okay?Note that I have a limited ability to actually test my failing test case
-- I have to fire up the user's full application test suite to produce
it, and that needs to be scheduled. However, Andrew is able to rerun
his test case repeatedly.
Alvaro has access (as does Andres) to my setup, and instructions on how
to run the test. I am going to be offline most of the time from tonight
until next Tuesday.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
Some testing later, I think the issue only occurs if we determine that
we don't need to wait for the xid/multi to complete, because otherwise
the wait itself saves us. (It's easy to cause the problem by adding a
breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
lock, and then having transaction A lock for key share, then transaction
B update the tuple which stops at the breakpoint, then transaction A
also update the tuple, and finally release transaction B).
So, trying to make my synthetic test work:
In order to encounter this issue, I'd need to have two concurrent
processes update the child records of the same parent record? That is:
A ---> B1
\---> B2
... and the issue should only happen if I update both B1 and B2
concurrently in separate sessions?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMc67809c56863adcd61c1b9fd5b75b718c0e2125da2bcdf8468bd07820de573c1b32c18b57b6891614706df93c712a293@asav-3.01.com
On 2014-04-22 17:36:42 -0400, Andrew Dunstan wrote:
On 04/22/2014 05:20 PM, Josh Berkus wrote:
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes. Does that sound okay?Note that I have a limited ability to actually test my failing test case
-- I have to fire up the user's full application test suite to produce
it, and that needs to be scheduled. However, Andrew is able to rerun
his test case repeatedly.Alvaro has access (as does Andres) to my setup, and instructions on how to
run the test. I am going to be offline most of the time from tonight until
next Tuesday.
My preliminary patch seems to have fixed the problem in your setup...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-04-22 14:40:46 -0700, Josh Berkus wrote:
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
Some testing later, I think the issue only occurs if we determine that
we don't need to wait for the xid/multi to complete, because otherwise
the wait itself saves us. (It's easy to cause the problem by adding a
breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
lock, and then having transaction A lock for key share, then transaction
B update the tuple which stops at the breakpoint, then transaction A
also update the tuple, and finally release transaction B).So, trying to make my synthetic test work:
In order to encounter this issue, I'd need to have two concurrent
processes update the child records of the same parent record? That is:A ---> B1
\---> B2... and the issue should only happen if I update both B1 and B2
concurrently in separate sessions?
I don't think that'll trigger it. You need rows that are first key share
locked and then updated by the locking transaction. Under
concurrency. And the timewindow really is rather small..
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-04-22 18:01:40 -0300, Alvaro Herrera wrote:
Thanks for the analysis and patches. I've been playing with this on my
own a bit, and one thing that I just noticed is that at least for
heap_update I cannot reproduce a problem when the xmax is originally a
multixact, so AFAICT the number of places that need patched aren't as
many.
I am quite uncomfortable with that assumption. I don't immediately see a
problem for some of the cases, but leaving them in a weaker state than
9.2 makes me uncomfortable.
For now I offer a cleaned up version of your patch to add the assertion
that multis don't contain multiple updates. I considered the idea of
making this #ifdef USE_ASSERT_CHECKING, because it has to walk the
complete array of members; and then have full elogs in MultiXactIdExpand
and MultiXactIdCreate, which are lighter because they can check more
easily. But on second thoughts I refrained from doing that, because
surely the arrays are not as large anyway, are they.
Yea, I think it's fine to do it where it's in the proposed patch.
I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes. Does that sound okay?
+1
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
In order to encounter this issue, I'd need to have two concurrent
processes update the child records of the same parent record? That is:A ---> B1
\---> B2... and the issue should only happen if I update both B1 and B2
concurrently in separate sessions?I don't think that'll trigger it. You need rows that are first key share
locked and then updated by the locking transaction. Under
concurrency. And the timewindow really is rather small..
Well, currently I have a test which locks A and B1, then updates B1
(twice, actually), and then updates A. However, since there's a lock on
A, there's no concurrent updating of B1 and B2. This is based on the
behavior of the queue where I originally saw the problem, but it doesn't
reproduce the bug.
I'm thinking I need to just lock B1, update B1, then A, while allowing a
concurrent session to update B2 and and A. No?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM6f108ece49f82b2a8021f3c04b0642794c9acfc7315089f509a23686a4acf1435b5661cd0bf08d3dbddb93a6026edc87@asav-2.01.com
On 2014-04-22 14:49:00 -0700, Josh Berkus wrote:
In order to encounter this issue, I'd need to have two concurrent
processes update the child records of the same parent record? That is:A ---> B1
\---> B2... and the issue should only happen if I update both B1 and B2
concurrently in separate sessions?I don't think that'll trigger it. You need rows that are first key share
locked and then updated by the locking transaction. Under
concurrency. And the timewindow really is rather small..Well, currently I have a test which locks A and B1, then updates B1
(twice, actually), and then updates A. However, since there's a lock on
A, there's no concurrent updating of B1 and B2. This is based on the
behavior of the queue where I originally saw the problem, but it doesn't
reproduce the bug.I'm thinking I need to just lock B1, update B1, then A, while allowing a
concurrent session to update B2 and and A. No?
I don't think this gets any easier to reproduce by introducing more than
one relation. Have one session acquire key share locks and then update,
and another one just doing updates.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Josh Berkus wrote:
In order to encounter this issue, I'd need to have two concurrent
processes update the child records of the same parent record? That is:A ---> B1
\---> B2... and the issue should only happen if I update both B1 and B2
concurrently in separate sessions?I don't think that'll trigger it. You need rows that are first key share
locked and then updated by the locking transaction. Under
concurrency. And the timewindow really is rather small..Well, currently I have a test which locks A and B1, then updates B1
(twice, actually), and then updates A. However, since there's a lock on
A, there's no concurrent updating of B1 and B2. This is based on the
behavior of the queue where I originally saw the problem, but it doesn't
reproduce the bug.
If you want to make it easier to reproduce, you need to insert some
pg_usleep() calls in carefully selected spots. As Andres says, the
window is small normally.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund wrote:
On 2014-04-22 18:01:40 -0300, Alvaro Herrera wrote:
Thanks for the analysis and patches. I've been playing with this on my
own a bit, and one thing that I just noticed is that at least for
heap_update I cannot reproduce a problem when the xmax is originally a
multixact, so AFAICT the number of places that need patched aren't as
many.I am quite uncomfortable with that assumption. I don't immediately see a
problem for some of the cases, but leaving them in a weaker state than
9.2 makes me uncomfortable.
That's true too.
I'm thinking about the comparison of full infomask as you propose
instead of just the bits that we actually care about. I think the only
thing that could cause a spurious failure (causing an extra execution of
the HeapTupleSatisfiesUpdate call and the stuff below) is somebody
setting HEAP_XMIN_COMMITTED concurrently; but that seems infrequent
enough that it should pretty harmless. However, should we worry about
possible future infomask bit changes that could negatively affect this
behavior?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/22/2014 05:07 PM, Alvaro Herrera wrote:
If you want to make it easier to reproduce, you need to insert some
pg_usleep() calls in carefully selected spots. As Andres says, the
window is small normally.
Yeah, but the whole point of this is that having
pg_stat-statements/auto_explain loaded should make the window much
bigger. In the cases where we're hitting it, we're hitting it with a
fair degree of predicability.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM57a3ea4744ab2b2cf9d6635fb834b2c5155039c75b6f83547b363fca9398b23392c401c1041e9f7078cac93d55f90f46@asav-3.01.com
Alvaro Herrera wrote:
I'm thinking about the comparison of full infomask as you propose
instead of just the bits that we actually care about. I think the only
thing that could cause a spurious failure (causing an extra execution of
the HeapTupleSatisfiesUpdate call and the stuff below) is somebody
setting HEAP_XMIN_COMMITTED concurrently; but that seems infrequent
enough that it should pretty harmless. However, should we worry about
possible future infomask bit changes that could negatively affect this
behavior?
Here's a complete patch illustrating what I mean. This is slightly more
expensive than straight infomask comparison in terms of machine
instructions, but that seems okay to me.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
dont-multi.patchtext/x-diff; charset=us-asciiDownload
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***************
*** 2586,2591 **** compute_infobits(uint16 infomask, uint16 infomask2)
--- 2586,2612 ----
}
/*
+ * Given two versions of the same t_infomask for a tuple, compare them and
+ * return whether the relevant status for a tuple Xmax has changed. This is
+ * used after a buffer lock has been released and reacquired: we want to ensure
+ * that the tuple state continues to be the same it was when we previously
+ * examined it.
+ *
+ * Note the Xmax field itself must be compared separately.
+ */
+ static inline bool
+ xmax_infomsk_changed(uint16 infomask1, uint16 infomask2)
+ {
+ const uint16 interesting =
+ HEAP_XMAX_IS_MULTI | HEAP_XMAX_LOCK_ONLY | HEAP_LOCK_MASK;
+
+ if ((infomask1 & interesting) != (infomask2 & interesting))
+ return true;
+
+ return false;
+ }
+
+ /*
* heap_delete - delete a tuple
*
* NB: do not call this directly unless you are prepared to deal with
***************
*** 2722,2728 **** l1:
* update this tuple before we get to this point. Check for xmax
* change, and start over if so.
*/
! if (!(tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data),
xwait))
goto l1;
--- 2743,2749 ----
* update this tuple before we get to this point. Check for xmax
* change, and start over if so.
*/
! if (xmax_infomsk_changed(tp.t_data->t_infomask, infomask) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data),
xwait))
goto l1;
***************
*** 2748,2754 **** l1:
* other xact could update this tuple before we get to this point.
* Check for xmax change, and start over if so.
*/
! if ((tp.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data),
xwait))
goto l1;
--- 2769,2775 ----
* other xact could update this tuple before we get to this point.
* Check for xmax change, and start over if so.
*/
! if (xmax_infomsk_changed(tp.t_data->t_infomask, infomask) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data),
xwait))
goto l1;
***************
*** 3275,3281 **** l2:
* update this tuple before we get to this point. Check for xmax
* change, and start over if so.
*/
! if (!(oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
goto l2;
--- 3296,3302 ----
* update this tuple before we get to this point. Check for xmax
* change, and start over if so.
*/
! if (xmax_infomsk_changed(oldtup.t_data->t_infomask, infomask) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
goto l2;
***************
*** 3324,3335 **** l2:
if (HEAP_XMAX_IS_KEYSHR_LOCKED(infomask) && key_intact)
{
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
-
/*
* recheck the locker; if someone else changed the tuple while
* we weren't looking, start over.
*/
! if ((oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
--- 3345,3355 ----
if (HEAP_XMAX_IS_KEYSHR_LOCKED(infomask) && key_intact)
{
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
/*
* recheck the locker; if someone else changed the tuple while
* we weren't looking, start over.
*/
! if (xmax_infomsk_changed(oldtup.t_data->t_infomask, infomask) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
***************
*** 3350,3356 **** l2:
* some other xact could update this tuple before we get to
* this point. Check for xmax change, and start over if so.
*/
! if ((oldtup.t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
--- 3370,3376 ----
* some other xact could update this tuple before we get to
* this point. Check for xmax change, and start over if so.
*/
! if (xmax_infomsk_changed(oldtup.t_data->t_infomask, infomask) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
xwait))
***************
*** 4354,4360 **** l3:
* over.
*/
LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
! if (!(tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
{
--- 4374,4380 ----
* over.
*/
LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
! if (xmax_infomsk_changed(tuple->t_data->t_infomask, infomask) ||
!TransactionIdEquals(HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
{
***************
*** 4373,4379 **** l3:
LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
/* if the xmax changed in the meantime, start over */
! if ((tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
--- 4393,4399 ----
LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
/* if the xmax changed in the meantime, start over */
! if (xmax_infomsk_changed(tuple->t_data->t_infomask, infomask) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
***************
*** 4441,4447 **** l3:
* could update this tuple before we get to this point. Check
* for xmax change, and start over if so.
*/
! if (!(tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
--- 4461,4467 ----
* could update this tuple before we get to this point. Check
* for xmax change, and start over if so.
*/
! if (xmax_infomsk_changed(tuple->t_data->t_infomask, infomask) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
***************
*** 4497,4503 **** l3:
* some other xact could update this tuple before we get to
* this point. Check for xmax change, and start over if so.
*/
! if ((tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
--- 4517,4523 ----
* some other xact could update this tuple before we get to
* this point. Check for xmax change, and start over if so.
*/
! if (xmax_infomsk_changed(tuple->t_data->t_infomask, infomask) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(tuple->t_data),
xwait))
Alvaro Herrera wrote:
Alvaro Herrera wrote:
I'm thinking about the comparison of full infomask as you propose
instead of just the bits that we actually care about. I think the only
thing that could cause a spurious failure (causing an extra execution of
the HeapTupleSatisfiesUpdate call and the stuff below) is somebody
setting HEAP_XMIN_COMMITTED concurrently; but that seems infrequent
enough that it should pretty harmless. However, should we worry about
possible future infomask bit changes that could negatively affect this
behavior?Here's a complete patch illustrating what I mean. This is slightly more
expensive than straight infomask comparison in terms of machine
instructions, but that seems okay to me.
I have pushed a slightly tweaked version of this, after verifying that
it solves the problem in Andrew's test environment.
Josh, if you could verify that it solves the problem for you too, it'd
be great.
Thanks for the report and test case.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers