pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

Started by Simon Riggsalmost 9 years ago6 messages
#1Simon Riggs
simon@2ndQuadrant.com

Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/42b4b0b2413b9b472aaf2112a3bbfd80a6ab4dc5

Modified Files
--------------
src/backend/access/transam/xact.c | 6 +++---
src/backend/utils/time/snapmgr.c | 21 ++++++++++++++++++---
src/include/utils/snapmgr.h | 2 +-
3 files changed, 22 insertions(+), 7 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#1)
Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#0 0x00007fff9679c286 in __pthread_kill ()
#1 0x00007fff94e1a9f9 in pthread_kill ()
#2 0x00007fff9253a9a3 in abort ()
#3 0x0000000107e0659e in ExceptionalCondition (conditionName=<value
temporarily unavailable, due to optimizations>, errorType=0x6 <Address
0x6 out of bounds>, fileName=<value temporarily unavailable, due to
optimizations>, lineNumber=<value temporarily unavailable, due to
optimizations>) at assert.c:54
#4 0x0000000107e4be2b in AtEOXact_Snapshot (isCommit=<value
temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
snapmgr.c:1154
#5 0x0000000107a76c06 in CleanupTransaction () at xact.c:2643
#6 0x0000000107a76267 in CommitTransactionCommand () at xact.c:2818
#7 0x0000000107cecfc2 in exec_simple_query
(query_string=0x7f975481e640 "ABORT TRANSACTION") at postgres.c:2461
#8 0x0000000107ceabb7 in PostgresMain (argc=<value temporarily
unavailable, due to optimizations>, argv=<value temporarily
unavailable, due to optimizations>, dbname=<value temporarily
unavailable, due to optimizations>, username=<value temporarily
unavailable, due to optimizations>) at postgres.c:4071
#9 0x0000000107c6bb58 in PostmasterMain (argc=<value temporarily
unavailable, due to optimizations>, argv=<value temporarily
unavailable, due to optimizations>) at postmaster.c:4317
#10 0x0000000107be5cdd in main (argc=<value temporarily unavailable,
due to optimizations>, argv=<value temporarily unavailable, due to
optimizations>) at main.c:228

I suspect that is the fault of this patch. Please fix or revert.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#2)
Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#4 0x0000000107e4be2b in AtEOXact_Snapshot (isCommit=<value
temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
snapmgr.c:1154
#5 0x0000000107a76c06 in CleanupTransaction () at xact.c:2643

I suspect that is the fault of this patch. Please fix or revert.

Also, the entire buildfarm is turning red.

longfin, spurfowl, and magpie all show this assertion failure in the
log. I haven't checked the others.

TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#3)
Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#4 0x0000000107e4be2b in AtEOXact_Snapshot (isCommit=<value
temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
snapmgr.c:1154
#5 0x0000000107a76c06 in CleanupTransaction () at xact.c:2643

I suspect that is the fault of this patch. Please fix or revert.

Also, the entire buildfarm is turning red.

longfin, spurfowl, and magpie all show this assertion failure in the
log. I haven't checked the others.

TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

Another thing that is interesting is that when I run make -j8
check-world, the overall tests appear to succeed even though there are
failures mid-way through:

test tablefunc ... FAILED (test process exited with exit code 2)

...but then later we end with:

ok
All tests successful.
Files=11, Tests=80, 251 wallclock secs ( 0.07 usr 0.02 sys + 19.77
cusr 14.45 csys = 34.31 CPU)
Result: PASS

real 4m27.421s
user 3m50.047s
sys 1m31.937s

That's unrelated to the current problem of course, but it seems to
suggest that make's -j option doesn't entirely do what you'd expect
when used with make check-world.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndquadrant.com
In reply to: Robert Haas (#2)
Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

On 24 March 2017 at 16:14, Robert Haas <robertmhaas@gmail.com> wrote:

I suspect that is the fault of this patch. Please fix or revert.

Will revert then fix.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#4)
Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

On 2017-03-24 13:50:54 -0400, Robert Haas wrote:

On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#4 0x0000000107e4be2b in AtEOXact_Snapshot (isCommit=<value
temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
snapmgr.c:1154
#5 0x0000000107a76c06 in CleanupTransaction () at xact.c:2643

I suspect that is the fault of this patch. Please fix or revert.

Also, the entire buildfarm is turning red.

longfin, spurfowl, and magpie all show this assertion failure in the
log. I haven't checked the others.

TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

Another thing that is interesting is that when I run make -j8
check-world, the overall tests appear to succeed even though there are
failures mid-way through:

test tablefunc ... FAILED (test process exited with exit code 2)

...but then later we end with:

ok
All tests successful.
Files=11, Tests=80, 251 wallclock secs ( 0.07 usr 0.02 sys + 19.77
cusr 14.45 csys = 34.31 CPU)
Result: PASS

real 4m27.421s
user 3m50.047s
sys 1m31.937s

That's unrelated to the current problem of course, but it seems to
suggest that make's -j option doesn't entirely do what you'd expect
when used with make check-world.

That's likely the output of a different test from the one that failed.
It's a lot easier to see the result if you're doing
&& echo success || echo failure

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers