Deadlock in XLogInsert at AIX
Hi Hackers,
We are running Postgres at AIX and encoountered two strqange problems:
active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly
concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100
connections.
It sometimes happens with 9.6 stable version of Postgres but only when
it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug in
compiler or xlc-specific atomics implementation...
But there are few moments which contradicts with this hypothesis:
1. The problem is reproduce with Postgres built without optimization.
Usually compiler bugs affect only optimized code.
2. Disabling atomics doesn't help.
3. Without optimization and with LOCK_DEBUG defined time of reproducing
the problem significantly increased. With optimized code it is almost
always reproduced in few minutes.
With debug version it usually takes much more time.
But the most confusing thing is stack trace:
(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock(sema = 0x0a00000044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a0000000000d980, valptr =
0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10),
line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8,
oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line
1583 in "xlog.c"
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line
1916 in "xlog.c"
unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata =
0x000000011007ce10, StartPos = 102067874256, EndPos = 102067874328),
line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x000000011007ce10, fpw_lsn = 102067718328),
line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in
"xloginsert.c"
XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x0000000110273540, oldbuf = 40544, newbuf =
40544, oldtup = 0x0fffffffffffa2a0, newtup = 0x00000001102bb958,
old_key_tuple = (nil), all_visible_cleared = '\0',
new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x0000000110273540, otid =
0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck =
(nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode =
0x0fffffffffffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8,
newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A',
hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in
"heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple =
(nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630,
epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag =
'^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot =
0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate =
0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line
937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x00000001102b2c30), line 1516 in "nodeModifyTable.c"
ExecProcNode(node = 0x00000001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x00000001102b29e0, planstate = 0x00000001102b2c30,
use_parallel_mode = '\0', operation = CMD_UPDATE, sendTuples = '\0',
numberTuples = 0, direction = ForwardScanDirection, dest =
0x00000001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x00000001102b25c0, direction =
ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x00000001102b25c0, direction =
ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x00000001102b1510, sourceText = "UPDATE
pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;",
params = (nil), dest = 0x00000001102b7520, completionTag = ""), line 187
in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel
= '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest =
0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel
= '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest =
0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A',
setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest =
0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x0000000110115e20, count =
9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520,
altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807,
isTopLevel = '^A', dest = 0x00000001102b7520, altdest =
0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
unnamed block in exec_simple_query(query_string = "UPDATE
pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line
1094 in "postgres.c"
exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance =
tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
unnamed block in PostgresMain(argc = 1, argv = 0x0000000110119f68,
dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres",
username = "postgres"), line 4076 in "postgres.c"
BackendRun(port = 0x0000000110114290), line 4279 in "postmaster.c"
BackendStartup(port = 0x0000000110114290), line 3953 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
ServerLoop(), line 1701 in "postmaster.c"
PostmasterMain(argc = 3, argv = 0x00000001100c6190), line 1309 in
"postmaster.c"
main(argc = 3, argv = 0x00000001100c6190), line 228 in "main.c"
As I already mentioned, we built Postgres with LOCK_DEBUG , so we can
inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the
function:
if (isLogSwitch)
WALInsertLockAcquireExclusive();
else
WALInsertLockAcquire();
WALInsertLockAcquire just selects random item from WALInsertLocks array
and exclusively locks:
if (lockToTry == -1)
lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
MyLockNo = lockToTry;
immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
Then, following the stack trace, AdvanceXLInsertBuffer calls
WaitXLogInsertionsToFinish:
/*
* Now that we have an up-to-date LogwrtResult value, see if we
* still need to write it or if someone else already did.
*/
if (LogwrtResult.Write < OldPageRqstPtr)
{
/*
* Must acquire write lock. Release WALBufMappingLock
first,
* to make sure that all insertions that we need to
wait for
* can finish (up to this same position). Otherwise we risk
* deadlock.
*/
LWLockRelease(WALBufMappingLock);
WaitXLogInsertionsToFinish(OldPageRqstPtr);
LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);
It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:
for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
{
XLogRecPtr insertingat = InvalidXLogRecPtr;
do
{
/*
* See if this insertion is in progress. LWLockWait will
wait for
* the lock to be released, or for the 'value' to be set by a
* LWLockUpdateVar call. When a lock is initially
acquired, its
* value is 0 (InvalidXLogRecPtr), which means that we
don't know
* where it's inserting yet. We will have to wait for it. If
* it's a small insertion, the record will most likely fit
on the
* same page and the inserter will release the lock without
ever
* calling LWLockUpdateVar. But if it has to sleep, it will
* advertise the insertion point with LWLockUpdateVar before
* sleeping.
*/
if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
&WALInsertLocks[i].l.insertingAt,
insertingat, &insertingat))
And here we stuck!
The comment to WaitXLogInsertionsToFinish says:
* Note: When you are about to write out WAL, you must call this function
* *before* acquiring WALWriteLock, to avoid deadlocks. This function might
* need to wait for an insertion to finish (or at least advance to next
* uninitialized page), and the inserter might need to evict an old WAL
buffer
* to make room for a new one, which in turn requires WALWriteLock.
Which contradicts to the observed stack trace.
I wonder if it is really synchronization bug in xlog.c or there is
something wrong in this stack trace and it can not happen in case of
normal work?
Thanks in advance,
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
More information about the problem - Postgres log contains several records:
2017-01-24 19:15:20.272 MSK [19270462] LOG: request to flush past end
of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0
and them correspond to the time when deadlock happen.
There is the following comment in xlog.c concerning this message:
/*
* No-one should request to flush a piece of WAL that hasn't even been
* reserved yet. However, it can happen if there is a block with a
bogus
* LSN on disk, for example. XLogFlush checks for that situation and
* complains, but only after the flush. Here we just assume that to
mean
* that all WAL that has been reserved needs to be finished. In this
* corner-case, the return value can be smaller than 'upto' argument.
*/
So looks like it should not happen.
The first thing to suspect is spinlock implementation which is different
for GCC and XLC.
But ... if I rebuild Postgres without spinlocks, then the problem is
still reproduced.
On 24.01.2017 17:47, Konstantin Knizhnik wrote:
Hi Hackers,
We are running Postgres at AIX and encoountered two strqange problems:
active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly
concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100
connections.It sometimes happens with 9.6 stable version of Postgres but only when
it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug
in compiler or xlc-specific atomics implementation...
But there are few moments which contradicts with this hypothesis:1. The problem is reproduce with Postgres built without optimization.
Usually compiler bugs affect only optimized code.
2. Disabling atomics doesn't help.
3. Without optimization and with LOCK_DEBUG defined time of
reproducing the problem significantly increased. With optimized code
it is almost always reproduced in few minutes.
With debug version it usually takes much more time.But the most confusing thing is stack trace:
(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock(sema = 0x0a00000044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a0000000000d980, valptr =
0x0a0000000000d9a8, oldval = 102067874256, newval =
0x0fffffffffff9c10), line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a0000000000d980, valptr =
0x0a0000000000d9a8, oldval = 102067874256, newval =
0x0fffffffffff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line
1583 in "xlog.c"
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line
1916 in "xlog.c"
unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata =
0x000000011007ce10, StartPos = 102067874256, EndPos = 102067874328),
line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x000000011007ce10, fpw_lsn = 102067718328),
line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in
"xloginsert.c"
XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x0000000110273540, oldbuf = 40544, newbuf =
40544, oldtup = 0x0fffffffffffa2a0, newtup = 0x00000001102bb958,
old_key_tuple = (nil), all_visible_cleared = '\0',
new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x0000000110273540, otid =
0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck =
(nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode =
0x0fffffffffffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8,
newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A',
hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212
in "heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple =
(nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630,
epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag
= '^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot =
0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate =
0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'),
line 937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x00000001102b2c30), line 1516 in
"nodeModifyTable.c"
ExecProcNode(node = 0x00000001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x00000001102b29e0, planstate =
0x00000001102b2c30, use_parallel_mode = '\0', operation = CMD_UPDATE,
sendTuples = '\0', numberTuples = 0, direction = ForwardScanDirection,
dest = 0x00000001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x00000001102b25c0, direction =
ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x00000001102b25c0, direction =
ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x00000001102b1510, sourceText = "UPDATE
pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;",
params = (nil), dest = 0x00000001102b7520, completionTag = ""), line
187 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20,
isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520,
altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20,
isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520,
altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A',
setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest =
0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x0000000110115e20, count =
9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520,
altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807,
isTopLevel = '^A', dest = 0x00000001102b7520, altdest =
0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
unnamed block in exec_simple_query(query_string = "UPDATE
pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"),
line 1094 in "postgres.c"
exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance
= tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
unnamed block in PostgresMain(argc = 1, argv = 0x0000000110119f68,
dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres",
username = "postgres"), line 4076 in "postgres.c"
BackendRun(port = 0x0000000110114290), line 4279 in "postmaster.c"
BackendStartup(port = 0x0000000110114290), line 3953 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
ServerLoop(), line 1701 in "postmaster.c"
PostmasterMain(argc = 3, argv = 0x00000001100c6190), line 1309 in
"postmaster.c"
main(argc = 3, argv = 0x00000001100c6190), line 228 in "main.c"As I already mentioned, we built Postgres with LOCK_DEBUG , so we can
inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the
function:if (isLogSwitch)
WALInsertLockAcquireExclusive();
else
WALInsertLockAcquire();WALInsertLockAcquire just selects random item from WALInsertLocks
array and exclusively locks:if (lockToTry == -1)
lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
MyLockNo = lockToTry;
immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);Then, following the stack trace, AdvanceXLInsertBuffer calls
WaitXLogInsertionsToFinish:/*
* Now that we have an up-to-date LogwrtResult value, see
if we
* still need to write it or if someone else already did.
*/
if (LogwrtResult.Write < OldPageRqstPtr)
{
/*
* Must acquire write lock. Release WALBufMappingLock
first,
* to make sure that all insertions that we need to
wait for
* can finish (up to this same position). Otherwise we
risk
* deadlock.
*/
LWLockRelease(WALBufMappingLock);WaitXLogInsertionsToFinish(OldPageRqstPtr);
LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);
It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
{
XLogRecPtr insertingat = InvalidXLogRecPtr;do
{
/*
* See if this insertion is in progress. LWLockWait will
wait for
* the lock to be released, or for the 'value' to be set by a
* LWLockUpdateVar call. When a lock is initially
acquired, its
* value is 0 (InvalidXLogRecPtr), which means that we
don't know
* where it's inserting yet. We will have to wait for it. If
* it's a small insertion, the record will most likely fit
on the
* same page and the inserter will release the lock
without ever
* calling LWLockUpdateVar. But if it has to sleep, it will
* advertise the insertion point with LWLockUpdateVar before
* sleeping.
*/
if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
&WALInsertLocks[i].l.insertingAt,
insertingat, &insertingat))And here we stuck!
The comment to WaitXLogInsertionsToFinish says:* Note: When you are about to write out WAL, you must call this function
* *before* acquiring WALWriteLock, to avoid deadlocks. This function
might
* need to wait for an insertion to finish (or at least advance to next
* uninitialized page), and the inserter might need to evict an old
WAL buffer
* to make room for a new one, which in turn requires WALWriteLock.Which contradicts to the observed stack trace.
I wonder if it is really synchronization bug in xlog.c or there is
something wrong in this stack trace and it can not happen in case of
normal work?Thanks in advance,
--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi Konstantin,
We had observed exactly the same issues on a customer system with the
same environment and PostgreSQL 9.5.5. Additionally, we've tested on
Linux with XL/C 12 and 13 with exactly the same deadlock behavior.
So we assumed that this is somehow a compiler issue.
Am Dienstag, den 24.01.2017, 19:26 +0300 schrieb Konstantin Knizhnik:
More information about the problem - Postgres log contains several
records:2017-01-24 19:15:20.272 MSK [19270462] LOG: request to flush past
end
of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0and them correspond to the time when deadlock happen.
Yeah, the same logs here:
LOG: request to flush past end of generated WAL; request 1/1F4C6000,
currpos 1/1F4C40E0
STATEMENT: UPDATE pgbench_accounts SET abalance = abalance + -2653
WHERE aid = 3662494;
There is the following comment in xlog.c concerning this message:
/*
* No-one should request to flush a piece of WAL that hasn't
even been
* reserved yet. However, it can happen if there is a block with
a
bogus
* LSN on disk, for example. XLogFlush checks for that situation
and
* complains, but only after the flush. Here we just assume that
to
mean
* that all WAL that has been reserved needs to be finished. In
this
* corner-case, the return value can be smaller than 'upto'
argument.
*/So looks like it should not happen.
The first thing to suspect is spinlock implementation which is
different
for GCC and XLC.
But ... if I rebuild Postgres without spinlocks, then the problem is
still reproduced.
Before we got the results from XLC on Linux (where Postgres show the
same behavior) i had a look into the spinlock implementation. If i got
it right, XLC doesn't use the ppc64 specific ones, but the fallback
implementation (system monitoring on AIX also has shown massive calls
for signal(0)...). So i tried the following patch:
diff --git a/src/include/port/atomics/arch-ppc.h
b/src/include/port/atomics/arch-ppc.h
new file mode 100644
index f901a0c..028cced
*** a/src/include/port/atomics/arch-ppc.h
--- b/src/include/port/atomics/arch-ppc.h
***************
*** 23,26 ****
--- 23,33 ----
#define pg_memory_barrier_impl() __asm__ __volatile__ ("sync" :
: :
"memory")
#define pg_read_barrier_impl() __asm__ __volatile__
("lwsync" : : : "memory")
#define pg_write_barrier_impl() __asm__ __volatile__
("lwsync" : : : "memory")
+
+ #elif defined(__IBMC__) || defined(__IBMCPP__)
+
+ #define pg_memory_barrier_impl() __asm__ __volatile__ (" sync
\n"
::: "memory")
+ #define pg_read_barrier_impl() __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+ #define pg_write_barrier_impl() __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+
#endif
This didn't change the picture, though.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
As I already mentioned, we built Postgres with LOCK_DEBUG , so we can
inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the
function:if (isLogSwitch)
WALInsertLockAcquireExclusive();
else
WALInsertLockAcquire();WALInsertLockAcquire just selects random item from WALInsertLocks array
and exclusively locks:if (lockToTry == -1)
lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
MyLockNo = lockToTry;
immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);Then, following the stack trace, AdvanceXLInsertBuffer calls
WaitXLogInsertionsToFinish:/*
* Now that we have an up-to-date LogwrtResult value, see if we
* still need to write it or if someone else already did.
*/
if (LogwrtResult.Write < OldPageRqstPtr)
{
/*
* Must acquire write lock. Release WALBufMappingLock
first,
* to make sure that all insertions that we need to
wait for
* can finish (up to this same position). Otherwise we risk
* deadlock.
*/
LWLockRelease(WALBufMappingLock);WaitXLogInsertionsToFinish(OldPageRqstPtr);
LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);
It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
{
XLogRecPtr insertingat = InvalidXLogRecPtr;do
{
/*
* See if this insertion is in progress. LWLockWait will
wait for
* the lock to be released, or for the 'value' to be set by a
* LWLockUpdateVar call. When a lock is initially
acquired, its
* value is 0 (InvalidXLogRecPtr), which means that we
don't know
* where it's inserting yet. We will have to wait for it. If
* it's a small insertion, the record will most likely fit
on the
* same page and the inserter will release the lock without
ever
* calling LWLockUpdateVar. But if it has to sleep, it will
* advertise the insertion point with LWLockUpdateVar before
* sleeping.
*/
if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
&WALInsertLocks[i].l.insertingAt,
insertingat, &insertingat))And here we stuck!
Interesting.. What should happen here is that for the backend's own
insertion slot, the "insertingat" value should be greater than the
requested flush point ('upto' variable). That's because before
GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the backend's
insertingat value, to the position that it wants to insert to. And
AdvanceXLInsertBuffer() only calls WaitXLogInsertionsToFinish() with
value smaller than what was passed as the 'upto' argument.
The comment to WaitXLogInsertionsToFinish says:
* Note: When you are about to write out WAL, you must call this function
* *before* acquiring WALWriteLock, to avoid deadlocks. This function might
* need to wait for an insertion to finish (or at least advance to next
* uninitialized page), and the inserter might need to evict an old WAL
buffer
* to make room for a new one, which in turn requires WALWriteLock.Which contradicts to the observed stack trace.
Not AFAICS. In the stack trace you showed, the backend is not holding
WALWriteLock. It would only acquire it after the
WaitXLogInsertionsToFinish() call finished.
I wonder if it is really synchronization bug in xlog.c or there is
something wrong in this stack trace and it can not happen in case of
normal work?
Yeah, hard to tell. Something is clearly wrong..
This line in the stack trace is suspicious:
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer() should only ever call
WaitXLogInsertionsToFinish() with an xlog position that points to a page
bounary, but that upto value points to the middle of a page.
Perhaps the value stored in the stack trace is not what the caller
passed, but it was updated because it was past the 'reserveUpto' value?
That would explain the "request to flush past end
of generated WAL" notices you saw in the log. Now, why would that
happen, I have no idea.
If you can and want to provide me access to the system, I could have a
look myself. I'd also like to see if the attached additional Assertions
will fire.
- Heikki
Attachments:
extra-asserts-in-AdvanceXLInsertBuffer.patchtext/plain; charset=UTF-8; name=extra-asserts-in-AdvanceXLInsertBuffer.patchDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2f5d603066..a2ea03506a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1940,6 +1940,7 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic)
* already written out.
*/
OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
+ Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);
if (LogwrtResult.Write < OldPageRqstPtr)
{
/*
@@ -1970,6 +1971,10 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic)
*/
LWLockRelease(WALBufMappingLock);
+ /* we should exit the loop before reaching a page beyond the
+ * requested 'upto' position. */
+ Assert(OldPageRqstPtr <= upto || opportunistic);
+
WaitXLogInsertionsToFinish(OldPageRqstPtr);
LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);
On 30.01.2017 19:21, Heikki Linnakangas wrote:
On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
Interesting.. What should happen here is that for the backend's own
insertion slot, the "insertingat" value should be greater than the
requested flush point ('upto' variable). That's because before
GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the
backend's insertingat value, to the position that it wants to insert
to. And AdvanceXLInsertBuffer() only calls
WaitXLogInsertionsToFinish() with value smaller than what was passed
as the 'upto' argument.The comment to WaitXLogInsertionsToFinish says:
* Note: When you are about to write out WAL, you must call this
function
* *before* acquiring WALWriteLock, to avoid deadlocks. This
function might
* need to wait for an insertion to finish (or at least advance to next
* uninitialized page), and the inserter might need to evict an old WAL
buffer
* to make room for a new one, which in turn requires WALWriteLock.Which contradicts to the observed stack trace.
Not AFAICS. In the stack trace you showed, the backend is not holding
WALWriteLock. It would only acquire it after the
WaitXLogInsertionsToFinish() call finished.
Hmmm, may be I missed something.
I am not telling about WALBufMappingLock which is required after return
from XLogInsertionsToFinish.
But about lock obtained by WALInsertLockAcquire at line 946 in
XLogInsertRecord.
It will be release at line 1021 by WALInsertLockRelease(). But
CopyXLogRecordToWAL is invoked with this lock granted.
This line in the stack trace is suspicious:
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer() should only ever call
WaitXLogInsertionsToFinish() with an xlog position that points to a
page bounary, but that upto value points to the middle of a page.Perhaps the value stored in the stack trace is not what the caller
passed, but it was updated because it was past the 'reserveUpto'
value? That would explain the "request to flush past end
of generated WAL" notices you saw in the log. Now, why would that
happen, I have no idea.If you can and want to provide me access to the system, I could have a
look myself. I'd also like to see if the attached additional
Assertions will fire.
I really get this assertion failed:
ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= upto ||
opportunistic)", errorType = "FailedAssertion", fileName = "xlog.c",
lineNumber = 1917), line 54 in "assert.c"
(dbx) up
unnamed block in AdvanceXLInsertBuffer(upto = 147439056632,
opportunistic = '\0'), line 1917 in "xlog.c"
(dbx) p OldPageRqstPtr
147439058944
(dbx) p upto
147439056632
(dbx) p opportunistic
'\0'
Also , in another run, I encountered yet another assertion failure:
ExceptionalCondition(conditionName = "!((((NewPageBeginPtr) / 8192) %
(XLogCtl->XLogCacheBlck + 1)) == nextidx)", errorType =
"FailedAssertion", fileName = "xlog.c", lineNumber = 1950), line 54 in
"assert.c"
nextidx equals to 1456, while expected value is 1457.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
One more assertion failure:
ExceptionalCondition(conditionName = "!(OldPageRqstPtr <=
XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName =
"xlog.c", lineNumber = 1887), line 54 in "assert.c"
(dbx) p OldPageRqstPtr
153551667200
(dbx) p XLogCtl->InitializedUpTo
153551667200
(dbx) p InitializedUpTo
153551659008
I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo
in local variable:
1870 LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
1871
1872 /*
1873 * Now that we have the lock, check if someone
initialized the page
1874 * already.
1875 */
1876 while (upto >= XLogCtl->InitializedUpTo || opportunistic)
1877 {
1878 XLogRecPtr InitializedUpTo =
XLogCtl->InitializedUpTo;
1879 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
1880
1881 /*
1882 * Get ending-offset of the buffer page we need
to replace (this may
1883 * be zero if the buffer hasn't been used yet).
Fall through if it's
1884 * already written out.
1885 */
1886 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
1887 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);
And, as you can see, XLogCtl->InitializedUpTo is not equal to saved
value InitializedUpTo.
But we are under exclusive WALBufMappingLock and InitializedUpTo is
updated only under this lock.
So it means that LW-locks doesn't work!
I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync
in prologue:
(dbx) listi pg_atomic_compare_exchange_u32_impl
0x1000817bc (pg_atomic_compare_exchange_u32_impl+0x1c)
e88100b0 ld r4,0xb0(r1)
0x1000817c0 (pg_atomic_compare_exchange_u32_impl+0x20)
e86100b8 ld r3,0xb8(r1)
0x1000817c4 (pg_atomic_compare_exchange_u32_impl+0x24)
800100c0 lwz r0,0xc0(r1)
0x1000817c8 (pg_atomic_compare_exchange_u32_impl+0x28) 7c0007b4
extsw r0,r0
0x1000817cc (pg_atomic_compare_exchange_u32_impl+0x2c)
e8a30002 lwa r5,0x0(r3)
0x1000817d0 (pg_atomic_compare_exchange_u32_impl+0x30) 7cc02028
lwarx r6,r0,r4,0x0
0x1000817d4 (pg_atomic_compare_exchange_u32_impl+0x34)
7c053040 cmpl cr0,0x0,r5,r6
0x1000817d8 (pg_atomic_compare_exchange_u32_impl+0x38)
4082000c bne 0x1000817e4
(pg_atomic_compare_exchange_u32_impl+0x44)
0x1000817dc (pg_atomic_compare_exchange_u32_impl+0x3c) 7c00212d
stwcx. r0,r0,r4
0x1000817e0 (pg_atomic_compare_exchange_u32_impl+0x40)
40e2fff0 bne+ 0x1000817d0
(pg_atomic_compare_exchange_u32_impl+0x30)
0x1000817e4 (pg_atomic_compare_exchange_u32_impl+0x44)
60c00000 ori r0,r6,0x0
0x1000817e8 (pg_atomic_compare_exchange_u32_impl+0x48)
90030000 stw r0,0x0(r3)
0x1000817ec (pg_atomic_compare_exchange_u32_impl+0x4c)
7c000026 mfcr r0
0x1000817f0 (pg_atomic_compare_exchange_u32_impl+0x50) 54001ffe
rlwinm r0,r0,0x3,0x1f,0x1f
0x1000817f4 (pg_atomic_compare_exchange_u32_impl+0x54) 78000620
rldicl r0,r0,0x0,0x19
0x1000817f8 (pg_atomic_compare_exchange_u32_impl+0x58)
98010070 stb r0,0x70(r1)
0x1000817fc (pg_atomic_compare_exchange_u32_impl+0x5c) 4c00012c
isync
0x100081800 (pg_atomic_compare_exchange_u32_impl+0x60)
88610070 lbz r3,0x70(r1)
0x100081804 (pg_atomic_compare_exchange_u32_impl+0x64)
48000004 b 0x100081808
(pg_atomic_compare_exchange_u32_impl+0x68)
0x100081808 (pg_atomic_compare_exchange_u32_impl+0x68)
38210080 addi r1,0x80(r1)
0x10008180c (pg_atomic_compare_exchange_u32_impl+0x6c)
4e800020 blr
Source code of pg_atomic_compare_exchange_u32_impl is the following:
static inline bool
pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
uint32 *expected, uint32 newval)
{
bool ret;
/*
* atomics.h specifies sequential consistency ("full barrier
semantics")
* for this interface. Since "lwsync" provides acquire/release
* consistency only, do not use it here. GCC atomics observe the same
* restriction; see its rs6000_pre_atomic_barrier().
*/
__asm__ __volatile__ (" sync \n" ::: "memory");
/*
* XXX: __compare_and_swap is defined to take signed parameters,
but that
* shouldn't matter since we don't perform any arithmetic operations.
*/
ret = __compare_and_swap((volatile int*)&ptr->value,
(int *)expected, (int)newval);
/*
* xlc's documentation tells us:
* "If __compare_and_swap is used as a locking primitive, insert a
call to
* the __isync built-in function at the start of any critical
sections."
*
* The critical section begins immediately after __compare_and_swap().
*/
__isync();
return ret;
}
and if I compile this fuctions standalone, I get the following assembler
code:
.pg_atomic_compare_exchange_u32_impl: # 0x0000000000000000 (H.4.NO_SYMBOL)
stdu SP,-128(SP)
std r3,176(SP)
std r4,184(SP)
std r5,192(SP)
ld r0,192(SP)
stw r0,192(SP)
sync
ld r4,176(SP)
ld r3,184(SP)
lwz r0,192(SP)
extsw r0,r0
lwa r5,0(r3)
__L30: # 0x0000000000000030
(H.4.NO_SYMBOL+0x030)
lwarx r6,r0,r4
cmpl 0,0,r5,r6
bc BO_IF_NOT,CR0_EQ,__L44
stwcx. r0,r0,r4
.machine "any"
bc BO_IF_NOT_3,CR0_EQ,__L30
__L44: # 0x0000000000000044
(H.4.NO_SYMBOL+0x044)
ori r0,r6,0x0000
stw r0,0(r3)
mfcr r0
rlwinm r0,r0,3,31,31
rldicl r0,r0,0,56
stb r0,112(SP)
isync
lbz r3,112(SP)
addi SP,SP,128
bclr BO_ALWAYS,CR0_LT
sync is here!
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:
One more assertion failure:
ExceptionalCondition(conditionName = "!(OldPageRqstPtr <=
XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName =
"xlog.c", lineNumber = 1887), line 54 in "assert.c"(dbx) p OldPageRqstPtr
153551667200
(dbx) p XLogCtl->InitializedUpTo
153551667200
(dbx) p InitializedUpTo
153551659008I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo
in local variable:1870 LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
1871
1872 /*
1873 * Now that we have the lock, check if someone
initialized the page
1874 * already.
1875 */
1876 while (upto >= XLogCtl->InitializedUpTo || opportunistic)
1877 {
1878 XLogRecPtr InitializedUpTo =
XLogCtl->InitializedUpTo;
1879 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
1880
1881 /*
1882 * Get ending-offset of the buffer page we need
to replace (this may
1883 * be zero if the buffer hasn't been used yet).
Fall through if it's
1884 * already written out.
1885 */
1886 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
1887 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);And, as you can see, XLogCtl->InitializedUpTo is not equal to saved
value InitializedUpTo.
But we are under exclusive WALBufMappingLock and InitializedUpTo is
updated only under this lock.
So it means that LW-locks doesn't work!
Yeah, so it seems. XLogCtl->InitializeUpTo is quite clearly protected by
the WALBufMappingLock. All references to it (after StartupXLog) happen
while holding the lock.
Can you get the assembly output of the AdvanceXLInsertBuffer() function?
I wonder if the compiler is rearranging things so that
XLogCtl->InitializedUpTo is fetched before the LWLockAcquire call. Or
should there be a memory barrier instruction somewhere in LWLockAcquire?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Oh, you were one step ahead of me, I didn't understand it on first read
of your email. Need more coffee..
On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:
I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync
in prologue:(dbx) listi pg_atomic_compare_exchange_u32_impl
[no sync instruction]
and if I compile this fuctions standalone, I get the following assembler
code:.pg_atomic_compare_exchange_u32_impl: # 0x0000000000000000 (H.4.NO_SYMBOL)
stdu SP,-128(SP)
std r3,176(SP)
std r4,184(SP)
std r5,192(SP)
ld r0,192(SP)
stw r0,192(SP)
sync
ld r4,176(SP)
ld r3,184(SP)
lwz r0,192(SP)
extsw r0,r0
lwa r5,0(r3)
...sync is here!
Ok, so, the 'sync' instruction gets lost somehow. That "standalone"
assemly version looks slightly different in other ways too, you perhaps
used different optimization levels, or it looks different when it's
inlined into the caller. Not sure which version of the function gdb
would show, when it's a "static inline" function. Would be good to check
the disassembly of LWLockAttemptLock(), to see if the 'sync' is there.
Certainly seems like a compiler bug, though.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:
* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.
But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.
Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.
Thanks to everybody who helped me to locate and fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
xlc.patchtext/x-patch; name=xlc.patchDownload
diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
index 2b54c42..5828f7e 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,10 @@
#define pg_memory_barrier_impl() __asm__ __volatile__ ("sync" : : : "memory")
#define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
#define pg_write_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
+
+#else
+#define pg_memory_barrier_impl() __sync()
+#define pg_read_barrier_impl() __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
#endif
diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h
index f4fd2f3..531d17c 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -36,7 +36,8 @@ typedef struct pg_atomic_uint64
#endif /* __64BIT__ */
#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
static inline bool
pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
uint32 *expected, uint32 newval)
{
@@ -48,14 +49,14 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
* consistency only, do not use it here. GCC atomics observe the same
* restriction; see its rs6000_pre_atomic_barrier().
*/
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
/*
* XXX: __compare_and_swap is defined to take signed parameters, but that
* shouldn't matter since we don't perform any arithmetic operations.
*/
ret = __compare_and_swap((volatile int*)&ptr->value,
@@ -77,6 +78,7 @@ pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
* __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
* providing sequential consistency. This is undocumented.
*/
+ __sync();
return __fetch_and_add((volatile int *)&ptr->value, add_);
}
@@ -89,10 +91,10 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
{
bool ret;
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
ret = __compare_and_swaplp((volatile long*)&ptr->value,
(long *)expected, (long)newval);
__isync();
@@ -103,7 +105,8 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
static inline uint64
pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
{
- return __fetch_and_addlp((volatile long *)&ptr->value, add_);
+ __sync();
+ return __fetch_and_addlp((volatile long *)&ptr->value, add_);
}
#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 7aad2de..c6ef114 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -832,9 +831,8 @@ typedef unsigned int slock_t;
#include <sys/atomic_op.h>
typedef int slock_t;
-
-#define TAS(lock) _check_lock((slock_t *) (lock), 0, 1)
-#define S_UNLOCK(lock) _clear_lock((slock_t *) (lock), 0)
+#define TAS(lock) __check_lock_mp((slock_t *) (lock), 0, 1)
+#define S_UNLOCK(lock) __clear_lock_mp((slock_t *) (lock), 0)
#endif /* _AIX */
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.
Seems like it was not so much undocumented, but an implementation detail
that was not guaranteed after all..
Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless we
can find conclusive documentation on that, I think we should assume that
an __isync() is required, too.
Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.
__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that the
"asm" sections just disappeared.
In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync()
and __lwsync() intrinsics? Those are an xlc compiler-specific thing,
right? Or if they are expected to work on any ppc compiler, then we
should probably use them always, instead of the asm sections.
In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.
- Heikki
Attachments:
xlc-heikki-1.patchtext/plain; charset=UTF-8; name=xlc-heikki-1.patchDownload
diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
index ed1cd9d1b9..7cf8c8ef97 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,11 @@
#define pg_memory_barrier_impl() __asm__ __volatile__ ("sync" : : : "memory")
#define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
#define pg_write_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
+
+#if defined(__IBMC__) || defined(__IBMCPP__)
+
+#define pg_memory_barrier_impl() __sync()
+#define pg_read_barrier_impl() __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
#endif
diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h
index f854612d39..e1dd3310a5 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
* consistency only, do not use it here. GCC atomics observe the same
* restriction; see its rs6000_pre_atomic_barrier().
*/
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
/*
* XXX: __compare_and_swap is defined to take signed parameters, but that
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
static inline uint32
pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
{
+ uint32 ret;
+
/*
- * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
- * providing sequential consistency. This is undocumented.
+ * Use __sync() before and __isync() after, like in compare-exchange
+ * above.
*/
- return __fetch_and_add((volatile int *)&ptr->value, add_);
+ __sync();
+
+ ret = __fetch_and_add((volatile int *)&ptr->value, add_);
+
+ __isync();
+
+ return ret;
}
#ifdef PG_HAVE_ATOMIC_U64_SUPPORT
@@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
{
bool ret;
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
ret = __compare_and_swaplp((volatile long*)&ptr->value,
(long *)expected, (long)newval);
@@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
static inline uint64
pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
{
- return __fetch_and_addlp((volatile long *)&ptr->value, add_);
+ uint64 ret;
+
+ __sync();
+
+ ret = __fetch_and_addlp((volatile long *)&ptr->value, add_);
+
+ __isync();
+
+ return ret;
}
#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.
Seems like it was not so much undocumented, but an implementation detail
that was not guaranteed after all..
Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless we
can find conclusive documentation on that, I think we should assume that
an __isync() is required, too.
There was a long thread on these things the last time this was changed:
/messages/by-id/20160425185204.jrvlghn3jxulsb7i@alap3.anarazel.de.
I couldn't find an explanation there of why we thought that
fetch_and_add implicitly performs sync and isync.
Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.
__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that the
"asm" sections just disappeared.
In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync()
and __lwsync() intrinsics? Those are an xlc compiler-specific thing,
right? Or if they are expected to work on any ppc compiler, then we
should probably use them always, instead of the asm sections.
In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.
- Heikki
Attachments:
xlc-heikki-1.patchtext/plain; charset=UTF-8; name=xlc-heikki-1.patchDownload
diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
index ed1cd9d1b9..7cf8c8ef97 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,11 @@
#define pg_memory_barrier_impl() __asm__ __volatile__ ("sync" : : : "memory")
#define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
#define pg_write_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
+
+#if defined(__IBMC__) || defined(__IBMCPP__)
+
+#define pg_memory_barrier_impl() __sync()
+#define pg_read_barrier_impl() __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
#endif
diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h
index f854612d39..e1dd3310a5 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
* consistency only, do not use it here. GCC atomics observe the same
* restriction; see its rs6000_pre_atomic_barrier().
*/
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
/*
* XXX: __compare_and_swap is defined to take signed parameters, but that
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
static inline uint32
pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
{
+ uint32 ret;
+
/*
- * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
- * providing sequential consistency. This is undocumented.
+ * Use __sync() before and __isync() after, like in compare-exchange
+ * above.
*/
- return __fetch_and_add((volatile int *)&ptr->value, add_);
+ __sync();
+
+ ret = __fetch_and_add((volatile int *)&ptr->value, add_);
+
+ __isync();
+
+ return ret;
}
#ifdef PG_HAVE_ATOMIC_U64_SUPPORT
@@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
{
bool ret;
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
ret = __compare_and_swaplp((volatile long*)&ptr->value,
(long *)expected, (long)newval);
@@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
static inline uint64
pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
{
- return __fetch_and_addlp((volatile long *)&ptr->value, add_);
+ uint64 ret;
+
+ __sync();
+
+ ret = __fetch_and_addlp((volatile long *)&ptr->value, add_);
+
+ __isync();
+
+ return ret;
}
#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
Hi,
I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
http://bullfreeware.com/search.php?package=postgresql )
I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)
For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.
Am I missing something ?
I configure the build of PostgreSQL with (in 64bits):
./configure
--prefix=/opt/freeware
--libdir=/opt/freeware/lib64
--mandir=/opt/freeware/man
--with-perl
--with-tcl
--with-tclconfig=/opt/freeware/lib
--with-python
--with-ldap
--with-openssl
--with-libxml
--with-libxslt
--enable-nls
--enable-thread-safety
--sysconfdir=/etc/sysconfig/postgresql
Am I missing some option for more optimization on AIX ?
Thanks
Regards,
Tony
Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:
* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.
But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.
Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.
Thanks to everybody who helped me to locate and fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.
AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.
AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.
ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.
Hi,
We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced
only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100
connections during several minutes to reproduce the problem.
So may be you just didn't notice it;)
On 01.02.2017 16:29, REIX, Tony wrote:
Hi,
I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at
http://http://bullfreeware.com/ <http://http://bullfreeware.com/> .
http://bullfreeware.com/search.php?package=postgresql)I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)For now, with version 9.6.1, all tests "check-world", plus
numeric_bigtest, are OK, in both 32 & 64bit versions.Am I missing something ?
I configure the build of PostgreSQL with (in 64bits):
./configure
--prefix=/opt/freeware
--libdir=/opt/freeware/lib64
--mandir=/opt/freeware/man
--with-perl
--with-tcl
--with-tclconfig=/opt/freeware/lib
--with-python
--with-ldap
--with-openssl
--with-libxml
--with-libxslt
--enable-nls
--enable-thread-safety
--sysconfdir=/etc/sysconfig/postgresqlAm I missing some option for more optimization on AIX ?
Thanks
Regards,
Tony
Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.Thanks to everybody who helped me to locate and fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyATOS WARNING !
This message contains attachments that could potentially harm your
computer.
Please make sure you open ONLY attachments from senders you know,
trust and is in an e-mail that you are expecting.AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement
endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes
provenant d’emails que vous attendez et dont vous connaissez les
expéditeurs et leur faites confiance.AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su
ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de
confianza y que procedan de un correo esperado.ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer
beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender
den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit
Anlagen erwarten.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 01.02.2017 15:39, Heikki Linnakangas wrote:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.Seems like it was not so much undocumented, but an implementation
detail that was not guaranteed after all..Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless
we can find conclusive documentation on that, I think we should assume
that an __isync() is required, too.There was a long thread on these things the last time this was
changed:
/messages/by-id/20160425185204.jrvlghn3jxulsb7i@alap3.anarazel.de.
I couldn't find an explanation there of why we thought that
fetch_and_add implicitly performs sync and isync.Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that
the "asm" sections just disappeared.In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the
__sync() and __lwsync() intrinsics? Those are an xlc compiler-specific
thing, right? Or if they are expected to work on any ppc compiler,
then we should probably use them always, instead of the asm sections.In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.
Why do you prefer to use _check_lock instead of __check_lock_mp ?
First one is even not mentioned in XLC compiler manual:
http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7
or
http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm
- Heikki
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 01.02.2017 15:39, Heikki Linnakangas wrote:
In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.
Attached pleased find fixed version of your patch.
I verified that it is correctly applied, build and postgres normally
works with it.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
xlc-heikki-2.patchtext/x-patch; name=xlc-heikki-2.patchDownload
diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
index ed1cd9d1b9..7cf8c8ef97 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,11 @@
#define pg_memory_barrier_impl() __asm__ __volatile__ ("sync" : : : "memory")
#define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
#define pg_write_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
+
+#elif defined(__IBMC__) || defined(__IBMCPP__)
+
+#define pg_memory_barrier_impl() __sync()
+#define pg_read_barrier_impl() __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
#endif
diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h
index f854612d39..e1dd3310a5 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
* consistency only, do not use it here. GCC atomics observe the same
* restriction; see its rs6000_pre_atomic_barrier().
*/
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
/*
* XXX: __compare_and_swap is defined to take signed parameters, but that
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
static inline uint32
pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
{
+ uint32 ret;
+
/*
- * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
- * providing sequential consistency. This is undocumented.
+ * Use __sync() before and __isync() after, like in compare-exchange
+ * above.
*/
- return __fetch_and_add((volatile int *)&ptr->value, add_);
+ __sync();
+
+ ret = __fetch_and_add((volatile int *)&ptr->value, add_);
+
+ __isync();
+
+ return ret;
}
#ifdef PG_HAVE_ATOMIC_U64_SUPPORT
@@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
{
bool ret;
- __asm__ __volatile__ (" sync \n" ::: "memory");
+ __sync();
ret = __compare_and_swaplp((volatile long*)&ptr->value,
(long *)expected, (long)newval);
@@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
static inline uint64
pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
{
- return __fetch_and_addlp((volatile long *)&ptr->value, add_);
+ uint64 ret;
+
+ __sync();
+
+ ret = __fetch_and_addlp((volatile long *)&ptr->value, add_);
+
+ __isync();
+
+ return ret;
}
#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 7aad2de..c6ef114 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -832,9 +831,8 @@ typedef unsigned int slock_t;
#include <sys/atomic_op.h>
typedef int slock_t;
-
-#define TAS(lock) _check_lock((slock_t *) (lock), 0, 1)
-#define S_UNLOCK(lock) _clear_lock((slock_t *) (lock), 0)
+#define TAS(lock) __check_lock_mp((slock_t *) (lock), 0, 1)
+#define S_UNLOCK(lock) __clear_lock_mp((slock_t *) (lock), 0)
#endif /* _AIX */
Hi Konstantin
XLC.
I'm on AIX 7.1 for now.
I'm using this version of XLC v13:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003
With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .
With the following XLC v12 version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016
So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?
Configure.
What are the options that you give to the configure ?
Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.
pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing. I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !
Performance.
- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?
- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).
How to help ?
How could I help for improving the quality and performance of PostgreSQL on AIX ?
I may have access to very big machines for even more deeply testing of PostgreSQL. I just need to know how to run tests.
Thanks!
Regards,
Tony
Le 01/02/2017 à 14:48, Konstantin Knizhnik a écrit :
Hi,
We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 connections during several minutes to reproduce the problem.
So may be you just didn't notice it;)
On 01.02.2017 16:29, REIX, Tony wrote:
Hi,
I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at <http://http://bullfreeware.com/> http://http://bullfreeware.com/ .
http://bullfreeware.com/search.php?package=postgresql )
I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)
For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.
Am I missing something ?
I configure the build of PostgreSQL with (in 64bits):
./configure
--prefix=/opt/freeware
--libdir=/opt/freeware/lib64
--mandir=/opt/freeware/man
--with-perl
--with-tcl
--with-tclconfig=/opt/freeware/lib
--with-python
--with-ldap
--with-openssl
--with-libxml
--with-libxslt
--enable-nls
--enable-thread-safety
--sysconfdir=/etc/sysconfig/postgresql
Am I missing some option for more optimization on AIX ?
Thanks
Regards,
Tony
Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:
* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.
But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.
Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.
Thanks to everybody who helped me to locate and fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.
AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.
AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.
ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi Tony,
On 01.02.2017 18:42, REIX, Tony wrote:
Hi Konstantin
*XLC.*
I'm on AIX 7.1 for now.
I'm using this version of *XL**C v13*:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003With this version, I have (at least, since I tested with "check" and
not "check-world" at that time) 2 failing tests: create_aggregate ,
aggregates .With the following *XLC v12* version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016So maybe you are not using XLC v13.1.3.3, rather another sub-version.
Unless you are using more options for the configure ?*Configure*.
What are the options that you give to the configure ?
export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"
*Hard load & 64 cores ?* OK. That clearly explains why I do not see
this issue.*pgbench ?* I wanted to run it. However, I'm still looking where to
get it plus a guide for using it for testing.
pgbench is part of Postgres distributive (src/bin/pgbench)
I would add such tests when building my PostgreSQL RPMs on AIX. So any
help is welcome !*Performance*.
- Also, I'd like to compare PostgreSQL performance on AIX vs
Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL
performance benchmark that I could find and use ? pgbench ?
pgbench is most widely used tool simulating OLTP workload. Certainly it
is quite primitive and its results are rather artificial. TPC-C seems to
be better choice.
But the best case is to implement your own benchmark simulating actual
workload of your real application.
- I'm interested in any information for improving the performance &
quality of my PostgreSQM RPMs on AIX./(As I already said, BullFreeware
RPMs for AIX are free and can be used by anyone, like Perzl RPMs
are//.////My compa//ny (ATOS/Bull) sells IBM Power machines under the
Escala brand s//ince ages (25 years this year)//)/.*How to help ?*
How could I help for improving the quality and performance of
PostgreSQL on AIX ?
We still have one open issue at AIX: see
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 02/01/2017 04:12 PM, Konstantin Knizhnik wrote:
On 01.02.2017 15:39, Heikki Linnakangas wrote:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.Seems like it was not so much undocumented, but an implementation
detail that was not guaranteed after all..Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless
we can find conclusive documentation on that, I think we should assume
that an __isync() is required, too.There was a long thread on these things the last time this was
changed:
/messages/by-id/20160425185204.jrvlghn3jxulsb7i@alap3.anarazel.de.
I couldn't find an explanation there of why we thought that
fetch_and_add implicitly performs sync and isync.Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that
the "asm" sections just disappeared.In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the
__sync() and __lwsync() intrinsics? Those are an xlc compiler-specific
thing, right? Or if they are expected to work on any ppc compiler,
then we should probably use them always, instead of the asm sections.In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.Why do you prefer to use _check_lock instead of __check_lock_mp ?
First one is even not mentioned in XLC compiler manual:
http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7
or
http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm
Googling around, it seems that they do more or less the same thing. I
would guess that they actually produce the same assembly code, but I
have no machine to test on. If I understand correctly, the difference is
that __check_lock_mp() is an xlc compiler intrinsic, while _check_lock()
is a libc function. The libc function presumably does __check_lock_mp()
or __check_lock_up() depending on whether the system is a multi- or
uni-processor system.
So I think if we're going to change this, the use of __check_lock_mp()
needs to be in an #ifdef block to check that you're on the XLC compiler,
as it's a *compiler* intrinsic, while the current code that uses
_check_lock() are in an "#ifdef _AIX" block, which is correct for
_check_lock() because it's defined in libc, not by the compiler.
But if there's no pressing reason to change it, let's leave it alone.
It's not related to the problem at hand, right?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Konstantin,
Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion so that I know your exact XLC v13 version.
I'm building on Power7 and not giving any architecture flag to XLC.
I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?
Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information...
If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.
I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one.
About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.
Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where define _LARGE_FILES 1 is required. 2nd version (new2-...) is simpler.
I'm now experimenting with your patch for dead lock. However, that should be invisible with the "check-world" tests I guess.
Regards,
Tony
Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,
On 01.02.2017 18:42, REIX, Tony wrote:
Hi Konstantin
XLC.
I'm on AIX 7.1 for now.
I'm using this version of XLC v13:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003
With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .
With the following XLC v12 version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016
So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?
Configure.
What are the options that you give to the configure ?
export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"
Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.
pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.
pgbench is part of Postgres distributive (src/bin/pgbench)
I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !
Performance.
- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?
pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.
- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).
How to help ?
How could I help for improving the quality and performance of PostgreSQL on AIX ?
We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
postgresql-9.6.1-new2-LARGE_FILES.patchtext/x-patch; name=postgresql-9.6.1-new2-LARGE_FILES.patchDownload
--- src/include/postgres.h.ORIGIN 2017-02-01 07:32:04 -0600
+++ src/include/postgres.h 2017-02-01 07:32:29 -0600
@@ -44,6 +44,10 @@
#ifndef POSTGRES_H
#define POSTGRES_H
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "c.h"
#include "utils/elog.h"
#include "utils/palloc.h"
postgresql-9.6.1-new-LARGE_FILES.patchtext/x-patch; name=postgresql-9.6.1-new-LARGE_FILES.patchDownload
--- src/pl/plpython/plpy_cursorobject.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_cursorobject.c 2017-02-01 03:00:20 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_cursorobject.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include <limits.h>
--- src/pl/plpython/plpy_elog.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_elog.c 2017-02-01 03:01:45 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_elog.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "lib/stringinfo.h"
--- src/pl/plpython/plpy_exec.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_exec.c 2017-02-01 03:01:59 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_exec.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "access/htup_details.h"
--- src/pl/plpython/plpy_main.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_main.c 2017-02-01 03:02:11 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_main.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "access/htup_details.h"
--- src/pl/plpython/plpy_planobject.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_planobject.c 2017-02-01 03:02:24 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_planobject.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "plpython.h"
--- src/pl/plpython/plpy_plpymodule.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_plpymodule.c 2017-02-01 03:02:34 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_plpymodule.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "mb/pg_wchar.h"
--- src/pl/plpython/plpy_procedure.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_procedure.c 2017-02-01 03:02:41 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_procedure.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "access/htup_details.h"
--- src/pl/plpython/plpy_resultobject.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_resultobject.c 2017-02-01 03:02:48 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_resultobject.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "plpython.h"
--- src/pl/plpython/plpy_spi.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_spi.c 2017-02-01 03:02:57 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_spi.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include <limits.h>
--- src/pl/plpython/plpy_subxactobject.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_subxactobject.c 2017-02-01 03:03:06 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_subxactobject.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "access/xact.h"
--- src/pl/plpython/plpy_typeio.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_typeio.c 2017-02-01 03:03:15 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_typeio.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "access/htup_details.h"
--- src/pl/plpython/plpy_util.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_util.c 2017-02-01 03:03:23 -0600
@@ -4,6 +4,10 @@
* src/pl/plpython/plpy_util.c
*/
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "mb/pg_wchar.h"
--- contrib/hstore_plpython/hstore_plpython.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ contrib/hstore_plpython/hstore_plpython.c 2017-02-01 03:03:32 -0600
@@ -1,3 +1,7 @@
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "fmgr.h"
#include "plpython.h"
--- contrib/ltree_plpython/ltree_plpython.c.ORIGIN 2017-02-01 02:59:08 -0600
+++ contrib/ltree_plpython/ltree_plpython.c 2017-02-01 03:03:41 -0600
@@ -1,3 +1,7 @@
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
#include "postgres.h"
#include "fmgr.h"
#include "plpython.h"
On 02/01/2017 08:30 PM, REIX, Tony wrote:
Hi Konstantin,
Please run:*/opt/IBM/xlc/13.1.3/bin/xlc -qversion* so that I know your exact XLC v13 version.
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
I'm building on Power7 and not giving any architecture flag to XLC.
I'm not using *-qalign=natural* . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?
Because otherwise double type is aligned on 4 bytes.
Thanks for info about *pgbench*. PostgreSQL web-site contains a lot of old information...
If you could*share scripts or instructions about the tests you are doing with pgbench*, I would reproduce here.
You do not need any script.
Just two simple commands.
One to initialize database:
pgbench -i -s 1000
And another to run benchmark itself:
pgbench -c 100 -j 20 -P 1 -T 1000000000
I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before
moving to another one.About the *zombie* issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I do not remember well the details. And that should be not specific to AIX.
I'll discuss with another colleague, tomorrow, who should understand this better than me.
1. Process is not in zomby state (according to ps). It is in <exiting> state... It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process
doing nothing - just immediately terminating after return from fork()
*Patch for Large Files*: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where
define _LARGE_FILES 1 is required. 2nd version (new2-...) is simpler.I'm now experimenting with your patch for dead lock. However, that should be invisible with the "check-world" tests I guess.
Regards,
Tony
Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,
On 01.02.2017 18:42, REIX, Tony wrote:
Hi Konstantin
*XLC.*
I'm on AIX 7.1 for now.
I'm using this version of *XL**C v13*:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .
With the following *XLC v12* version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?
*Configure*.
What are the options that you give to the configure ?
export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"*Hard load & 64 cores ?* OK. That clearly explains why I do not see this issue.
*pgbench ?* I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.
pgbench is part of Postgres distributive (src/bin/pgbench)
I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !
*Performance*.
- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?
pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX./(As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are//.////My compa//ny (ATOS/Bull) sells IBM
Power machines under the Escala brand s//ince ages (25 years this year)//)/.*How to help ?*
How could I help for improving the quality and performance of PostgreSQL on AIX ?
We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 02/01/2017 08:28 PM, Heikki Linnakangas wrote:
But if there's no pressing reason to change it, let's leave it alone. It's not related to the problem at hand, right?
Yes, I agree with you: we should better leave it as it is.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Last update on the issue with deadlock in XLogInsert.
After almost one day of working, pgbench is once again not working
normally:(
There are no deadlock, there are no core files and no error messages in log.
But TPS is almost zero:
progress: 57446.0 s, 1.1 tps, lat 3840.265 ms stddev NaNQ
progress: 57447.3 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57448.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57449.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57450.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57451.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57452.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57453.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57454.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57455.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57456.5 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57457.1 s, 164.6 tps, lat 11504.085 ms stddev 5902.148
progress: 57458.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57459.0 s, 234.0 tps, lat 1597.573 ms stddev 3665.814
progress: 57460.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57461.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57462.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57463.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57464.0 s, 602.8 tps, lat 906.765 ms stddev 1940.256
progress: 57465.0 s, 7.2 tps, lat 38.052 ms stddev 12.302
progress: 57466.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57467.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57468.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57469.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57470.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57471.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57472.1 s, 147.8 tps, lat 4379.790 ms stddev 3431.477
progress: 57473.0 s, 1314.1 tps, lat 156.884 ms stddev 535.761
progress: 57474.0 s, 1272.2 tps, lat 31.548 ms stddev 59.538
progress: 57475.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57476.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57477.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57478.0 s, 1688.6 tps, lat 268.379 ms stddev 956.537
progress: 57479.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57480.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57481.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57482.1 s, 29.0 tps, lat 3500.432 ms stddev 54.177
progress: 57483.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57484.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57485.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57486.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57487.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57488.0 s, 66.0 tps, lat 9813.646 ms stddev 19.807
progress: 57489.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57490.0 s, 31.0 tps, lat 8368.125 ms stddev 933.997
progress: 57491.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57492.0 s, 1601.0 tps, lat 226.865 ms stddev 844.952
progress: 57493.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57494.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
ps auwx shows the following picture:
[10:44:12]postgres@postgres:~/postgresql $ ps auwx | fgrep postgres
postgres 61802470 0.4 0.0 4064 4180 pts/6 A 18:54:58 976:56
pgbench -c 100 -j 20 -P 1 -T 1000000000 -p 5436
postgres 15271518 0.0 0.0 138276 15024 - A 10:43:34 0:06
postgres: autovacuum worker process postgres
postgres 13305354 0.0 0.0 22944 21356 - A 20:49:04 27:51
postgres: autovacuum worker process postgres
postgres 5245902 0.0 0.0 14072 14020 - A 18:54:59 10:24
postgres: postgres postgres [local] COMMIT
postgres 44303278 0.0 0.0 15176 14036 - A 18:54:59 10:18
postgres: postgres postgres [local] COMMIT
postgres 38601340 0.0 0.0 11564 14008 - A 18:54:59 10:16
postgres: postgres postgres [local] COMMIT
postgres 53674890 0.0 0.0 12712 14004 - A 18:54:59 8:54
postgres: postgres postgres [local] COMMIT
postgres 27591640 0.0 0.0 15040 14028 - A 18:54:59 8:38
postgres: postgres postgres [local] COMMIT
postgres 40960422 0.0 0.0 12128 13996 - A 18:54:59 8:36
postgres: postgres postgres [local] COMMIT
postgres 41288514 0.0 0.0 10544 14012 - A 18:54:59 8:30
postgres: postgres postgres [local] idle
postgres 55771564 0.0 0.0 12844 14008 - A 18:54:59 8:24
postgres: postgres postgres [local] COMMIT
postgres 21760842 0.0 0.0 13164 14008 - A 18:54:59 8:17
postgres: postgres postgres [local] COMMIT
postgres 18810974 0.0 0.0 10416 14012 - A 18:54:59 8:13
postgres: postgres postgres [local] idle in transaction
postgres 17566474 0.0 0.0 10224 14012 - A 18:54:59 8:02
postgres: postgres postgres [local] COMMIT
postgres 63963402 0.0 0.0 11300 14000 - A 18:54:59 7:48
postgres: postgres postgres [local] COMMIT
postgres 9963962 0.0 0.0 15548 14024 - A 18:54:59 7:37
postgres: postgres postgres [local] idle
postgres 10094942 0.0 0.0 12192 13996 - A 18:54:59 7:33
postgres: postgres postgres [local] COMMIT
postgres 53740662 0.0 0.0 15104 14028 - A 18:54:59 7:33
postgres: postgres postgres [local] idle
postgres 42926266 0.0 0.0 15352 14020 - A 18:54:59 7:32
postgres: postgres postgres [local] COMMIT
postgres 29295244 0.0 0.0 10612 14016 - A 18:54:59 7:31
postgres: postgres postgres [local] idle in transaction
postgres 4392458 0.0 0.0 11504 14012 - A 18:54:59 7:28
postgres: postgres postgres [local] COMMIT
postgres 45482810 0.0 0.0 9896 14004 - A 18:54:59 7:27
postgres: postgres postgres [local] COMMIT
postgres 59703706 0.0 0.0 11384 14020 - A 18:54:59 7:26
postgres: postgres postgres [local] COMMIT
postgres 13697320 0.0 0.0 13556 14016 - A 18:54:59 7:26
postgres: postgres postgres [local] COMMIT
postgres 65275126 0.0 0.0 13748 14016 - A 18:54:59 7:24
postgres: postgres postgres [local] COMMIT
postgres 17435626 0.0 0.0 13492 14016 - A 18:54:59 7:23
postgres: postgres postgres [local] COMMIT
postgres 32834044 0.0 0.0 9648 14012 - A 18:54:59 7:23
postgres: postgres postgres [local] idle in transaction
postgres 3015796 0.0 0.0 15292 14024 - A 18:54:59 7:22
postgres: postgres postgres [local] COMMIT
postgres 54789310 0.0 0.0 15480 14020 - A 18:54:59 7:21
postgres: postgres postgres [local] COMMIT
postgres 13369644 0.0 0.0 13300 14016 - A 18:54:59 7:20
postgres: postgres postgres [local] COMMIT
postgres 49415352 0.0 0.0 12392 14004 - A 18:54:59 7:19
postgres: postgres postgres [local] COMMIT
postgres 11273960 0.0 0.0 12328 14004 - A 18:54:59 7:19
postgres: postgres postgres [local] COMMIT
postgres 37749126 0.0 0.0 13628 14024 - A 18:54:59 7:17
postgres: postgres postgres [local] COMMIT
postgres 42664990 0.0 0.0 14012 14024 - A 18:54:59 7:16
postgres: postgres postgres [local] idle
postgres 48628314 0.0 0.0 12972 14008 - A 18:54:59 7:15
postgres: postgres postgres [local] COMMIT
postgres 27526940 0.0 0.0 9832 14004 - A 18:54:59 7:15
postgres: postgres postgres [local] COMMIT
postgres 4262142 0.0 0.0 11048 14004 - A 18:54:59 7:14
postgres: postgres postgres [local] COMMIT
postgres 59049404 0.0 0.0 14200 14020 - A 18:54:59 7:13
postgres: postgres postgres [local] COMMIT
postgres 25035818 0.0 0.0 9264 14012 - A 18:54:59 7:11
postgres: postgres postgres [local] COMMIT
postgres 62587380 0.0 0.0 15420 14024 - A 18:54:59 7:10
postgres: postgres postgres [local] COMMIT
postgres 66848122 0.0 0.0 12588 14008 - A 18:54:59 7:06
postgres: postgres postgres [local] COMMIT
postgres 45352748 0.0 0.0 14912 14028 - A 18:54:59 7:04
postgres: postgres postgres [local] UPDATE waiting
postgres 46990366 0.0 0.0 11680 13996 - A 18:54:59 7:03
postgres: postgres postgres [local] idle
postgres 42271516 0.0 0.0 14776 14020 - A 18:54:59 6:59
postgres: postgres postgres [local] COMMIT
postgres 43253972 0.0 0.0 9192 14004 - A 18:54:59 6:59
postgres: postgres postgres [local] UPDATE waiting
postgres 37487324 0.0 0.0 11936 13996 - A 18:54:59 6:58
postgres: postgres postgres [local] COMMIT
postgres 33096324 0.0 0.0 14396 14024 - A 18:54:59 6:58
postgres: postgres postgres [local] COMMIT
postgres 37094658 0.0 0.0 11184 14012 - A 18:54:59 6:57
postgres: postgres postgres [local] UPDATE waiting
postgres 41223048 0.0 0.0 11628 14008 - A 18:54:59 6:57
postgres: postgres postgres [local] idle in transaction
postgres 13240806 0.0 0.0 10024 14004 - A 18:54:59 6:57
postgres: postgres postgres [local] COMMIT
postgres 61276560 0.0 0.0 10728 14004 - A 18:54:59 6:56
postgres: postgres postgres [local] COMMIT
postgres 66585476 0.0 0.0 12908 14008 - A 18:54:59 6:52
postgres: postgres postgres [local] UPDATE waiting
postgres 15074434 0.0 0.0 9328 14012 - A 18:54:59 6:50
postgres: postgres postgres [local] COMMIT
postgres 33751620 0.0 0.0 12456 14004 - A 18:54:59 6:47
postgres: postgres postgres [local] COMMIT
postgres 854400 0.0 0.0 14584 14020 - A 18:54:59 6:46
postgres: postgres postgres [local] COMMIT
postgres 36504484 0.0 0.0 14264 14020 - A 18:54:59 6:46
postgres: postgres postgres [local] COMMIT
postgres 61408076 0.0 0.0 11112 14004 - A 18:54:59 6:46
postgres: postgres postgres [local] COMMIT
postgres 24905384 0.0 0.0 14712 14020 - A 18:54:59 6:45
postgres: postgres postgres [local] COMMIT
postgres 61867150 0.0 0.0 13812 14016 - A 18:54:59 6:45
postgres: postgres postgres [local] COMMIT
postgres 38798230 0.0 0.0 13876 14016 - A 18:54:59 6:45
postgres: postgres postgres [local] COMMIT
postgres 53217076 0.0 0.0 13228 14008 - A 18:54:59 6:43
postgres: postgres postgres [local] COMMIT
postgres 19727378 0.0 0.0 9456 14012 - A 18:54:59 6:40
postgres: postgres postgres [local] idle in transaction
postgres 20253128 0.0 0.0 12648 14004 - A 18:54:59 6:38
postgres: postgres postgres [local] COMMIT
postgres 35784016 0.0 0.0 10088 14004 - A 18:54:59 6:38
postgres: postgres postgres [local] COMMIT
postgres 9243026 0.0 0.0 13100 14008 - A 18:54:59 6:37
postgres: postgres postgres [local] COMMIT
postgres 14027754 0.0 0.0 10792 14004 - A 18:54:59 6:35
postgres: postgres postgres [local] COMMIT
postgres 61342300 0.0 0.0 12264 14004 - A 18:54:59 6:34
postgres: postgres postgres [local] UPDATE waiting
postgres 21693262 0.0 0.0 14460 14024 - A 18:54:59 6:30
postgres: postgres postgres [local] COMMIT
postgres 53938020 0.0 0.0 10856 14004 - A 18:54:59 6:30
postgres: postgres postgres [local] COMMIT
postgres 24053688 0.0 0.0 12064 13996 - A 18:54:59 6:29
postgres: postgres postgres [local] COMMIT
postgres 45024698 0.0 0.0 14984 14036 - A 18:54:59 6:25
postgres: postgres postgres [local] COMMIT
postgres 20710448 0.0 0.0 10480 14012 - A 18:54:59 6:24
postgres: postgres postgres [local] COMMIT
postgres 8718716 0.0 0.0 15232 14028 - A 18:54:59 6:23
postgres: postgres postgres [local] COMMIT
postgres 55313538 0.0 0.0 14136 14020 - A 18:54:59 6:22
postgres: postgres postgres [local] COMMIT
postgres 13896472 0.0 0.0 9584 14012 - A 18:54:59 6:18
postgres: postgres postgres [local] COMMIT
postgres 8261178 0.0 0.0 9960 14004 - A 18:54:59 6:18
postgres: postgres postgres [local] COMMIT
postgres 8980574 0.0 0.0 10540 15608 - A 18:54:54 6:17
postgres: checkpointer process
postgres 4787946 0.0 0.0 9392 14012 - A 18:54:59 6:16
postgres: postgres postgres [local] COMMIT
postgres 27919600 0.0 0.0 11812 14000 - A 18:54:59 6:16
postgres: postgres postgres [local] COMMIT
postgres 47579896 0.0 0.0 10920 14004 - A 18:54:59 6:14
postgres: postgres postgres [local] COMMIT
postgres 10290864 0.0 0.0 10160 14012 - A 18:54:59 6:13
postgres: postgres postgres [local] COMMIT
postgres 46072384 0.0 0.0 14520 14020 - A 18:54:59 6:13
postgres: postgres postgres [local] COMMIT
postgres 57737434 0.0 0.0 9520 14012 - A 18:54:59 6:12
postgres: postgres postgres [local] COMMIT
postgres 65012512 0.0 0.0 9768 14004 - A 18:54:59 6:11
postgres: postgres postgres [local] COMMIT
postgres 21495924 0.0 0.0 12000 13996 - A 18:54:59 6:11
postgres: postgres postgres [local] COMMIT
postgres 59704720 0.0 0.0 14656 14028 - A 18:54:59 6:09
postgres: postgres postgres [local] COMMIT
postgres 58458128 0.0 0.0 13036 14008 - A 18:54:59 6:08
postgres: postgres postgres [local] COMMIT
postgres 53412068 0.0 0.0 13684 14016 - A 18:54:59 6:02
postgres: postgres postgres [local] COMMIT
postgres 8652638 0.0 0.0 11244 14008 - A 18:54:59 6:00
postgres: postgres postgres [local] COMMIT
postgres 14289464 0.0 0.0 10672 14012 - A 18:54:59 6:00
postgres: postgres postgres [local] COMMIT
postgres 16582572 0.0 0.0 14328 14020 - A 18:54:59 5:56
postgres: postgres postgres [local] COMMIT
postgres 9308408 0.0 0.0 9704 14004 - A 18:54:59 5:56
postgres: postgres postgres [local] COMMIT
postgres 51970736 0.0 0.0 11440 14012 - A 18:54:59 5:51
postgres: postgres postgres [local] COMMIT
postgres 19792490 0.0 0.0 13364 14016 - A 18:54:59 5:48
postgres: postgres postgres [local] COMMIT
postgres 58065970 0.0 0.0 11744 13996 - A 18:54:59 5:38
postgres: postgres postgres [local] COMMIT
postgres 17891344 0.0 0.0 10984 14004 - A 18:54:59 5:33
postgres: postgres postgres [local] COMMIT
postgres 20121588 0.0 0.0 12520 14004 - A 18:54:59 5:30
postgres: postgres postgres [local] COMMIT
postgres 39977868 0.0 0.0 13944 14020 - A 18:54:59 5:26
postgres: postgres postgres [local] COMMIT
postgres 25167604 0.0 0.0 11872 13996 - A 18:54:59 5:25
postgres: postgres postgres [local] COMMIT
postgres 22348378 0.0 0.0 10352 14012 - A 18:54:59 5:25
postgres: postgres postgres [local] COMMIT
postgres 8587156 0.0 0.0 12780 14008 - A 18:54:59 5:21
postgres: postgres postgres [local] COMMIT
postgres 12453402 0.0 0.0 10288 14012 - A 18:54:59 5:20
postgres: postgres postgres [local] COMMIT
postgres 25822956 0.0 0.0 13428 14016 - A 18:54:59 5:04
postgres: postgres postgres [local] COMMIT
postgres 7145012 0.0 0.0 14844 14024 - A 18:54:59 5:04
postgres: postgres postgres [local] COMMIT
postgres 10224292 0.0 0.0 8100 13040 - A 18:54:54 2:55
postgres: wal writer process
postgres 47711172 0.0 0.0 8080 13084 - A 18:54:54 0:57
postgres: writer process
postgres 22743474 0.0 0.0 8328 13204 - A 18:54:54 0:29
postgres: stats collector process
postgres 1706138 0.0 0.0 137280 12940 pts/6 A 18:54:54 0:17
/opt/postgresql/xlc-debug/9.6/bin/postgres -D xlc-debug
postgres 9898070 0.0 0.0 8528 13404 - A 18:54:54 0:02
postgres: autovacuum launcher process
postgres 55444330 0.0 0.0 7884 13016 - A 18:54:54 0:01
postgres: logger process
postgres 42927172 0.0 0.0 138232 13892 - A 10:44:18 0:00
postgres: autovacuum worker process postgres
postgres 3934886 0.0 0.0 244 256 pts/3 A 10:44:21 0:00 fgrep
postgres
which is pericodically (but slowly) updated. So there seems to be no
hanged backends (it is hard to inspect state of all 100 backends).
CPU activity is almost zero as well as disk activity, memory is mostly
free, ...
Attaching to one of the backends I got the following stack traces in
debugger:
[10:44:21]postgres@postgres:~/postgresql $ dbx -a 25822956
/opt/postgresql/xlc-debug/9.6/bin/postgres
Waiting to attach to process 25822956 ...
Successfully attached to postgres.
warning: Directory containing postgres could not be determined.
Apply 'use' command to initialize source path.
Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g
stopped in __fd_poll at 0x9000000001545d4
0x9000000001545d4 (__fd_poll+0xb4) e8410028 ld r2,0x28(r1)
(dbx) where
__fd_poll(??, ??, ??) at 0x9000000001545d4
WaitEventSetWait() at 0x10012b730
secure_read() at 0x100141ca4
IPRA.$pq_recvbuf() at 0x10013e370
pq_getbyte() at 0x10013f294
SocketBackend() at 0x10006cf98
postgres.IPRA.$exec_simple_query.ReadCommand() at 0x10006cf1c
PostgresMain() at 0x10006e590
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
^C
Interrupt in semop at 0x9000000001f5790
0x9000000001f5790 (semop+0xb0) e8410028 ld r2,0x28(r1)
(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock() at 0x100049040
LWLockAcquireOrWait() at 0x100047140
XLogFlush() at 0x1000e89ec
RecordTransactionCommit() at 0x10005856c
xact.RecordTransactionAbort.IPRA.$CommitTransaction() at 0x10005a598
CommitTransactionCommand() at 0x10005ea30
postgres.IPRA.$exec_describe_portal_message.IPRA.$finish_xact_command()
at 0x10006c91c
IPRA.$exec_simple_query() at 0x10006c298
PostgresMain() at 0x10006eac0
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
^C
Interrupt in __fd_poll at 0x9000000001545d4
0x9000000001545d4 (__fd_poll+0xb4) e8410028 ld r2,0x28(r1)
(dbx) where
__fd_poll(??, ??, ??) at 0x9000000001545d4
WaitEventSetWait() at 0x10012b730
WaitLatchOrSocket() at 0x10012b48c
ProcSleep() at 0x100144fe4
IPRA.$WaitOnLock() at 0x1001492fc
LockAcquireExtended() at 0x1001519ec
XactLockTableWait() at 0x10016d7e8
heap_update() at 0x1000cc7e4
IPRA.$ExecUpdate() at 0x1004f041c
ExecModifyTable() at 0x1004f2220
ExecProcNode() at 0x1001faf08
IPRA.$ExecutePlan() at 0x1001f64b0
standard_ExecutorRun() at 0x1001f9c34
ExecutorRun() at 0x1001f9d6c
pquery.IPRA.$FillPortalStore.IPRA.$ProcessQuery() at 0x1003309e4
IPRA.$PortalRunMulti() at 0x10032fcf8
PortalRun() at 0x100331030
IPRA.$exec_simple_query() at 0x10006c014
PostgresMain() at 0x10006eac0
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
User defined signal 1 in __fd_poll at 0x9000000001545d4
0x9000000001545d4 (__fd_poll+0xb4) e8410028 ld r2,0x28(r1)
(dbx) where
__fd_poll(??, ??, ??) at 0x9000000001545d4
WaitEventSetWait() at 0x10012b730
WaitLatchOrSocket() at 0x10012b48c
ProcSleep() at 0x100144fe4
IPRA.$WaitOnLock() at 0x1001492fc
LockAcquireExtended() at 0x1001519ec
XactLockTableWait() at 0x10016d7e8
heap_update() at 0x1000cc7e4
IPRA.$ExecUpdate() at 0x1004f041c
ExecModifyTable() at 0x1004f2220
ExecProcNode() at 0x1001faf08
IPRA.$ExecutePlan() at 0x1001f64b0
standard_ExecutorRun() at 0x1001f9c34
ExecutorRun() at 0x1001f9d6c
pquery.IPRA.$FillPortalStore.IPRA.$ProcessQuery() at 0x1003309e4
IPRA.$PortalRunMulti() at 0x10032fcf8
PortalRun() at 0x100331030
IPRA.$exec_simple_query() at 0x10006c014
PostgresMain() at 0x10006eac0
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
Broken pipe in send at 0x90000000010cfb4
0x90000000010cfb4 (send+0x2b4) e8410028 ld r2,0x28(r1)
(dbx) where
send(??, ??, ??, ??) at 0x90000000010cfb4
secure_write() at 0x100141a94
IPRA.$internal_flush() at 0x10013e800
socket_flush() at 0x10013ee7c
ReadyForQuery@AF106_4() at 0x10033241c
PostgresMain() at 0x10006eb7c
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) where
send(??, ??, ??, ??) at 0x90000000010cfb4
secure_write() at 0x100141a94
IPRA.$internal_flush() at 0x10013e800
socket_flush() at 0x10013ee7c
ReadyForQuery@AF106_4() at 0x10033241c
PostgresMain() at 0x10006eb7c
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
execution completed (exit code 1)
(dbx) quit
^C
(dbx) quit
libdebug assertion "(rc == DB_SUCCESS)" failed at line 162 in file
../../../../../../../../../../../src/bos/usr/ccs/lib/libdbx/libdebug/modules/procdebug/ptrace/procdb_PtraceSession.C
I have no idea what's going on. It is release version without debug
information, assert checks and lwlock info. So it is hard to debug it.
Heikki, I will be pleased if you have a chance to login at the system
and look at it yourself.
May be you will have some idea what's happening...
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Konstantin
I've discussed the "zombie/exit" issue with our expert here.
- He does not think that AIX has anything special here
- If the process is marked <exiting> in ps, this is because the flag SEXIT is set, thus the process is blocked somewhere in the kexitx() syscall, waiting for something.
- In order to know what it is waiting for, the best would be to have a look with kdb.
- either it is waiting for an asynchronous I/O to end, or a thread to end if the process is multi-thread
- Using the proctree command for analyzing the issue is not a good idea, since the process will block in kexitx() if there is an operation on /proc being done
- If the process is marked <defunct>, that means that the process has not called waitpid() yet for getting the son's status. Maybe the parent is blocked in non-interruptible code where the signal handler cannot be called.
- In short, that may be due to many causes... Use kdb is the best way.
- Instead of proctree (which makes use of /proc), use: "ps -faT <pid>".
I'll try to reproduce here.
Regards
Tony
Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:
....
About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.
1. Process is not in zomby state (according to ps). It is in <exiting> state... It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork()
....
Regards,
Tony
Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,
On 01.02.2017 18:42, REIX, Tony wrote:
Hi Konstantin
XLC.
I'm on AIX 7.1 for now.
I'm using this version of XLC v13:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003
With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .
With the following XLC v12 version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016
So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?
Configure.
What are the options that you give to the configure ?
export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"
Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.
pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.
pgbench is part of Postgres distributive (src/bin/pgbench)
I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !
Performance.
- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?
pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.
- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).
How to help ?
How could I help for improving the quality and performance of PostgreSQL on AIX ?
We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi Konstantin
I have an issue with pgbench. Any idea ?
# mkdir /tmp/PGS
# chown pgstbf.staff /tmp/PGS
# su pgstbf
$ /opt/freeware/bin/initdb -D /tmp/PGS
The files belonging to this database system will be owned by user "pgstbf".
This user must also own the server prcess.
The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /tmp/PGS ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
$ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile start
server starting
$ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
pg_ctl: server is running (PID: 11599920)
/opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"
$ /usr/bin/createdb pgstbf
$
$ pgbench -i -s 1000
creating tables...
100000 of 100000000 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
...
100000000 of 100000000 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s)
ERROR: could not extend file "base/16384/24614": wrote only 7680 of 8192 bytes at block 131071
HINT: Check free disk space.
CONTEXT: COPY pgbench_accounts, line 7995584
PQendcopy failed
After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, I've retried and I got nearly the same:
100000000 of 100000000 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s)
ERROR: could not extend file "base/16384/16397.6": No space left on device
HINT: Check free disk space.
CONTEXT: COPY pgbench_accounts, line 51235802
PQendcopy failed
Do I need more than 6GB ???
Thanks
Tony
$ df -k .
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd1 45088768 6719484 86% 946016 39% /home
bash-4.3$ pwd
/tmp/PGS
bash-4.3$ ll /tmp/PGS
lrwxrwxrwx 1 root system 10 Feb 2 08:43 /tmp/PGS -> /home/PGS/
$ df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 524288 277284 48% 10733 14% /
/dev/hd2 6684672 148896 98% 49303 48% /usr
/dev/hd9var 2097152 314696 85% 24934 18% /var
/dev/hd3 3145728 2527532 20% 418 1% /tmp
/dev/hd1 45088768 6719484 86% 946016 39% /home
/dev/hd11admin 131072 130692 1% 7 1% /admin
/proc - - - - - /proc
/dev/hd10opt 65273856 829500 99% 938339 41% /opt
/dev/livedump 262144 261776 1% 4 1% /var/adm/ras/livedump
/aha - - - 18 1% /aha
$ cat logfile
LOG: database system was shut down at 2017-02-02 09:08:31 CST
LOG: MultiXact member wraparound protections are now enabled
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
ERROR: could not extend file "base/16384/16397.6": No space left on device
HINT: Check free disk space.
CONTEXT: COPY pgbench_accounts, line 51235802
STATEMENT: copy pgbench_accounts from stdin
$ ulimit -a
core file size (blocks, -c) 1048575
data seg size (kbytes, -d) 131072
file size (blocks, -f) unlimited
max memory size (kbytes, -m) 32768
open files (-n) 2000
pipe size (512 bytes, -p) 64
stack size (kbytes, -s) 32768
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
bash-4.3$ ll /tmp/PGS
lrwxrwxrwx 1 root system 10 Feb 2 08:43 /tmp/PGS -> /home/PGS/
bash-4.3$ ls -l
total 120
-rw------- 1 pgstbf staff 4 Feb 2 09:08 PG_VERSION
drwx------ 6 pgstbf staff 256 Feb 2 09:09 base
drwx------ 2 pgstbf staff 4096 Feb 2 09:09 global
-rw------- 1 pgstbf staff 410 Feb 2 09:13 logfile
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_clog
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_commit_ts
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_dynshmem
-rw------- 1 pgstbf staff 4462 Feb 2 09:08 pg_hba.conf
-rw------- 1 pgstbf staff 1636 Feb 2 09:08 pg_ident.conf
drwx------ 4 pgstbf staff 256 Feb 2 09:08 pg_logical
drwx------ 4 pgstbf staff 256 Feb 2 09:08 pg_multixact
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_notify
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_replslot
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_serial
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_snapshots
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_stat
drwx------ 2 pgstbf staff 256 Feb 2 09:17 pg_stat_tmp
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_subtrans
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_tblspc
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_twophase
drwx------ 3 pgstbf staff 256 Feb 2 09:08 pg_xlog
-rw------- 1 pgstbf staff 88 Feb 2 09:08 postgresql.auto.conf
-rw------- 1 pgstbf staff 22236 Feb 2 09:08 postgresql.conf
-rw------- 1 pgstbf staff 46 Feb 2 09:08 postmaster.opts
-rw------- 1 pgstbf staff 69 Feb 2 09:08 postmaster.pid
bash-4.3$ ls -l base
total 112
drwx------ 2 pgstbf staff 16384 Feb 2 09:08 1
drwx------ 2 pgstbf staff 12288 Feb 2 09:08 12407
drwx------ 2 pgstbf staff 12288 Feb 2 09:09 12408
drwx------ 2 pgstbf staff 16384 Feb 2 09:14 16384
bash-4.3$ ls -l base/16384/
total 15200
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 112
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 113
-rw------- 1 pgstbf staff 57344 Feb 2 09:09 12243
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12243_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12243_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12245
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12247
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12248
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12248_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12248_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12250
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12252
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12253
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12253_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12253_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12255
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12257
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12258
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12258_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12258_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12260
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12262
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12263
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12263_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12263_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12265
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12267
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12268
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12268_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12268_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12270
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12272
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12273
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12275
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12277
-rw------- 1 pgstbf staff 73728 Feb 2 09:14 1247
-rw------- 1 pgstbf staff 24576 Feb 2 09:14 1247_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 1247_vm
-rw------- 1 pgstbf staff 368640 Feb 2 09:14 1249
-rw------- 1 pgstbf staff 24576 Feb 2 09:14 1249_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 1249_vm
-rw------- 1 pgstbf staff 589824 Feb 2 09:09 1255
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 1255_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 1255_vm
-rw------- 1 pgstbf staff 90112 Feb 2 09:14 1259
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 1259_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 1259_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1417
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1417_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1418
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1418_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 16385
-rw------- 1 pgstbf staff 450560 Feb 2 09:14 16388
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 16388_fsm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 16391
-rw------- 1 pgstbf staff 40960 Feb 2 09:14 16394
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 16394_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 174
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 175
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2187
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2328
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2328_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2336
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2336_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2337
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2600
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2600_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2600_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2601
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2601_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2601_vm
-rw------- 1 pgstbf staff 49152 Feb 2 09:09 2602
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2602_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2602_vm
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2603
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2603_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2603_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2604
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2604_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2605
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2605_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2605_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2606
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2606_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2606_vm
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2607
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2607_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2607_vm
-rw------- 1 pgstbf staff 450560 Feb 2 09:14 2608
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2608_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 2608_vm
-rw------- 1 pgstbf staff 278528 Feb 2 09:09 2609
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2609_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2609_vm
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2610
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2610_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2610_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2611
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2611_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2612
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2612_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2612_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2613
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2613_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2615
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2615_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2615_vm
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2616
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2616_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2616_vm
-rw------- 1 pgstbf staff 122880 Feb 2 09:09 2617
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2617_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2617_vm
-rw------- 1 pgstbf staff 98304 Feb 2 09:09 2618
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2618_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2618_vm
-rw------- 1 pgstbf staff 122880 Feb 2 09:14 2619
-rw------- 1 pgstbf staff 24576 Feb 2 09:14 2619_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 2619_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2620
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2620_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2650
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2651
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2652
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2653
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2654
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2655
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2656
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2657
-rw------- 1 pgstbf staff 106496 Feb 2 09:14 2658
-rw------- 1 pgstbf staff 73728 Feb 2 09:14 2659
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2660
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2661
-rw------- 1 pgstbf staff 32768 Feb 2 09:14 2662
-rw------- 1 pgstbf staff 40960 Feb 2 09:14 2663
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2664
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2665
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2666
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2667
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2668
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2669
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2670
-rw------- 1 pgstbf staff 319488 Feb 2 09:14 2673
-rw------- 1 pgstbf staff 352256 Feb 2 09:14 2674
-rw------- 1 pgstbf staff 172032 Feb 2 09:09 2675
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2678
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2679
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2680
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2681
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2682
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2683
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2684
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2685
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2686
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2687
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2688
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2689
-rw------- 1 pgstbf staff 81920 Feb 2 09:09 2690
-rw------- 1 pgstbf staff 253952 Feb 2 09:09 2691
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2692
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2693
-rw------- 1 pgstbf staff 16384 Feb 2 09:14 2696
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2699
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2701
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2702
-rw------- 1 pgstbf staff 16384 Feb 2 09:14 2703
-rw------- 1 pgstbf staff 40960 Feb 2 09:14 2704
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2753
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2753_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2753_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2754
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2755
-rw------- 1 pgstbf staff 32768 Feb 2 09:09 2756
-rw------- 1 pgstbf staff 32768 Feb 2 09:09 2757
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2830
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2830_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2831
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2832
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2832_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2833
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2834
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2834_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2835
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2836
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2836_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2837
-rw------- 1 pgstbf staff 385024 Feb 2 09:09 2838
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2838_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2838_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2839
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2840
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2840_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2840_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2841
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2995
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2995_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2996
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3079
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3079_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3079_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3080
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3081
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3085
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3118
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3118_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3119
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3164
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3256
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3256_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3257
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3258
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3394
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3394_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3394_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3395
-rw------- 1 pgstbf staff 32768 Feb 2 09:14 3455
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3456
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3456_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3456_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3466
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3466_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3467
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3468
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3501
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3501_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3502
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3503
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3534
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3541
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3541_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3541_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3542
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3574
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3575
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3576
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3576_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3596
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3596_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3597
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3598
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3598_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3599
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3600
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3600_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3600_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3601
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3601_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3601_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3602
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3602_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3602_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3603
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3603_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3603_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3604
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3605
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3606
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3607
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3608
-rw------- 1 pgstbf staff 32768 Feb 2 09:09 3609
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3712
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3764
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3764_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3764_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3766
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3767
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 548
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 549
-rw------- 1 pgstbf staff 0 Feb 2 09:09 826
-rw------- 1 pgstbf staff 0 Feb 2 09:09 826_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 827
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 828
-rw------- 1 pgstbf staff 4 Feb 2 09:09 PG_VERSION
-rw------- 1 pgstbf staff 512 Feb 2 09:09 pg_filenode.map
-rw------- 1 pgstbf staff 112660 Feb 2 09:09 pg_internal.init
Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:
Hi Konstantin,
....
If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.
You do not need any script.
Just two simple commands.
One to initialize database:
pgbench -i -s 1000
And another to run benchmark itself:
pgbench -c 100 -j 20 -P 1 -T 1000000000
...
Regards,
Tony
Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,
On 01.02.2017 18:42, REIX, Tony wrote:
Hi Konstantin
XLC.
I'm on AIX 7.1 for now.
I'm using this version of XLC v13:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003
With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .
With the following XLC v12 version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016
So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?
Configure.
What are the options that you give to the configure ?
export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"
Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.
pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.
pgbench is part of Postgres distributive (src/bin/pgbench)
I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !
Performance.
- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?
pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.
- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).
How to help ?
How could I help for improving the quality and performance of PostgreSQL on AIX ?
We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 02.02.2017 18:20, REIX, Tony wrote:
Hi Konstantin
I have an issue with pgbench. Any idea ?
Pgbench -s options specifies scale.
Scale 1000 corresponds to 1000 million rows and requires about 16Gb at disk.
# mkdir /tmp/PGS
# chown pgstbf.staff /tmp/PGS# su pgstbf
$ /opt/freeware/bin/*initdb* -D /tmp/PGS
The files belonging to this database system will be owned by user
"pgstbf".
This user must also own the server prcess.The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".Data page checksums are disabled.
fixing permissions on existing directory /tmp/PGS ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... okWARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.Success. You can now start the database server using:
$ /opt/freeware/bin/*pg_ctl* -D /tmp/PGS -l /tmp/PGS/logfile *start*
server starting$ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
pg_ctl: server is running (PID: 11599920)
/opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"$ /usr/bin/*createdb* pgstbf
$$ *pgbench* -i -s 1000
creating tables...
100000 of 100000000 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
...
100000000 of 100000000 tuples (100%) done (elapsed 42.60 s, remaining
0.00 s)
*ERROR: could not extend file "base/16384/24614": wrote only 7680 of
8192 bytes at block 131071**
** HINT: Check free disk space.*
CONTEXT: COPY pgbench_accounts, line 7995584
PQendcopy failedAfter cleaning all /tmp/PGS and symlinking it to /home, where I have
6GB free, I've retried and I got nearly the same:100000000 of 100000000 tuples (100%) done (elapsed 204.65 s,
remaining 0.00 s)
ERROR: could not extend file "base/16384/16397.6": *No space left on
device*
HINT: Check free disk space.
CONTEXT: COPY pgbench_accounts, line 51235802
PQendcopy failed*Do I need more than 6GB ???*
*Thanks*
*Tony*
$ df -k .
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd1 45088768 6719484 86% 946016 39% /homebash-4.3$ pwd
/tmp/PGSbash-4.3$ ll /tmp/PGS
lrwxrwxrwx 1 root system 10 Feb 2 08:43 /tmp/PGS ->
/home/PGS/$ df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 524288 277284 48% 10733 14% /
/dev/hd2 6684672 148896 98% 49303 48% /usr
/dev/hd9var 2097152 314696 85% 24934 18% /var
/dev/hd3 3145728 2527532 20% 418 1% /tmp
*/dev/hd1 45088768 6719484 86% 946016 39% /home*
/dev/hd11admin 131072 130692 1% 7 1% /admin
/proc - - - - - /proc
/dev/hd10opt 65273856 829500 99% 938339 41% /opt
/dev/livedump 262144 261776 1% 4 1%
/var/adm/ras/livedump
/aha - - - 18 1% /aha$ cat logfile
LOG: database system was shut down at 2017-02-02 09:08:31 CST
LOG: MultiXact member wraparound protections are now enabled
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
ERROR: could not extend file "base/16384/16397.6": No space left on
device
HINT: Check free disk space.
CONTEXT: COPY pgbench_accounts, line 51235802
STATEMENT: copy pgbench_accounts from stdin$ *ulimit -a*
core file size (blocks, -c) 1048575
data seg size (kbytes, -d) 131072
*file size (blocks, -f) unlimited*
max memory size (kbytes, -m) 32768
open files (-n) 2000
pipe size (512 bytes, -p) 64
stack size (kbytes, -s) 32768
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimitedbash-4.3$ ll /tmp/PGS
lrwxrwxrwx 1 root system 10 Feb 2 08:43 /tmp/PGS ->
/home/PGS/
bash-4.3$ ls -l
total 120
-rw------- 1 pgstbf staff 4 Feb 2 09:08 PG_VERSION
drwx------ 6 pgstbf staff 256 Feb 2 09:09 base
drwx------ 2 pgstbf staff 4096 Feb 2 09:09 global
-rw------- 1 pgstbf staff 410 Feb 2 09:13 logfile
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_clog
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_commit_ts
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_dynshmem
-rw------- 1 pgstbf staff 4462 Feb 2 09:08 pg_hba.conf
-rw------- 1 pgstbf staff 1636 Feb 2 09:08 pg_ident.conf
drwx------ 4 pgstbf staff 256 Feb 2 09:08 pg_logical
drwx------ 4 pgstbf staff 256 Feb 2 09:08 pg_multixact
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_notify
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_replslot
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_serial
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_snapshots
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_stat
drwx------ 2 pgstbf staff 256 Feb 2 09:17 pg_stat_tmp
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_subtrans
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_tblspc
drwx------ 2 pgstbf staff 256 Feb 2 09:08 pg_twophase
drwx------ 3 pgstbf staff 256 Feb 2 09:08 pg_xlog
-rw------- 1 pgstbf staff 88 Feb 2 09:08
postgresql.auto.conf
-rw------- 1 pgstbf staff 22236 Feb 2 09:08 postgresql.conf
-rw------- 1 pgstbf staff 46 Feb 2 09:08 postmaster.opts
-rw------- 1 pgstbf staff 69 Feb 2 09:08 postmaster.pid
bash-4.3$ ls -l base
total 112
drwx------ 2 pgstbf staff 16384 Feb 2 09:08 1
drwx------ 2 pgstbf staff 12288 Feb 2 09:08 12407
drwx------ 2 pgstbf staff 12288 Feb 2 09:09 12408
drwx------ 2 pgstbf staff 16384 Feb 2 09:14 16384
bash-4.3$ ls -l base/16384/
total 15200
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 112
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 113
-rw------- 1 pgstbf staff 57344 Feb 2 09:09 12243
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12243_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12243_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12245
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12247
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12248
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12248_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12248_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12250
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12252
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12253
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12253_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12253_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12255
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12257
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12258
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12258_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12258_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12260
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12262
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12263
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12263_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12263_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12265
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12267
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12268
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 12268_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12268_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12270
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12272
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12273
-rw------- 1 pgstbf staff 0 Feb 2 09:09 12275
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 12277
-rw------- 1 pgstbf staff 73728 Feb 2 09:14 1247
-rw------- 1 pgstbf staff 24576 Feb 2 09:14 1247_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 1247_vm
-rw------- 1 pgstbf staff 368640 Feb 2 09:14 1249
-rw------- 1 pgstbf staff 24576 Feb 2 09:14 1249_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 1249_vm
-rw------- 1 pgstbf staff 589824 Feb 2 09:09 1255
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 1255_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 1255_vm
-rw------- 1 pgstbf staff 90112 Feb 2 09:14 1259
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 1259_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 1259_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1417
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1417_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1418
-rw------- 1 pgstbf staff 0 Feb 2 09:09 1418_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 16385
-rw------- 1 pgstbf staff 450560 Feb 2 09:14 16388
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 16388_fsm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 16391
-rw------- 1 pgstbf staff 40960 Feb 2 09:14 16394
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 16394_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 174
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 175
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2187
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2328
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2328_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2336
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2336_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2337
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2600
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2600_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2600_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2601
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2601_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2601_vm
-rw------- 1 pgstbf staff 49152 Feb 2 09:09 2602
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2602_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2602_vm
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2603
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2603_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2603_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2604
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2604_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2605
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2605_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2605_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2606
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2606_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2606_vm
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2607
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2607_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2607_vm
-rw------- 1 pgstbf staff 450560 Feb 2 09:14 2608
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2608_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 2608_vm
-rw------- 1 pgstbf staff 278528 Feb 2 09:09 2609
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2609_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2609_vm
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2610
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2610_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2610_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2611
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2611_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2612
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2612_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2612_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2613
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2613_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2615
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2615_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2615_vm
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2616
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2616_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2616_vm
-rw------- 1 pgstbf staff 122880 Feb 2 09:09 2617
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2617_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2617_vm
-rw------- 1 pgstbf staff 98304 Feb 2 09:09 2618
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2618_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2618_vm
-rw------- 1 pgstbf staff 122880 Feb 2 09:14 2619
-rw------- 1 pgstbf staff 24576 Feb 2 09:14 2619_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:14 2619_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2620
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2620_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2650
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2651
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2652
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2653
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2654
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2655
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2656
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2657
-rw------- 1 pgstbf staff 106496 Feb 2 09:14 2658
-rw------- 1 pgstbf staff 73728 Feb 2 09:14 2659
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2660
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2661
-rw------- 1 pgstbf staff 32768 Feb 2 09:14 2662
-rw------- 1 pgstbf staff 40960 Feb 2 09:14 2663
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2664
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2665
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2666
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2667
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2668
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2669
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2670
-rw------- 1 pgstbf staff 319488 Feb 2 09:14 2673
-rw------- 1 pgstbf staff 352256 Feb 2 09:14 2674
-rw------- 1 pgstbf staff 172032 Feb 2 09:09 2675
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2678
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2679
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2680
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2681
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2682
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2683
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2684
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2685
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2686
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2687
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2688
-rw------- 1 pgstbf staff 40960 Feb 2 09:09 2689
-rw------- 1 pgstbf staff 81920 Feb 2 09:09 2690
-rw------- 1 pgstbf staff 253952 Feb 2 09:09 2691
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2692
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2693
-rw------- 1 pgstbf staff 16384 Feb 2 09:14 2696
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2699
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2701
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2702
-rw------- 1 pgstbf staff 16384 Feb 2 09:14 2703
-rw------- 1 pgstbf staff 40960 Feb 2 09:14 2704
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2753
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2753_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2753_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2754
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2755
-rw------- 1 pgstbf staff 32768 Feb 2 09:09 2756
-rw------- 1 pgstbf staff 32768 Feb 2 09:09 2757
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2830
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2830_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2831
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2832
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2832_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2833
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2834
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2834_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2835
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2836
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2836_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2837
-rw------- 1 pgstbf staff 385024 Feb 2 09:09 2838
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2838_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2838_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2839
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2840
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 2840_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2840_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 2841
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2995
-rw------- 1 pgstbf staff 0 Feb 2 09:09 2995_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 2996
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3079
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3079_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3079_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3080
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3081
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3085
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3118
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3118_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3119
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3164
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3256
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3256_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3257
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3258
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3394
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3394_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3394_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3395
-rw------- 1 pgstbf staff 32768 Feb 2 09:14 3455
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3456
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3456_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3456_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3466
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3466_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3467
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3468
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3501
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3501_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3502
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3503
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3534
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3541
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3541_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3541_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3542
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3574
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3575
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3576
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3576_vm
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3596
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3596_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3597
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3598
-rw------- 1 pgstbf staff 0 Feb 2 09:09 3598_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3599
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3600
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3600_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3600_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3601
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3601_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3601_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3602
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3602_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3602_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3603
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3603_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3603_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3604
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3605
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3606
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3607
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3608
-rw------- 1 pgstbf staff 32768 Feb 2 09:09 3609
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3712
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3764
-rw------- 1 pgstbf staff 24576 Feb 2 09:09 3764_fsm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 3764_vm
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3766
-rw------- 1 pgstbf staff 16384 Feb 2 09:09 3767
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 548
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 549
-rw------- 1 pgstbf staff 0 Feb 2 09:09 826
-rw------- 1 pgstbf staff 0 Feb 2 09:09 826_vm
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 827
-rw------- 1 pgstbf staff 8192 Feb 2 09:09 828
-rw------- 1 pgstbf staff 4 Feb 2 09:09 PG_VERSION
-rw------- 1 pgstbf staff 512 Feb 2 09:09 pg_filenode.map
-rw------- 1 pgstbf staff 112660 Feb 2 09:09 pg_internal.initLe 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:
Hi Konstantin,
....
If you could*share scripts or instructions about the tests you are
doing with pgbench*, I would reproduce here.You do not need any script.
Just two simple commands.
One to initialize database:pgbench -i -s 1000
And another to run benchmark itself:
pgbench -c 100 -j 20 -P 1 -T 1000000000
...Regards,
Tony
Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,
On 01.02.2017 18:42, REIX, Tony wrote:
Hi Konstantin
*XLC.*
I'm on AIX 7.1 for now.
I'm using this version of *XL**C v13*:
# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003With this version, I have (at least, since I tested with "check"
and not "check-world" at that time) 2 failing tests:
create_aggregate , aggregates .With the following *XLC v12* version, I have NO test failure:
# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016So maybe you are not using XLC v13.1.3.3, rather another
sub-version. Unless you are using more options for the configure ?*Configure*.
What are the options that you give to the configure ?
export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"*Hard load & 64 cores ?* OK. That clearly explains why I do not
see this issue.*pgbench ?* I wanted to run it. However, I'm still looking where
to get it plus a guide for using it for testing.pgbench is part of Postgres distributive (src/bin/pgbench)
I would add such tests when building my PostgreSQL RPMs on AIX. So
any help is welcome !*Performance*.
- Also, I'd like to compare PostgreSQL performance on AIX vs
Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL
performance benchmark that I could find and use ? pgbench ?pgbench is most widely used tool simulating OLTP workload.
Certainly it is quite primitive and its results are rather
artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating
actual workload of your real application.- I'm interested in any information for improving the performance
& quality of my PostgreSQM RPMs on AIX./(As I already said,
BullFreeware RPMs for AIX are free and can be used by anyone, like
Perzl RPMs are//.////My compa//ny (ATOS/Bull) sells IBM Power
machines under the Escala brand s//ince ages (25 years this year)//)/.*How to help ?*
How could I help for improving the quality and performance of
PostgreSQL on AIX ?We still have one open issue at AIX: see
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi Tony,
On 02.02.2017 17:10, REIX, Tony wrote:
Hi Konstantin
I've discussed the "zombie/exit" issue with our expert here.
- He does not think that AIX has anything special here
- If the process is marked <exiting> in ps, this is because the flag
SEXIT is set, thus the process is blocked somewhere in the kexitx()
syscall, waiting for something.- In order to know what it is waiting for, the best would be to have a
look with *kdb*.
kdb shows the following stack:
pvthread+073000 STACK:
[005E1958]slock+000578 (00000000005E1958, 8000000000001032 [??])
[00009558]: .simple_lock+000058 () [00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??) [006544AC]vm_map_entry_delete+00074C (??, ??, ??) [00659C30]vm_map_delete+000150 (??, ??, ??, ??) [00659D88]vm_map_deallocate+000048 (??, ??) [0011C588]kexitx+001408 (??) [000BB08C]kexit+00008C () ___ Recovery (FFFFFFFFFFF9290) ___ WARNING: Eyecatcher/version mismatch in RWA
[00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??)
[006544AC]vm_map_entry_delete+00074C (??, ??, ??)
[00659C30]vm_map_delete+000150 (??, ??, ??, ??)
[00659D88]vm_map_deallocate+000048 (??, ??)
[0011C588]kexitx+001408 (??)
[000BB08C]kexit+00008C ()
___ Recovery (FFFFFFFFFFF9290) ___
WARNING: Eyecatcher/version mismatch in RWA
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:* __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
* providing sequential consistency. This is undocumented.But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.
Konstantin, does "make -C src/bin/pg_bench check" fail >10% of the time in the
bad build?
Seems like it was not so much undocumented, but an implementation detail
that was not guaranteed after all..
Seems so.
There was a long thread on these things the last time this was changed: /messages/by-id/20160425185204.jrvlghn3jxulsb7i@alap3.anarazel.de.
I couldn't find an explanation there of why we thought that fetch_and_add
implicitly performs sync and isync.
It was in the generated code, for AIX xlc 12.1.0.0.
Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.__sync() seems more appropriate there, anyway. We're using intrinsics for
all the other things in generic-xlc.h. But it sure is scary that the "asm"
sections just disappeared.
That is a problem, but it's a stretch to conclude that asm sections are
generally prone to removal, while intrinsics are generally durable.
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 ret; + /* - * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby - * providing sequential consistency. This is undocumented. + * Use __sync() before and __isync() after, like in compare-exchange + * above. */ - return __fetch_and_add((volatile int *)&ptr->value, add_); + __sync(); + + ret = __fetch_and_add((volatile int *)&ptr->value, add_); + + __isync(); + + return ret; }
Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm. As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them. Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 ret; + /* - * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby - * providing sequential consistency. This is undocumented. + * Use __sync() before and __isync() after, like in compare-exchange + * above. */ - return __fetch_and_add((volatile int *)&ptr->value, add_); + __sync(); + + ret = __fetch_and_add((volatile int *)&ptr->value, add_); + + __isync(); + + return ret; }Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm. As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them. Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.
Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions. As there have been other
bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
that this thread got lost in the wild, I have added an entry in the next
CF:
https://commitfest.postgresql.org/17/1484/
As Heikki is not around these days, Noah, could you provide a new
version of the patch? This bug has been around for some time now, it
would be nice to move on.. I think I could have written patches myself,
but I don't have an AIX machine at hand. Of course not with XLC 13.1.
--
Michael
Hi Michael,
My team and my company (ATOS/Bull) are involved in improving the quality of PostgreSQL on AIX.
We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several logical/physical processors.
And I plan to have a more powerful (more processors) machine for running PostgreSQL stress tests.
A DB-expert colleague has started to write some new not-too-complex stress tests that we'd like to submit to PostgreSQL project later.
For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4 with a patch), we have only (on AIX 6.1 and 7.2) one remaining random failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for PostgreSQL 9.6.6 and 10.1 . And, on AIX 7.1, we have one more remaining failure that may be due to some other dependent software. Investigating.
XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that fixes it in ./src/backend/parser/gram.c . We have opened a PMR (defect) against XLC.
Note that our tests are now executed without the PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that suppresses the inlining of routines on AIX. I think that older versions of XLC have shown issues that have now disappeared (or, at least, many of them).
I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and, using times outputs provided by PostgreSQL tests, XLC seems to provide at least 8% more speed. We also plan to run professional performance tests in order to compare PostgreSQL 10.1 on AIX vs Linux/Power. I saw some 2017 performance slides, made with older versions of PostgreSQL and XLC, that show bad PostgreSQL performance on AIX vs Linux/Power, and I cannot believe it. We plan to investigate this.
Though I have very very little skills about PostgreSQL (I'm porting too now GCC Go on AIX), we can help, at least by compiling/testing/investigating/stressing in a different AIX environment than the AIX ones (32/64bit, XLC/GCC) you have in your BuildFarm.
Let me know how we can help.
Regards,
Cordialement,
Tony Reix
ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net
________________________________________
De : Michael Paquier [michael.paquier@gmail.com]
Envoyé : mardi 16 janvier 2018 08:12
À : Noah Misch
Cc : Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle
Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX
On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 ret; + /* - * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby - * providing sequential consistency. This is undocumented. + * Use __sync() before and __isync() after, like in compare-exchange + * above. */ - return __fetch_and_add((volatile int *)&ptr->value, add_); + __sync(); + + ret = __fetch_and_add((volatile int *)&ptr->value, add_); + + __isync(); + + return ret; }Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm. As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them. Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.
Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions. As there have been other
bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
that this thread got lost in the wild, I have added an entry in the next
CF:
https://commitfest.postgresql.org/17/1484/
As Heikki is not around these days, Noah, could you provide a new
version of the patch? This bug has been around for some time now, it
would be nice to move on.. I think I could have written patches myself,
but I don't have an AIX machine at hand. Of course not with XLC 13.1.
--
Michael
On Tue, Jan 16, 2018 at 08:25:51AM +0000, REIX, Tony wrote:
My team and my company (ATOS/Bull) are involved in improving the
quality of PostgreSQL on AIX.
Cool to hear that!
We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several
logical/physical processors. And I plan to have a more powerful (more
processors) machine for running PostgreSQL stress tests.
A DB-expert colleague has started to write some new not-too-complex
stress tests that we'd like to submit to PostgreSQL project later.
For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4
with a patch), we have only (on AIX 6.1 and 7.2) one remaining random
failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for
PostgreSQL 9.6.6 and 10.1 . And, on AIX 7.1, we have one more
remaining failure that may be due to some other dependent
software. Investigating.
XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that
fixes it in ./src/backend/parser/gram.c . We have opened a PMR
(defect) against XLC. Note that our tests are now executed without the
PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that
suppresses the inlining of routines on AIX. I think that older
versions of XLC have shown issues that have now disappeared (or, at
least, many of them).
I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and,
using times outputs provided by PostgreSQL tests, XLC seems to provide
at least 8% more speed. We also plan to run professional performance
tests in order to compare PostgreSQL 10.1 on AIX vs Linux/Power. I saw
some 2017 performance slides, made with older versions of PostgreSQL
and XLC, that show bad PostgreSQL performance on AIX vs Linux/Power,
and I cannot believe it. We plan to investigate this.
That's interesting investigation. The community is always interested in
such stories. You could have material for a conference talk.
Though I have very very little skills about PostgreSQL (I'm porting
too now GCC Go on AIX), we can help, at least by
compiling/testing/investigating/stressing in a different AIX
environment than the AIX ones (32/64bit, XLC/GCC) you have in your
BuildFarm.
Setting up a buildfarm member with the combination of compiler and
environment where you are seeing the failures would be the best answer
in my opinion:
https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto
This does not require special knowledge of PostgreSQL internals, and the
in-core testing framework has improved the last couple of years to allow
for more advanced tests. I do use it as well for some tests on my own
modules (company stuff). The buildfarm code has also followed the pace,
which really helps a lot, thanks to Andrew Dunstan.
Developers and committers are more pro-active if they can see automated
tests failing in the central community place. And buildfarm animals
usually don't stay red for more than a couple of days.
--
Michael
Hi Michael
You said:
Setting up a buildfarm member with the combination of compiler and
environment where you are seeing the failures would be the best answer
in my opinion:
https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_HowtoThis does not require special knowledge of PostgreSQL internals, and the
in-core testing framework has improved the last couple of years to allow
for more advanced tests. I do use it as well for some tests on my own
modules (company stuff). The buildfarm code has also followed the pace,
which really helps a lot, thanks to Andrew Dunstan.Developers and committers are more pro-active if they can see automated
tests failing in the central community place. And buildfarm animals
usually don't stay red for more than a couple of days.
Hummmm I quickly read this HowTo and I did not find any explanation about the "protocole"
used for exchanging data between my VM and the PostgreSQL BuildFarm.
My machine is behind firewalls and have restricted access to the outside.
Either I'll see when that does not work... or I can get some information about which port
(or anything else) I have to ask to be opened, if needed.
Anyway, I'll read it in depth now and I'll try to implement it.
About the random error we see, I guess that I may see it, though PostgreSQL BuildFarm AIX VMs do not see it,
because I'm using a not-too-small VM, using variable Physical Processing units (CPUs) since my VM is uncapped
(I may use all Physical CPU if available): up to 4 physical processors and up to 8 virtual processors.
And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet.
Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help.
Regards,
Tony
On 01/16/2018 08:50 AM, REIX, Tony wrote:
Hi Michael
You said:
Setting up a buildfarm member with the combination of compiler and
environment where you are seeing the failures would be the best answer
in my opinion:
https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_HowtoThis does not require special knowledge of PostgreSQL internals, and the
in-core testing framework has improved the last couple of years to allow
for more advanced tests. I do use it as well for some tests on my own
modules (company stuff). The buildfarm code has also followed the pace,
which really helps a lot, thanks to Andrew Dunstan.Developers and committers are more pro-active if they can see automated
tests failing in the central community place. And buildfarm animals
usually don't stay red for more than a couple of days.Hummmm I quickly read this HowTo and I did not find any explanation about the "protocole"
used for exchanging data between my VM and the PostgreSQL BuildFarm.
My machine is behind firewalls and have restricted access to the outside.
Either I'll see when that does not work... or I can get some information about which port
(or anything else) I have to ask to be opened, if needed.
Anyway, I'll read it in depth now and I'll try to implement it.
Communication is only done via outbound port 443 (https). There are no
passwords required and no inbound connections, ever. Uploads are signed
using a shared secret. Communication can can be via a proxy. If you need
the client to use a proxy with git that's a bit more complex, but possible.
Ping me if you need help setting this up.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jan 16, 2018 at 01:50:29PM +0000, REIX, Tony wrote:
And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet.
Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help.
It has 48 virtual CPUs. Here's the prtconf output:
System Model: IBM,8231-E2C
Machine Serial Number: 104C0CT
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 12
Processor Clock Speed: 3720 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 1 10-4C0CT
Memory Size: 127488 MB
Good Memory Size: 127488 MB
Platform Firmware level: AL740_100
Firmware Version: IBM,AL740_100
Console Login: enable
Auto Restart: true
Full Core: false
NX Crypto Acceleration: Not Capable
Network Information
Host Name: power-aix
IP Address: 140.211.15.154
Sub Netmask: 255.255.255.0
Gateway: 140.211.15.1
Name Server: 140.211.166.130
Domain Name: osuosl.org
Paging Space Information
Total Paging Space: 12288MB
Percent Used: 1%
Volume Groups Information
==============================================================================
Active VGs
==============================================================================
homevg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 558 183 00..00..00..71..112
hdisk2 active 558 0 00..00..00..00..00
hdisk3 active 558 0 00..00..00..00..00
hdisk4 active 558 0 00..00..00..00..00
==============================================================================
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 558 510 111..93..83..111..112
hdisk5 active 558 514 111..97..83..111..112
==============================================================================
INSTALLED RESOURCE LIST
The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
* = Diagnostic support not available.
Model Architecture: chrp
Model Implementation: Multiple Processor, PCI bus
+ sys0 System Object
+ sysplanar0 System Planar
* vio0 Virtual I/O Bus
* vsa2 U78AB.001.WZSHZY0-P1-T2 LPAR Virtual Serial Adapter
* vty2 U78AB.001.WZSHZY0-P1-T2-L0 Asynchronous Terminal
* vsa1 U78AB.001.WZSHZY0-P1-T1 LPAR Virtual Serial Adapter
* vty1 U78AB.001.WZSHZY0-P1-T1-L0 Asynchronous Terminal
* vsa0 U8231.E2C.104C0CT-V1-C0 LPAR Virtual Serial Adapter
* vty0 U8231.E2C.104C0CT-V1-C0-L0 Asynchronous Terminal
* pci8 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci7 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci6 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci5 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci4 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci10 U78AB.001.WZSHZY0-P1-C2 PCI Bus
+ cor0 U78AB.001.WZSHZY0-P1-C2-T1 GXT145 Graphics Adapter
* pci3 U78AB.001.WZSHZY0-P1 PCI Express Bus
+ ent0 U78AB.001.WZSHZY0-P1-C7-T1 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent1 U78AB.001.WZSHZY0-P1-C7-T2 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent2 U78AB.001.WZSHZY0-P1-C7-T3 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent3 U78AB.001.WZSHZY0-P1-C7-T4 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
* pci2 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci1 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci9 U78AB.001.WZSHZY0-P1 PCI Bus
+ usbhc0 U78AB.001.WZSHZY0-P1 USB Host Controller (33103500)
+ usbhc1 U78AB.001.WZSHZY0-P1 USB Host Controller (33103500)
+ usbhc2 U78AB.001.WZSHZY0-P1 USB Enhanced Host Controller (3310e000)
* pci0 U78AB.001.WZSHZY0-P1 PCI Express Bus
+ sissas0 U78AB.001.WZSHZY0-P1-T9 PCIe x4 Planar 3Gb SAS Adapter
* sas0 U78AB.001.WZSHZY0-P1-T9 Controller SAS Protocol
* sfwcomm0 SAS Storage Framework Comm
+ hdisk0 U78AB.001.WZSHZY0-P3-D1 SAS Disk Drive (600000 MB)
+ hdisk1 U78AB.001.WZSHZY0-P3-D2 SAS Disk Drive (600000 MB)
+ hdisk2 U78AB.001.WZSHZY0-P3-D3 SAS Disk Drive (600000 MB)
+ hdisk3 U78AB.001.WZSHZY0-P3-D4 SAS Disk Drive (600000 MB)
+ hdisk4 U78AB.001.WZSHZY0-P3-D5 SAS Disk Drive (600000 MB)
+ hdisk5 U78AB.001.WZSHZY0-P3-D6 SAS Disk Drive (600000 MB)
+ ses0 U78AB.001.WZSHZY0-P2-Y1 SAS Enclosure Services Device
+ ses1 U78AB.001.WZSHZY0-P2-Y1 SAS Enclosure Services Device
* sata0 U78AB.001.WZSHZY0-P1-T9 Controller SATA Protocol
+ cd0 U78AB.001.WZSHZY0-P3-D7 SATA DVD-RAM Drive
+ L2cache0 L2 Cache
+ mem0 Memory
+ proc0 Processor
+ proc4 Processor
+ proc8 Processor
+ proc12 Processor
+ proc16 Processor
+ proc20 Processor
+ proc24 Processor
+ proc28 Processor
+ proc32 Processor
+ proc36 Processor
+ proc40 Processor
+ proc44 Processor
Thanks Noah !
Hummm You have a big machine, more powerful than mine. However, it seems that you do not see the random failure I see.
Cordialement,
Tony Reix
ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net
________________________________________
De : Noah Misch [noah@leadboat.com]
Envoyé : mardi 16 janvier 2018 17:19
À : REIX, Tony
Cc : Michael Paquier; Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle; OLIVA, PASCAL; EMPEREUR-MOT, SYLVIE
Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX
On Tue, Jan 16, 2018 at 01:50:29PM +0000, REIX, Tony wrote:
And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet.
Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help.
It has 48 virtual CPUs. Here's the prtconf output:
System Model: IBM,8231-E2C
Machine Serial Number: 104C0CT
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 12
Processor Clock Speed: 3720 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 1 10-4C0CT
Memory Size: 127488 MB
Good Memory Size: 127488 MB
Platform Firmware level: AL740_100
Firmware Version: IBM,AL740_100
Console Login: enable
Auto Restart: true
Full Core: false
NX Crypto Acceleration: Not Capable
Network Information
Host Name: power-aix
IP Address: 140.211.15.154
Sub Netmask: 255.255.255.0
Gateway: 140.211.15.1
Name Server: 140.211.166.130
Domain Name: osuosl.org
Paging Space Information
Total Paging Space: 12288MB
Percent Used: 1%
Volume Groups Information
==============================================================================
Active VGs
==============================================================================
homevg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 558 183 00..00..00..71..112
hdisk2 active 558 0 00..00..00..00..00
hdisk3 active 558 0 00..00..00..00..00
hdisk4 active 558 0 00..00..00..00..00
==============================================================================
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 558 510 111..93..83..111..112
hdisk5 active 558 514 111..97..83..111..112
==============================================================================
INSTALLED RESOURCE LIST
The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
* = Diagnostic support not available.
Model Architecture: chrp
Model Implementation: Multiple Processor, PCI bus
+ sys0 System Object
+ sysplanar0 System Planar
* vio0 Virtual I/O Bus
* vsa2 U78AB.001.WZSHZY0-P1-T2 LPAR Virtual Serial Adapter
* vty2 U78AB.001.WZSHZY0-P1-T2-L0 Asynchronous Terminal
* vsa1 U78AB.001.WZSHZY0-P1-T1 LPAR Virtual Serial Adapter
* vty1 U78AB.001.WZSHZY0-P1-T1-L0 Asynchronous Terminal
* vsa0 U8231.E2C.104C0CT-V1-C0 LPAR Virtual Serial Adapter
* vty0 U8231.E2C.104C0CT-V1-C0-L0 Asynchronous Terminal
* pci8 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci7 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci6 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci5 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci4 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci10 U78AB.001.WZSHZY0-P1-C2 PCI Bus
+ cor0 U78AB.001.WZSHZY0-P1-C2-T1 GXT145 Graphics Adapter
* pci3 U78AB.001.WZSHZY0-P1 PCI Express Bus
+ ent0 U78AB.001.WZSHZY0-P1-C7-T1 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent1 U78AB.001.WZSHZY0-P1-C7-T2 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent2 U78AB.001.WZSHZY0-P1-C7-T3 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent3 U78AB.001.WZSHZY0-P1-C7-T4 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
* pci2 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci1 U78AB.001.WZSHZY0-P1 PCI Express Bus
* pci9 U78AB.001.WZSHZY0-P1 PCI Bus
+ usbhc0 U78AB.001.WZSHZY0-P1 USB Host Controller (33103500)
+ usbhc1 U78AB.001.WZSHZY0-P1 USB Host Controller (33103500)
+ usbhc2 U78AB.001.WZSHZY0-P1 USB Enhanced Host Controller (3310e000)
* pci0 U78AB.001.WZSHZY0-P1 PCI Express Bus
+ sissas0 U78AB.001.WZSHZY0-P1-T9 PCIe x4 Planar 3Gb SAS Adapter
* sas0 U78AB.001.WZSHZY0-P1-T9 Controller SAS Protocol
* sfwcomm0 SAS Storage Framework Comm
+ hdisk0 U78AB.001.WZSHZY0-P3-D1 SAS Disk Drive (600000 MB)
+ hdisk1 U78AB.001.WZSHZY0-P3-D2 SAS Disk Drive (600000 MB)
+ hdisk2 U78AB.001.WZSHZY0-P3-D3 SAS Disk Drive (600000 MB)
+ hdisk3 U78AB.001.WZSHZY0-P3-D4 SAS Disk Drive (600000 MB)
+ hdisk4 U78AB.001.WZSHZY0-P3-D5 SAS Disk Drive (600000 MB)
+ hdisk5 U78AB.001.WZSHZY0-P3-D6 SAS Disk Drive (600000 MB)
+ ses0 U78AB.001.WZSHZY0-P2-Y1 SAS Enclosure Services Device
+ ses1 U78AB.001.WZSHZY0-P2-Y1 SAS Enclosure Services Device
* sata0 U78AB.001.WZSHZY0-P1-T9 Controller SATA Protocol
+ cd0 U78AB.001.WZSHZY0-P3-D7 SATA DVD-RAM Drive
+ L2cache0 L2 Cache
+ mem0 Memory
+ proc0 Processor
+ proc4 Processor
+ proc8 Processor
+ proc12 Processor
+ proc16 Processor
+ proc20 Processor
+ proc24 Processor
+ proc28 Processor
+ proc32 Processor
+ proc36 Processor
+ proc40 Processor
+ proc44 Processor
On 2018-01-16 16:12:11 +0900, Michael Paquier wrote:
On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 ret; + /* - * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby - * providing sequential consistency. This is undocumented. + * Use __sync() before and __isync() after, like in compare-exchange + * above. */ - return __fetch_and_add((volatile int *)&ptr->value, add_); + __sync(); + + ret = __fetch_and_add((volatile int *)&ptr->value, add_); + + __isync(); + + return ret; }Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm. As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them. Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions.
Given the quality of the intrinsics on AIX, see past commits and the
comment in the code quoted above, I think we're much better of doing
this via inline asm.
Greetings,
Andres Freund
On Tue, Jan 16, 2018 at 08:50:24AM -0800, Andres Freund wrote:
On 2018-01-16 16:12:11 +0900, Michael Paquier wrote:
On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm. As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them. Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions.Given the quality of the intrinsics on AIX, see past commits and the
comment in the code quoted above, I think we're much better of doing
this via inline asm.
For me, verifiability is the crucial benefit of inline asm. Anyone with an
architecture manual can thoroughly review an inline asm implementation. Given
intrinsics and __xlc_ver__ conditionals, the same level of review requires
access to every xlc version.
As there have been other
bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
that this thread got lost in the wild, I have added an entry in the next
CF:
https://commitfest.postgresql.org/17/1484/
The most recent patch version is Returned with Feedback. As a matter of
procedure, I discourage creating commitfest entries as a tool to solicit new
patch versions. If I were the author of a RwF patch, I would dislike finding
a commitfest entry that I did not create with myself listed as author.
If you do choose to proceed, the entry should be Waiting on Author.
As Heikki is not around these days, Noah, could you provide a new
version of the patch? This bug has been around for some time now, it
would be nice to move on..
Not soon.
Note that fixing this bug is just the start of accepting XLC 13.1 as a
compiler of PostgreSQL. If we get a buildfarm member with a few dozen clean
runs (blocked by, at a minimum, fixing this and the inlining bug), we'll have
something. Until then, support for XLC 13.1 is an anti-feature.
nm
Am Dienstag, den 16.01.2018, 08:25 +0000 schrieb REIX, Tony:
I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1
and, using times outputs provided by PostgreSQL tests, XLC seems to
provide at least 8% more speed. We also plan to run professional
performance tests in order to compare PostgreSQL 10.1 on AIX vs
Linux/Power. I saw some 2017 performance slides, made with older
versions of PostgreSQL and XLC, that show bad PostgreSQL performance
on AIX vs Linux/Power, and I cannot believe it. We plan to
investigate this.
I assume you are referring to the attached graph i've showed on
PGConf.US 2017 ?
The numbers we've got on that E850 machine (pgbench SELECT-only, scale
1000) weren't really good in comparison to Linux on the same machine.
We tried many options to make the performance better, overall the graph
shows the best performance from Linux *and* AIX with gcc. XL C We used
some knobs to get the best out on AIX:
export OBJECT_MODE=64; gcc -m64
ldedit -b forkpolicy:cor -b textpsize:64K -b datapsize:64K -b
stackpsize:64K postgres
export MALLOCOPTIONS=multiheap:16,considersize,pool,no_mallinfo
schedo -p -o vpm_fold_policy=4
There are many other things you can tune on AIX, but they didn't seem
to give the improvement we'd like to see.
Attachments:
AIX_vs_LINUX_fastest.pngimage/png; name=AIX_vs_LINUX_fastest.pngDownload
�PNG
IHDR � � ��q= pHYs � ��l� ��IDATx��� \�����r� ("("�h���YYij��G���_e�����2�ef�}x%��Wy��(^x��"*�}�fg] QQv�^��>������|����w��Z� ��R� < ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� F�����9�q����w��O?����^;y�d�:u=<<���keu���O>����_��aC��]M���#~������{������w�}�������?�������-[-�={�<t��R�;��Mzz�Z� J7 `_�u��
�4n���m\\\�-[��]�wu��3g�L�:u���f�kXZZ��7o��!����=��� % (��G�<xp��m/�����~�h���m��U��S������=�\���sss�
���;w���l��5k���<f��?��S��<�N��4 @�C� �������������a�>����s�:88�o�i��m��u���G�Y�p��C����S�����KHH8v��a��_~y��v����}��������>���T�rei�S�N���(��dff~��m������_~����O�O �0Hp @�������Oo���F�>|�{���z���C��o{KK����8p���3f�
*p����5����u��UFV�X�����]����N�:5o�|��-����w�7n�����o�Y�����i���K/�4i���={zxx<�~
� �A� �$$$$99���^�j__�g�}��o�}@�#�V���9sF�]�^���'�o�>}�L�0a��U3f��V����������h4�.]�~�zppp�F������6l���\���j�����?^�n�t������wU�� ( P$_�u��]}||���#G������S����g��
�O�>r���.Nqss����!�
MKK2d�TW�Z�I�&������'�}����[7h���'���1{�l�\���M���OaN �0Hp �����T�����o�}���_|���
v������w�������/pK��=z�8q�^�z+V�h��M������m��}���?���������8g��\��>|���?v�X������333��>;;�����N ��Hp ���������7l��wp��EK�,����%��'O����o��f�V�~����
N�<y�����K�.���_�z����o����)W...o���ys����&MJNN^�l��NM��|����7���O���~�T�b������D����GFFV�\��N ��Hp �cJKK[�|�������3��o�Y�n]ppp��dff0�f��3g��V�b��Y������?���O�{kk�~�����/���VVV}��U�������#�J��x7n���{���o���O<��+������
2v����/�X�p�[o�e��e�����~���;M ��"� �i��� �7��q�:u�|��7�&8�'O���8x�����2��+��]����^
wuu��(�^�p��_~��gO�e/����
:t����C����?�u�VC�s/�V�y���#vvv:tx�9��9S:�����S����t^S�N���|��g---w��-�h���1N ��Hp �c���oj����I�{�8p���3�������o�~��{����aC�����?� ��?~������y��5k�<r���~hl��Yhh�G}$����U�J�#FL�>�~����v��%����Ott�C����i����{��;��_,_�|��u�����WS�����O ��Hp �c��{�����#��(��<�Lnn����Q#%%�:s��������ou
|+o{���n��/(�7����S�>�4 � ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � @QYYY����� (�Hp ��k�������
���+�~����j��]����}6_{...�k������O��8JQ: (" P$�{��;wn������y��7o�x��>��O�O>�D�o���t���}�n���K�.�q��t P$8 �H�������?s������������A�O>��y�������?������;88<���.,_����>����o����/��x�b||��o�v������Y�fu���������e����Z*����s����%�y�I'boo����v��M�&;;�����*��I�;1� P��� �����������mll&O���G�+W�DGG2������{\\�K/���[o}��7[�n�Q���#�-��J;;;<x�����~�6>{��4^��>���mff�R?������DEE=�N
D� L��?����uvv���#G.\�����iii��������������a���E�;v��?�<}�����:z�����������C?�`���K�,��{���kMz IRRR�w `@� �d���!!!yG����a���<y����;~�xNNNnn�4���V�^����?����7�����M�4���3g�H���k�|��'
��{-_�|��5J���Z�J��������(; 0 � E�������yG*U���s����?k��urr:q�D�����4��&M��y���7�h���w�y����~���A�f�qss�w��~���f��-t��t��M�������r %�����) @��� �"quu�S�N���~�-55����rrr�V����������u�r��#t>��������5j��)->��SO)#.\������}�g����\�F
�^�ti��m;v��\���������LOO7l����L
� _�j����]�:u������^�ZZ���>|���I���]��Y����#G���UK������s������w�����_�>%�g����v"�������<`���-[N�<���_n��I�*Uj���H����>��o�gee}��G�H�C:
}���
� �1�� �k����i��
������O/_�|����{�^�z���������]���+3f��o��v���/J����Z�f�233k���l����[�j�������2w����1c����
��}�������/�7n\�j�*T�0}����������7tr���+W�� ��@� �_�u����1��[�N)�v�:e��{���������-���F������>�=kk�C�V+T��H���CDD�a5>>^)������~; x$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��+��������������Ki����&L�������1c���M4 P
S��������(ufff�^��M�6j��t��)((�������j�*�� 0��Hp6l�1d���W�J�;v����=z�T7o��{��!!!-Z�0�������� L�� N||�� 6l����?+#�N�
0lP�v���p777���� ���� &�3�n������{{{����C���MuJ ��� Nhh��s�/^�w���)55�������c���
3�y �S�n��n0 �&8!!!QQQ��W��Qegg�={�����7o�a��������� ���$�����0�V @)�/�Q��<�1����~tt��_~�������h��1c����k����g����5��IO
������uH+����q��M�2���g������Bw��� J��Kp�N�j�����oS �*\� �GB� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 �� ���� �; sG� `�Hp � ��#� 0w$8 J����[I����eRZ�����j7 �D� �L�ed�LJ����[����;��������<�������j L� @qK���ON�^Wo&��LR���<K]q�V����VR�q ��� 0��\�ri�r�������;��'?^��h<\�����]����-?� (mL�um����'O�������8q��q��AGG����6���U����'L�)m9c���Joq @�N������iJJ�����VL�����y8;(�����>�q��������~t 0+�Mpbbb���W�^��K�C��k��i���������������-333{��5m��Q�F8p�S�NAAA~~~E�U��I� (��2�
S�(���,&�R~+;'��Gw��Q���.�PF��t&���
%�i�V�x��.]�HuPPP��5O�:���'������r������G������w��=$$�E�E�>}�I� 079��S��q�_��YF?��������2���h������� ��i���{KENNNhhhLL��O>oee5d���{������9u�T@@����k�wss+��IO
(Nyoe���yN�M9�I�63��m�-
W�(��wu�p��%3J@ccE. �R�{�X���_tww���o+W�5x���#GJ����������wJJ����+$Rt�2X� EJz����JJ�q;5��2������\�q���<��{��I��}yW}�hgm�C Cq$8��������"''��^������Z�l9h�����+��>�����S���=������� �X��s���eR��B7��~���Y9F�e$����V���.���vV�v�w^��{>��{��$��$�cD��� <.�&8g��������e�V��{���7�n�:::�I�&�6YYY666���3|0<<<00���y;���[�����|# �Ced��} ���4��3y���z+1=Wk�h���J�Y)�D����W�p�[�P���z�b�Mp
�i���m�^�xq��-o��Fddd����n���e��������������]�h��1cv��%�;{�l__����� P�$�f���v#1U�P&oL#'5���jJN�4�q�,-�<!���2y����� e�i�V�Z}����F��v����������r���C�����T��Tw��Q�844t��qS�L���Y�x���� P�ee��{N���&�f�i��������|��e����f\m�{\ @ie�yp�����o000p��=� @����t�V������y.�Q�o�S��`�{/�Q�9�qu0������ ���W `F��fn$�*A���T���c��K#���)���^8�wPzYh4�=4 �A� ������
bn���QfV.���5)�S�|����m7��y���� �wqPF��m�{\ �� E�����e����8UZ��y'���1�rR��V&iimea�� �
�WjF�2���xy���S��2]OH��o+�F�������A b���f*�J#w���j�f �)$8 e�r7�.�I����et#R�kR3��xD'{{%�Q�%����t7��� �E� P�df�(��r�T��2�\F�i�x�����r�������L����D����p)����� @Y�?� %IbjFl��by2`}F���I7+�u�>B��Bc�_F�|F�^F��I*������uD P s��2�e2��2�����|��.����1���m������x�����$2�{���� �� ���Z!G3�8�zB��"���h��P��\�����l�g�_{�r��<�I�����3 �,$8 E��b�X}"�����s��Igl�,��ct3��2}.S�����g�,� �R� �A�2������J�����~�N6������3%���U9'�r�v�����������P{�;�m � Pv�gf��N��O���<�)��-��3w�8����uyW�r���e*�s(��nk F� J'�V�����k�)��S��6���3��7���2�r,[��������{�*����;�4n/mc�c ��� �H��I)n$����rNv�<�pg�4n� � 0G��e��J�KH�����d��)=3�(�oh�M �U�Qy`���CE��M��^��\m�r �� � *�q;-6!�zB�����u�l�vK�z+Y���(��8�Z{�9x�;�����n�
�l % 0>�
�{i�7Eh��,����4w��q�g���U�� � xd ��17�c
h��'�
G;k/w��n��n�N�nr4��*�^�'�rNvF9 �� ^��\���'��$��2:.)15�������;1��P��p��Ml� (Hp (���scR��c�S/�%^��I�Rcn&)W���j��[k/wG�r��L��+h<��=���g*�;9�q �# � �t������z�F��[���Z|����K�&?�)E[������b9Gg ��A� @ v;%#�FRt\b����q�Wn$�������[���������NL��S^��=���9z�;�;��, �P$8 �5�V\�OV�������\�������{.�j�U��r��n������r������ 3�7g �O�i.�%F�%I�K��o$]�!�Wo%?�s����
sWrw��k�R����4�Y �tHp (>������.)M��D���]S�����;���������\���������(y
�� �|c ��ee�F��3����coK��qI�HT�c���loS���Oyg�*\��FZ�vw�rw4z� 0C$8 <������nG]���5���mi�����x������K�
.���1�T�1��sOg{S� �� ��HN�<5����c�kj.]�/�������������h|=uMy9����*-�9���y �$8 �ee�^�K��5�KIm��1�O7��]}=]�z�*E���T5�� ��C� (�b�S�$5���5 �7��s��OV����������F^Vt�
� �q��% ���r#)2&^y�S�+�Ii������E�
.��\�y��y�������S�Yz�t� $8 �R"W��|=1X���]��x�9T�r��\�+�SR_O� �� P"E�H:}Ky��r+2&�������B~���:oF�_I��8��'