Disable WAL completely - Performance and Persistency research
Hi All,
As part of my masters at TAU, I'm currently conducting some research
regarding new persistent memory technology.
I'm using PG for this research and would like to better understand some of
the performance bottlenecks.
For this reason I'm trying to disable the WAL completely, using some hacks
on the source code and compiling my own version.
So what I'm actually looking for, is some guidance about a simple way to:
1. Disable the WAL by not writing anything to the xlog directory. I don't
care about recovery/fault tolerance or PITR/ replication etc at the moment.
I'm aware that the WAL and checkpoint are bind in many ways and are crucial
for PG core features.
I tried changing the status of all tables to "unlogged" tables by changing
RelationNeedsWAL MACRO, as well as "needs_wal" parameter at storage.c.
But, got no performance benefit, so I guess this was the wrong place to
change.
2. Cancel the locking around WAL files - I don't care about corrupted
files at the moment, I just want to see what is the maximum performance
benefit that I can get without lock contention.
Any guidance on how to do so would be appreciated :)
Kind regards,
Netanel
On Thu, Jul 7, 2016 at 5:01 PM, Netanel Katzburg <netanel10k@gmail.com> wrote:
1. Disable the WAL by not writing anything to the xlog directory. I don't
care about recovery/fault tolerance or PITR/ replication etc at the moment.
I'm aware that the WAL and checkpoint are bind in many ways and are crucial
for PG core features.
Any guidance on how to do so would be appreciated :)
WAL insertion routines are in xloginsert.c. Did you try to play with those?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Michael,
Sorry for the delay,
The answer is yes,
I tried 2 things so far:
1. As I understand:
*XLogRecPtr*
*XLogInsert(RmgrId rmid, uint8 info)*
is the primary insert function in xloginsert.c.
I tried commenting the following line at this function, so I can return a
phony pointer every time the function is called, just as in bootstrap mode.
*if (IsBootstrapProcessingMode() && rmid != RM_XLOG_ID)*
2. At xlog.c, CopyXLogRecordToWAL(int write_len, bool isLogSwitch,
XLogRecData *rdata,
XLogRecPtr StartPos, XLogRecPtr EndPos), Commenting the memcpy syscall:
...
memcpy(currpos, rdata_data, rdata_len);
...
BUT, both options are not good, as they are stopping me from even running i
*nitdb.*
Maybe someone have a lead regarding changes to be done at xlog.c:
*XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn)*
Any other lead regarding xloginsert.c is welcomed as well.
Regards,
Netanel
On Thu, Jul 7, 2016 at 4:17 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
Show quoted text
On Thu, Jul 7, 2016 at 5:01 PM, Netanel Katzburg <netanel10k@gmail.com>
wrote:1. Disable the WAL by not writing anything to the xlog directory. I don't
care about recovery/fault tolerance or PITR/ replication etc at themoment.
I'm aware that the WAL and checkpoint are bind in many ways and are
crucial
for PG core features.
Any guidance on how to do so would be appreciated :)WAL insertion routines are in xloginsert.c. Did you try to play with those?
--
Michael
On 10 July 2016 at 18:27, Netanel Katzburg <netanel10k@gmail.com> wrote:
BUT, both options are not good, as they are stopping me from even running i
*nitdb.*
The easiest path for testing will be to use an unpatched PostgreSQL to
`initdb` and create a new database. Then start up a patched one that simply
skips WAL writing against an already-`initdb`'d data directory.
You probably won't be able to safely restart PostgreSQL, but all you're
doing is performance analsys so one-shot operation on a throw-away data
directory is probably fine.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi,
You were right, the method you described worked well. Thanks you!
But so far, could not get any noticeable improvement in Number of
transactions / latency.
I have tried:
1. At xlog.c, CopyXLogRecordToWAL(int write_len, bool isLogSwitch,
XLogRecData *rdata,
XLogRecPtr StartPos, XLogRecPtr EndPos), Commenting the memcpy syscall:
memcpy(currpos, rdata_data, rdata_len);
2. *XLogInsert(RmgrId rmid, uint8 info), t*he primary insert function in
xloginsert.c.
I tried commenting the following line at this function, so I can return a
phony pointer every time the function is called, just as in bootstrap mode.
*if (IsBootstrapProcessingMode() && rmid != RM_XLOG_ID)*
3. At xlog.c, XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn).
Commenting the WALInsertLock(s), as well as, commenting the spinlocks
around - Update shared LogwrtRqst. (Write, if we crossed page boundary.)
4. The last thing I tried regarding *XLogInsertRecord* function is to
comment:
"/*
* All the record data, including the header, is now ready to be
* inserted. Copy the record in the space reserved.
*/
CopyXLogRecordToWAL(rechdr->xl_tot_len, isLogSwitch, rdata,
StartPos, EndPos);"
Regards,
Netanel
On Mon, Jul 11, 2016 at 8:27 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
Show quoted text
On 10 July 2016 at 18:27, Netanel Katzburg <netanel10k@gmail.com> wrote:
BUT, both options are not good, as they are stopping me from even running
i*nitdb.*The easiest path for testing will be to use an unpatched PostgreSQL to
`initdb` and create a new database. Then start up a patched one that simply
skips WAL writing against an already-`initdb`'d data directory.You probably won't be able to safely restart PostgreSQL, but all you're
doing is performance analsys so one-shot operation on a throw-away data
directory is probably fine.--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 11 July 2016 at 19:14, Netanel Katzburg <netanel10k@gmail.com> wrote:
Hi,
You were right, the method you described worked well. Thanks you!
But so far, could not get any noticeable improvement in Number of
transactions / latency.
What are you comparing to?
To start with, compare with:
- an unpatched PostgreSQL, configured normally, with normal logged tables
- an unpatched PostgreSQL, using UNLOGGED tables
- an unpatched PostgreSQL, using UNLOGGED tables and synchronous_commit =
off (or fsync=off, but remember, that disables data integrity protections
for system catalogs and everything).
Make sure you're introducing a suitably write-concurrent workload that
might actually be waiting on WAL.
Personally I'd be surprised if you saw any significant difference over
using UNLOGGED tables. That's why we have them ;)
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jul 7, 2016 at 1:01 AM, Netanel Katzburg <netanel10k@gmail.com> wrote:
Hi All,
As part of my masters at TAU, I'm currently conducting some research
regarding new persistent memory technology.
I'm using PG for this research and would like to better understand some of
the performance bottlenecks.
For this reason I'm trying to disable the WAL completely, using some hacks
on the source code and compiling my own version.So what I'm actually looking for, is some guidance about a simple way to:
1. Disable the WAL by not writing anything to the xlog directory. I don't
care about recovery/fault tolerance or PITR/ replication etc at the moment.
I'm aware that the WAL and checkpoint are bind in many ways and are crucial
for PG core features.
I tried changing the status of all tables to "unlogged" tables by changing
RelationNeedsWAL MACRO, as well as "needs_wal" parameter at storage.c.
But, got no performance benefit, so I guess this was the wrong place to
change.2. Cancel the locking around WAL files - I don't care about corrupted files
at the moment, I just want to see what is the maximum performance benefit
that I can get without lock contention.Any guidance on how to do so would be appreciated :)
I have a very old patch which introduces a config variable (JJNOWAL)
that skips all WAL, except for the WAL of certain checkpoints (which
are needed for initdb and to restart the server after a clean
shutdown).
I have rebased it up to HEAD. It seems to work, but I haven't tested
thoroughly that it still does the correct thing in every corner case.
(a lot of changes have been made to xlog code since last time I used
this.)
Obviously if the server goes down uncleanly while this setting is
active, it will not be usable anymore.
Cheers,
Jeff
Attachments:
nowal.patchapplication/octet-stream; name=nowal.patchDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
new file mode 100644
index aecede1..44e5eab
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*************** int wal_level = WAL_LEVEL_MINIMAL;
*** 100,105 ****
--- 100,106 ----
int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
int wal_retrieve_retry_interval = 5000;
+ int JJNOWAL=0;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
*************** XLogInsertRecord(XLogRecData *rdata, XLo
*** 904,909 ****
--- 905,916 ----
XLogRecPtr StartPos;
XLogRecPtr EndPos;
+ if (JJNOWAL && (rechdr->xl_rmid != RM_XLOG_ID) ) { // Don't actually insert any XLOG, except the ones needed during boot strap
+ EndPos = SizeOfXLogLongPHD; /* start of 1st chkpt record */
+ return EndPos;
+ };
+
+
/* we assume that all of the record header is in the first chunk */
Assert(rdata->len >= SizeOfXLogRecord);
*************** CreateCheckPoint(int flags)
*** 8268,8274 ****
CHECKPOINT_FORCE)) == 0)
{
if (prevPtr == ControlFile->checkPointCopy.redo &&
! prevPtr / XLOG_SEG_SIZE == curInsert / XLOG_SEG_SIZE)
{
WALInsertLockRelease();
LWLockRelease(CheckpointLock);
--- 8275,8281 ----
CHECKPOINT_FORCE)) == 0)
{
if (prevPtr == ControlFile->checkPointCopy.redo &&
! prevPtr / XLOG_SEG_SIZE == curInsert / XLOG_SEG_SIZE && !JJNOWAL)
{
WALInsertLockRelease();
LWLockRelease(CheckpointLock);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index 6ac5184..dd9c0a1
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** extern char *default_tablespace;
*** 111,116 ****
--- 111,117 ----
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool synchronize_seqscans;
+ extern int JJNOWAL;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
*************** static struct config_int ConfigureNamesI
*** 1744,1749 ****
--- 1745,1759 ----
0, 0, INT_MAX,
NULL, NULL, NULL
},
+
+ {
+ {"JJNOWAL", PGC_USERSET, WAL_SETTINGS,
+ gettext_noop("turn of WAL logging, except for checkpoint/shutdown records"),
+ NULL
+ },
+ &JJNOWAL,
+ 0, 0, 100000, NULL, NULL
+ },
{
{"geqo_generations", PGC_USERSET, QUERY_TUNING_GEQO,
gettext_noop("GEQO: number of iterations of the algorithm."),
Your patch is very helpful, I'm still checking it on different file-systems.
I really liked the idea of using only the edge checkpoints.
Thanks.
On Mon, Jul 11, 2016 at 9:26 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
Show quoted text
On Thu, Jul 7, 2016 at 1:01 AM, Netanel Katzburg <netanel10k@gmail.com>
wrote:Hi All,
As part of my masters at TAU, I'm currently conducting some research
regarding new persistent memory technology.
I'm using PG for this research and would like to better understand someof
the performance bottlenecks.
For this reason I'm trying to disable the WAL completely, using somehacks
on the source code and compiling my own version.
So what I'm actually looking for, is some guidance about a simple way to:
1. Disable the WAL by not writing anything to the xlog directory. I don't
care about recovery/fault tolerance or PITR/ replication etc at themoment.
I'm aware that the WAL and checkpoint are bind in many ways and are
crucial
for PG core features.
I tried changing the status of all tables to "unlogged" tables bychanging
RelationNeedsWAL MACRO, as well as "needs_wal" parameter at storage.c.
But, got no performance benefit, so I guess this was the wrong place to
change.2. Cancel the locking around WAL files - I don't care about corrupted
files
at the moment, I just want to see what is the maximum performance benefit
that I can get without lock contention.Any guidance on how to do so would be appreciated :)
I have a very old patch which introduces a config variable (JJNOWAL)
that skips all WAL, except for the WAL of certain checkpoints (which
are needed for initdb and to restart the server after a clean
shutdown).I have rebased it up to HEAD. It seems to work, but I haven't tested
thoroughly that it still does the correct thing in every corner case.
(a lot of changes have been made to xlog code since last time I used
this.)Obviously if the server goes down uncleanly while this setting is
active, it will not be usable anymore.Cheers,
Jeff