wait events for disk I/O
Hi All,
Attached is the patch, which extend the existing wait event infrastructure
to
implement the wait events for the disk I/O. Basically pg_stat_activity's
wait
event information to show data about disk I/O as well as IPC primitives.
Implementation details:
- Added PG_WAIT_IO to pgstat.h and a new enum WaitEventIO
- Added a wait_event_info argument to FileRead, FileWrite, FilePrefetch,
FileWriteback, FileSync, and FileTruncate. Set this wait event just before
performing the file system operation and clear it just after.
- Pass down an appropriate wait event from caller of any of those
functions.
- Also set and clear a wait event around standalone calls to read(),
write(), fsync() in other parts of the system.
- Added documentation for all newly added wait event.
Open issue:
- Might missed few standalone calls to read(), write(), etc which need
to pass the wait_event_info.
Thanks to my colleague Robert Haas for his help in design.
Please let me know your thought, and thanks for reading.
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
Attachments:
wait_event_disk_IO.patchbinary/octet-stream; name=wait_event_disk_IO.patchDownload
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 01fad38..7d953f1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -716,6 +716,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
point.
</para>
</listitem>
+ <listitem>
+ <para>
+ <literal>IO</>: The server process is waiting for a IO to complete.
+ <literal>wait_event</> will identify the specific wait point.
+ </para>
+ </listitem>
</itemizedlist>
</entry>
</row>
@@ -1260,6 +1266,272 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry><literal>RecoveryApplyDelay</></entry>
<entry>Waiting to apply WAL at recovery because it is delayed.</entry>
</row>
+ <row>
+ <entry morerows="65"><literal>IO</></entry>
+ <entry><literal>ReadDataBlock</></entry>
+ <entry>Waiting during relation data block read.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteDataBlock</></entry>
+ <entry>Waiting during relation data block write.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncDataBlock</></entry>
+ <entry>Waiting during relation data block sync.</entry>
+ </row>
+ <row>
+ <entry><literal>ExtendDataBlock</></entry>
+ <entry>Waiting during add a block to the relation.</entry>
+ </row>
+ <row>
+ <entry><literal>FlushDataBlock</></entry>
+ <entry>Waiting during write pages back to storage.</entry>
+ </row>
+ <row>
+ <entry><literal>PrefetchDataBlock</></entry>
+ <entry>Waiting during asynchronous read of the specified block of a relation.</entry>
+ </row>
+ <row>
+ <entry><literal>TruncateDataBlock</></entry>
+ <entry>Waiting during truncate relation to specified number of blocks.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRelation</></entry>
+ <entry>Waiting during sync writes to stable storage.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncImmedRelation</></entry>
+ <entry>Waiting during immediate sync a relation to stable storage.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteRewriteDataBlock</></entry>
+ <entry>Waiting to write data block during rewrite heap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRewriteDataBlock</></entry>
+ <entry>Waiting to sync data block during rewrite heap.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBuffile</></entry>
+ <entry>Waiting during buffile read operation.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBuffile</></entry>
+ <entry>Waiting during buffile write operation.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadXLog</></entry>
+ <entry>Waiting during read the XLOG page.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCopyXLog</></entry>
+ <entry>Wait to read the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteXLog</></entry>
+ <entry>Waiting during write the XLOG page.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteInitXLogFile</></entry>
+ <entry>Waiting to write the XLOG page during XLOG file initialization.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCopyXLogFile</></entry>
+ <entry>Waiting to write the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBootstrapXLog</></entry>
+ <entry>Waiting to write the XLOG page during bootstrap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncInitXLogFile</></entry>
+ <entry>Waiting to sync the XLOG page during XLOG file initialization.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncCopyXLogFile</></entry>
+ <entry>Waiting to sync the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncBootstrapXLog</></entry>
+ <entry>Waiting to sync the XLOG page during bootstrap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncAssignXLogMethod</></entry>
+ <entry>Waiting to assign xlog sync method.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteControlFile</></entry>
+ <entry>Waiting to write the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteUpdateControlFile</></entry>
+ <entry>Waiting to write the control file during update control file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadControlFile</></entry>
+ <entry>Waiting to read the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncWriteControlFile</></entry>
+ <entry>Waiting to sync the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncUpdateControlFile</></entry>
+ <entry>Waiting to sync the control file during update control file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadApplyLogicalMapping</></entry>
+ <entry>Waiting to read logical mapping during apply a single mapping file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteLogicalMappingRewrite</></entry>
+ <entry>Waiting to write logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLogicalMappingRewrite</></entry>
+ <entry>Waiting to sync logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLogicalMappingRewriteHeap</></entry>
+ <entry>Waiting to sync logical mapping during a checkpoint for logical rewrite mappings.</entry>
+ </row>
+ <row>
+ <entry><literal>TruncateLogicalMappingRewrite</></entry>
+ <entry>Waiting to truncate logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSnapbuildSerialize</></entry>
+ <entry>Waiting to write snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSnapbuildSerialize</></entry>
+ <entry>Waiting to read snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSnapbuildSerialize</></entry>
+ <entry>Waiting to sync snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSLRUPage</></entry>
+ <entry>Waiting to read page during physical read of a page into a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSLRUPage</></entry>
+ <entry>Waiting to write page during physical write of a page from a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSLRUWritePage</></entry>
+ <entry>Waiting to sync page during physical write of a page from a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSLRUFlush</></entry>
+ <entry>Waiting to sync page during flush dirty pages to disk during checkpoint or database shutdown.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTimelineHistoryWrite</></entry>
+ <entry>Waiting to read timeline history during write timeline history.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTimelineHistoryWalsender</></entry>
+ <entry>Waiting to read timeline history during walsander timeline command.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteTimelineHistory</></entry>
+ <entry>Waiting to write timeline history.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteTimelineHistoryFile</></entry>
+ <entry>Waiting to write timeline history during a history file write for given timeline and contents.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncTimelineHistoryWrite</></entry>
+ <entry>Waiting to sync timeline history during write timeline history</entry>
+ </row>
+ <row>
+ <entry><literal>SyncTimelineHistoryFile</></entry>
+ <entry>Waiting to sync timeline history during a history file write for given timeline and contents.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTwophaseFile</></entry>
+ <entry>Waiting to read two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteRecreateTwophaseFile</></entry>
+ <entry>Waiting to write two phase file during recreate two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRecreateTwophaseFile</></entry>
+ <entry>Waiting to sync two phase file during recreate two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSysloggerFile</></entry>
+ <entry>Wait during read syslogger file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSysloggerFile</></entry>
+ <entry>Wait during write syslogger file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadRestorREPLSlot</></entry>
+ <entry>Wait to read REPL slot during load a single slot from disk into memory.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteREPLSlot</></entry>
+ <entry>Wait to write REPL slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRestoreREPLSlot</></entry>
+ <entry>Wait to sync REPL slot during load a single slot from disk into memory.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCopyFile</></entry>
+ <entry>Waiting to read during copy file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCopyFile</></entry>
+ <entry>Waiting to write during copy file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadLoadRELMAPFile</></entry>
+ <entry>Waiting to read RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteLoadRELMAPFile</></entry>
+ <entry>Waiting to write RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLoadRELMAPFile</></entry>
+ <entry>Waiting to sync RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCreateLockFile</></entry>
+ <entry>Wait to read lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadAddToDataDirLockFile</></entry>
+ <entry>Wait to read lock file during add a line in the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadRecheckDataDirLockFile</></entry>
+ <entry>Wait to read lock file during recheck that the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCreateLockFile</></entry>
+ <entry>Wait to write lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteAddToDataDirLockFile</></entry>
+ <entry>Wait to write lock file during add a line in the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncCreateLockFile</></entry>
+ <entry>Wait to sync lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncAddToDataDirLockFile</></entry>
+ <entry>Wait to sync lock file during add a line in the data directory lock file.</entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 90ab6f2..709acbf 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -119,6 +119,8 @@
#include "lib/ilist.h"
+#include "pgstat.h"
+
#include "replication/logical.h"
#include "replication/slot.h"
@@ -916,7 +918,8 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
* Note that we deviate from the usual WAL coding practices here,
* check the above "Logical rewrite support" comment for reasoning.
*/
- written = FileWrite(src->vfd, waldata_start, len);
+ written = FileWrite(src->vfd, waldata_start, len,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
if (written != len)
ereport(ERROR,
(errcode_for_file_access(),
@@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
hash_seq_init(&seq_status, state->rs_logical_mappings);
while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL)
{
- if (FileSync(src->vfd) != 0)
+ if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", src->path)));
@@ -1141,11 +1144,13 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
* Truncate all data that's not guaranteed to have been safely fsynced (by
* previous record or by the last checkpoint).
*/
+ pgstat_report_wait_start(WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE);
if (ftruncate(fd, xlrec->offset) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\" to %u: %m",
path, (uint32) xlrec->offset)));
+ pgstat_report_wait_end();
/* now seek to the position we want to write our data to */
if (lseek(fd, xlrec->offset, SEEK_SET) != xlrec->offset)
@@ -1159,20 +1164,24 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
len = xlrec->num_mappings * sizeof(LogicalRewriteMappingData);
/* write out tail end of mapping file (again) */
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE);
if (write(fd, data, len) != len)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", path)));
+ pgstat_report_wait_end();
/*
* Now fsync all previously written data. We could improve things and only
* do this for the last write to a file, but the required bookkeeping
* doesn't seem worth the trouble.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", path)));
+ pgstat_report_wait_end();
CloseTransientFile(fd);
}
@@ -1266,10 +1275,12 @@ CheckPointLogicalRewriteHeap(void)
* changed or have only been created since the checkpoint's start,
* but it's currently not deemed worth the effort.
*/
- else if (pg_fsync(fd) != 0)
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP);
+ if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", path)));
+ pgstat_report_wait_end();
CloseTransientFile(fd);
}
}
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a66ef5c..237e8eb 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -54,6 +54,7 @@
#include "access/slru.h"
#include "access/transam.h"
#include "access/xlog.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
#include "miscadmin.h"
@@ -675,13 +676,16 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_SLRU_PAGE);
if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
{
+ pgstat_report_wait_end();
slru_errcause = SLRU_READ_FAILED;
slru_errno = errno;
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
{
@@ -834,8 +838,10 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SLRU_PAGE);
if (write(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
{
+ pgstat_report_wait_end();
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
@@ -845,6 +851,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
/*
* If not part of Flush, need to fsync now. We assume this happens
@@ -852,6 +859,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
*/
if (!fdata)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_WRITE_PAGE);
if (ctl->do_fsync && pg_fsync(fd))
{
slru_errcause = SLRU_FSYNC_FAILED;
@@ -859,6 +867,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
{
@@ -1126,6 +1135,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
ok = true;
for (i = 0; i < fdata.num_files; i++)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_FLUSH);
if (ctl->do_fsync && pg_fsync(fdata.fd[i]))
{
slru_errcause = SLRU_FSYNC_FAILED;
@@ -1133,6 +1143,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT;
ok = false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fdata.fd[i]))
{
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index c8240b1..052f77f 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -39,6 +39,7 @@
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "pgstat.h"
#include "storage/fd.h"
/*
@@ -339,7 +340,9 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
for (;;)
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE);
nbytes = (int) read(srcfd, buffer, sizeof(buffer));
+ pgstat_report_wait_end();
if (nbytes < 0 || errno != 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -347,6 +350,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
if (nbytes == 0)
break;
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY);
if ((int) write(fd, buffer, nbytes) != nbytes)
{
int save_errno = errno;
@@ -366,6 +370,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
CloseTransientFile(srcfd);
}
@@ -401,10 +406,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
@@ -461,6 +468,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
errmsg("could not create file \"%s\": %m", tmppath)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE);
if ((int) write(fd, content, size) != size)
{
int save_errno = errno;
@@ -476,11 +484,14 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 5b72c1d..3ab361e 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1200,8 +1200,10 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
*/
buf = (char *) palloc(stat.st_size);
+ pgstat_report_wait_start(WAIT_EVENT_READ_TWOPHASE_FILE);
if (read(fd, buf, stat.st_size) != stat.st_size)
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
if (give_warnings)
ereport(WARNING,
@@ -1212,6 +1214,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
return NULL;
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
hdr = (TwoPhaseFileHeader *) buf;
@@ -1542,8 +1545,10 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
path)));
/* Write content and CRC */
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE);
if (write(fd, content, len) != len)
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
ereport(ERROR,
(errcode_for_file_access(),
@@ -1551,16 +1556,19 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
}
if (write(fd, &statefile_crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c))
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write two-phase state file: %m")));
}
+ pgstat_report_wait_end();
/*
* We must fsync the file because the end-of-replay checkpoint will not do
* so, there being no GXACT in shared memory yet to tell it to.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1568,6 +1576,7 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
(errcode_for_file_access(),
errmsg("could not fsync two-phase state file: %m")));
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd) != 0)
ereport(ERROR,
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2f5d603..8725d9b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2340,7 +2340,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
do
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_XLOG);
written = write(openLogFile, from, nleft);
+ pgstat_report_wait_end();
if (written <= 0)
{
if (errno == EINTR)
@@ -3091,6 +3093,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
for (nbytes = 0; nbytes < XLogSegSize; nbytes += XLOG_BLCKSZ)
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_INIT_XLOG_FILE);
if ((int) write(fd, zbuffer, XLOG_BLCKSZ) != (int) XLOG_BLCKSZ)
{
int save_errno = errno;
@@ -3109,8 +3112,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_INIT_XLOG_FILE);
if (pg_fsync(fd) != 0)
{
close(fd);
@@ -3118,6 +3123,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
if (close(fd))
ereport(ERROR,
@@ -3244,6 +3250,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
if (nread > sizeof(buffer))
nread = sizeof(buffer);
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_COPY_XLOG);
if (read(srcfd, buffer, nread) != nread)
{
if (errno != 0)
@@ -3256,8 +3263,10 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
(errmsg("not enough data in file \"%s\"",
path)));
}
+ pgstat_report_wait_end();
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_XLOG_FILE);
if ((int) write(fd, buffer, sizeof(buffer)) != (int) sizeof(buffer))
{
int save_errno = errno;
@@ -3273,12 +3282,15 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_COPY_XLOG_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
@@ -4303,6 +4315,7 @@ WriteControlFile(void)
XLOG_CONTROL_FILE)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_CONTROL_FILE);
if (write(fd, buffer, PG_CONTROL_SIZE) != PG_CONTROL_SIZE)
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4312,11 +4325,14 @@ WriteControlFile(void)
(errcode_for_file_access(),
errmsg("could not write to control file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_CONTROL_FILE);
if (pg_fsync(fd) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync control file: %m")));
+ pgstat_report_wait_end();
if (close(fd))
ereport(PANIC,
@@ -4342,10 +4358,12 @@ ReadControlFile(void)
errmsg("could not open control file \"%s\": %m",
XLOG_CONTROL_FILE)));
+ pgstat_report_wait_start(WAIT_EVENT_READ_CONTROL_FILE);
if (read(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not read from control file: %m")));
+ pgstat_report_wait_end();
close(fd);
@@ -4539,6 +4557,7 @@ UpdateControlFile(void)
XLOG_CONTROL_FILE)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE);
if (write(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4548,11 +4567,14 @@ UpdateControlFile(void)
(errcode_for_file_access(),
errmsg("could not write to control file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE);
if (pg_fsync(fd) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync control file: %m")));
+ pgstat_report_wait_end();
if (close(fd))
ereport(PANIC,
@@ -4919,6 +4941,7 @@ BootStrapXLOG(void)
/* Write the first page with the initial record */
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_BOOTSTRAP_XLOG);
if (write(openLogFile, page, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4928,11 +4951,14 @@ BootStrapXLOG(void)
(errcode_for_file_access(),
errmsg("could not write bootstrap transaction log file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_BOOTSTRAP_XLOG);
if (pg_fsync(openLogFile) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync bootstrap transaction log file: %m")));
+ pgstat_report_wait_end();
if (close(openLogFile))
ereport(PANIC,
@@ -9865,11 +9891,13 @@ assign_xlog_sync_method(int new_sync_method, void *extra)
*/
if (openLogFile >= 0)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD);
if (pg_fsync(openLogFile) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync log segment %s: %m",
XLogFileNameP(ThisTimeLineID, openLogSegNo))));
+ pgstat_report_wait_end();
if (get_sync_bit(sync_method) != get_sync_bit(new_sync_method))
XLogFileClose();
}
@@ -11322,6 +11350,7 @@ retry:
goto next_record_is_invalid;
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
char fname[MAXFNAMELEN];
@@ -11333,6 +11362,7 @@ retry:
fname, readOff)));
goto next_record_is_invalid;
}
+ pgstat_report_wait_end();
Assert(targetSegNo == readSegNo);
Assert(targetPageOff == readOff);
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 0de2419..24538bc 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -24,6 +24,7 @@
#include "access/xlogutils.h"
#include "catalog/catalog.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "storage/smgr.h"
#include "utils/guc.h"
#include "utils/hsearch.h"
@@ -727,7 +728,9 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count)
else
segbytes = nbytes;
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
readbytes = read(sendFile, p, segbytes);
+ pgstat_report_wait_end();
if (readbytes <= 0)
{
char path[MAXPGPATH];
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7176cf1..5a8a945 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -280,6 +280,7 @@ static const char *pgstat_get_wait_activity(WaitEventActivity w);
static const char *pgstat_get_wait_client(WaitEventClient w);
static const char *pgstat_get_wait_ipc(WaitEventIPC w);
static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
+static const char *pgstat_get_wait_io(WaitEventIO w);
static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
static void pgstat_send(void *msg, int len);
@@ -3176,6 +3177,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info)
case PG_WAIT_TIMEOUT:
event_type = "Timeout";
break;
+ case PG_WAIT_IO:
+ event_type = "IO";
+ break;
default:
event_type = "???";
break;
@@ -3246,6 +3250,13 @@ pgstat_get_wait_event(uint32 wait_event_info)
event_name = pgstat_get_wait_timeout(w);
break;
}
+ case PG_WAIT_IO:
+ {
+ WaitEventIO w = (WaitEventIO) wait_event_info;
+
+ event_name = pgstat_get_wait_io(w);
+ break;
+ }
default:
event_name = "unknown wait event";
break;
@@ -3433,6 +3444,247 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
}
/* ----------
+ * pgstat_get_wait_io() -
+ *
+ * Convert WaitEventIO to string.
+ * ----------
+ */
+static const char *
+pgstat_get_wait_io(WaitEventIO w)
+{
+ const char *event_name = "unknown wait event";
+
+ switch (w)
+ {
+ case WAIT_EVENT_READ_DATA_BLOCK:
+ event_name = "ReadDataBlock";
+ break;
+ case WAIT_EVENT_WRITE_DATA_BLOCK:
+ event_name = "WriteDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_DATA_BLOCK:
+ event_name = "SyncDataBlock";
+ break;
+ case WAIT_EVENT_EXTEND_DATA_BLOCK:
+ event_name = "ExtendDataBlock";
+ break;
+ case WAIT_EVENT_FLUSH_DATA_BLOCK:
+ event_name = "FlushDataBlock";
+ break;
+ case WAIT_EVENT_PREFETCH_DATA_BLOCK:
+ event_name = "PrefetchDataBlock";
+ break;
+ case WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS:
+ event_name = "TruncateDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_RELATION:
+ event_name = "SyncRelation";
+ break;
+ case WAIT_EVENT_SYNC_IMMED_RELATION:
+ event_name = "SyncImmedRelation";
+ break;
+ case WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK:
+ event_name = "WriteRewriteDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK:
+ event_name = "SyncRewriteDataBlock";
+ break;
+ case WAIT_EVENT_READ_BUFFILE:
+ event_name = "ReadBuffile";
+ break;
+ case WAIT_EVENT_WRITE_BUFFILE:
+ event_name = "WriteBuffile";
+ break;
+ /* XLOG wait event */
+ case WAIT_EVENT_READ_XLOG:
+ event_name = "ReadXLog";
+ break;
+ case WAIT_EVENT_READ_COPY_XLOG:
+ event_name = "ReadCopyXLog";
+ break;
+ case WAIT_EVENT_WRITE_XLOG:
+ event_name = "WriteXLog";
+ break;
+ case WAIT_EVENT_WRITE_INIT_XLOG_FILE:
+ event_name = "WriteInitXLogFile";
+ break;
+ case WAIT_EVENT_WRITE_COPY_XLOG_FILE:
+ event_name = "WriteCopyXLogFile";
+ break;
+ case WAIT_EVENT_WRITE_BOOTSTRAP_XLOG:
+ event_name = "WriteBootstrapXLog";
+ break;
+ case WAIT_EVENT_SYNC_INIT_XLOG_FILE:
+ event_name = "SyncInitXLogFile";
+ break;
+ case WAIT_EVENT_SYNC_COPY_XLOG_FILE:
+ event_name = "SyncCopyXLogFile";
+ break;
+ case WAIT_EVENT_SYNC_BOOTSTRAP_XLOG:
+ event_name = "SyncBootStrapXLog";
+ break;
+ case WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD:
+ event_name = "SyncAssignXLogMethod";
+ break;
+ /* Control file wait events */
+ case WAIT_EVENT_WRITE_CONTROL_FILE:
+ event_name = "WriteControlFile";
+ break;
+ case WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE:
+ event_name = "WriteUpdateControlFile";
+ break;
+ case WAIT_EVENT_READ_CONTROL_FILE:
+ event_name = "ReadControlFile";
+ break;
+ case WAIT_EVENT_SYNC_WRITE_CONTROL_FILE:
+ event_name = "SyncWriteControlFile";
+ break;
+ case WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE:
+ event_name = "SyncUpdateControlFile";
+ break;
+ /* reorder buffer wait event */
+ case WAIT_EVENT_READ_REORDER_BUFFER:
+ event_name = "ReadReorderBuffer";
+ break;
+ case WAIT_EVENT_WRITE_REORDER_BUFFER:
+ event_name = "WriteReorderBuffer";
+ break;
+ /* logical mapping wait event */
+ case WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING:
+ event_name = "ReadApplyLogicalMapping";
+ break;
+ case WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE:
+ event_name = "WriteLogicalMappingRewrite";
+ break;
+ case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE:
+ event_name = "SyncLogicalMappingRewrite";
+ break;
+ case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP:
+ event_name = "SyncLogicalMappingRewriteHeap";
+ break;
+ case WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE:
+ event_name = "TruncateLogicalMappingRewrite";
+ break;
+ /* Snapbuild wait event */
+ case WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE:
+ event_name = "WriteSnapbuildSerialize";
+ break;
+ case WAIT_EVENT_READ_SNAPBUILD_RESTORE:
+ event_name = "ReadSnapbuildRestore";
+ break;
+ case WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE:
+ event_name = "SyncSnapbuildSerialize";
+ break;
+ /* SLRU wait event */
+ case WAIT_EVENT_READ_SLRU_PAGE:
+ event_name = "ReadSLRUPage";
+ break;
+ case WAIT_EVENT_WRITE_SLRU_PAGE:
+ event_name = "WriteSLRUPage";
+ break;
+ case WAIT_EVENT_SYNC_SLRU_FLUSH:
+ event_name = "SyncSLRUFlush";
+ break;
+ case WAIT_EVENT_SYNC_SLRU_WRITE_PAGE:
+ event_name = "SyncSLRUWritePage";
+ break;
+ /* TIMELINE HISTORY wait event */
+ case WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER:
+ event_name = "ReadTimelineHistoryWalsender";
+ break;
+ case WAIT_EVENT_WRITE_TIMELINE_HISTORY:
+ event_name = "WriteTimelineHistory";
+ break;
+ case WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE:
+ event_name = "WriteTimelineHistoryFile";
+ break;
+ case WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE:
+ event_name = "ReadTimelineHistoryWrite";
+ break;
+ case WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE:
+ event_name = "SyncTimelineHistoryWrite";
+ break;
+ case WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE:
+ event_name = "SyncTimelineHistoryFile";
+ break;
+ /* TWOPHASE FILE wait event */
+ case WAIT_EVENT_READ_TWOPHASE_FILE:
+ event_name = "ReadTwophaseFile";
+ break;
+ case WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE:
+ event_name = "WriteRecreateTwophaseFile";
+ break;
+ case WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE:
+ event_name = "SyncRecreateTwophaseFile";
+ break;
+ /* SYSLOGGER wait event */
+ case WAIT_EVENT_READ_SYSLOGGER_FILE:
+ event_name = "ReadSysloggerFile";
+ break;
+ case WAIT_EVENT_WRITE_SYSLOGGER_FILE:
+ event_name = "WriteSysloggerFile";
+ break;
+ /* REPLSLOT wait event */
+ case WAIT_EVENT_READ_RESTORE_REPLSLOT:
+ event_name = "ReadRestorREPLSlot";
+ break;
+ case WAIT_EVENT_WRITE_REPLSLOT:
+ event_name = "WriteREPLSlot";
+ break;
+ case WAIT_EVENT_SYNC_RESTORE_REPLSLOT:
+ event_name = "SyncRestoreREPLSlot";
+ break;
+ case WAIT_EVENT_SYNC_SAVE_REPLSLOT:
+ event_name = "SyncSaveREPLSlot";
+ break;
+ /* COPYDIR IO wait event */
+ case WAIT_EVENT_READ_COPY_FILE:
+ event_name = "ReadCopyFile";
+ break;
+ case WAIT_EVENT_WRITE_COPY_FILE:
+ event_name = "WriteCopyFile";
+ break;
+ /* RELMAP IO wait event */
+ case WAIT_EVENT_READ_LOAD_RELMAP_FILE:
+ event_name = "ReadLoadRELMAPFile";
+ break;
+ case WAIT_EVENT_WRITE_RELMAP_FILE:
+ event_name = "WriteRELMAPFile";
+ break;
+ case WAIT_EVENT_SYNC_WRITE_RELMAP_FILE:
+ event_name = "SyncWriteRELMAFile";
+ break;
+ /* LOCK FILE IO wait event */
+ case WAIT_EVENT_READ_CREATE_LOCK_FILE:
+ event_name = "ReadCreateLockFile";
+ break;
+ case WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE:
+ event_name = "ReadAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE:
+ event_name = "ReadRecheckDataDirLockFile";
+ break;
+ case WAIT_EVENT_WRITE_CREATE_LOCK_FILE:
+ event_name = "WriteCreateLockFile";
+ break;
+ case WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE:
+ event_name = "WriteAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE:
+ event_name = "SyncAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_SYNC_CREATE_LOCK_FILE:
+ event_name = "SyncCreateLockFile";
+ break;
+
+ /* no default case, so that compiler will warn */
+ }
+
+ return event_name;
+}
+
+
+/* ----------
* pgstat_get_backend_current_activity() -
*
* Return a string representing the current activity of the backend with
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 13a0301..1ba9a5c 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -432,9 +432,11 @@ SysLoggerMain(int argc, char *argv[])
{
int bytesRead;
+ pgstat_report_wait_start(WAIT_EVENT_READ_SYSLOGGER_FILE);
bytesRead = read(syslogPipe[0],
logbuffer + bytes_in_logbuffer,
sizeof(logbuffer) - bytes_in_logbuffer);
+ pgstat_report_wait_end();
if (bytesRead < 0)
{
if (errno != EINTR)
@@ -992,7 +994,9 @@ write_syslogger_file(const char *buffer, int count, int destination)
open_csvlogfile();
logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SYSLOGGER_FILE);
rc = fwrite(buffer, 1, count, logfile);
+ pgstat_report_wait_end();
/* can't use ereport here because of possible recursion */
if (rc != count)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index d805ef4..a32bd3e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -58,6 +58,7 @@
#include "catalog/catalog.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/slot.h"
@@ -2317,6 +2318,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
ondisk->size = sz;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_REORDER_BUFFER);
if (write(fd, rb->outbuf, ondisk->size) != ondisk->size)
{
int save_errno = errno;
@@ -2328,6 +2330,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
errmsg("could not write to data file for XID %u: %m",
txn->xid)));
}
+ pgstat_report_wait_end();
Assert(ondisk->change.action == change->action);
}
@@ -2408,7 +2411,9 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
* end of this file.
*/
ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER);
readBytes = read(*fd, rb->outbuf, sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_end();
/* eof */
if (readBytes == 0)
@@ -2435,8 +2440,10 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
sizeof(ReorderBufferDiskChange) + ondisk->size);
ondisk = (ReorderBufferDiskChange *) rb->outbuf;
+ pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER);
readBytes = read(*fd, rb->outbuf + sizeof(ReorderBufferDiskChange),
ondisk->size - sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_end();
if (readBytes < 0)
ereport(ERROR,
@@ -3089,7 +3096,9 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
/* read all mappings till the end of the file */
+ pgstat_report_wait_start(WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING);
readBytes = read(fd, &map, sizeof(LogicalRewriteMappingData));
+ pgstat_report_wait_end();
if (readBytes < 0)
ereport(ERROR,
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1e02aa9..3fc1343 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -116,6 +116,8 @@
#include "access/transam.h"
#include "access/xact.h"
+#include "pgstat.h"
+
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/snapbuild.h"
@@ -1581,6 +1583,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
ereport(ERROR,
(errmsg("could not open file \"%s\": %m", path)));
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE);
if ((write(fd, ondisk, needed_length)) != needed_length)
{
CloseTransientFile(fd);
@@ -1588,6 +1591,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
/*
* fsync the file before renaming so that even if we crash after this we
@@ -1597,6 +1601,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
* some noticeable overhead since it's performed synchronously during
* decoding?
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1604,6 +1609,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
fsync_fname("pg_logical/snapshots", true);
@@ -1678,7 +1684,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* read statically sized portion of snapshot */
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, &ondisk, SnapBuildOnDiskConstantSize);
+ pgstat_report_wait_end();
if (readBytes != SnapBuildOnDiskConstantSize)
{
CloseTransientFile(fd);
@@ -1704,7 +1712,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
SnapBuildOnDiskConstantSize - SnapBuildOnDiskNotChecksummedSize);
/* read SnapBuild */
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, &ondisk.builder, sizeof(SnapBuild));
+ pgstat_report_wait_end();
if (readBytes != sizeof(SnapBuild))
{
CloseTransientFile(fd);
@@ -1718,7 +1728,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* restore running xacts information */
sz = sizeof(TransactionId) * ondisk.builder.running.xcnt_space;
ondisk.builder.running.xip = MemoryContextAllocZero(builder->context, sz);
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, ondisk.builder.running.xip, sz);
+ pgstat_report_wait_end();
if (readBytes != sz)
{
CloseTransientFile(fd);
@@ -1732,7 +1744,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* restore committed xacts information */
sz = sizeof(TransactionId) * ondisk.builder.committed.xcnt;
ondisk.builder.committed.xip = MemoryContextAllocZero(builder->context, sz);
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, ondisk.builder.committed.xip, sz);
+ pgstat_report_wait_end();
if (readBytes != sz)
{
CloseTransientFile(fd);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 10d69d0..d4ebcd9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -43,6 +43,7 @@
#include "access/xlog_internal.h"
#include "common/string.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "replication/slot.h"
#include "storage/fd.h"
#include "storage/proc.h"
@@ -1100,10 +1101,12 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
SnapBuildOnDiskChecksummedSize);
FIN_CRC32C(cp.checksum);
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_REPLSLOT);
if ((write(fd, &cp, sizeof(cp))) != sizeof(cp))
{
int save_errno = errno;
+ pgstat_report_wait_end();
CloseTransientFile(fd);
errno = save_errno;
ereport(elevel,
@@ -1112,8 +1115,10 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
tmppath)));
return;
}
+ pgstat_report_wait_end();
/* fsync the temporary file */
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SAVE_REPLSLOT);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;
@@ -1126,6 +1131,7 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
tmppath)));
return;
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
@@ -1202,6 +1208,7 @@ RestoreSlotFromDisk(const char *name)
* Sync state file before we're reading from it. We might have crashed
* while it wasn't synced yet and we shouldn't continue on that basis.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_RESTORE_REPLSLOT);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1210,6 +1217,7 @@ RestoreSlotFromDisk(const char *name)
errmsg("could not fsync file \"%s\": %m",
path)));
}
+ pgstat_report_wait_end();
/* Also sync the parent directory */
START_CRIT_SECTION();
@@ -1217,7 +1225,9 @@ RestoreSlotFromDisk(const char *name)
END_CRIT_SECTION();
/* read part of statefile that's guaranteed to be version independent */
+ pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT);
readBytes = read(fd, &cp, ReplicationSlotOnDiskConstantSize);
+ pgstat_report_wait_end();
if (readBytes != ReplicationSlotOnDiskConstantSize)
{
int saved_errno = errno;
@@ -1253,9 +1263,11 @@ RestoreSlotFromDisk(const char *name)
path, cp.length)));
/* Now that we know the size, read the entire file */
+ pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT);
readBytes = read(fd,
(char *) &cp + ReplicationSlotOnDiskConstantSize,
cp.length);
+ pgstat_report_wait_end();
if (readBytes != cp.length)
{
int saved_errno = errno;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index f3082c3..db0d458 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -496,7 +496,9 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
char rbuf[BLCKSZ];
int nread;
+ pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER);
nread = read(fd, rbuf, sizeof(rbuf));
+ pgstat_report_wait_end();
if (nread <= 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -2137,7 +2139,9 @@ retry:
else
segbytes = nbytes;
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
readbytes = read(sendFile, p, segbytes);
+ pgstat_report_wait_end();
if (readbytes <= 0)
{
ereport(ERROR,
diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 7ebd636..971cfd1 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -37,6 +37,7 @@
#include "postgres.h"
#include "executor/instrument.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/buffile.h"
#include "storage/buf_internals.h"
@@ -254,7 +255,10 @@ BufFileLoadBuffer(BufFile *file)
/*
* Read whatever we can get, up to a full bufferload.
*/
- file->nbytes = FileRead(thisfile, file->buffer, sizeof(file->buffer));
+ file->nbytes = FileRead(thisfile,
+ file->buffer,
+ sizeof(file->buffer),
+ WAIT_EVENT_READ_BUFFILE);
if (file->nbytes < 0)
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
@@ -317,7 +321,10 @@ BufFileDumpBuffer(BufFile *file)
return; /* seek failed, give up */
file->offsets[file->curFile] = file->curOffset;
}
- bytestowrite = FileWrite(thisfile, file->buffer + wpos, bytestowrite);
+ bytestowrite = FileWrite(thisfile,
+ file->buffer + wpos,
+ bytestowrite,
+ WAIT_EVENT_WRITE_BUFFILE);
if (bytestowrite <= 0)
return; /* failed to write */
file->offsets[file->curFile] += bytestowrite;
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 101da47..2eda42d 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -25,7 +25,7 @@
#include "storage/copydir.h"
#include "storage/fd.h"
#include "miscadmin.h"
-
+#include "pgstat.h"
/*
* copydir: copy a directory
@@ -169,7 +169,9 @@ copy_file(char *fromfile, char *tofile)
/* If we got a cancel signal during the copy of the file, quit */
CHECK_FOR_INTERRUPTS();
+ pgstat_report_wait_start(WAIT_EVENT_READ_COPY_FILE);
nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
+ pgstat_report_wait_end();
if (nbytes < 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -177,8 +179,10 @@ copy_file(char *fromfile, char *tofile)
if (nbytes == 0)
break;
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_FILE);
if ((int) write(dstfd, buffer, nbytes) != nbytes)
{
+ pgstat_report_wait_end();
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
@@ -186,6 +190,7 @@ copy_file(char *fromfile, char *tofile)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tofile)));
}
+ pgstat_report_wait_end();
/*
* We fsync the files later but first flush them to avoid spamming the
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index ce4bd0f..431fc48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -1512,7 +1512,7 @@ FileClose(File file)
* to read into.
*/
int
-FilePrefetch(File file, off_t offset, int amount)
+FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
{
#if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
int returnCode;
@@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
POSIX_FADV_WILLNEED);
+ pgstat_report_wait_end();
return returnCode;
#else
@@ -1538,7 +1540,7 @@ FilePrefetch(File file, off_t offset, int amount)
}
void
-FileWriteback(File file, off_t offset, off_t nbytes)
+FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info)
{
int returnCode;
@@ -1559,11 +1561,13 @@ FileWriteback(File file, off_t offset, off_t nbytes)
if (returnCode < 0)
return;
+ pgstat_report_wait_start(wait_event_info);
pg_flush_data(VfdCache[file].fd, offset, nbytes);
+ pgstat_report_wait_end();
}
int
-FileRead(File file, char *buffer, int amount)
+FileRead(File file, char *buffer, int amount, uint32 wait_event_info)
{
int returnCode;
@@ -1578,6 +1582,7 @@ FileRead(File file, char *buffer, int amount)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
retry:
returnCode = read(VfdCache[file].fd, buffer, amount);
@@ -1613,12 +1618,13 @@ retry:
/* Trouble, so assume we don't know the file position anymore */
VfdCache[file].seekPos = FileUnknownPos;
}
+ pgstat_report_wait_end();
return returnCode;
}
int
-FileWrite(File file, char *buffer, int amount)
+FileWrite(File file, char *buffer, int amount, uint32 wait_event_info)
{
int returnCode;
@@ -1658,6 +1664,7 @@ FileWrite(File file, char *buffer, int amount)
}
}
+ pgstat_report_wait_start(wait_event_info);
retry:
errno = 0;
returnCode = write(VfdCache[file].fd, buffer, amount);
@@ -1708,12 +1715,13 @@ retry:
/* Trouble, so assume we don't know the file position anymore */
VfdCache[file].seekPos = FileUnknownPos;
}
+ pgstat_report_wait_end();
return returnCode;
}
int
-FileSync(File file)
+FileSync(File file, uint32 wait_event_info)
{
int returnCode;
@@ -1726,7 +1734,11 @@ FileSync(File file)
if (returnCode < 0)
return returnCode;
- return pg_fsync(VfdCache[file].fd);
+ pgstat_report_wait_start(wait_event_info);
+ returnCode = pg_fsync(VfdCache[file].fd);
+ pgstat_report_wait_end();
+
+ return returnCode;
}
off_t
@@ -1810,7 +1822,7 @@ FileTell(File file)
#endif
int
-FileTruncate(File file, off_t offset)
+FileTruncate(File file, off_t offset, uint32 wait_event_info)
{
int returnCode;
@@ -1823,7 +1835,9 @@ FileTruncate(File file, off_t offset)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = ftruncate(VfdCache[file].fd, offset);
+ pgstat_report_wait_end();
if (returnCode == 0 && VfdCache[file].fileSize > offset)
{
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1d9384e..9f57d7f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlog.h"
#include "catalog/catalog.h"
+#include "pgstat.h"
#include "portability/instr_time.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -536,7 +537,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ)) != BLCKSZ)
+ if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_EXTEND_DATA_BLOCK)) != BLCKSZ)
{
if (nbytes < 0)
ereport(ERROR,
@@ -667,7 +668,7 @@ mdprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE);
- (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ);
+ (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ, WAIT_EVENT_PREFETCH_DATA_BLOCK);
#endif /* USE_PREFETCH */
}
@@ -716,7 +717,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum,
seekpos = (off_t) BLCKSZ *(blocknum % ((BlockNumber) RELSEG_SIZE));
- FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush);
+ FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush, WAIT_EVENT_FLUSH_DATA_BLOCK);
nblocks -= nflush;
blocknum += nflush;
@@ -753,7 +754,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ);
+ nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_READ_DATA_BLOCK);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
@@ -829,7 +830,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ);
+ nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_WRITE_DATA_BLOCK);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
@@ -967,7 +968,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
* This segment is no longer active. We truncate the file, but do
* not delete it, for reasons explained in the header comments.
*/
- if (FileTruncate(v->mdfd_vfd, 0) < 0)
+ if (FileTruncate(v->mdfd_vfd, 0, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\": %m",
@@ -993,7 +994,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
*/
BlockNumber lastsegblocks = nblocks - priorblocks;
- if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ) < 0)
+ if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\" to %u blocks: %m",
@@ -1037,7 +1038,7 @@ mdimmedsync(SMgrRelation reln, ForkNumber forknum)
{
MdfdVec *v = &reln->md_seg_fds[forknum][segno - 1];
- if (FileSync(v->mdfd_vfd) < 0)
+ if (FileSync(v->mdfd_vfd, WAIT_EVENT_SYNC_IMMED_RELATION) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
@@ -1232,7 +1233,7 @@ mdsync(void)
INSTR_TIME_SET_CURRENT(sync_start);
if (seg != NULL &&
- FileSync(seg->mdfd_vfd) >= 0)
+ FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_RELATION) >= 0)
{
/* Success; update statistics about sync timing */
INSTR_TIME_SET_CURRENT(sync_end);
@@ -1443,7 +1444,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
ereport(DEBUG1,
(errmsg("could not forward fsync request because request queue is full")));
- if (FileSync(seg->mdfd_vfd) < 0)
+ if (FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_DATA_BLOCK) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index c9d6e44..ee5ded9 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -50,6 +50,7 @@
#include "catalog/pg_tablespace.h"
#include "catalog/storage.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/lwlock.h"
#include "utils/inval.h"
@@ -658,11 +659,15 @@ load_relmap_file(bool shared)
* look, the sinval signaling mechanism will make us re-read it before we
* are able to access any relation that's affected by the change.
*/
+ pgstat_report_wait_start(WAIT_EVENT_READ_LOAD_RELMAP_FILE);
if (read(fd, map, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not read relation mapping file \"%s\": %m",
mapfilename)));
+ }
+ pgstat_report_wait_end();
CloseTransientFile(fd);
@@ -774,6 +779,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_RELMAP_FILE);
if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
{
/* if write didn't set errno, assume problem is no disk space */
@@ -784,6 +790,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
errmsg("could not write to relation mapping file \"%s\": %m",
mapfilename)));
}
+ pgstat_report_wait_end();
/*
* We choose to fsync the data to disk before considering the task done.
@@ -791,11 +798,13 @@ write_relmap_file(bool shared, RelMapFile *newmap,
* issue, but it would complicate checkpointing --- see notes for
* CheckPointRelationMap.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_RELMAP_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync relation mapping file \"%s\": %m",
mapfilename)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index e984e79..e58cfa5 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -36,6 +36,7 @@
#include "libpq/libpq.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "postmaster/autovacuum.h"
#include "postmaster/postmaster.h"
#include "storage/fd.h"
@@ -857,11 +858,13 @@ CreateLockFile(const char *filename, bool amPostmaster,
errmsg("could not open lock file \"%s\": %m",
filename)));
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_CREATE_LOCK_FILE);
if ((len = read(fd, buffer, sizeof(buffer) - 1)) < 0)
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not read lock file \"%s\": %m",
filename)));
+ pgstat_report_wait_end();
close(fd);
if (len == 0)
@@ -1010,6 +1013,7 @@ CreateLockFile(const char *filename, bool amPostmaster,
strlcat(buffer, "\n", sizeof(buffer));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_CREATE_LOCK_FILE);
if (write(fd, buffer, strlen(buffer)) != strlen(buffer))
{
int save_errno = errno;
@@ -1022,6 +1026,9 @@ CreateLockFile(const char *filename, bool amPostmaster,
(errcode_for_file_access(),
errmsg("could not write lock file \"%s\": %m", filename)));
}
+ pgstat_report_wait_end();
+
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_CREATE_LOCK_FILE);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;
@@ -1033,6 +1040,7 @@ CreateLockFile(const char *filename, bool amPostmaster,
(errcode_for_file_access(),
errmsg("could not write lock file \"%s\": %m", filename)));
}
+ pgstat_report_wait_end();
if (close(fd) != 0)
{
int save_errno = errno;
@@ -1165,7 +1173,9 @@ AddToDataDirLockFile(int target_line, const char *str)
DIRECTORY_LOCK_FILE)));
return;
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE);
len = read(fd, srcbuffer, sizeof(srcbuffer) - 1);
+ pgstat_report_wait_end();
if (len < 0)
{
ereport(LOG,
@@ -1218,6 +1228,7 @@ AddToDataDirLockFile(int target_line, const char *str)
*/
len = strlen(destbuffer);
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE);
if (lseek(fd, (off_t) 0, SEEK_SET) != 0 ||
(int) write(fd, destbuffer, len) != len)
{
@@ -1231,6 +1242,8 @@ AddToDataDirLockFile(int target_line, const char *str)
close(fd);
return;
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE);
if (pg_fsync(fd) != 0)
{
ereport(LOG,
@@ -1238,6 +1251,7 @@ AddToDataDirLockFile(int target_line, const char *str)
errmsg("could not write to file \"%s\": %m",
DIRECTORY_LOCK_FILE)));
}
+ pgstat_report_wait_end();
if (close(fd) != 0)
{
ereport(LOG,
@@ -1294,7 +1308,9 @@ RecheckDataDirLockFile(void)
return true;
}
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE);
len = read(fd, buffer, sizeof(buffer) - 1);
+ pgstat_report_wait_end();
if (len < 0)
{
ereport(LOG,
diff --git a/src/common/controldata_utils.c b/src/common/controldata_utils.c
index f1a097a..0d48469 100644
--- a/src/common/controldata_utils.c
+++ b/src/common/controldata_utils.c
@@ -16,6 +16,7 @@
#ifndef FRONTEND
#include "postgres.h"
+#include "pgstat.h"
#else
#include "postgres_fe.h"
#endif
@@ -64,6 +65,9 @@ get_controlfile(const char *DataDir, const char *progname, bool *crc_ok_p)
}
#endif
+#ifndef FRONTEND
+ pgstat_report_wait_start(WAIT_EVENT_READ_CONTROL_FILE);
+#endif
if (read(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
#ifndef FRONTEND
ereport(ERROR,
@@ -76,7 +80,9 @@ get_controlfile(const char *DataDir, const char *progname, bool *crc_ok_p)
exit(EXIT_FAILURE);
}
#endif
-
+#ifndef FRONTEND
+ pgstat_report_wait_end();
+#endif
close(fd);
/* Check the CRC. */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index de8225b..5e599ed 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -723,6 +723,7 @@ typedef enum BackendState
#define PG_WAIT_EXTENSION 0x07000000U
#define PG_WAIT_IPC 0x08000000U
#define PG_WAIT_TIMEOUT 0x09000000U
+#define PG_WAIT_IO 0x0A000000U
/* ----------
* Wait Events - Activity
@@ -804,6 +805,98 @@ typedef enum
} WaitEventTimeout;
/* ----------
+ * Wait Events - IO
+ *
+ * Use this category when a process is waiting for a IO.
+ * ----------
+ */
+typedef enum
+{
+ WAIT_EVENT_READ_DATA_BLOCK,
+ WAIT_EVENT_WRITE_DATA_BLOCK,
+ WAIT_EVENT_SYNC_DATA_BLOCK,
+ WAIT_EVENT_EXTEND_DATA_BLOCK,
+ WAIT_EVENT_FLUSH_DATA_BLOCK,
+ WAIT_EVENT_PREFETCH_DATA_BLOCK,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK,
+ WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK,
+ WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS,
+ WAIT_EVENT_SYNC_RELATION,
+ WAIT_EVENT_SYNC_IMMED_RELATION,
+ WAIT_EVENT_READ_BUFFILE,
+ WAIT_EVENT_WRITE_BUFFILE,
+ /* Wait event for XLOG */
+ WAIT_EVENT_READ_XLOG,
+ WAIT_EVENT_READ_COPY_XLOG,
+ WAIT_EVENT_WRITE_XLOG,
+ WAIT_EVENT_WRITE_INIT_XLOG_FILE,
+ WAIT_EVENT_WRITE_COPY_XLOG_FILE,
+ WAIT_EVENT_WRITE_BOOTSTRAP_XLOG,
+ WAIT_EVENT_SYNC_INIT_XLOG_FILE,
+ WAIT_EVENT_SYNC_COPY_XLOG_FILE,
+ WAIT_EVENT_SYNC_BOOTSTRAP_XLOG,
+ WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD,
+ /* Wait event for CONTROL_FILE */
+ WAIT_EVENT_WRITE_CONTROL_FILE,
+ WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE,
+ WAIT_EVENT_SYNC_WRITE_CONTROL_FILE,
+ WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE,
+ WAIT_EVENT_READ_CONTROL_FILE,
+ /* Wait event for REORDER BUFFER */
+ WAIT_EVENT_READ_REORDER_BUFFER,
+ WAIT_EVENT_WRITE_REORDER_BUFFER,
+ /* Wait event for LOGICAL MAPPING */
+ WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING,
+ WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE,
+ WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE,
+ WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP,
+ WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE,
+ /* Wait event for SNAPBUILD */
+ WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE,
+ WAIT_EVENT_READ_SNAPBUILD_RESTORE,
+ WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE,
+ /* Wait event for SNRU */
+ WAIT_EVENT_READ_SLRU_PAGE,
+ WAIT_EVENT_WRITE_SLRU_PAGE,
+ WAIT_EVENT_SYNC_SLRU_FLUSH,
+ WAIT_EVENT_SYNC_SLRU_WRITE_PAGE,
+ /* Wait event for TIMELINE HISTORY */
+ WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER,
+ WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_WRITE_TIMELINE_HISTORY,
+ WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE,
+ WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE,
+ /* Wait event for TWOPHASE FILE */
+ WAIT_EVENT_READ_TWOPHASE_FILE,
+ WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE,
+ WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE,
+ /* Wait event for SYSLOGGER */
+ WAIT_EVENT_READ_SYSLOGGER_FILE,
+ WAIT_EVENT_WRITE_SYSLOGGER_FILE,
+ /* Wait event for REPLSLOT */
+ WAIT_EVENT_READ_RESTORE_REPLSLOT,
+ WAIT_EVENT_WRITE_REPLSLOT,
+ WAIT_EVENT_SYNC_RESTORE_REPLSLOT,
+ WAIT_EVENT_SYNC_SAVE_REPLSLOT,
+ /* Wait event for copydir */
+ WAIT_EVENT_READ_COPY_FILE,
+ WAIT_EVENT_WRITE_COPY_FILE,
+ /* Wait event RELMAP FILE */
+ WAIT_EVENT_READ_LOAD_RELMAP_FILE,
+ WAIT_EVENT_WRITE_RELMAP_FILE,
+ WAIT_EVENT_SYNC_WRITE_RELMAP_FILE,
+ /* Wait event for LOCK FILE */
+ WAIT_EVENT_READ_CREATE_LOCK_FILE,
+ WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE,
+ WAIT_EVENT_WRITE_CREATE_LOCK_FILE,
+ WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_SYNC_CREATE_LOCK_FILE
+} WaitEventIO;
+
+/* ----------
* Command type for progress reporting purposes
* ----------
*/
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 1a43a2c..ac37502 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -68,13 +68,13 @@ extern int max_safe_fds;
extern File PathNameOpenFile(FileName fileName, int fileFlags, int fileMode);
extern File OpenTemporaryFile(bool interXact);
extern void FileClose(File file);
-extern int FilePrefetch(File file, off_t offset, int amount);
-extern int FileRead(File file, char *buffer, int amount);
-extern int FileWrite(File file, char *buffer, int amount);
-extern int FileSync(File file);
+extern int FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info);
+extern int FileRead(File file, char *buffer, int amount, uint32 wait_event_info);
+extern int FileWrite(File file, char *buffer, int amount, uint32 wait_event_info);
+extern int FileSync(File file, uint32 wait_event_info);
extern off_t FileSeek(File file, off_t offset, int whence);
-extern int FileTruncate(File file, off_t offset);
-extern void FileWriteback(File file, off_t offset, off_t nbytes);
+extern int FileTruncate(File file, off_t offset, uint32 wait_event_info);
+extern void FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info);
extern char *FilePathName(File file);
extern int FileGetRawDesc(File file);
extern int FileGetRawFlags(File file);
On Mon, Jan 30, 2017 at 10:01 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
Attached is the patch, which extend the existing wait event infrastructure
to implement the wait events for the disk I/O. Basically pg_stat_activity's
wait event information to show data about disk I/O as well as IPC primitives.Implementation details:
- Added PG_WAIT_IO to pgstat.h and a new enum WaitEventIO
- Added a wait_event_info argument to FileRead, FileWrite, FilePrefetch,
FileWriteback, FileSync, and FileTruncate. Set this wait event just before
performing the file system operation and clear it just after.
- Pass down an appropriate wait event from caller of any of those
functions.
- Also set and clear a wait event around standalone calls to read(),
write(), fsync() in other parts of the system.
- Added documentation for all newly added wait event.
Looks neat, those are unlikely to overlap with other wait events.
Open issue:
- Might missed few standalone calls to read(), write(), etc which need
to pass the wait_event_info.
It may be an idea to use LD_PRELOAD with custom versions of read(),
write() and fsync(), and look at the paths where no flags are set in
MyProc->wait_event_info, and log information when that happens.
Thanks to my colleague Robert Haas for his help in design.
Please let me know your thought, and thanks for reading.
Did you consider a wrapper of the type pg_read_event() or
pg_write_event(), etc.?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 31, 2017 at 8:54 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Mon, Jan 30, 2017 at 10:01 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:Attached is the patch, which extend the existing wait event
infrastructure
to implement the wait events for the disk I/O. Basically
pg_stat_activity's
wait event information to show data about disk I/O as well as IPC
primitives.
Implementation details:
- Added PG_WAIT_IO to pgstat.h and a new enum WaitEventIO
- Added a wait_event_info argument to FileRead, FileWrite, FilePrefetch,
FileWriteback, FileSync, and FileTruncate. Set this wait event justbefore
performing the file system operation and clear it just after.
- Pass down an appropriate wait event from caller of any of those
functions.
- Also set and clear a wait event around standalone calls to read(),
write(), fsync() in other parts of the system.
- Added documentation for all newly added wait event.Looks neat, those are unlikely to overlap with other wait events.
Thanks.
Open issue:
- Might missed few standalone calls to read(), write(), etc which need
to pass the wait_event_info.It may be an idea to use LD_PRELOAD with custom versions of read(),
write() and fsync(), and look at the paths where no flags are set in
MyProc->wait_event_info, and log information when that happens.
Yes, may be I will try this.
Thanks to my colleague Robert Haas for his help in design.
Please let me know your thought, and thanks for reading.Did you consider a wrapper of the type pg_read_event() or
pg_write_event(), etc.?
I thought on that, but eventually stick to this approach as it looks
more neat and uniform with other wait event implementation.
--
Michael
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
My colleague Rahila reported compilation issue with
the patch. Issue was only coming with we do the clean
build on the branch.
Fixed the same into latest version of patch.
Thanks,
On Tue, Jan 31, 2017 at 11:09 AM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:
On Tue, Jan 31, 2017 at 8:54 AM, Michael Paquier <
michael.paquier@gmail.com> wrote:On Mon, Jan 30, 2017 at 10:01 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:Attached is the patch, which extend the existing wait event
infrastructure
to implement the wait events for the disk I/O. Basically
pg_stat_activity's
wait event information to show data about disk I/O as well as IPC
primitives.
Implementation details:
- Added PG_WAIT_IO to pgstat.h and a new enum WaitEventIO
- Added a wait_event_info argument to FileRead, FileWrite, FilePrefetch,
FileWriteback, FileSync, and FileTruncate. Set this wait event justbefore
performing the file system operation and clear it just after.
- Pass down an appropriate wait event from caller of any of those
functions.
- Also set and clear a wait event around standalone calls to read(),
write(), fsync() in other parts of the system.
- Added documentation for all newly added wait event.Looks neat, those are unlikely to overlap with other wait events.
Thanks.
Open issue:
- Might missed few standalone calls to read(), write(), etc which need
to pass the wait_event_info.It may be an idea to use LD_PRELOAD with custom versions of read(),
write() and fsync(), and look at the paths where no flags are set in
MyProc->wait_event_info, and log information when that happens.Yes, may be I will try this.
Thanks to my colleague Robert Haas for his help in design.
Please let me know your thought, and thanks for reading.Did you consider a wrapper of the type pg_read_event() or
pg_write_event(), etc.?I thought on that, but eventually stick to this approach as it looks
more neat and uniform with other wait event implementation.--
MichaelThanks,
Rushabh Lathia
www.EnterpriseDB.com
--
Rushabh Lathia
Attachments:
wait_event_for_disk_IO_v1.patchapplication/x-download; name=wait_event_for_disk_IO_v1.patchDownload
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index fad5cb0..3b7b0fd 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -716,6 +716,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
point.
</para>
</listitem>
+ <listitem>
+ <para>
+ <literal>IO</>: The server process is waiting for a IO to complete.
+ <literal>wait_event</> will identify the specific wait point.
+ </para>
+ </listitem>
</itemizedlist>
</entry>
</row>
@@ -1264,6 +1270,272 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry><literal>RecoveryApplyDelay</></entry>
<entry>Waiting to apply WAL at recovery because it is delayed.</entry>
</row>
+ <row>
+ <entry morerows="65"><literal>IO</></entry>
+ <entry><literal>ReadDataBlock</></entry>
+ <entry>Waiting during relation data block read.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteDataBlock</></entry>
+ <entry>Waiting during relation data block write.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncDataBlock</></entry>
+ <entry>Waiting during relation data block sync.</entry>
+ </row>
+ <row>
+ <entry><literal>ExtendDataBlock</></entry>
+ <entry>Waiting during add a block to the relation.</entry>
+ </row>
+ <row>
+ <entry><literal>FlushDataBlock</></entry>
+ <entry>Waiting during write pages back to storage.</entry>
+ </row>
+ <row>
+ <entry><literal>PrefetchDataBlock</></entry>
+ <entry>Waiting during asynchronous read of the specified block of a relation.</entry>
+ </row>
+ <row>
+ <entry><literal>TruncateDataBlock</></entry>
+ <entry>Waiting during truncate relation to specified number of blocks.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRelation</></entry>
+ <entry>Waiting during sync writes to stable storage.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncImmedRelation</></entry>
+ <entry>Waiting during immediate sync a relation to stable storage.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteRewriteDataBlock</></entry>
+ <entry>Waiting to write data block during rewrite heap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRewriteDataBlock</></entry>
+ <entry>Waiting to sync data block during rewrite heap.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBuffile</></entry>
+ <entry>Waiting during buffile read operation.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBuffile</></entry>
+ <entry>Waiting during buffile write operation.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadXLog</></entry>
+ <entry>Waiting during read the XLOG page.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCopyXLog</></entry>
+ <entry>Wait to read the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteXLog</></entry>
+ <entry>Waiting during write the XLOG page.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteInitXLogFile</></entry>
+ <entry>Waiting to write the XLOG page during XLOG file initialization.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCopyXLogFile</></entry>
+ <entry>Waiting to write the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBootstrapXLog</></entry>
+ <entry>Waiting to write the XLOG page during bootstrap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncInitXLogFile</></entry>
+ <entry>Waiting to sync the XLOG page during XLOG file initialization.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncCopyXLogFile</></entry>
+ <entry>Waiting to sync the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncBootstrapXLog</></entry>
+ <entry>Waiting to sync the XLOG page during bootstrap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncAssignXLogMethod</></entry>
+ <entry>Waiting to assign xlog sync method.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteControlFile</></entry>
+ <entry>Waiting to write the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteUpdateControlFile</></entry>
+ <entry>Waiting to write the control file during update control file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadControlFile</></entry>
+ <entry>Waiting to read the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncWriteControlFile</></entry>
+ <entry>Waiting to sync the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncUpdateControlFile</></entry>
+ <entry>Waiting to sync the control file during update control file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadApplyLogicalMapping</></entry>
+ <entry>Waiting to read logical mapping during apply a single mapping file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteLogicalMappingRewrite</></entry>
+ <entry>Waiting to write logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLogicalMappingRewrite</></entry>
+ <entry>Waiting to sync logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLogicalMappingRewriteHeap</></entry>
+ <entry>Waiting to sync logical mapping during a checkpoint for logical rewrite mappings.</entry>
+ </row>
+ <row>
+ <entry><literal>TruncateLogicalMappingRewrite</></entry>
+ <entry>Waiting to truncate logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSnapbuildSerialize</></entry>
+ <entry>Waiting to write snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSnapbuildSerialize</></entry>
+ <entry>Waiting to read snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSnapbuildSerialize</></entry>
+ <entry>Waiting to sync snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSLRUPage</></entry>
+ <entry>Waiting to read page during physical read of a page into a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSLRUPage</></entry>
+ <entry>Waiting to write page during physical write of a page from a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSLRUWritePage</></entry>
+ <entry>Waiting to sync page during physical write of a page from a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSLRUFlush</></entry>
+ <entry>Waiting to sync page during flush dirty pages to disk during checkpoint or database shutdown.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTimelineHistoryWrite</></entry>
+ <entry>Waiting to read timeline history during write timeline history.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTimelineHistoryWalsender</></entry>
+ <entry>Waiting to read timeline history during walsander timeline command.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteTimelineHistory</></entry>
+ <entry>Waiting to write timeline history.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteTimelineHistoryFile</></entry>
+ <entry>Waiting to write timeline history during a history file write for given timeline and contents.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncTimelineHistoryWrite</></entry>
+ <entry>Waiting to sync timeline history during write timeline history</entry>
+ </row>
+ <row>
+ <entry><literal>SyncTimelineHistoryFile</></entry>
+ <entry>Waiting to sync timeline history during a history file write for given timeline and contents.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTwophaseFile</></entry>
+ <entry>Waiting to read two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteRecreateTwophaseFile</></entry>
+ <entry>Waiting to write two phase file during recreate two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRecreateTwophaseFile</></entry>
+ <entry>Waiting to sync two phase file during recreate two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSysloggerFile</></entry>
+ <entry>Wait during read syslogger file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSysloggerFile</></entry>
+ <entry>Wait during write syslogger file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadRestorREPLSlot</></entry>
+ <entry>Wait to read REPL slot during load a single slot from disk into memory.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteREPLSlot</></entry>
+ <entry>Wait to write REPL slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRestoreREPLSlot</></entry>
+ <entry>Wait to sync REPL slot during load a single slot from disk into memory.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCopyFile</></entry>
+ <entry>Waiting to read during copy file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCopyFile</></entry>
+ <entry>Waiting to write during copy file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadLoadRELMAPFile</></entry>
+ <entry>Waiting to read RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteLoadRELMAPFile</></entry>
+ <entry>Waiting to write RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLoadRELMAPFile</></entry>
+ <entry>Waiting to sync RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCreateLockFile</></entry>
+ <entry>Wait to read lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadAddToDataDirLockFile</></entry>
+ <entry>Wait to read lock file during add a line in the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadRecheckDataDirLockFile</></entry>
+ <entry>Wait to read lock file during recheck that the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCreateLockFile</></entry>
+ <entry>Wait to write lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteAddToDataDirLockFile</></entry>
+ <entry>Wait to write lock file during add a line in the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncCreateLockFile</></entry>
+ <entry>Wait to sync lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncAddToDataDirLockFile</></entry>
+ <entry>Wait to sync lock file during add a line in the data directory lock file.</entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index c7b283c..2456500 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -119,6 +119,8 @@
#include "lib/ilist.h"
+#include "pgstat.h"
+
#include "replication/logical.h"
#include "replication/slot.h"
@@ -916,7 +918,8 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
* Note that we deviate from the usual WAL coding practices here,
* check the above "Logical rewrite support" comment for reasoning.
*/
- written = FileWrite(src->vfd, waldata_start, len);
+ written = FileWrite(src->vfd, waldata_start, len,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
if (written != len)
ereport(ERROR,
(errcode_for_file_access(),
@@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
hash_seq_init(&seq_status, state->rs_logical_mappings);
while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL)
{
- if (FileSync(src->vfd) != 0)
+ if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", src->path)));
@@ -1141,11 +1144,13 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
* Truncate all data that's not guaranteed to have been safely fsynced (by
* previous record or by the last checkpoint).
*/
+ pgstat_report_wait_start(WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE);
if (ftruncate(fd, xlrec->offset) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\" to %u: %m",
path, (uint32) xlrec->offset)));
+ pgstat_report_wait_end();
/* now seek to the position we want to write our data to */
if (lseek(fd, xlrec->offset, SEEK_SET) != xlrec->offset)
@@ -1159,20 +1164,24 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
len = xlrec->num_mappings * sizeof(LogicalRewriteMappingData);
/* write out tail end of mapping file (again) */
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE);
if (write(fd, data, len) != len)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", path)));
+ pgstat_report_wait_end();
/*
* Now fsync all previously written data. We could improve things and only
* do this for the last write to a file, but the required bookkeeping
* doesn't seem worth the trouble.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", path)));
+ pgstat_report_wait_end();
CloseTransientFile(fd);
}
@@ -1266,10 +1275,12 @@ CheckPointLogicalRewriteHeap(void)
* changed or have only been created since the checkpoint's start,
* but it's currently not deemed worth the effort.
*/
- else if (pg_fsync(fd) != 0)
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP);
+ if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", path)));
+ pgstat_report_wait_end();
CloseTransientFile(fd);
}
}
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a66ef5c..237e8eb 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -54,6 +54,7 @@
#include "access/slru.h"
#include "access/transam.h"
#include "access/xlog.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
#include "miscadmin.h"
@@ -675,13 +676,16 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_SLRU_PAGE);
if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
{
+ pgstat_report_wait_end();
slru_errcause = SLRU_READ_FAILED;
slru_errno = errno;
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
{
@@ -834,8 +838,10 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SLRU_PAGE);
if (write(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
{
+ pgstat_report_wait_end();
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
@@ -845,6 +851,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
/*
* If not part of Flush, need to fsync now. We assume this happens
@@ -852,6 +859,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
*/
if (!fdata)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_WRITE_PAGE);
if (ctl->do_fsync && pg_fsync(fd))
{
slru_errcause = SLRU_FSYNC_FAILED;
@@ -859,6 +867,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
{
@@ -1126,6 +1135,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
ok = true;
for (i = 0; i < fdata.num_files; i++)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_FLUSH);
if (ctl->do_fsync && pg_fsync(fdata.fd[i]))
{
slru_errcause = SLRU_FSYNC_FAILED;
@@ -1133,6 +1143,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT;
ok = false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fdata.fd[i]))
{
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index c8240b1..052f77f 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -39,6 +39,7 @@
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "pgstat.h"
#include "storage/fd.h"
/*
@@ -339,7 +340,9 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
for (;;)
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE);
nbytes = (int) read(srcfd, buffer, sizeof(buffer));
+ pgstat_report_wait_end();
if (nbytes < 0 || errno != 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -347,6 +350,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
if (nbytes == 0)
break;
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY);
if ((int) write(fd, buffer, nbytes) != nbytes)
{
int save_errno = errno;
@@ -366,6 +370,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
CloseTransientFile(srcfd);
}
@@ -401,10 +406,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
@@ -461,6 +468,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
errmsg("could not create file \"%s\": %m", tmppath)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE);
if ((int) write(fd, content, size) != size)
{
int save_errno = errno;
@@ -476,11 +484,14 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 50c70b2..4aabfba 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1201,8 +1201,10 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
*/
buf = (char *) palloc(stat.st_size);
+ pgstat_report_wait_start(WAIT_EVENT_READ_TWOPHASE_FILE);
if (read(fd, buf, stat.st_size) != stat.st_size)
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
if (give_warnings)
ereport(WARNING,
@@ -1213,6 +1215,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
return NULL;
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
hdr = (TwoPhaseFileHeader *) buf;
@@ -1543,8 +1546,10 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
path)));
/* Write content and CRC */
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE);
if (write(fd, content, len) != len)
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
ereport(ERROR,
(errcode_for_file_access(),
@@ -1552,16 +1557,19 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
}
if (write(fd, &statefile_crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c))
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write two-phase state file: %m")));
}
+ pgstat_report_wait_end();
/*
* We must fsync the file because the end-of-replay checkpoint will not do
* so, there being no GXACT in shared memory yet to tell it to.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1569,6 +1577,7 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
(errcode_for_file_access(),
errmsg("could not fsync two-phase state file: %m")));
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd) != 0)
ereport(ERROR,
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f23e108..1132b95 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2454,7 +2454,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
do
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_XLOG);
written = write(openLogFile, from, nleft);
+ pgstat_report_wait_end();
if (written <= 0)
{
if (errno == EINTR)
@@ -3205,6 +3207,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
for (nbytes = 0; nbytes < XLogSegSize; nbytes += XLOG_BLCKSZ)
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_INIT_XLOG_FILE);
if ((int) write(fd, zbuffer, XLOG_BLCKSZ) != (int) XLOG_BLCKSZ)
{
int save_errno = errno;
@@ -3223,8 +3226,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_INIT_XLOG_FILE);
if (pg_fsync(fd) != 0)
{
close(fd);
@@ -3232,6 +3237,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
if (close(fd))
ereport(ERROR,
@@ -3358,6 +3364,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
if (nread > sizeof(buffer))
nread = sizeof(buffer);
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_COPY_XLOG);
if (read(srcfd, buffer, nread) != nread)
{
if (errno != 0)
@@ -3370,8 +3377,10 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
(errmsg("not enough data in file \"%s\"",
path)));
}
+ pgstat_report_wait_end();
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_XLOG_FILE);
if ((int) write(fd, buffer, sizeof(buffer)) != (int) sizeof(buffer))
{
int save_errno = errno;
@@ -3387,12 +3396,15 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_COPY_XLOG_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
@@ -4417,6 +4429,7 @@ WriteControlFile(void)
XLOG_CONTROL_FILE)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_CONTROL_FILE);
if (write(fd, buffer, PG_CONTROL_SIZE) != PG_CONTROL_SIZE)
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4426,11 +4439,14 @@ WriteControlFile(void)
(errcode_for_file_access(),
errmsg("could not write to control file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_CONTROL_FILE);
if (pg_fsync(fd) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync control file: %m")));
+ pgstat_report_wait_end();
if (close(fd))
ereport(PANIC,
@@ -4456,10 +4472,12 @@ ReadControlFile(void)
errmsg("could not open control file \"%s\": %m",
XLOG_CONTROL_FILE)));
+ pgstat_report_wait_start(WAIT_EVENT_READ_CONTROL_FILE);
if (read(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not read from control file: %m")));
+ pgstat_report_wait_end();
close(fd);
@@ -4653,6 +4671,7 @@ UpdateControlFile(void)
XLOG_CONTROL_FILE)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE);
if (write(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4662,11 +4681,14 @@ UpdateControlFile(void)
(errcode_for_file_access(),
errmsg("could not write to control file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE);
if (pg_fsync(fd) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync control file: %m")));
+ pgstat_report_wait_end();
if (close(fd))
ereport(PANIC,
@@ -5033,6 +5055,7 @@ BootStrapXLOG(void)
/* Write the first page with the initial record */
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_BOOTSTRAP_XLOG);
if (write(openLogFile, page, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
/* if write didn't set errno, assume problem is no disk space */
@@ -5042,11 +5065,14 @@ BootStrapXLOG(void)
(errcode_for_file_access(),
errmsg("could not write bootstrap transaction log file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_BOOTSTRAP_XLOG);
if (pg_fsync(openLogFile) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync bootstrap transaction log file: %m")));
+ pgstat_report_wait_end();
if (close(openLogFile))
ereport(PANIC,
@@ -9995,11 +10021,13 @@ assign_xlog_sync_method(int new_sync_method, void *extra)
*/
if (openLogFile >= 0)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD);
if (pg_fsync(openLogFile) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync log segment %s: %m",
XLogFileNameP(ThisTimeLineID, openLogSegNo))));
+ pgstat_report_wait_end();
if (get_sync_bit(sync_method) != get_sync_bit(new_sync_method))
XLogFileClose();
}
@@ -11452,6 +11480,7 @@ retry:
goto next_record_is_invalid;
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
char fname[MAXFNAMELEN];
@@ -11463,6 +11492,7 @@ retry:
fname, readOff)));
goto next_record_is_invalid;
}
+ pgstat_report_wait_end();
Assert(targetSegNo == readSegNo);
Assert(targetPageOff == readOff);
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 8b99b78..d17d541 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -24,6 +24,7 @@
#include "access/xlogutils.h"
#include "catalog/catalog.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "storage/smgr.h"
#include "utils/guc.h"
#include "utils/hsearch.h"
@@ -728,7 +729,9 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count)
else
segbytes = nbytes;
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
readbytes = read(sendFile, p, segbytes);
+ pgstat_report_wait_end();
if (readbytes <= 0)
{
char path[MAXPGPATH];
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index ada374c..5bd4964 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -280,6 +280,7 @@ static const char *pgstat_get_wait_activity(WaitEventActivity w);
static const char *pgstat_get_wait_client(WaitEventClient w);
static const char *pgstat_get_wait_ipc(WaitEventIPC w);
static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
+static const char *pgstat_get_wait_io(WaitEventIO w);
static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
static void pgstat_send(void *msg, int len);
@@ -3176,6 +3177,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info)
case PG_WAIT_TIMEOUT:
event_type = "Timeout";
break;
+ case PG_WAIT_IO:
+ event_type = "IO";
+ break;
default:
event_type = "???";
break;
@@ -3246,6 +3250,13 @@ pgstat_get_wait_event(uint32 wait_event_info)
event_name = pgstat_get_wait_timeout(w);
break;
}
+ case PG_WAIT_IO:
+ {
+ WaitEventIO w = (WaitEventIO) wait_event_info;
+
+ event_name = pgstat_get_wait_io(w);
+ break;
+ }
default:
event_name = "unknown wait event";
break;
@@ -3436,6 +3447,247 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
}
/* ----------
+ * pgstat_get_wait_io() -
+ *
+ * Convert WaitEventIO to string.
+ * ----------
+ */
+static const char *
+pgstat_get_wait_io(WaitEventIO w)
+{
+ const char *event_name = "unknown wait event";
+
+ switch (w)
+ {
+ case WAIT_EVENT_READ_DATA_BLOCK:
+ event_name = "ReadDataBlock";
+ break;
+ case WAIT_EVENT_WRITE_DATA_BLOCK:
+ event_name = "WriteDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_DATA_BLOCK:
+ event_name = "SyncDataBlock";
+ break;
+ case WAIT_EVENT_EXTEND_DATA_BLOCK:
+ event_name = "ExtendDataBlock";
+ break;
+ case WAIT_EVENT_FLUSH_DATA_BLOCK:
+ event_name = "FlushDataBlock";
+ break;
+ case WAIT_EVENT_PREFETCH_DATA_BLOCK:
+ event_name = "PrefetchDataBlock";
+ break;
+ case WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS:
+ event_name = "TruncateDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_RELATION:
+ event_name = "SyncRelation";
+ break;
+ case WAIT_EVENT_SYNC_IMMED_RELATION:
+ event_name = "SyncImmedRelation";
+ break;
+ case WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK:
+ event_name = "WriteRewriteDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK:
+ event_name = "SyncRewriteDataBlock";
+ break;
+ case WAIT_EVENT_READ_BUFFILE:
+ event_name = "ReadBuffile";
+ break;
+ case WAIT_EVENT_WRITE_BUFFILE:
+ event_name = "WriteBuffile";
+ break;
+ /* XLOG wait event */
+ case WAIT_EVENT_READ_XLOG:
+ event_name = "ReadXLog";
+ break;
+ case WAIT_EVENT_READ_COPY_XLOG:
+ event_name = "ReadCopyXLog";
+ break;
+ case WAIT_EVENT_WRITE_XLOG:
+ event_name = "WriteXLog";
+ break;
+ case WAIT_EVENT_WRITE_INIT_XLOG_FILE:
+ event_name = "WriteInitXLogFile";
+ break;
+ case WAIT_EVENT_WRITE_COPY_XLOG_FILE:
+ event_name = "WriteCopyXLogFile";
+ break;
+ case WAIT_EVENT_WRITE_BOOTSTRAP_XLOG:
+ event_name = "WriteBootstrapXLog";
+ break;
+ case WAIT_EVENT_SYNC_INIT_XLOG_FILE:
+ event_name = "SyncInitXLogFile";
+ break;
+ case WAIT_EVENT_SYNC_COPY_XLOG_FILE:
+ event_name = "SyncCopyXLogFile";
+ break;
+ case WAIT_EVENT_SYNC_BOOTSTRAP_XLOG:
+ event_name = "SyncBootStrapXLog";
+ break;
+ case WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD:
+ event_name = "SyncAssignXLogMethod";
+ break;
+ /* Control file wait events */
+ case WAIT_EVENT_WRITE_CONTROL_FILE:
+ event_name = "WriteControlFile";
+ break;
+ case WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE:
+ event_name = "WriteUpdateControlFile";
+ break;
+ case WAIT_EVENT_READ_CONTROL_FILE:
+ event_name = "ReadControlFile";
+ break;
+ case WAIT_EVENT_SYNC_WRITE_CONTROL_FILE:
+ event_name = "SyncWriteControlFile";
+ break;
+ case WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE:
+ event_name = "SyncUpdateControlFile";
+ break;
+ /* reorder buffer wait event */
+ case WAIT_EVENT_READ_REORDER_BUFFER:
+ event_name = "ReadReorderBuffer";
+ break;
+ case WAIT_EVENT_WRITE_REORDER_BUFFER:
+ event_name = "WriteReorderBuffer";
+ break;
+ /* logical mapping wait event */
+ case WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING:
+ event_name = "ReadApplyLogicalMapping";
+ break;
+ case WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE:
+ event_name = "WriteLogicalMappingRewrite";
+ break;
+ case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE:
+ event_name = "SyncLogicalMappingRewrite";
+ break;
+ case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP:
+ event_name = "SyncLogicalMappingRewriteHeap";
+ break;
+ case WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE:
+ event_name = "TruncateLogicalMappingRewrite";
+ break;
+ /* Snapbuild wait event */
+ case WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE:
+ event_name = "WriteSnapbuildSerialize";
+ break;
+ case WAIT_EVENT_READ_SNAPBUILD_RESTORE:
+ event_name = "ReadSnapbuildRestore";
+ break;
+ case WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE:
+ event_name = "SyncSnapbuildSerialize";
+ break;
+ /* SLRU wait event */
+ case WAIT_EVENT_READ_SLRU_PAGE:
+ event_name = "ReadSLRUPage";
+ break;
+ case WAIT_EVENT_WRITE_SLRU_PAGE:
+ event_name = "WriteSLRUPage";
+ break;
+ case WAIT_EVENT_SYNC_SLRU_FLUSH:
+ event_name = "SyncSLRUFlush";
+ break;
+ case WAIT_EVENT_SYNC_SLRU_WRITE_PAGE:
+ event_name = "SyncSLRUWritePage";
+ break;
+ /* TIMELINE HISTORY wait event */
+ case WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER:
+ event_name = "ReadTimelineHistoryWalsender";
+ break;
+ case WAIT_EVENT_WRITE_TIMELINE_HISTORY:
+ event_name = "WriteTimelineHistory";
+ break;
+ case WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE:
+ event_name = "WriteTimelineHistoryFile";
+ break;
+ case WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE:
+ event_name = "ReadTimelineHistoryWrite";
+ break;
+ case WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE:
+ event_name = "SyncTimelineHistoryWrite";
+ break;
+ case WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE:
+ event_name = "SyncTimelineHistoryFile";
+ break;
+ /* TWOPHASE FILE wait event */
+ case WAIT_EVENT_READ_TWOPHASE_FILE:
+ event_name = "ReadTwophaseFile";
+ break;
+ case WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE:
+ event_name = "WriteRecreateTwophaseFile";
+ break;
+ case WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE:
+ event_name = "SyncRecreateTwophaseFile";
+ break;
+ /* SYSLOGGER wait event */
+ case WAIT_EVENT_READ_SYSLOGGER_FILE:
+ event_name = "ReadSysloggerFile";
+ break;
+ case WAIT_EVENT_WRITE_SYSLOGGER_FILE:
+ event_name = "WriteSysloggerFile";
+ break;
+ /* REPLSLOT wait event */
+ case WAIT_EVENT_READ_RESTORE_REPLSLOT:
+ event_name = "ReadRestorREPLSlot";
+ break;
+ case WAIT_EVENT_WRITE_REPLSLOT:
+ event_name = "WriteREPLSlot";
+ break;
+ case WAIT_EVENT_SYNC_RESTORE_REPLSLOT:
+ event_name = "SyncRestoreREPLSlot";
+ break;
+ case WAIT_EVENT_SYNC_SAVE_REPLSLOT:
+ event_name = "SyncSaveREPLSlot";
+ break;
+ /* COPYDIR IO wait event */
+ case WAIT_EVENT_READ_COPY_FILE:
+ event_name = "ReadCopyFile";
+ break;
+ case WAIT_EVENT_WRITE_COPY_FILE:
+ event_name = "WriteCopyFile";
+ break;
+ /* RELMAP IO wait event */
+ case WAIT_EVENT_READ_LOAD_RELMAP_FILE:
+ event_name = "ReadLoadRELMAPFile";
+ break;
+ case WAIT_EVENT_WRITE_RELMAP_FILE:
+ event_name = "WriteRELMAPFile";
+ break;
+ case WAIT_EVENT_SYNC_WRITE_RELMAP_FILE:
+ event_name = "SyncWriteRELMAFile";
+ break;
+ /* LOCK FILE IO wait event */
+ case WAIT_EVENT_READ_CREATE_LOCK_FILE:
+ event_name = "ReadCreateLockFile";
+ break;
+ case WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE:
+ event_name = "ReadAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE:
+ event_name = "ReadRecheckDataDirLockFile";
+ break;
+ case WAIT_EVENT_WRITE_CREATE_LOCK_FILE:
+ event_name = "WriteCreateLockFile";
+ break;
+ case WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE:
+ event_name = "WriteAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE:
+ event_name = "SyncAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_SYNC_CREATE_LOCK_FILE:
+ event_name = "SyncCreateLockFile";
+ break;
+
+ /* no default case, so that compiler will warn */
+ }
+
+ return event_name;
+}
+
+
+/* ----------
* pgstat_get_backend_current_activity() -
*
* Return a string representing the current activity of the backend with
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 13a0301..1ba9a5c 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -432,9 +432,11 @@ SysLoggerMain(int argc, char *argv[])
{
int bytesRead;
+ pgstat_report_wait_start(WAIT_EVENT_READ_SYSLOGGER_FILE);
bytesRead = read(syslogPipe[0],
logbuffer + bytes_in_logbuffer,
sizeof(logbuffer) - bytes_in_logbuffer);
+ pgstat_report_wait_end();
if (bytesRead < 0)
{
if (errno != EINTR)
@@ -992,7 +994,9 @@ write_syslogger_file(const char *buffer, int count, int destination)
open_csvlogfile();
logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SYSLOGGER_FILE);
rc = fwrite(buffer, 1, count, logfile);
+ pgstat_report_wait_end();
/* can't use ereport here because of possible recursion */
if (rc != count)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 7dc97fa..8bd4dda 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -58,6 +58,7 @@
#include "catalog/catalog.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/slot.h"
@@ -2317,6 +2318,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
ondisk->size = sz;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_REORDER_BUFFER);
if (write(fd, rb->outbuf, ondisk->size) != ondisk->size)
{
int save_errno = errno;
@@ -2328,6 +2330,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
errmsg("could not write to data file for XID %u: %m",
txn->xid)));
}
+ pgstat_report_wait_end();
Assert(ondisk->change.action == change->action);
}
@@ -2408,7 +2411,9 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
* end of this file.
*/
ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER);
readBytes = read(*fd, rb->outbuf, sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_end();
/* eof */
if (readBytes == 0)
@@ -2435,8 +2440,10 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
sizeof(ReorderBufferDiskChange) + ondisk->size);
ondisk = (ReorderBufferDiskChange *) rb->outbuf;
+ pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER);
readBytes = read(*fd, rb->outbuf + sizeof(ReorderBufferDiskChange),
ondisk->size - sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_end();
if (readBytes < 0)
ereport(ERROR,
@@ -3089,7 +3096,9 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
/* read all mappings till the end of the file */
+ pgstat_report_wait_start(WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING);
readBytes = read(fd, &map, sizeof(LogicalRewriteMappingData));
+ pgstat_report_wait_end();
if (readBytes < 0)
ereport(ERROR,
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index c0f28dd..f47eefd 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -116,6 +116,8 @@
#include "access/transam.h"
#include "access/xact.h"
+#include "pgstat.h"
+
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/snapbuild.h"
@@ -1581,6 +1583,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
ereport(ERROR,
(errmsg("could not open file \"%s\": %m", path)));
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE);
if ((write(fd, ondisk, needed_length)) != needed_length)
{
CloseTransientFile(fd);
@@ -1588,6 +1591,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
/*
* fsync the file before renaming so that even if we crash after this we
@@ -1597,6 +1601,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
* some noticeable overhead since it's performed synchronously during
* decoding?
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1604,6 +1609,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
fsync_fname("pg_logical/snapshots", true);
@@ -1678,7 +1684,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* read statically sized portion of snapshot */
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, &ondisk, SnapBuildOnDiskConstantSize);
+ pgstat_report_wait_end();
if (readBytes != SnapBuildOnDiskConstantSize)
{
CloseTransientFile(fd);
@@ -1704,7 +1712,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
SnapBuildOnDiskConstantSize - SnapBuildOnDiskNotChecksummedSize);
/* read SnapBuild */
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, &ondisk.builder, sizeof(SnapBuild));
+ pgstat_report_wait_end();
if (readBytes != sizeof(SnapBuild))
{
CloseTransientFile(fd);
@@ -1718,7 +1728,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* restore running xacts information */
sz = sizeof(TransactionId) * ondisk.builder.running.xcnt_space;
ondisk.builder.running.xip = MemoryContextAllocZero(builder->context, sz);
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, ondisk.builder.running.xip, sz);
+ pgstat_report_wait_end();
if (readBytes != sz)
{
CloseTransientFile(fd);
@@ -1732,7 +1744,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* restore committed xacts information */
sz = sizeof(TransactionId) * ondisk.builder.committed.xcnt;
ondisk.builder.committed.xip = MemoryContextAllocZero(builder->context, sz);
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, ondisk.builder.committed.xip, sz);
+ pgstat_report_wait_end();
if (readBytes != sz)
{
CloseTransientFile(fd);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 10d69d0..d4ebcd9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -43,6 +43,7 @@
#include "access/xlog_internal.h"
#include "common/string.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "replication/slot.h"
#include "storage/fd.h"
#include "storage/proc.h"
@@ -1100,10 +1101,12 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
SnapBuildOnDiskChecksummedSize);
FIN_CRC32C(cp.checksum);
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_REPLSLOT);
if ((write(fd, &cp, sizeof(cp))) != sizeof(cp))
{
int save_errno = errno;
+ pgstat_report_wait_end();
CloseTransientFile(fd);
errno = save_errno;
ereport(elevel,
@@ -1112,8 +1115,10 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
tmppath)));
return;
}
+ pgstat_report_wait_end();
/* fsync the temporary file */
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SAVE_REPLSLOT);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;
@@ -1126,6 +1131,7 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
tmppath)));
return;
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
@@ -1202,6 +1208,7 @@ RestoreSlotFromDisk(const char *name)
* Sync state file before we're reading from it. We might have crashed
* while it wasn't synced yet and we shouldn't continue on that basis.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_RESTORE_REPLSLOT);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1210,6 +1217,7 @@ RestoreSlotFromDisk(const char *name)
errmsg("could not fsync file \"%s\": %m",
path)));
}
+ pgstat_report_wait_end();
/* Also sync the parent directory */
START_CRIT_SECTION();
@@ -1217,7 +1225,9 @@ RestoreSlotFromDisk(const char *name)
END_CRIT_SECTION();
/* read part of statefile that's guaranteed to be version independent */
+ pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT);
readBytes = read(fd, &cp, ReplicationSlotOnDiskConstantSize);
+ pgstat_report_wait_end();
if (readBytes != ReplicationSlotOnDiskConstantSize)
{
int saved_errno = errno;
@@ -1253,9 +1263,11 @@ RestoreSlotFromDisk(const char *name)
path, cp.length)));
/* Now that we know the size, read the entire file */
+ pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT);
readBytes = read(fd,
(char *) &cp + ReplicationSlotOnDiskConstantSize,
cp.length);
+ pgstat_report_wait_end();
if (readBytes != cp.length)
{
int saved_errno = errno;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index ba506e2..f723941 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -462,7 +462,9 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
char rbuf[BLCKSZ];
int nread;
+ pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER);
nread = read(fd, rbuf, sizeof(rbuf));
+ pgstat_report_wait_end();
if (nread <= 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -2072,7 +2074,9 @@ retry:
else
segbytes = nbytes;
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
readbytes = read(sendFile, p, segbytes);
+ pgstat_report_wait_end();
if (readbytes <= 0)
{
ereport(ERROR,
diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 7ebd636..971cfd1 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -37,6 +37,7 @@
#include "postgres.h"
#include "executor/instrument.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/buffile.h"
#include "storage/buf_internals.h"
@@ -254,7 +255,10 @@ BufFileLoadBuffer(BufFile *file)
/*
* Read whatever we can get, up to a full bufferload.
*/
- file->nbytes = FileRead(thisfile, file->buffer, sizeof(file->buffer));
+ file->nbytes = FileRead(thisfile,
+ file->buffer,
+ sizeof(file->buffer),
+ WAIT_EVENT_READ_BUFFILE);
if (file->nbytes < 0)
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
@@ -317,7 +321,10 @@ BufFileDumpBuffer(BufFile *file)
return; /* seek failed, give up */
file->offsets[file->curFile] = file->curOffset;
}
- bytestowrite = FileWrite(thisfile, file->buffer + wpos, bytestowrite);
+ bytestowrite = FileWrite(thisfile,
+ file->buffer + wpos,
+ bytestowrite,
+ WAIT_EVENT_WRITE_BUFFILE);
if (bytestowrite <= 0)
return; /* failed to write */
file->offsets[file->curFile] += bytestowrite;
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 101da47..2eda42d 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -25,7 +25,7 @@
#include "storage/copydir.h"
#include "storage/fd.h"
#include "miscadmin.h"
-
+#include "pgstat.h"
/*
* copydir: copy a directory
@@ -169,7 +169,9 @@ copy_file(char *fromfile, char *tofile)
/* If we got a cancel signal during the copy of the file, quit */
CHECK_FOR_INTERRUPTS();
+ pgstat_report_wait_start(WAIT_EVENT_READ_COPY_FILE);
nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
+ pgstat_report_wait_end();
if (nbytes < 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -177,8 +179,10 @@ copy_file(char *fromfile, char *tofile)
if (nbytes == 0)
break;
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_FILE);
if ((int) write(dstfd, buffer, nbytes) != nbytes)
{
+ pgstat_report_wait_end();
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
@@ -186,6 +190,7 @@ copy_file(char *fromfile, char *tofile)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tofile)));
}
+ pgstat_report_wait_end();
/*
* We fsync the files later but first flush them to avoid spamming the
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index ce4bd0f..431fc48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -1512,7 +1512,7 @@ FileClose(File file)
* to read into.
*/
int
-FilePrefetch(File file, off_t offset, int amount)
+FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
{
#if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
int returnCode;
@@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
POSIX_FADV_WILLNEED);
+ pgstat_report_wait_end();
return returnCode;
#else
@@ -1538,7 +1540,7 @@ FilePrefetch(File file, off_t offset, int amount)
}
void
-FileWriteback(File file, off_t offset, off_t nbytes)
+FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info)
{
int returnCode;
@@ -1559,11 +1561,13 @@ FileWriteback(File file, off_t offset, off_t nbytes)
if (returnCode < 0)
return;
+ pgstat_report_wait_start(wait_event_info);
pg_flush_data(VfdCache[file].fd, offset, nbytes);
+ pgstat_report_wait_end();
}
int
-FileRead(File file, char *buffer, int amount)
+FileRead(File file, char *buffer, int amount, uint32 wait_event_info)
{
int returnCode;
@@ -1578,6 +1582,7 @@ FileRead(File file, char *buffer, int amount)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
retry:
returnCode = read(VfdCache[file].fd, buffer, amount);
@@ -1613,12 +1618,13 @@ retry:
/* Trouble, so assume we don't know the file position anymore */
VfdCache[file].seekPos = FileUnknownPos;
}
+ pgstat_report_wait_end();
return returnCode;
}
int
-FileWrite(File file, char *buffer, int amount)
+FileWrite(File file, char *buffer, int amount, uint32 wait_event_info)
{
int returnCode;
@@ -1658,6 +1664,7 @@ FileWrite(File file, char *buffer, int amount)
}
}
+ pgstat_report_wait_start(wait_event_info);
retry:
errno = 0;
returnCode = write(VfdCache[file].fd, buffer, amount);
@@ -1708,12 +1715,13 @@ retry:
/* Trouble, so assume we don't know the file position anymore */
VfdCache[file].seekPos = FileUnknownPos;
}
+ pgstat_report_wait_end();
return returnCode;
}
int
-FileSync(File file)
+FileSync(File file, uint32 wait_event_info)
{
int returnCode;
@@ -1726,7 +1734,11 @@ FileSync(File file)
if (returnCode < 0)
return returnCode;
- return pg_fsync(VfdCache[file].fd);
+ pgstat_report_wait_start(wait_event_info);
+ returnCode = pg_fsync(VfdCache[file].fd);
+ pgstat_report_wait_end();
+
+ return returnCode;
}
off_t
@@ -1810,7 +1822,7 @@ FileTell(File file)
#endif
int
-FileTruncate(File file, off_t offset)
+FileTruncate(File file, off_t offset, uint32 wait_event_info)
{
int returnCode;
@@ -1823,7 +1835,9 @@ FileTruncate(File file, off_t offset)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = ftruncate(VfdCache[file].fd, offset);
+ pgstat_report_wait_end();
if (returnCode == 0 && VfdCache[file].fileSize > offset)
{
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 6c17b54..d0f0f37 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlog.h"
#include "catalog/catalog.h"
+#include "pgstat.h"
#include "portability/instr_time.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -536,7 +537,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ)) != BLCKSZ)
+ if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_EXTEND_DATA_BLOCK)) != BLCKSZ)
{
if (nbytes < 0)
ereport(ERROR,
@@ -667,7 +668,7 @@ mdprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE);
- (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ);
+ (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ, WAIT_EVENT_PREFETCH_DATA_BLOCK);
#endif /* USE_PREFETCH */
}
@@ -716,7 +717,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum,
seekpos = (off_t) BLCKSZ *(blocknum % ((BlockNumber) RELSEG_SIZE));
- FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush);
+ FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush, WAIT_EVENT_FLUSH_DATA_BLOCK);
nblocks -= nflush;
blocknum += nflush;
@@ -753,7 +754,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ);
+ nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_READ_DATA_BLOCK);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
@@ -829,7 +830,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ);
+ nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_WRITE_DATA_BLOCK);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
@@ -967,7 +968,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
* This segment is no longer active. We truncate the file, but do
* not delete it, for reasons explained in the header comments.
*/
- if (FileTruncate(v->mdfd_vfd, 0) < 0)
+ if (FileTruncate(v->mdfd_vfd, 0, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\": %m",
@@ -993,7 +994,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
*/
BlockNumber lastsegblocks = nblocks - priorblocks;
- if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ) < 0)
+ if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\" to %u blocks: %m",
@@ -1037,7 +1038,7 @@ mdimmedsync(SMgrRelation reln, ForkNumber forknum)
{
MdfdVec *v = &reln->md_seg_fds[forknum][segno - 1];
- if (FileSync(v->mdfd_vfd) < 0)
+ if (FileSync(v->mdfd_vfd, WAIT_EVENT_SYNC_IMMED_RELATION) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
@@ -1232,7 +1233,7 @@ mdsync(void)
INSTR_TIME_SET_CURRENT(sync_start);
if (seg != NULL &&
- FileSync(seg->mdfd_vfd) >= 0)
+ FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_RELATION) >= 0)
{
/* Success; update statistics about sync timing */
INSTR_TIME_SET_CURRENT(sync_end);
@@ -1443,7 +1444,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
ereport(DEBUG1,
(errmsg("could not forward fsync request because request queue is full")));
- if (FileSync(seg->mdfd_vfd) < 0)
+ if (FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_DATA_BLOCK) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index c9d6e44..ee5ded9 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -50,6 +50,7 @@
#include "catalog/pg_tablespace.h"
#include "catalog/storage.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/lwlock.h"
#include "utils/inval.h"
@@ -658,11 +659,15 @@ load_relmap_file(bool shared)
* look, the sinval signaling mechanism will make us re-read it before we
* are able to access any relation that's affected by the change.
*/
+ pgstat_report_wait_start(WAIT_EVENT_READ_LOAD_RELMAP_FILE);
if (read(fd, map, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not read relation mapping file \"%s\": %m",
mapfilename)));
+ }
+ pgstat_report_wait_end();
CloseTransientFile(fd);
@@ -774,6 +779,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_RELMAP_FILE);
if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
{
/* if write didn't set errno, assume problem is no disk space */
@@ -784,6 +790,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
errmsg("could not write to relation mapping file \"%s\": %m",
mapfilename)));
}
+ pgstat_report_wait_end();
/*
* We choose to fsync the data to disk before considering the task done.
@@ -791,11 +798,13 @@ write_relmap_file(bool shared, RelMapFile *newmap,
* issue, but it would complicate checkpointing --- see notes for
* CheckPointRelationMap.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_RELMAP_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync relation mapping file \"%s\": %m",
mapfilename)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index e984e79..e58cfa5 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -36,6 +36,7 @@
#include "libpq/libpq.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "postmaster/autovacuum.h"
#include "postmaster/postmaster.h"
#include "storage/fd.h"
@@ -857,11 +858,13 @@ CreateLockFile(const char *filename, bool amPostmaster,
errmsg("could not open lock file \"%s\": %m",
filename)));
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_CREATE_LOCK_FILE);
if ((len = read(fd, buffer, sizeof(buffer) - 1)) < 0)
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not read lock file \"%s\": %m",
filename)));
+ pgstat_report_wait_end();
close(fd);
if (len == 0)
@@ -1010,6 +1013,7 @@ CreateLockFile(const char *filename, bool amPostmaster,
strlcat(buffer, "\n", sizeof(buffer));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_CREATE_LOCK_FILE);
if (write(fd, buffer, strlen(buffer)) != strlen(buffer))
{
int save_errno = errno;
@@ -1022,6 +1026,9 @@ CreateLockFile(const char *filename, bool amPostmaster,
(errcode_for_file_access(),
errmsg("could not write lock file \"%s\": %m", filename)));
}
+ pgstat_report_wait_end();
+
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_CREATE_LOCK_FILE);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;
@@ -1033,6 +1040,7 @@ CreateLockFile(const char *filename, bool amPostmaster,
(errcode_for_file_access(),
errmsg("could not write lock file \"%s\": %m", filename)));
}
+ pgstat_report_wait_end();
if (close(fd) != 0)
{
int save_errno = errno;
@@ -1165,7 +1173,9 @@ AddToDataDirLockFile(int target_line, const char *str)
DIRECTORY_LOCK_FILE)));
return;
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE);
len = read(fd, srcbuffer, sizeof(srcbuffer) - 1);
+ pgstat_report_wait_end();
if (len < 0)
{
ereport(LOG,
@@ -1218,6 +1228,7 @@ AddToDataDirLockFile(int target_line, const char *str)
*/
len = strlen(destbuffer);
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE);
if (lseek(fd, (off_t) 0, SEEK_SET) != 0 ||
(int) write(fd, destbuffer, len) != len)
{
@@ -1231,6 +1242,8 @@ AddToDataDirLockFile(int target_line, const char *str)
close(fd);
return;
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE);
if (pg_fsync(fd) != 0)
{
ereport(LOG,
@@ -1238,6 +1251,7 @@ AddToDataDirLockFile(int target_line, const char *str)
errmsg("could not write to file \"%s\": %m",
DIRECTORY_LOCK_FILE)));
}
+ pgstat_report_wait_end();
if (close(fd) != 0)
{
ereport(LOG,
@@ -1294,7 +1308,9 @@ RecheckDataDirLockFile(void)
return true;
}
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE);
len = read(fd, buffer, sizeof(buffer) - 1);
+ pgstat_report_wait_end();
if (len < 0)
{
ereport(LOG,
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8b710ec..6d1cebc 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -723,6 +723,7 @@ typedef enum BackendState
#define PG_WAIT_EXTENSION 0x07000000U
#define PG_WAIT_IPC 0x08000000U
#define PG_WAIT_TIMEOUT 0x09000000U
+#define PG_WAIT_IO 0x0A000000U
/* ----------
* Wait Events - Activity
@@ -805,6 +806,98 @@ typedef enum
} WaitEventTimeout;
/* ----------
+ * Wait Events - IO
+ *
+ * Use this category when a process is waiting for a IO.
+ * ----------
+ */
+typedef enum
+{
+ WAIT_EVENT_READ_DATA_BLOCK,
+ WAIT_EVENT_WRITE_DATA_BLOCK,
+ WAIT_EVENT_SYNC_DATA_BLOCK,
+ WAIT_EVENT_EXTEND_DATA_BLOCK,
+ WAIT_EVENT_FLUSH_DATA_BLOCK,
+ WAIT_EVENT_PREFETCH_DATA_BLOCK,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK,
+ WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK,
+ WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS,
+ WAIT_EVENT_SYNC_RELATION,
+ WAIT_EVENT_SYNC_IMMED_RELATION,
+ WAIT_EVENT_READ_BUFFILE,
+ WAIT_EVENT_WRITE_BUFFILE,
+ /* Wait event for XLOG */
+ WAIT_EVENT_READ_XLOG,
+ WAIT_EVENT_READ_COPY_XLOG,
+ WAIT_EVENT_WRITE_XLOG,
+ WAIT_EVENT_WRITE_INIT_XLOG_FILE,
+ WAIT_EVENT_WRITE_COPY_XLOG_FILE,
+ WAIT_EVENT_WRITE_BOOTSTRAP_XLOG,
+ WAIT_EVENT_SYNC_INIT_XLOG_FILE,
+ WAIT_EVENT_SYNC_COPY_XLOG_FILE,
+ WAIT_EVENT_SYNC_BOOTSTRAP_XLOG,
+ WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD,
+ /* Wait event for CONTROL_FILE */
+ WAIT_EVENT_WRITE_CONTROL_FILE,
+ WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE,
+ WAIT_EVENT_SYNC_WRITE_CONTROL_FILE,
+ WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE,
+ WAIT_EVENT_READ_CONTROL_FILE,
+ /* Wait event for REORDER BUFFER */
+ WAIT_EVENT_READ_REORDER_BUFFER,
+ WAIT_EVENT_WRITE_REORDER_BUFFER,
+ /* Wait event for LOGICAL MAPPING */
+ WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING,
+ WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE,
+ WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE,
+ WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP,
+ WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE,
+ /* Wait event for SNAPBUILD */
+ WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE,
+ WAIT_EVENT_READ_SNAPBUILD_RESTORE,
+ WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE,
+ /* Wait event for SNRU */
+ WAIT_EVENT_READ_SLRU_PAGE,
+ WAIT_EVENT_WRITE_SLRU_PAGE,
+ WAIT_EVENT_SYNC_SLRU_FLUSH,
+ WAIT_EVENT_SYNC_SLRU_WRITE_PAGE,
+ /* Wait event for TIMELINE HISTORY */
+ WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER,
+ WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_WRITE_TIMELINE_HISTORY,
+ WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE,
+ WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE,
+ /* Wait event for TWOPHASE FILE */
+ WAIT_EVENT_READ_TWOPHASE_FILE,
+ WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE,
+ WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE,
+ /* Wait event for SYSLOGGER */
+ WAIT_EVENT_READ_SYSLOGGER_FILE,
+ WAIT_EVENT_WRITE_SYSLOGGER_FILE,
+ /* Wait event for REPLSLOT */
+ WAIT_EVENT_READ_RESTORE_REPLSLOT,
+ WAIT_EVENT_WRITE_REPLSLOT,
+ WAIT_EVENT_SYNC_RESTORE_REPLSLOT,
+ WAIT_EVENT_SYNC_SAVE_REPLSLOT,
+ /* Wait event for copydir */
+ WAIT_EVENT_READ_COPY_FILE,
+ WAIT_EVENT_WRITE_COPY_FILE,
+ /* Wait event RELMAP FILE */
+ WAIT_EVENT_READ_LOAD_RELMAP_FILE,
+ WAIT_EVENT_WRITE_RELMAP_FILE,
+ WAIT_EVENT_SYNC_WRITE_RELMAP_FILE,
+ /* Wait event for LOCK FILE */
+ WAIT_EVENT_READ_CREATE_LOCK_FILE,
+ WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE,
+ WAIT_EVENT_WRITE_CREATE_LOCK_FILE,
+ WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_SYNC_CREATE_LOCK_FILE
+} WaitEventIO;
+
+/* ----------
* Command type for progress reporting purposes
* ----------
*/
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 1a43a2c..ac37502 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -68,13 +68,13 @@ extern int max_safe_fds;
extern File PathNameOpenFile(FileName fileName, int fileFlags, int fileMode);
extern File OpenTemporaryFile(bool interXact);
extern void FileClose(File file);
-extern int FilePrefetch(File file, off_t offset, int amount);
-extern int FileRead(File file, char *buffer, int amount);
-extern int FileWrite(File file, char *buffer, int amount);
-extern int FileSync(File file);
+extern int FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info);
+extern int FileRead(File file, char *buffer, int amount, uint32 wait_event_info);
+extern int FileWrite(File file, char *buffer, int amount, uint32 wait_event_info);
+extern int FileSync(File file, uint32 wait_event_info);
extern off_t FileSeek(File file, off_t offset, int whence);
-extern int FileTruncate(File file, off_t offset);
-extern void FileWriteback(File file, off_t offset, off_t nbytes);
+extern int FileTruncate(File file, off_t offset, uint32 wait_event_info);
+extern void FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info);
extern char *FilePathName(File file);
extern int FileGetRawDesc(File file);
extern int FileGetRawFlags(File file);
On Mon, Feb 20, 2017 at 4:04 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
My colleague Rahila reported compilation issue with
the patch. Issue was only coming with we do the clean
build on the branch.Fixed the same into latest version of patch.
Few assorted comments:
1.
+ <row>
+ <entry><literal>WriteBuffile</></entry>
+ <entry>Waiting during
buffile read operation.</entry>
+ </row>
Operation name and definition are not matching.
2.
+FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
{
#if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
int returnCode;
@@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
POSIX_FADV_WILLNEED);
+ pgstat_report_wait_end();
AFAIK, this call is non-blocking and will just initiate a read, so do
you think we should record wait event for such a call.
3.
- written = FileWrite(src->vfd, waldata_start, len);
+ written = FileWrite(src->vfd, waldata_start, len,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
if (written != len)
ereport(ERROR,
(errcode_for_file_access(),
@@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
hash_seq_init(&seq_status, state->rs_logical_mappings);
while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL)
{
- if (FileSync(src->vfd) != 0)
+ if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
Do we want to consider recording wait event for both write and sync?
It seems to me OS level writes are relatively cheap and sync calls are
expensive, so shouldn't we just log for sync calls? I could see the
similar usage at multiple places in the patch.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Mar 4, 2017 at 7:53 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 4:04 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:My colleague Rahila reported compilation issue with
the patch. Issue was only coming with we do the clean
build on the branch.Fixed the same into latest version of patch.
Few assorted comments:
1. + <row> + <entry><literal>WriteBuffile</></entry> + <entry>Waiting during buffile read operation.</entry> + </row>Operation name and definition are not matching.
Will fix this.
2. +FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info) { #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED) int returnCode; @@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount) if (returnCode < 0) return returnCode;+ pgstat_report_wait_start(wait_event_info); returnCode = posix_fadvise(VfdCache[file].fd, offset, amount, POSIX_FADV_WILLNEED); + pgstat_report_wait_end();AFAIK, this call is non-blocking and will just initiate a read, so do
you think we should record wait event for such a call.
Yes, prefecth call is a non-blocking and will just initiate a read. But
this info
about the prefetch into wait events will give more info about the system.
3.
- written = FileWrite(src->vfd, waldata_start, len); + written = FileWrite(src->vfd, waldata_start, len, + WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK); if (written != len) ereport(ERROR, (errcode_for_file_access(), @@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state) hash_seq_init(&seq_status, state->rs_logical_mappings); while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL) { - if (FileSync(src->vfd) != 0) + if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)Do we want to consider recording wait event for both write and sync?
It seems to me OS level writes are relatively cheap and sync calls are
expensive, so shouldn't we just log for sync calls? I could see the
similar usage at multiple places in the patch.
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.
I am open for other opinion/suggestions.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Rushabh Lathia
On Mon, Mar 6, 2017 at 3:27 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.I am open for other opinion/suggestions.
Writes are NOT always fast. I've seen cases of write() blocking for
LONG periods of time on systems that are in the process of failing, or
just busy. So I think we certainly want to advertise a wait event for
those.
Likewise, I agree that the prefetch call probably SHOULDN'T block, but
just because it shouldn't doesn't mean it never will.
I think somebody should try a pgbench run with this patch applied,
using a scale factor greater than shared_buffers, and generate a wait
event profile, just to see if these are showing up and how often.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 7, 2017 at 8:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 6, 2017 at 3:27 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.I am open for other opinion/suggestions.
Writes are NOT always fast. I've seen cases of write() blocking for
LONG periods of time on systems that are in the process of failing, or
just busy. So I think we certainly want to advertise a wait event for
those.
+1. I see that quite often really happens, and a lot particularly on
overloaded VMs. Having to debug such systems until you notice that
what is slow is just the underlying system is no fun. So being able to
show up an event state in memory where we can say that things are
stuck on I/O just with PG interface can really help in diagnostics
without having to look at storage-level logs for a first impression.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 7, 2017 at 5:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 6, 2017 at 3:27 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.I am open for other opinion/suggestions.
Writes are NOT always fast. I've seen cases of write() blocking for
LONG periods of time on systems that are in the process of failing, or
just busy. So I think we certainly want to advertise a wait event for
those.
Sure, if you think both Writes and Reads at OS level can have some
chance of blocking in obscure cases, then we should add a wait event
for them.
Likewise, I agree that the prefetch call probably SHOULDN'T block, but
just because it shouldn't doesn't mean it never will.I think somebody should try a pgbench run with this patch applied,
using a scale factor greater than shared_buffers, and generate a wait
event profile, just to see if these are showing up and how often.
Yeah, that makes sense to me and we should try for both read-write and
read-only tests.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, if you think both Writes and Reads at OS level can have some
chance of blocking in obscure cases, then we should add a wait event
for them.
I think writes have a chance of blocking in cases even in cases that
are not very obscure at all.
I think somebody should try a pgbench run with this patch applied,
using a scale factor greater than shared_buffers, and generate a wait
event profile, just to see if these are showing up and how often.Yeah, that makes sense to me and we should try for both read-write and
read-only tests.
wfm.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, if you think both Writes and Reads at OS level can have some
chance of blocking in obscure cases, then we should add a wait event
for them.I think writes have a chance of blocking in cases even in cases that
are not very obscure at all.
Point taken for writes, but I think in general we should have some
criteria based on which we can decide whether to have a wait event for
a particular call. It should not happen that we have tons of wait
events and out of which, only a few are helpful in most of the cases
in real-world scenarios.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 7, 2017 at 9:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, if you think both Writes and Reads at OS level can have some
chance of blocking in obscure cases, then we should add a wait event
for them.I think writes have a chance of blocking in cases even in cases that
are not very obscure at all.Point taken for writes, but I think in general we should have some
criteria based on which we can decide whether to have a wait event for
a particular call. It should not happen that we have tons of wait
events and out of which, only a few are helpful in most of the cases
in real-world scenarios.
Well, the problem is that if you pick and choose which wait events to
add based on what you think will be common, you're actually kind of
hosing yourself. Because now when something uncommon happens, suddenly
you don't get any wait event data and you can't tell what's happening.
I think the number of new wait events added by Rushabh's patch is
wholly reasonable. Yeah, some of those are going to be a lot more
common than others, but so what? We add wait events so that we can
find out what's going on. I don't want to sometimes know when a
backend is blocked on an I/O. I want to ALWAYS know.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 8, 2017 at 8:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 7, 2017 at 9:32 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas <robertmhaas@gmail.com>
wrote:
On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
Sure, if you think both Writes and Reads at OS level can have some
chance of blocking in obscure cases, then we should add a wait event
for them.I think writes have a chance of blocking in cases even in cases that
are not very obscure at all.Point taken for writes, but I think in general we should have some
criteria based on which we can decide whether to have a wait event for
a particular call. It should not happen that we have tons of wait
events and out of which, only a few are helpful in most of the cases
in real-world scenarios.Well, the problem is that if you pick and choose which wait events to
add based on what you think will be common, you're actually kind of
hosing yourself. Because now when something uncommon happens, suddenly
you don't get any wait event data and you can't tell what's happening.
I think the number of new wait events added by Rushabh's patch is
wholly reasonable. Yeah, some of those are going to be a lot more
common than others, but so what? We add wait events so that we can
find out what's going on. I don't want to sometimes know when a
backend is blocked on an I/O. I want to ALWAYS know.
Yes, I agree with Robert. Knowing what we want and what we don't
want is difficult to judge. Something which we might think its not useful
information, and later of end up into situation where we re-think about
adding those missing stuff is not good. Having more information about
the system, specially for monitoring purpose is always good.
I am attaching another version of the patch, as I found stupid mistake
in the earlier version of patch, where I missed to initialize initial value
to
WaitEventIO enum. Also earlier version was not getting cleanly apply on
the current version of sources.
--
Rushabh Lathia
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
wait_event_for_disk_IO_v2.patchapplication/x-download; name=wait_event_for_disk_IO_v2.patchDownload
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index fad5cb0..3704f5d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -716,6 +716,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
point.
</para>
</listitem>
+ <listitem>
+ <para>
+ <literal>IO</>: The server process is waiting for a IO to complete.
+ <literal>wait_event</> will identify the specific wait point.
+ </para>
+ </listitem>
</itemizedlist>
</entry>
</row>
@@ -1264,6 +1270,272 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry><literal>RecoveryApplyDelay</></entry>
<entry>Waiting to apply WAL at recovery because it is delayed.</entry>
</row>
+ <row>
+ <entry morerows="65"><literal>IO</></entry>
+ <entry><literal>ReadDataBlock</></entry>
+ <entry>Waiting during relation data block read.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteDataBlock</></entry>
+ <entry>Waiting during relation data block write.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncDataBlock</></entry>
+ <entry>Waiting during relation data block sync.</entry>
+ </row>
+ <row>
+ <entry><literal>ExtendDataBlock</></entry>
+ <entry>Waiting during add a block to the relation.</entry>
+ </row>
+ <row>
+ <entry><literal>FlushDataBlock</></entry>
+ <entry>Waiting during write pages back to storage.</entry>
+ </row>
+ <row>
+ <entry><literal>PrefetchDataBlock</></entry>
+ <entry>Waiting during asynchronous read of the specified block of a relation.</entry>
+ </row>
+ <row>
+ <entry><literal>TruncateDataBlock</></entry>
+ <entry>Waiting during truncate relation to specified number of blocks.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRelation</></entry>
+ <entry>Waiting during sync writes to stable storage.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncImmedRelation</></entry>
+ <entry>Waiting during immediate sync a relation to stable storage.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteRewriteDataBlock</></entry>
+ <entry>Waiting to write data block during rewrite heap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRewriteDataBlock</></entry>
+ <entry>Waiting to sync data block during rewrite heap.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadBuffile</></entry>
+ <entry>Waiting during buffile read operation.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBuffile</></entry>
+ <entry>Waiting during buffile write operation.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadXLog</></entry>
+ <entry>Waiting during read the XLOG page.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCopyXLog</></entry>
+ <entry>Wait to read the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteXLog</></entry>
+ <entry>Waiting during write the XLOG page.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteInitXLogFile</></entry>
+ <entry>Waiting to write the XLOG page during XLOG file initialization.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCopyXLogFile</></entry>
+ <entry>Waiting to write the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteBootstrapXLog</></entry>
+ <entry>Waiting to write the XLOG page during bootstrap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncInitXLogFile</></entry>
+ <entry>Waiting to sync the XLOG page during XLOG file initialization.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncCopyXLogFile</></entry>
+ <entry>Waiting to sync the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncBootstrapXLog</></entry>
+ <entry>Waiting to sync the XLOG page during bootstrap.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncAssignXLogMethod</></entry>
+ <entry>Waiting to assign xlog sync method.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteControlFile</></entry>
+ <entry>Waiting to write the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteUpdateControlFile</></entry>
+ <entry>Waiting to write the control file during update control file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadControlFile</></entry>
+ <entry>Waiting to read the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncWriteControlFile</></entry>
+ <entry>Waiting to sync the control file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncUpdateControlFile</></entry>
+ <entry>Waiting to sync the control file during update control file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadApplyLogicalMapping</></entry>
+ <entry>Waiting to read logical mapping during apply a single mapping file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteLogicalMappingRewrite</></entry>
+ <entry>Waiting to write logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLogicalMappingRewrite</></entry>
+ <entry>Waiting to sync logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLogicalMappingRewriteHeap</></entry>
+ <entry>Waiting to sync logical mapping during a checkpoint for logical rewrite mappings.</entry>
+ </row>
+ <row>
+ <entry><literal>TruncateLogicalMappingRewrite</></entry>
+ <entry>Waiting to truncate logical mapping during xlog logical rewrite.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSnapbuildSerialize</></entry>
+ <entry>Waiting to write snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSnapbuildSerialize</></entry>
+ <entry>Waiting to read snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSnapbuildSerialize</></entry>
+ <entry>Waiting to sync snapshot during serialize the snapshot.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSLRUPage</></entry>
+ <entry>Waiting to read page during physical read of a page into a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSLRUPage</></entry>
+ <entry>Waiting to write page during physical write of a page from a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSLRUWritePage</></entry>
+ <entry>Waiting to sync page during physical write of a page from a buffer slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncSLRUFlush</></entry>
+ <entry>Waiting to sync page during flush dirty pages to disk during checkpoint or database shutdown.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTimelineHistoryWrite</></entry>
+ <entry>Waiting to read timeline history during write timeline history.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTimelineHistoryWalsender</></entry>
+ <entry>Waiting to read timeline history during walsander timeline command.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteTimelineHistory</></entry>
+ <entry>Waiting to write timeline history.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteTimelineHistoryFile</></entry>
+ <entry>Waiting to write timeline history during a history file write for given timeline and contents.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncTimelineHistoryWrite</></entry>
+ <entry>Waiting to sync timeline history during write timeline history</entry>
+ </row>
+ <row>
+ <entry><literal>SyncTimelineHistoryFile</></entry>
+ <entry>Waiting to sync timeline history during a history file write for given timeline and contents.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadTwophaseFile</></entry>
+ <entry>Waiting to read two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteRecreateTwophaseFile</></entry>
+ <entry>Waiting to write two phase file during recreate two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRecreateTwophaseFile</></entry>
+ <entry>Waiting to sync two phase file during recreate two phase file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadSysloggerFile</></entry>
+ <entry>Wait during read syslogger file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteSysloggerFile</></entry>
+ <entry>Wait during write syslogger file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadRestorREPLSlot</></entry>
+ <entry>Wait to read REPL slot during load a single slot from disk into memory.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteREPLSlot</></entry>
+ <entry>Wait to write REPL slot.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncRestoreREPLSlot</></entry>
+ <entry>Wait to sync REPL slot during load a single slot from disk into memory.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCopyFile</></entry>
+ <entry>Waiting to read during copy file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCopyFile</></entry>
+ <entry>Waiting to write during copy file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadLoadRELMAPFile</></entry>
+ <entry>Waiting to read RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteLoadRELMAPFile</></entry>
+ <entry>Waiting to write RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncLoadRELMAPFile</></entry>
+ <entry>Waiting to sync RELMAP file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadCreateLockFile</></entry>
+ <entry>Wait to read lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadAddToDataDirLockFile</></entry>
+ <entry>Wait to read lock file during add a line in the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>ReadRecheckDataDirLockFile</></entry>
+ <entry>Wait to read lock file during recheck that the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteCreateLockFile</></entry>
+ <entry>Wait to write lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>WriteAddToDataDirLockFile</></entry>
+ <entry>Wait to write lock file during add a line in the data directory lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncCreateLockFile</></entry>
+ <entry>Wait to sync lock file during create lock file.</entry>
+ </row>
+ <row>
+ <entry><literal>SyncAddToDataDirLockFile</></entry>
+ <entry>Wait to sync lock file during add a line in the data directory lock file.</entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index c7b283c..2456500 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -119,6 +119,8 @@
#include "lib/ilist.h"
+#include "pgstat.h"
+
#include "replication/logical.h"
#include "replication/slot.h"
@@ -916,7 +918,8 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
* Note that we deviate from the usual WAL coding practices here,
* check the above "Logical rewrite support" comment for reasoning.
*/
- written = FileWrite(src->vfd, waldata_start, len);
+ written = FileWrite(src->vfd, waldata_start, len,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
if (written != len)
ereport(ERROR,
(errcode_for_file_access(),
@@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
hash_seq_init(&seq_status, state->rs_logical_mappings);
while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL)
{
- if (FileSync(src->vfd) != 0)
+ if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", src->path)));
@@ -1141,11 +1144,13 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
* Truncate all data that's not guaranteed to have been safely fsynced (by
* previous record or by the last checkpoint).
*/
+ pgstat_report_wait_start(WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE);
if (ftruncate(fd, xlrec->offset) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\" to %u: %m",
path, (uint32) xlrec->offset)));
+ pgstat_report_wait_end();
/* now seek to the position we want to write our data to */
if (lseek(fd, xlrec->offset, SEEK_SET) != xlrec->offset)
@@ -1159,20 +1164,24 @@ heap_xlog_logical_rewrite(XLogReaderState *r)
len = xlrec->num_mappings * sizeof(LogicalRewriteMappingData);
/* write out tail end of mapping file (again) */
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE);
if (write(fd, data, len) != len)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", path)));
+ pgstat_report_wait_end();
/*
* Now fsync all previously written data. We could improve things and only
* do this for the last write to a file, but the required bookkeeping
* doesn't seem worth the trouble.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", path)));
+ pgstat_report_wait_end();
CloseTransientFile(fd);
}
@@ -1266,10 +1275,12 @@ CheckPointLogicalRewriteHeap(void)
* changed or have only been created since the checkpoint's start,
* but it's currently not deemed worth the effort.
*/
- else if (pg_fsync(fd) != 0)
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP);
+ if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", path)));
+ pgstat_report_wait_end();
CloseTransientFile(fd);
}
}
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a66ef5c..237e8eb 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -54,6 +54,7 @@
#include "access/slru.h"
#include "access/transam.h"
#include "access/xlog.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
#include "miscadmin.h"
@@ -675,13 +676,16 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_SLRU_PAGE);
if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
{
+ pgstat_report_wait_end();
slru_errcause = SLRU_READ_FAILED;
slru_errno = errno;
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
{
@@ -834,8 +838,10 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SLRU_PAGE);
if (write(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
{
+ pgstat_report_wait_end();
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
@@ -845,6 +851,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
/*
* If not part of Flush, need to fsync now. We assume this happens
@@ -852,6 +859,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
*/
if (!fdata)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_WRITE_PAGE);
if (ctl->do_fsync && pg_fsync(fd))
{
slru_errcause = SLRU_FSYNC_FAILED;
@@ -859,6 +867,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
CloseTransientFile(fd);
return false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
{
@@ -1126,6 +1135,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
ok = true;
for (i = 0; i < fdata.num_files; i++)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_FLUSH);
if (ctl->do_fsync && pg_fsync(fdata.fd[i]))
{
slru_errcause = SLRU_FSYNC_FAILED;
@@ -1133,6 +1143,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT;
ok = false;
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fdata.fd[i]))
{
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index 1fdc591..5ca811f 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -38,6 +38,7 @@
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "pgstat.h"
#include "storage/fd.h"
/*
@@ -338,7 +339,9 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
for (;;)
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE);
nbytes = (int) read(srcfd, buffer, sizeof(buffer));
+ pgstat_report_wait_end();
if (nbytes < 0 || errno != 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -346,6 +349,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
if (nbytes == 0)
break;
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY);
if ((int) write(fd, buffer, nbytes) != nbytes)
{
int save_errno = errno;
@@ -365,6 +369,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
CloseTransientFile(srcfd);
}
@@ -400,10 +405,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
@@ -460,6 +467,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
errmsg("could not create file \"%s\": %m", tmppath)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE);
if ((int) write(fd, content, size) != size)
{
int save_errno = errno;
@@ -475,11 +483,14 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 0a8edb9..1f8bfd6 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1200,8 +1200,10 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
*/
buf = (char *) palloc(stat.st_size);
+ pgstat_report_wait_start(WAIT_EVENT_READ_TWOPHASE_FILE);
if (read(fd, buf, stat.st_size) != stat.st_size)
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
if (give_warnings)
ereport(WARNING,
@@ -1212,6 +1214,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
return NULL;
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
hdr = (TwoPhaseFileHeader *) buf;
@@ -1542,8 +1545,10 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
path)));
/* Write content and CRC */
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE);
if (write(fd, content, len) != len)
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
ereport(ERROR,
(errcode_for_file_access(),
@@ -1551,16 +1556,19 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
}
if (write(fd, &statefile_crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c))
{
+ pgstat_report_wait_end();
CloseTransientFile(fd);
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write two-phase state file: %m")));
}
+ pgstat_report_wait_end();
/*
* We must fsync the file because the end-of-replay checkpoint will not do
* so, there being no GXACT in shared memory yet to tell it to.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1568,6 +1576,7 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
(errcode_for_file_access(),
errmsg("could not fsync two-phase state file: %m")));
}
+ pgstat_report_wait_end();
if (CloseTransientFile(fd) != 0)
ereport(ERROR,
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 744360c..fa18b7e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2456,7 +2456,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
do
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_XLOG);
written = write(openLogFile, from, nleft);
+ pgstat_report_wait_end();
if (written <= 0)
{
if (errno == EINTR)
@@ -3207,6 +3209,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
for (nbytes = 0; nbytes < XLogSegSize; nbytes += XLOG_BLCKSZ)
{
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_INIT_XLOG_FILE);
if ((int) write(fd, zbuffer, XLOG_BLCKSZ) != (int) XLOG_BLCKSZ)
{
int save_errno = errno;
@@ -3225,8 +3228,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_INIT_XLOG_FILE);
if (pg_fsync(fd) != 0)
{
close(fd);
@@ -3234,6 +3239,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
if (close(fd))
ereport(ERROR,
@@ -3360,6 +3366,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
if (nread > sizeof(buffer))
nread = sizeof(buffer);
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_READ_COPY_XLOG);
if (read(srcfd, buffer, nread) != nread)
{
if (errno != 0)
@@ -3372,8 +3379,10 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
(errmsg("not enough data in file \"%s\"",
path)));
}
+ pgstat_report_wait_end();
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_XLOG_FILE);
if ((int) write(fd, buffer, sizeof(buffer)) != (int) sizeof(buffer))
{
int save_errno = errno;
@@ -3389,12 +3398,15 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
}
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_COPY_XLOG_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
@@ -4414,6 +4426,7 @@ WriteControlFile(void)
XLOG_CONTROL_FILE)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_CONTROL_FILE);
if (write(fd, buffer, PG_CONTROL_SIZE) != PG_CONTROL_SIZE)
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4423,11 +4436,14 @@ WriteControlFile(void)
(errcode_for_file_access(),
errmsg("could not write to control file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_CONTROL_FILE);
if (pg_fsync(fd) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync control file: %m")));
+ pgstat_report_wait_end();
if (close(fd))
ereport(PANIC,
@@ -4453,10 +4469,12 @@ ReadControlFile(void)
errmsg("could not open control file \"%s\": %m",
XLOG_CONTROL_FILE)));
+ pgstat_report_wait_start(WAIT_EVENT_READ_CONTROL_FILE);
if (read(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not read from control file: %m")));
+ pgstat_report_wait_end();
close(fd);
@@ -4634,6 +4652,7 @@ UpdateControlFile(void)
XLOG_CONTROL_FILE)));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE);
if (write(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData))
{
/* if write didn't set errno, assume problem is no disk space */
@@ -4643,11 +4662,14 @@ UpdateControlFile(void)
(errcode_for_file_access(),
errmsg("could not write to control file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE);
if (pg_fsync(fd) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync control file: %m")));
+ pgstat_report_wait_end();
if (close(fd))
ereport(PANIC,
@@ -5036,6 +5058,7 @@ BootStrapXLOG(void)
/* Write the first page with the initial record */
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_BOOTSTRAP_XLOG);
if (write(openLogFile, page, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
/* if write didn't set errno, assume problem is no disk space */
@@ -5045,11 +5068,14 @@ BootStrapXLOG(void)
(errcode_for_file_access(),
errmsg("could not write bootstrap transaction log file: %m")));
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_BOOTSTRAP_XLOG);
if (pg_fsync(openLogFile) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync bootstrap transaction log file: %m")));
+ pgstat_report_wait_end();
if (close(openLogFile))
ereport(PANIC,
@@ -9999,11 +10025,13 @@ assign_xlog_sync_method(int new_sync_method, void *extra)
*/
if (openLogFile >= 0)
{
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD);
if (pg_fsync(openLogFile) != 0)
ereport(PANIC,
(errcode_for_file_access(),
errmsg("could not fsync log segment %s: %m",
XLogFileNameP(ThisTimeLineID, openLogSegNo))));
+ pgstat_report_wait_end();
if (get_sync_bit(sync_method) != get_sync_bit(new_sync_method))
XLogFileClose();
}
@@ -11456,6 +11484,7 @@ retry:
goto next_record_is_invalid;
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
char fname[MAXFNAMELEN];
@@ -11467,6 +11496,7 @@ retry:
fname, readOff)));
goto next_record_is_invalid;
}
+ pgstat_report_wait_end();
Assert(targetSegNo == readSegNo);
Assert(targetPageOff == readOff);
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 8b99b78..d17d541 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -24,6 +24,7 @@
#include "access/xlogutils.h"
#include "catalog/catalog.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "storage/smgr.h"
#include "utils/guc.h"
#include "utils/hsearch.h"
@@ -728,7 +729,9 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count)
else
segbytes = nbytes;
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
readbytes = read(sendFile, p, segbytes);
+ pgstat_report_wait_end();
if (readbytes <= 0)
{
char path[MAXPGPATH];
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 2fb9a8b..4ec704e 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -280,6 +280,7 @@ static const char *pgstat_get_wait_activity(WaitEventActivity w);
static const char *pgstat_get_wait_client(WaitEventClient w);
static const char *pgstat_get_wait_ipc(WaitEventIPC w);
static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
+static const char *pgstat_get_wait_io(WaitEventIO w);
static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
static void pgstat_send(void *msg, int len);
@@ -3176,6 +3177,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info)
case PG_WAIT_TIMEOUT:
event_type = "Timeout";
break;
+ case PG_WAIT_IO:
+ event_type = "IO";
+ break;
default:
event_type = "???";
break;
@@ -3246,6 +3250,13 @@ pgstat_get_wait_event(uint32 wait_event_info)
event_name = pgstat_get_wait_timeout(w);
break;
}
+ case PG_WAIT_IO:
+ {
+ WaitEventIO w = (WaitEventIO) wait_event_info;
+
+ event_name = pgstat_get_wait_io(w);
+ break;
+ }
default:
event_name = "unknown wait event";
break;
@@ -3436,6 +3447,247 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
}
/* ----------
+ * pgstat_get_wait_io() -
+ *
+ * Convert WaitEventIO to string.
+ * ----------
+ */
+static const char *
+pgstat_get_wait_io(WaitEventIO w)
+{
+ const char *event_name = "unknown wait event";
+
+ switch (w)
+ {
+ case WAIT_EVENT_READ_DATA_BLOCK:
+ event_name = "ReadDataBlock";
+ break;
+ case WAIT_EVENT_WRITE_DATA_BLOCK:
+ event_name = "WriteDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_DATA_BLOCK:
+ event_name = "SyncDataBlock";
+ break;
+ case WAIT_EVENT_EXTEND_DATA_BLOCK:
+ event_name = "ExtendDataBlock";
+ break;
+ case WAIT_EVENT_FLUSH_DATA_BLOCK:
+ event_name = "FlushDataBlock";
+ break;
+ case WAIT_EVENT_PREFETCH_DATA_BLOCK:
+ event_name = "PrefetchDataBlock";
+ break;
+ case WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS:
+ event_name = "TruncateDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_RELATION:
+ event_name = "SyncRelation";
+ break;
+ case WAIT_EVENT_SYNC_IMMED_RELATION:
+ event_name = "SyncImmedRelation";
+ break;
+ case WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK:
+ event_name = "WriteRewriteDataBlock";
+ break;
+ case WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK:
+ event_name = "SyncRewriteDataBlock";
+ break;
+ case WAIT_EVENT_READ_BUFFILE:
+ event_name = "ReadBuffile";
+ break;
+ case WAIT_EVENT_WRITE_BUFFILE:
+ event_name = "WriteBuffile";
+ break;
+ /* XLOG wait event */
+ case WAIT_EVENT_READ_XLOG:
+ event_name = "ReadXLog";
+ break;
+ case WAIT_EVENT_READ_COPY_XLOG:
+ event_name = "ReadCopyXLog";
+ break;
+ case WAIT_EVENT_WRITE_XLOG:
+ event_name = "WriteXLog";
+ break;
+ case WAIT_EVENT_WRITE_INIT_XLOG_FILE:
+ event_name = "WriteInitXLogFile";
+ break;
+ case WAIT_EVENT_WRITE_COPY_XLOG_FILE:
+ event_name = "WriteCopyXLogFile";
+ break;
+ case WAIT_EVENT_WRITE_BOOTSTRAP_XLOG:
+ event_name = "WriteBootstrapXLog";
+ break;
+ case WAIT_EVENT_SYNC_INIT_XLOG_FILE:
+ event_name = "SyncInitXLogFile";
+ break;
+ case WAIT_EVENT_SYNC_COPY_XLOG_FILE:
+ event_name = "SyncCopyXLogFile";
+ break;
+ case WAIT_EVENT_SYNC_BOOTSTRAP_XLOG:
+ event_name = "SyncBootStrapXLog";
+ break;
+ case WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD:
+ event_name = "SyncAssignXLogMethod";
+ break;
+ /* Control file wait events */
+ case WAIT_EVENT_WRITE_CONTROL_FILE:
+ event_name = "WriteControlFile";
+ break;
+ case WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE:
+ event_name = "WriteUpdateControlFile";
+ break;
+ case WAIT_EVENT_READ_CONTROL_FILE:
+ event_name = "ReadControlFile";
+ break;
+ case WAIT_EVENT_SYNC_WRITE_CONTROL_FILE:
+ event_name = "SyncWriteControlFile";
+ break;
+ case WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE:
+ event_name = "SyncUpdateControlFile";
+ break;
+ /* reorder buffer wait event */
+ case WAIT_EVENT_READ_REORDER_BUFFER:
+ event_name = "ReadReorderBuffer";
+ break;
+ case WAIT_EVENT_WRITE_REORDER_BUFFER:
+ event_name = "WriteReorderBuffer";
+ break;
+ /* logical mapping wait event */
+ case WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING:
+ event_name = "ReadApplyLogicalMapping";
+ break;
+ case WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE:
+ event_name = "WriteLogicalMappingRewrite";
+ break;
+ case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE:
+ event_name = "SyncLogicalMappingRewrite";
+ break;
+ case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP:
+ event_name = "SyncLogicalMappingRewriteHeap";
+ break;
+ case WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE:
+ event_name = "TruncateLogicalMappingRewrite";
+ break;
+ /* Snapbuild wait event */
+ case WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE:
+ event_name = "WriteSnapbuildSerialize";
+ break;
+ case WAIT_EVENT_READ_SNAPBUILD_RESTORE:
+ event_name = "ReadSnapbuildRestore";
+ break;
+ case WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE:
+ event_name = "SyncSnapbuildSerialize";
+ break;
+ /* SLRU wait event */
+ case WAIT_EVENT_READ_SLRU_PAGE:
+ event_name = "ReadSLRUPage";
+ break;
+ case WAIT_EVENT_WRITE_SLRU_PAGE:
+ event_name = "WriteSLRUPage";
+ break;
+ case WAIT_EVENT_SYNC_SLRU_FLUSH:
+ event_name = "SyncSLRUFlush";
+ break;
+ case WAIT_EVENT_SYNC_SLRU_WRITE_PAGE:
+ event_name = "SyncSLRUWritePage";
+ break;
+ /* TIMELINE HISTORY wait event */
+ case WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER:
+ event_name = "ReadTimelineHistoryWalsender";
+ break;
+ case WAIT_EVENT_WRITE_TIMELINE_HISTORY:
+ event_name = "WriteTimelineHistory";
+ break;
+ case WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE:
+ event_name = "WriteTimelineHistoryFile";
+ break;
+ case WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE:
+ event_name = "ReadTimelineHistoryWrite";
+ break;
+ case WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE:
+ event_name = "SyncTimelineHistoryWrite";
+ break;
+ case WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE:
+ event_name = "SyncTimelineHistoryFile";
+ break;
+ /* TWOPHASE FILE wait event */
+ case WAIT_EVENT_READ_TWOPHASE_FILE:
+ event_name = "ReadTwophaseFile";
+ break;
+ case WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE:
+ event_name = "WriteRecreateTwophaseFile";
+ break;
+ case WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE:
+ event_name = "SyncRecreateTwophaseFile";
+ break;
+ /* SYSLOGGER wait event */
+ case WAIT_EVENT_READ_SYSLOGGER_FILE:
+ event_name = "ReadSysloggerFile";
+ break;
+ case WAIT_EVENT_WRITE_SYSLOGGER_FILE:
+ event_name = "WriteSysloggerFile";
+ break;
+ /* REPLSLOT wait event */
+ case WAIT_EVENT_READ_RESTORE_REPLSLOT:
+ event_name = "ReadRestorREPLSlot";
+ break;
+ case WAIT_EVENT_WRITE_REPLSLOT:
+ event_name = "WriteREPLSlot";
+ break;
+ case WAIT_EVENT_SYNC_RESTORE_REPLSLOT:
+ event_name = "SyncRestoreREPLSlot";
+ break;
+ case WAIT_EVENT_SYNC_SAVE_REPLSLOT:
+ event_name = "SyncSaveREPLSlot";
+ break;
+ /* COPYDIR IO wait event */
+ case WAIT_EVENT_READ_COPY_FILE:
+ event_name = "ReadCopyFile";
+ break;
+ case WAIT_EVENT_WRITE_COPY_FILE:
+ event_name = "WriteCopyFile";
+ break;
+ /* RELMAP IO wait event */
+ case WAIT_EVENT_READ_LOAD_RELMAP_FILE:
+ event_name = "ReadLoadRELMAPFile";
+ break;
+ case WAIT_EVENT_WRITE_RELMAP_FILE:
+ event_name = "WriteRELMAPFile";
+ break;
+ case WAIT_EVENT_SYNC_WRITE_RELMAP_FILE:
+ event_name = "SyncWriteRELMAFile";
+ break;
+ /* LOCK FILE IO wait event */
+ case WAIT_EVENT_READ_CREATE_LOCK_FILE:
+ event_name = "ReadCreateLockFile";
+ break;
+ case WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE:
+ event_name = "ReadAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE:
+ event_name = "ReadRecheckDataDirLockFile";
+ break;
+ case WAIT_EVENT_WRITE_CREATE_LOCK_FILE:
+ event_name = "WriteCreateLockFile";
+ break;
+ case WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE:
+ event_name = "WriteAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE:
+ event_name = "SyncAddToDataDirLockFile";
+ break;
+ case WAIT_EVENT_SYNC_CREATE_LOCK_FILE:
+ event_name = "SyncCreateLockFile";
+ break;
+
+ /* no default case, so that compiler will warn */
+ }
+
+ return event_name;
+}
+
+
+/* ----------
* pgstat_get_backend_current_activity() -
*
* Return a string representing the current activity of the backend with
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index aaefdae..9328814 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -441,9 +441,11 @@ SysLoggerMain(int argc, char *argv[])
{
int bytesRead;
+ pgstat_report_wait_start(WAIT_EVENT_READ_SYSLOGGER_FILE);
bytesRead = read(syslogPipe[0],
logbuffer + bytes_in_logbuffer,
sizeof(logbuffer) - bytes_in_logbuffer);
+ pgstat_report_wait_end();
if (bytesRead < 0)
{
if (errno != EINTR)
@@ -1001,7 +1003,9 @@ write_syslogger_file(const char *buffer, int count, int destination)
open_csvlogfile();
logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SYSLOGGER_FILE);
rc = fwrite(buffer, 1, count, logfile);
+ pgstat_report_wait_end();
/* can't use ereport here because of possible recursion */
if (rc != count)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8aac670..767ea5e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -58,6 +58,7 @@
#include "catalog/catalog.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/slot.h"
@@ -2275,6 +2276,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
ondisk->size = sz;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_REORDER_BUFFER);
if (write(fd, rb->outbuf, ondisk->size) != ondisk->size)
{
int save_errno = errno;
@@ -2286,6 +2288,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
errmsg("could not write to data file for XID %u: %m",
txn->xid)));
}
+ pgstat_report_wait_end();
Assert(ondisk->change.action == change->action);
}
@@ -2366,7 +2369,9 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
* end of this file.
*/
ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER);
readBytes = read(*fd, rb->outbuf, sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_end();
/* eof */
if (readBytes == 0)
@@ -2393,8 +2398,10 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
sizeof(ReorderBufferDiskChange) + ondisk->size);
ondisk = (ReorderBufferDiskChange *) rb->outbuf;
+ pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER);
readBytes = read(*fd, rb->outbuf + sizeof(ReorderBufferDiskChange),
ondisk->size - sizeof(ReorderBufferDiskChange));
+ pgstat_report_wait_end();
if (readBytes < 0)
ereport(ERROR,
@@ -3047,7 +3054,9 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
/* read all mappings till the end of the file */
+ pgstat_report_wait_start(WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING);
readBytes = read(fd, &map, sizeof(LogicalRewriteMappingData));
+ pgstat_report_wait_end();
if (readBytes < 0)
ereport(ERROR,
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 6f19cdc..c33bf48 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -115,6 +115,8 @@
#include "access/transam.h"
#include "access/xact.h"
+#include "pgstat.h"
+
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/snapbuild.h"
@@ -1580,6 +1582,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
ereport(ERROR,
(errmsg("could not open file \"%s\": %m", path)));
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE);
if ((write(fd, ondisk, needed_length)) != needed_length)
{
CloseTransientFile(fd);
@@ -1587,6 +1590,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
/*
* fsync the file before renaming so that even if we crash after this we
@@ -1596,6 +1600,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
* some noticeable overhead since it's performed synchronously during
* decoding?
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1603,6 +1608,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
fsync_fname("pg_logical/snapshots", true);
@@ -1677,7 +1683,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* read statically sized portion of snapshot */
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, &ondisk, SnapBuildOnDiskConstantSize);
+ pgstat_report_wait_end();
if (readBytes != SnapBuildOnDiskConstantSize)
{
CloseTransientFile(fd);
@@ -1703,7 +1711,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
SnapBuildOnDiskConstantSize - SnapBuildOnDiskNotChecksummedSize);
/* read SnapBuild */
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, &ondisk.builder, sizeof(SnapBuild));
+ pgstat_report_wait_end();
if (readBytes != sizeof(SnapBuild))
{
CloseTransientFile(fd);
@@ -1717,7 +1727,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* restore running xacts information */
sz = sizeof(TransactionId) * ondisk.builder.running.xcnt_space;
ondisk.builder.running.xip = MemoryContextAllocZero(builder->context, sz);
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, ondisk.builder.running.xip, sz);
+ pgstat_report_wait_end();
if (readBytes != sz)
{
CloseTransientFile(fd);
@@ -1731,7 +1743,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
/* restore committed xacts information */
sz = sizeof(TransactionId) * ondisk.builder.committed.xcnt;
ondisk.builder.committed.xip = MemoryContextAllocZero(builder->context, sz);
+ pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE);
readBytes = read(fd, ondisk.builder.committed.xip, sz);
+ pgstat_report_wait_end();
if (readBytes != sz)
{
CloseTransientFile(fd);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 10d69d0..d4ebcd9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -43,6 +43,7 @@
#include "access/xlog_internal.h"
#include "common/string.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "replication/slot.h"
#include "storage/fd.h"
#include "storage/proc.h"
@@ -1100,10 +1101,12 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
SnapBuildOnDiskChecksummedSize);
FIN_CRC32C(cp.checksum);
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_REPLSLOT);
if ((write(fd, &cp, sizeof(cp))) != sizeof(cp))
{
int save_errno = errno;
+ pgstat_report_wait_end();
CloseTransientFile(fd);
errno = save_errno;
ereport(elevel,
@@ -1112,8 +1115,10 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
tmppath)));
return;
}
+ pgstat_report_wait_end();
/* fsync the temporary file */
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_SAVE_REPLSLOT);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;
@@ -1126,6 +1131,7 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
tmppath)));
return;
}
+ pgstat_report_wait_end();
CloseTransientFile(fd);
@@ -1202,6 +1208,7 @@ RestoreSlotFromDisk(const char *name)
* Sync state file before we're reading from it. We might have crashed
* while it wasn't synced yet and we shouldn't continue on that basis.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_RESTORE_REPLSLOT);
if (pg_fsync(fd) != 0)
{
CloseTransientFile(fd);
@@ -1210,6 +1217,7 @@ RestoreSlotFromDisk(const char *name)
errmsg("could not fsync file \"%s\": %m",
path)));
}
+ pgstat_report_wait_end();
/* Also sync the parent directory */
START_CRIT_SECTION();
@@ -1217,7 +1225,9 @@ RestoreSlotFromDisk(const char *name)
END_CRIT_SECTION();
/* read part of statefile that's guaranteed to be version independent */
+ pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT);
readBytes = read(fd, &cp, ReplicationSlotOnDiskConstantSize);
+ pgstat_report_wait_end();
if (readBytes != ReplicationSlotOnDiskConstantSize)
{
int saved_errno = errno;
@@ -1253,9 +1263,11 @@ RestoreSlotFromDisk(const char *name)
path, cp.length)));
/* Now that we know the size, read the entire file */
+ pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT);
readBytes = read(fd,
(char *) &cp + ReplicationSlotOnDiskConstantSize,
cp.length);
+ pgstat_report_wait_end();
if (readBytes != cp.length)
{
int saved_errno = errno;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index dd3a936..9be0185 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -462,7 +462,9 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
char rbuf[BLCKSZ];
int nread;
+ pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER);
nread = read(fd, rbuf, sizeof(rbuf));
+ pgstat_report_wait_end();
if (nread <= 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -2076,7 +2078,9 @@ retry:
else
segbytes = nbytes;
+ pgstat_report_wait_start(WAIT_EVENT_READ_XLOG);
readbytes = read(sendFile, p, segbytes);
+ pgstat_report_wait_end();
if (readbytes <= 0)
{
ereport(ERROR,
diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 7ebd636..971cfd1 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -37,6 +37,7 @@
#include "postgres.h"
#include "executor/instrument.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/buffile.h"
#include "storage/buf_internals.h"
@@ -254,7 +255,10 @@ BufFileLoadBuffer(BufFile *file)
/*
* Read whatever we can get, up to a full bufferload.
*/
- file->nbytes = FileRead(thisfile, file->buffer, sizeof(file->buffer));
+ file->nbytes = FileRead(thisfile,
+ file->buffer,
+ sizeof(file->buffer),
+ WAIT_EVENT_READ_BUFFILE);
if (file->nbytes < 0)
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
@@ -317,7 +321,10 @@ BufFileDumpBuffer(BufFile *file)
return; /* seek failed, give up */
file->offsets[file->curFile] = file->curOffset;
}
- bytestowrite = FileWrite(thisfile, file->buffer + wpos, bytestowrite);
+ bytestowrite = FileWrite(thisfile,
+ file->buffer + wpos,
+ bytestowrite,
+ WAIT_EVENT_WRITE_BUFFILE);
if (bytestowrite <= 0)
return; /* failed to write */
file->offsets[file->curFile] += bytestowrite;
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 101da47..2eda42d 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -25,7 +25,7 @@
#include "storage/copydir.h"
#include "storage/fd.h"
#include "miscadmin.h"
-
+#include "pgstat.h"
/*
* copydir: copy a directory
@@ -169,7 +169,9 @@ copy_file(char *fromfile, char *tofile)
/* If we got a cancel signal during the copy of the file, quit */
CHECK_FOR_INTERRUPTS();
+ pgstat_report_wait_start(WAIT_EVENT_READ_COPY_FILE);
nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
+ pgstat_report_wait_end();
if (nbytes < 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -177,8 +179,10 @@ copy_file(char *fromfile, char *tofile)
if (nbytes == 0)
break;
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_FILE);
if ((int) write(dstfd, buffer, nbytes) != nbytes)
{
+ pgstat_report_wait_end();
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
@@ -186,6 +190,7 @@ copy_file(char *fromfile, char *tofile)
(errcode_for_file_access(),
errmsg("could not write to file \"%s\": %m", tofile)));
}
+ pgstat_report_wait_end();
/*
* We fsync the files later but first flush them to avoid spamming the
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index fd02fc0..16f0c92 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -1550,7 +1550,7 @@ FileClose(File file)
* to read into.
*/
int
-FilePrefetch(File file, off_t offset, int amount)
+FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
{
#if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
int returnCode;
@@ -1565,8 +1565,10 @@ FilePrefetch(File file, off_t offset, int amount)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
POSIX_FADV_WILLNEED);
+ pgstat_report_wait_end();
return returnCode;
#else
@@ -1576,7 +1578,7 @@ FilePrefetch(File file, off_t offset, int amount)
}
void
-FileWriteback(File file, off_t offset, off_t nbytes)
+FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info)
{
int returnCode;
@@ -1597,11 +1599,13 @@ FileWriteback(File file, off_t offset, off_t nbytes)
if (returnCode < 0)
return;
+ pgstat_report_wait_start(wait_event_info);
pg_flush_data(VfdCache[file].fd, offset, nbytes);
+ pgstat_report_wait_end();
}
int
-FileRead(File file, char *buffer, int amount)
+FileRead(File file, char *buffer, int amount, uint32 wait_event_info)
{
int returnCode;
Vfd *vfdP;
@@ -1619,6 +1623,7 @@ FileRead(File file, char *buffer, int amount)
vfdP = &VfdCache[file];
+ pgstat_report_wait_start(wait_event_info);
retry:
returnCode = read(vfdP->fd, buffer, amount);
@@ -1658,12 +1663,13 @@ retry:
/* Trouble, so assume we don't know the file position anymore */
vfdP->seekPos = FileUnknownPos;
}
+ pgstat_report_wait_end();
return returnCode;
}
int
-FileWrite(File file, char *buffer, int amount)
+FileWrite(File file, char *buffer, int amount, uint32 wait_event_info)
{
int returnCode;
Vfd *vfdP;
@@ -1719,6 +1725,7 @@ FileWrite(File file, char *buffer, int amount)
}
}
+ pgstat_report_wait_start(wait_event_info);
retry:
errno = 0;
returnCode = write(vfdP->fd, buffer, amount);
@@ -1777,12 +1784,13 @@ retry:
/* Trouble, so assume we don't know the file position anymore */
vfdP->seekPos = FileUnknownPos;
}
+ pgstat_report_wait_end();
return returnCode;
}
int
-FileSync(File file)
+FileSync(File file, uint32 wait_event_info)
{
int returnCode;
@@ -1795,7 +1803,11 @@ FileSync(File file)
if (returnCode < 0)
return returnCode;
- return pg_fsync(VfdCache[file].fd);
+ pgstat_report_wait_start(wait_event_info);
+ returnCode = pg_fsync(VfdCache[file].fd);
+ pgstat_report_wait_end();
+
+ return returnCode;
}
off_t
@@ -1887,7 +1899,7 @@ FileTell(File file)
#endif
int
-FileTruncate(File file, off_t offset)
+FileTruncate(File file, off_t offset, uint32 wait_event_info)
{
int returnCode;
@@ -1900,7 +1912,9 @@ FileTruncate(File file, off_t offset)
if (returnCode < 0)
return returnCode;
+ pgstat_report_wait_start(wait_event_info);
returnCode = ftruncate(VfdCache[file].fd, offset);
+ pgstat_report_wait_end();
if (returnCode == 0 && VfdCache[file].fileSize > offset)
{
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 6c17b54..d0f0f37 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,6 +28,7 @@
#include "miscadmin.h"
#include "access/xlog.h"
#include "catalog/catalog.h"
+#include "pgstat.h"
#include "portability/instr_time.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
@@ -536,7 +537,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ)) != BLCKSZ)
+ if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_EXTEND_DATA_BLOCK)) != BLCKSZ)
{
if (nbytes < 0)
ereport(ERROR,
@@ -667,7 +668,7 @@ mdprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE);
- (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ);
+ (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ, WAIT_EVENT_PREFETCH_DATA_BLOCK);
#endif /* USE_PREFETCH */
}
@@ -716,7 +717,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum,
seekpos = (off_t) BLCKSZ *(blocknum % ((BlockNumber) RELSEG_SIZE));
- FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush);
+ FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush, WAIT_EVENT_FLUSH_DATA_BLOCK);
nblocks -= nflush;
blocknum += nflush;
@@ -753,7 +754,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ);
+ nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_READ_DATA_BLOCK);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
@@ -829,7 +830,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
errmsg("could not seek to block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));
- nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ);
+ nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_WRITE_DATA_BLOCK);
TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum,
reln->smgr_rnode.node.spcNode,
@@ -967,7 +968,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
* This segment is no longer active. We truncate the file, but do
* not delete it, for reasons explained in the header comments.
*/
- if (FileTruncate(v->mdfd_vfd, 0) < 0)
+ if (FileTruncate(v->mdfd_vfd, 0, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\": %m",
@@ -993,7 +994,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
*/
BlockNumber lastsegblocks = nblocks - priorblocks;
- if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ) < 0)
+ if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not truncate file \"%s\" to %u blocks: %m",
@@ -1037,7 +1038,7 @@ mdimmedsync(SMgrRelation reln, ForkNumber forknum)
{
MdfdVec *v = &reln->md_seg_fds[forknum][segno - 1];
- if (FileSync(v->mdfd_vfd) < 0)
+ if (FileSync(v->mdfd_vfd, WAIT_EVENT_SYNC_IMMED_RELATION) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
@@ -1232,7 +1233,7 @@ mdsync(void)
INSTR_TIME_SET_CURRENT(sync_start);
if (seg != NULL &&
- FileSync(seg->mdfd_vfd) >= 0)
+ FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_RELATION) >= 0)
{
/* Success; update statistics about sync timing */
INSTR_TIME_SET_CURRENT(sync_end);
@@ -1443,7 +1444,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
ereport(DEBUG1,
(errmsg("could not forward fsync request because request queue is full")));
- if (FileSync(seg->mdfd_vfd) < 0)
+ if (FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_DATA_BLOCK) < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index c9d6e44..ee5ded9 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -50,6 +50,7 @@
#include "catalog/pg_tablespace.h"
#include "catalog/storage.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "storage/fd.h"
#include "storage/lwlock.h"
#include "utils/inval.h"
@@ -658,11 +659,15 @@ load_relmap_file(bool shared)
* look, the sinval signaling mechanism will make us re-read it before we
* are able to access any relation that's affected by the change.
*/
+ pgstat_report_wait_start(WAIT_EVENT_READ_LOAD_RELMAP_FILE);
if (read(fd, map, sizeof(RelMapFile)) != sizeof(RelMapFile))
+ {
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not read relation mapping file \"%s\": %m",
mapfilename)));
+ }
+ pgstat_report_wait_end();
CloseTransientFile(fd);
@@ -774,6 +779,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
}
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_RELMAP_FILE);
if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile))
{
/* if write didn't set errno, assume problem is no disk space */
@@ -784,6 +790,7 @@ write_relmap_file(bool shared, RelMapFile *newmap,
errmsg("could not write to relation mapping file \"%s\": %m",
mapfilename)));
}
+ pgstat_report_wait_end();
/*
* We choose to fsync the data to disk before considering the task done.
@@ -791,11 +798,13 @@ write_relmap_file(bool shared, RelMapFile *newmap,
* issue, but it would complicate checkpointing --- see notes for
* CheckPointRelationMap.
*/
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_RELMAP_FILE);
if (pg_fsync(fd) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not fsync relation mapping file \"%s\": %m",
mapfilename)));
+ pgstat_report_wait_end();
if (CloseTransientFile(fd))
ereport(ERROR,
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index e0298ee..90cc9bf 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -35,6 +35,7 @@
#include "libpq/libpq.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "pgstat.h"
#include "postmaster/autovacuum.h"
#include "postmaster/postmaster.h"
#include "storage/fd.h"
@@ -856,11 +857,13 @@ CreateLockFile(const char *filename, bool amPostmaster,
errmsg("could not open lock file \"%s\": %m",
filename)));
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_CREATE_LOCK_FILE);
if ((len = read(fd, buffer, sizeof(buffer) - 1)) < 0)
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not read lock file \"%s\": %m",
filename)));
+ pgstat_report_wait_end();
close(fd);
if (len == 0)
@@ -1009,6 +1012,7 @@ CreateLockFile(const char *filename, bool amPostmaster,
strlcat(buffer, "\n", sizeof(buffer));
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_CREATE_LOCK_FILE);
if (write(fd, buffer, strlen(buffer)) != strlen(buffer))
{
int save_errno = errno;
@@ -1021,6 +1025,9 @@ CreateLockFile(const char *filename, bool amPostmaster,
(errcode_for_file_access(),
errmsg("could not write lock file \"%s\": %m", filename)));
}
+ pgstat_report_wait_end();
+
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_CREATE_LOCK_FILE);
if (pg_fsync(fd) != 0)
{
int save_errno = errno;
@@ -1032,6 +1039,7 @@ CreateLockFile(const char *filename, bool amPostmaster,
(errcode_for_file_access(),
errmsg("could not write lock file \"%s\": %m", filename)));
}
+ pgstat_report_wait_end();
if (close(fd) != 0)
{
int save_errno = errno;
@@ -1164,7 +1172,9 @@ AddToDataDirLockFile(int target_line, const char *str)
DIRECTORY_LOCK_FILE)));
return;
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE);
len = read(fd, srcbuffer, sizeof(srcbuffer) - 1);
+ pgstat_report_wait_end();
if (len < 0)
{
ereport(LOG,
@@ -1217,6 +1227,7 @@ AddToDataDirLockFile(int target_line, const char *str)
*/
len = strlen(destbuffer);
errno = 0;
+ pgstat_report_wait_start(WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE);
if (lseek(fd, (off_t) 0, SEEK_SET) != 0 ||
(int) write(fd, destbuffer, len) != len)
{
@@ -1230,6 +1241,8 @@ AddToDataDirLockFile(int target_line, const char *str)
close(fd);
return;
}
+ pgstat_report_wait_end();
+ pgstat_report_wait_start(WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE);
if (pg_fsync(fd) != 0)
{
ereport(LOG,
@@ -1237,6 +1250,7 @@ AddToDataDirLockFile(int target_line, const char *str)
errmsg("could not write to file \"%s\": %m",
DIRECTORY_LOCK_FILE)));
}
+ pgstat_report_wait_end();
if (close(fd) != 0)
{
ereport(LOG,
@@ -1293,7 +1307,9 @@ RecheckDataDirLockFile(void)
return true;
}
}
+ pgstat_report_wait_start(WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE);
len = read(fd, buffer, sizeof(buffer) - 1);
+ pgstat_report_wait_end();
if (len < 0)
{
ereport(LOG,
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 0062fb8..dfb0aba 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -723,6 +723,7 @@ typedef enum BackendState
#define PG_WAIT_EXTENSION 0x07000000U
#define PG_WAIT_IPC 0x08000000U
#define PG_WAIT_TIMEOUT 0x09000000U
+#define PG_WAIT_IO 0x0A000000U
/* ----------
* Wait Events - Activity
@@ -805,6 +806,98 @@ typedef enum
} WaitEventTimeout;
/* ----------
+ * Wait Events - IO
+ *
+ * Use this category when a process is waiting for a IO.
+ * ----------
+ */
+typedef enum
+{
+ WAIT_EVENT_READ_DATA_BLOCK = PG_WAIT_IO,
+ WAIT_EVENT_WRITE_DATA_BLOCK,
+ WAIT_EVENT_SYNC_DATA_BLOCK,
+ WAIT_EVENT_EXTEND_DATA_BLOCK,
+ WAIT_EVENT_FLUSH_DATA_BLOCK,
+ WAIT_EVENT_PREFETCH_DATA_BLOCK,
+ WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK,
+ WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK,
+ WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS,
+ WAIT_EVENT_SYNC_RELATION,
+ WAIT_EVENT_SYNC_IMMED_RELATION,
+ WAIT_EVENT_READ_BUFFILE,
+ WAIT_EVENT_WRITE_BUFFILE,
+ /* Wait event for XLOG */
+ WAIT_EVENT_READ_XLOG,
+ WAIT_EVENT_READ_COPY_XLOG,
+ WAIT_EVENT_WRITE_XLOG,
+ WAIT_EVENT_WRITE_INIT_XLOG_FILE,
+ WAIT_EVENT_WRITE_COPY_XLOG_FILE,
+ WAIT_EVENT_WRITE_BOOTSTRAP_XLOG,
+ WAIT_EVENT_SYNC_INIT_XLOG_FILE,
+ WAIT_EVENT_SYNC_COPY_XLOG_FILE,
+ WAIT_EVENT_SYNC_BOOTSTRAP_XLOG,
+ WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD,
+ /* Wait event for CONTROL_FILE */
+ WAIT_EVENT_WRITE_CONTROL_FILE,
+ WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE,
+ WAIT_EVENT_SYNC_WRITE_CONTROL_FILE,
+ WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE,
+ WAIT_EVENT_READ_CONTROL_FILE,
+ /* Wait event for REORDER BUFFER */
+ WAIT_EVENT_READ_REORDER_BUFFER,
+ WAIT_EVENT_WRITE_REORDER_BUFFER,
+ /* Wait event for LOGICAL MAPPING */
+ WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING,
+ WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE,
+ WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE,
+ WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP,
+ WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE,
+ /* Wait event for SNAPBUILD */
+ WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE,
+ WAIT_EVENT_READ_SNAPBUILD_RESTORE,
+ WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE,
+ /* Wait event for SNRU */
+ WAIT_EVENT_READ_SLRU_PAGE,
+ WAIT_EVENT_WRITE_SLRU_PAGE,
+ WAIT_EVENT_SYNC_SLRU_FLUSH,
+ WAIT_EVENT_SYNC_SLRU_WRITE_PAGE,
+ /* Wait event for TIMELINE HISTORY */
+ WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER,
+ WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_WRITE_TIMELINE_HISTORY,
+ WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE,
+ WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE,
+ WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE,
+ /* Wait event for TWOPHASE FILE */
+ WAIT_EVENT_READ_TWOPHASE_FILE,
+ WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE,
+ WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE,
+ /* Wait event for SYSLOGGER */
+ WAIT_EVENT_READ_SYSLOGGER_FILE,
+ WAIT_EVENT_WRITE_SYSLOGGER_FILE,
+ /* Wait event for REPLSLOT */
+ WAIT_EVENT_READ_RESTORE_REPLSLOT,
+ WAIT_EVENT_WRITE_REPLSLOT,
+ WAIT_EVENT_SYNC_RESTORE_REPLSLOT,
+ WAIT_EVENT_SYNC_SAVE_REPLSLOT,
+ /* Wait event for copydir */
+ WAIT_EVENT_READ_COPY_FILE,
+ WAIT_EVENT_WRITE_COPY_FILE,
+ /* Wait event RELMAP FILE */
+ WAIT_EVENT_READ_LOAD_RELMAP_FILE,
+ WAIT_EVENT_WRITE_RELMAP_FILE,
+ WAIT_EVENT_SYNC_WRITE_RELMAP_FILE,
+ /* Wait event for LOCK FILE */
+ WAIT_EVENT_READ_CREATE_LOCK_FILE,
+ WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE,
+ WAIT_EVENT_WRITE_CREATE_LOCK_FILE,
+ WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE,
+ WAIT_EVENT_SYNC_CREATE_LOCK_FILE
+} WaitEventIO;
+
+/* ----------
* Command type for progress reporting purposes
* ----------
*/
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 1a43a2c..ac37502 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -68,13 +68,13 @@ extern int max_safe_fds;
extern File PathNameOpenFile(FileName fileName, int fileFlags, int fileMode);
extern File OpenTemporaryFile(bool interXact);
extern void FileClose(File file);
-extern int FilePrefetch(File file, off_t offset, int amount);
-extern int FileRead(File file, char *buffer, int amount);
-extern int FileWrite(File file, char *buffer, int amount);
-extern int FileSync(File file);
+extern int FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info);
+extern int FileRead(File file, char *buffer, int amount, uint32 wait_event_info);
+extern int FileWrite(File file, char *buffer, int amount, uint32 wait_event_info);
+extern int FileSync(File file, uint32 wait_event_info);
extern off_t FileSeek(File file, off_t offset, int whence);
-extern int FileTruncate(File file, off_t offset);
-extern void FileWriteback(File file, off_t offset, off_t nbytes);
+extern int FileTruncate(File file, off_t offset, uint32 wait_event_info);
+extern void FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info);
extern char *FilePathName(File file);
extern int FileGetRawDesc(File file);
extern int FileGetRawFlags(File file);
On Wed, Mar 8, 2017 at 4:50 PM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:
I am attaching another version of the patch, as I found stupid mistake
in the earlier version of patch, where I missed to initialize initial
value to
WaitEventIO enum. Also earlier version was not getting cleanly apply on
the current version of sources.
I have applied attached patch, set shared_buffers to 128kB and ran pgbench,
I am able to see below distinct IO events.
--with ./pgbench -i -s 500 postgres
application_name wait_event_type wait_event query
pgbench IO ExtendDataBlock
copy pgbench_account
pgbench IO WriteXLog
copy pgbench_account
pgbench IO WriteDataBlock
copy pgbench_account
pgbench IO ReadDataBlock
vacuum analyze pgben
pgbench IO ReadBuffile
alter table pgbench_
--with ./pgbench -T 600 postgres (readwrite)
application_name wait_event_type wait_event query
pgbench IO ReadDataBlock
UPDATE pgbench_accou
pgbench IO WriteDataBlock
UPDATE pgbench_telle
IO SyncDataBlock
pgbench IO SyncDataBlock
UPDATE pgbench_telle
IO SyncDataBlock
autovacuum: VACUUM A
pgbench IO WriteXLog
END;
pgbench IO ExtendDataBlock
copy pgbench_account
--with ./pgbench -T 600 -S postgres (select only)
application_name wait_event_type wait_event query
pgbench IO ReadDataBlock
SELECT abalance FROM
Attached excel with all IO event values.
Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation
Attachments:
wait_event_for_disk_io.odsapplication/vnd.oasis.opendocument.spreadsheet; name=wait_event_for_disk_io.odsDownload
PK ��gJ�l9�. . mimetypeapplication/vnd.oasis.opendocument.spreadsheetPK ��gJG�=�>� >� Thumbnails/thumbnail.png�PNG
IHDR � � (�N( �IDATx���w�U��]����N����qt��` A���Q�HVr�9� D�,Y@��3(Q1��8���>�����zw�9������O���v���o����x<�y^ZZZ�9���')�/����X����������~�;��Yk�~����%�l��Ka�:n m��EI
u\=u[������o�R-�_!�%�'�"##��h#�v�
2������X#*T������"x���Z��5����^y��G����[����K�CM�.��"n���������/�?���� �o2�K��r��r{������h�)��H�zA9?�R�K������Q"j����T�Q��jA��%z�n�Y����j���������q��
~��'_��m������R^`��,&�RS������0�pI
�LXx������v�����\�p�����/c�[���_.Q�����^�hQd�_B�`:g���Z���)�l��i�!/��K�F�(W�������q�i-�Ma9�H�"��|��~������{�������e�t%�h��'O?~���o�7o�c�=���~�>�D[������?���^D]2�.�����s������z��I�&��O�Y={�|�����_���?��ok��������_��h��������;���b�p�
�s��[�n�-�������~��_�~|�A�F������:Mk}���?z�G}4~�x|�6l�[o������������#�4<��#�V����|��Y�f<x��(_���m2�B�
~��+��2|��������k��=���g����k��^�re�|���?��� & ������u4�[N�8��L�6���o��'O���K��S�NO<��W_}�+���z:��_��U�v�����eK�������p��G�X�"�;v��`�7�@�Os^`�}���Q� �a-�X���+��h����m��!}���B\�Z�*\��
���=zq�m�v��}��c���?B9<�^���\z�)�A(���Y���I��_���V�Z�)Sf���s���R�
�+����7�Z�j�������o�B��N�J�`.|G�P���g�0���Ey��k��Y�xq��������cC�,Y�[&N�Hp���k��Rt��x��M�4A����B����F�(}�<y�h���}��7�b��eQz��f��y�M7��u�]=�#Z.��M�[��;�:|�����7��`=���3v���|�r8@Gx�w�A}lIq��K.�
,X��`u<�y�f�w���� 6�|���^����)�B�| �E_K�,��K/��?����2��v����c������PPz��t���
�k�FR��o��� ?!��_�������
4�D{`2V�(=�B�Ez+T��X�����^�z��y�:ux��T�R9s�D��g��,2��r��iT�R�J���3(5�����w7k��~"K�)"�]w����_J�� P_�r�m��ap8$�G��#!���p�GA���KS4N���w(%����Fqq$�
���5h!�E�h
�G���*�c�
�2��u+�0�)S�� f�r��7l� +�z��?��\(A��
���@i����6�l���� m�}Y�O�+8p ����o�C�[^}�U:��%���7�� W�yp�����$~�a .�v`/d�%D@S��5�#x5��P�S_���>��=}FuPL�0P�v���C�������1b$�-�} : L�.�_�� ��g�}��O?��h:l=t���s�=8��/��p���X@9�CAQ�A9$��8$�!p Z4�{��0�K�B-�>�������W��D@$m�� ��dh6��������yB����w��-����B'�"1e�o��
��a��M-�&j���fP;�;g�B0�-G�^{�5��������Ca�s@�J��)��[�H8I��fa�E'�>�K�RB' �6��K��/\2&���7t
$�I�OwB���l��,���Ql��c�qp� ���k��ww��w���s�=���L���X�/������.�F���y�K�A�4�k���t�AX6�r�
��s���q���Pa
d�/�
��^A��)d m�W�����pL��� P!��N�)6��p�@C����W�4E#X0�S�O?�4�j�)RY��rPt�m�6��0�S��C�a��?B���\�I(�0'���g ��*��t�R�N<"�!t�*����uP)~B$^E��&� �+]W���\C�\����~�����
�0 a�I�I�G�8g��"��������
�O��~b���������T@���25�D�����M�n�//�� ���o^�
�C$l�cl�
����G<j��a*W�:F��+�G�($���8�A��_��<'7��2UC� ��w��_�K\�F��t
��W����������Kf`s���Hs�.�N��4�`� 6�p��A��s�Q
�W�)7�8��dX,���*��y
e�%�\5�b���W6��������{#��'��r�:e'vQ����+l��)�u'd$����� ���Q�}5k�w�%�N7A�f�5%���R^,6��v�x+n r#A�#�^�q���r��l�0��}^0�g����_�����J����Hn��v���K����g�o6^"GTn$Y�I�pGn�n���QI#]kq��e�VS�nkn��C��`(��S���2�P���J�m���9~6@e����[��@��1�i������AP'~�:�e�P��D�l���M�z������3���d*�C�'VNh'w0��=���������q��Pdq�j�e�n����Z~��w �� w���� �d��N������A]�9�3����LM`%��pFX��pZz���,9(����d�J��rE^@h��g����:8����m�U��$Gt���o*}���������O"���}�g��V��$J�
! ���� �����!#s�z�f�����'l'!��<h��g���HE�+f ����<�O�8�k�IIMx����)����$]��� {�% ��iv������-Z�'�5���,��� #�������}�����j�I��x��l��_�5�j��o$�6m 4! �*.:A�U�V�Xt�@��/�0���DR�z���-[��4�2dH<�~�K�@?1/��A� 8�=I\�?��\^��a�����d�
��U�V��)��;i�$�2 <���b��)������2q�D� ��aJ@����
0�D�������
u�8�[�Y��Cj���T�Y���G�!�.2h����4j�S G��� ����������J�"H�yw�����?o�W��~��o���
�!�;���e%����3�//U�Yh3v�,��n��!,Un��`������#'RJpH���k�b�����#���Nb��$��o_4�v���"����&����d��+eaH
!�:�an�c.��b��A������E��#D���<x�uS�P73�0�����)
�4������ ����|��bzB�LB����Rg�/�t 0��DR�N��>�f��1/<(��&�F��!,$��$��*P2�s�LF�U@T8���qc�%0z� �!R c'��-(�>���`���4����*���� �%K���h��W^A"h ��E�/F��(�M9���n@�%D�#������}�u3=��������CD$�}t�Jd��| ����p�:ur��
��
� �{��AL%l�����!m�>D!Q���������D��>r�<�J@?��8��<���3gN�����<N k`f��u�6������?~|���D�"�`��=�N��W_}V ��89�%�.��P�[�n������/��3k�,�B�E�"qx�J�&&��/�����{0�3�<���h�@ ��KA����{��9�����?Q#��A�SO=�)CH< ��LTV���X0b�uxMl ������h>��&SzR<�0!������@�Bg���1�.]���e��Wy�J��&���oD�S������J����`t�fRx���j�p�O>��5+��Z��D;��0�d����4���xF�IR�}w�:�v)��:���z���6�Ob��3�'-2{�}�;��0������q��V�3�-\8
{�|�lCS9���^dJ�h�i��e%Y��a�X02��� 0�:s��dq����q�:���%��jOhL<I���-O�l<2�k�m�,�m�h6����(� i6D�(��)�q�,��x�)f�l�������x0�w&O�K�O.!��v�"Y�����]��� ��7��,��q� �����]7��O�3Z�"���-#
�f��d~+�������i� =�q���������^�I��m�N�� r�\r�%�w8�3\���:�7o���T�" ��m�,���N��h�#r��}�'@HOb���T�!��.��W��>������K/�D���_6b�L��E:l�02R�>����U+
���%�t��/�� Hv�%��g��2e
��D����#�����?�@GP�����K)�4i(��b�p L��������5j��2%z^�|��w����o��u��,X$�����%Dr���g�.]���W����}����:tF&�#W�����,�Z�����$~�~np3������U$� ���bC@w�A =T2D3(�l�@r����#�z���:Y�v������4
9��7k��l�{XG����O�?~���7���3'�5f�:r��������#����^�vda�1M��;M=r���L�01�0$rd�8�S�Q� Q�@y <��V�T D� t�&QI�c +��0�����%K�JT
���#�<�r����z���������]�v-^�8���'��G��q2@(@��(p�t���:��@��G
�0M� ��P�@�zR<�0!�P u���8L�7j���DqdF�1
4����+� 2F��W/�
*"����e��%KG�pNx�������Z�8��k�.\x��-T�A vy�C�6m��={k���%Dh#�����+���S�2 ;{>�5�7Jg�{>��oDGnx2�gq���@D4�H`�/[/�D������E�bo��������K�{ %(�
!X����bC�"E`��',�1|#��$|���'��_��G��� Z�a�Z�%�iP��'�./H���gc�4���zK�S00�l�W ��'��7���z���P�p�BZ�P���>�J�*��?(2�>tNx�O*���?��p��Ac� �Y.}��*Q���B�@f��@.�`�<
:��� �x���q�#F� d#����:�8p�-��s�=�����zS'P���d�?[�l�}�v�
B��}�Y���|�����`��#j�bRd�!���o��W^y%! "�G�s��B��P������u��i�V�����I�����E�d?��)F<�[�DnL���D���� )��S�,��V�p�=��P�Bh �K�����*�
R�/��r4 ��� q��C Bp3�� $� �>�(zC5q��yy�y.��%(�&/=u���9r/�Hk�7N+B�����4�>�\�r���C9}@����4f�%Qi�� Jl�W����r�h��hQ�;2f4%�`}6:KM�#� ��`�� ��$��jO�*��}0!������B�<
�>���<��q�����T�����L1l����`��Z�A��n��+^'���4���:u�qP3gU����`�� ��B|�O7���@���^ 4���s/�F}���h��V=�T�13p�����zg��:�����
4����)�-&
GG�s�LS1=��������6;8G��^��J[����$��lh4�r��������t�hS$��1�IQ�������g�&��F��n`����`�]h>�JB�����j� �2}�L���O��f{�rw�m����Zn��U���+X���:{S�We�+^~�eP'�s��� F",�\��"��=#�2}<5��6t����D�XcSZ�>D���E�n{���#�<Bf���1� y�\@<�T�Ur���U�Vq��u@���p����O�>d�w1J�����kZ��2e
!��J���o��&�i }�j�6n����_~����#G���.\`�����[������O�3Z�"���3f�/_�����5 �B}����_CR����YR��
6$�p�K/�T�F�e��]w�uO=�T�n�����O � `h��Q#L�����"�>}��4�j�����i+��g�}F������[���5o��c���^{�����?��@1R�]�v�����^��j��9�$�H��@��mp��������s�+VEy������>��
��&����O6��j�u�����[t����<x0�!�}���F!�
� \�w��WC'\�-�W��#�+��g y���3�R�
�/^6"���~��i��%(
�C� ��o�Yc�s��k�!������9r$���{��z_J?1-Z@9)������n��;w�Deqc���4�
E��,aNa�A$��
���;H#�@��{�n^S�bE��B������;qE�|� zM��7o�T��`H����[�/������k�O��R*/I��Od��F)LH�)B"s��qt���h�$1(���R�J�Oh)�7�|��g���;##�U��Y3�j�3I�-��B���z>�����I�! �FMb�4K����GA���z�jb��+���fx��>|-A�qe)�n���i]���\"D�&b6�DPC��p��
�{�/��#+�7�)xGm��J���E�� 1�J`@Xp�v�D���� <�@��BHB���(���p��������w/��*2{�l7*���o�7(��Y��I!BA/�9+���m��@� �N�����>H��� �&�������h_�U�ZG���U=�n����{�R�
Mc���8��������v��y�N�im�����@�����6m� 6l� ��>��R �y�8�&^��-�'�b�xtl�o�� f �`����Gp���S�N�.H���[������2m�n������|�W�� �`�DLt�6x%X����7o�L�G�P����-���f��w���iM��={x"��.���C�n������uS�R%��q����IA5�3����"�+E�gDR����YF� ��k�u�
z���P�#�m���_�4���|����������d.��)
�����# �h`
��Ib�da���D=,���f���f��a[L�B�����@�}#����&�L�x0��9��^0�P%�Q7wT��Q������_��;�h�����OM�;�� ����,��"�����-`������l$���
����K�� ����\����u6�fckbz��,��������`06�q[��dV����[����a�ns��6��4!�8�$�2��IP�Dt�
<����tO��=���FhTTQd��{E�
U3vs52���c*�=zW
%�����-tLH�J��N�Vc��X
� ���h<c����6�"wQS<�������^GM75K�)J�����?�ErN ����R�X�����n��L^���t��k����������"(P��M�r8�j���8n�8 ��!C������{o�f�����#���'