Make pg_stat_io view count IOs as bytes instead of blocks
Hi,
Currently, in the pg_stat_io view, IOs are counted as blocks. However,
there are two issues with this approach:
1- The actual number of IO requests to the kernel is lower because IO
requests can be merged before sending the final request. Additionally, it
appears that all IOs are counted in block size.
2- Some IOs may not align with block size. For example, WAL read IOs are
done in variable bytes and it is not possible to correctly show these IOs
in the pg_stat_io view [1]/messages/by-id/CAN55FZ1ny+3kpdm5X3nGZ2Jp3wxZO-744eFgxktS6YQ3=OKR-A@mail.gmail.com.
To address this, I propose showing the total number of IO requests to the
kernel (as smgr function calls) and the total number of bytes in the IO. To
implement this change, the op_bytes column will be removed from the
pg_stat_io view. Instead, the [reads | writes | extends] columns will track
the total number of IO requests, and newly added [read | write |
extend]_bytes columns will track the total number of bytes in the IO.
Example benefit of this change:
Running query [2]CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i; SELECT pg_stat_reset_shared('io'); SELECT * FROM t WHERE i = 0; SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM pg_stat_io WHERE reads > 0;, the result is:
╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 15.99 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 15.99 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝
You can rerun the same query [2]CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i; SELECT pg_stat_reset_shared('io'); SELECT * FROM t WHERE i = 0; SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM pg_stat_io WHERE reads > 0; after setting io_combine_limit to 32 [3]SET io_combine_limit TO 32;.
The result is:
╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 31.70 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 31.60 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝
I believe that having visibility into avg_io_[bytes | blocks] is valuable
information that could help optimize Postgres.
Any feedback would be appreciated.
[1]: /messages/by-id/CAN55FZ1ny+3kpdm5X3nGZ2Jp3wxZO-744eFgxktS6YQ3=OKR-A@mail.gmail.com
/messages/by-id/CAN55FZ1ny+3kpdm5X3nGZ2Jp3wxZO-744eFgxktS6YQ3=OKR-A@mail.gmail.com
[2]: CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i; SELECT pg_stat_reset_shared('io'); SELECT * FROM t WHERE i = 0; SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM pg_stat_io WHERE reads > 0;
CREATE TABLE t as select i, repeat('a', 600) as filler from
generate_series(1, 10000000) as i;
SELECT pg_stat_reset_shared('io');
SELECT * FROM t WHERE i = 0;
SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT
current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM
pg_stat_io WHERE reads > 0;
[3]: SET io_combine_limit TO 32;
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachments:
v1-0001-Make-pg_stat_io-count-IOs-as-bytes-instead-of-blo.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Make-pg_stat_io-count-IOs-as-bytes-instead-of-blo.patchDownload
From f02b0d261880aa3f933a9350b6b1557f6b14f292 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz <byavuz81@gmail.com>
Date: Wed, 11 Sep 2024 11:04:18 +0300
Subject: [PATCH v1] Make pg_stat_io count IOs as bytes instead of blocks
Currently in pg_stat_io view, IOs are counted as blocks. There are two
problems with this approach:
1- The actual number of I/O requests sent to the kernel is lower because
I/O requests may be merged before being sent. Additionally, it gives the
impression that all I/Os are done in block size, which shadows the
benefits of merging I/O requests.
2- There may be some IOs which are not done in block size in the future.
For example, WAL read IOs are done in variable bytes and it is not
possible to correctly show these IOs in pg_stat_io view.
Because of these problems, now show the total number of IO requests to
the kernel (as smgr function calls) and total number of bytes in the IO.
Also, remove op_bytes column from the pg_stat_io view.
---
src/include/catalog/pg_proc.dat | 6 +-
src/include/pgstat.h | 9 ++-
src/backend/catalog/system_views.sql | 4 +-
src/backend/storage/buffer/bufmgr.c | 14 ++---
src/backend/storage/buffer/localbuf.c | 6 +-
src/backend/storage/smgr/md.c | 4 +-
src/backend/utils/activity/pgstat_io.c | 63 ++++++++++++++++---
src/backend/utils/adt/pgstatfuncs.c | 87 +++++++++++++++++++-------
src/test/regress/expected/rules.out | 6 +-
doc/src/sgml/monitoring.sgml | 61 +++++++++++-------
10 files changed, 184 insertions(+), 76 deletions(-)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ff5436acacf..b0dab15bfd4 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5826,9 +5826,9 @@
proname => 'pg_stat_get_io', prorows => '30', proretset => 't',
provolatile => 'v', proparallel => 'r', prorettype => 'record',
proargtypes => '',
- proallargtypes => '{text,text,text,int8,float8,int8,float8,int8,float8,int8,float8,int8,int8,int8,int8,int8,float8,timestamptz}',
- proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
- proargnames => '{backend_type,object,context,reads,read_time,writes,write_time,writebacks,writeback_time,extends,extend_time,op_bytes,hits,evictions,reuses,fsyncs,fsync_time,stats_reset}',
+ proallargtypes => '{text,text,text,int8,numeric,float8,int8,numeric,float8,int8,float8,int8,numeric,float8,int8,int8,int8,int8,float8,timestamptz}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{backend_type,object,context,reads,read_bytes,read_time,writes,write_bytes,write_time,writebacks,writeback_time,extends,extend_bytes,extend_time,hits,evictions,reuses,fsyncs,fsync_time,stats_reset}',
prosrc => 'pg_stat_get_io' },
{ oid => '1136', descr => 'statistics: information about WAL activity',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index be2c91168a1..56ab9893999 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -341,6 +341,7 @@ typedef enum IOOp
typedef struct PgStat_BktypeIO
{
+ uint64 bytes[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
PgStat_Counter counts[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
PgStat_Counter times[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
} PgStat_BktypeIO;
@@ -553,11 +554,13 @@ extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
extern bool pgstat_bktype_io_stats_valid(PgStat_BktypeIO *backend_io,
BackendType bktype);
-extern void pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op);
-extern void pgstat_count_io_op_n(IOObject io_object, IOContext io_context, IOOp io_op, uint32 cnt);
+extern void pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op, uint64 bytes);
+extern void pgstat_count_io_op_n(IOObject io_object, IOContext io_context,
+ IOOp io_op, uint32 cnt, uint64 bytes);
extern instr_time pgstat_prepare_io_time(bool track_io_guc);
extern void pgstat_count_io_op_time(IOObject io_object, IOContext io_context,
- IOOp io_op, instr_time start_time, uint32 cnt);
+ IOOp io_op, instr_time start_time,
+ uint32 cnt, uint64 bytes);
extern PgStat_IO *pgstat_fetch_stat_io(void);
extern const char *pgstat_get_io_context_name(IOContext io_context);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7fd5d256a18..854e24da596 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1152,14 +1152,16 @@ SELECT
b.object,
b.context,
b.reads,
+ b.read_bytes,
b.read_time,
b.writes,
+ b.write_bytes,
b.write_time,
b.writebacks,
b.writeback_time,
b.extends,
+ b.extend_bytes,
b.extend_time,
- b.op_bytes,
b.hits,
b.evictions,
b.reuses,
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 48520443001..b23a5e0ffba 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1165,7 +1165,7 @@ PinBufferForBlock(Relation rel,
}
if (*foundPtr)
{
- pgstat_count_io_op(io_object, io_context, IOOP_HIT);
+ pgstat_count_io_op(io_object, io_context, IOOP_HIT, 0);
if (VacuumCostActive)
VacuumCostBalance += VacuumCostPageHit;
@@ -1497,7 +1497,7 @@ WaitReadBuffers(ReadBuffersOperation *operation)
io_start = pgstat_prepare_io_time(track_io_timing);
smgrreadv(operation->smgr, forknum, io_first_block, io_pages, io_buffers_len);
pgstat_count_io_op_time(io_object, io_context, IOOP_READ, io_start,
- io_buffers_len);
+ 1, io_buffers_len * BLCKSZ);
/* Verify each block we read, and terminate the I/O. */
for (int j = 0; j < io_buffers_len; ++j)
@@ -2055,7 +2055,7 @@ again:
* pinners or erroring out.
*/
pgstat_count_io_op(IOOBJECT_RELATION, io_context,
- from_ring ? IOOP_REUSE : IOOP_EVICT);
+ from_ring ? IOOP_REUSE : IOOP_EVICT, 0);
}
/*
@@ -2411,7 +2411,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
UnlockRelationForExtension(bmr.rel, ExclusiveLock);
pgstat_count_io_op_time(IOOBJECT_RELATION, io_context, IOOP_EXTEND,
- io_start, extend_by);
+ io_start, 1, extend_by * BLCKSZ);
/* Set BM_VALID, terminate IO, and wake up any waiters */
for (uint32 i = 0; i < extend_by; i++)
@@ -3873,7 +3873,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOObject io_object,
* of a dirty shared buffer (IOCONTEXT_NORMAL IOOP_WRITE).
*/
pgstat_count_io_op_time(IOOBJECT_RELATION, io_context,
- IOOP_WRITE, io_start, 1);
+ IOOP_WRITE, io_start, 1, BLCKSZ);
pgBufferUsage.shared_blks_written++;
@@ -4512,7 +4512,7 @@ FlushRelationBuffers(Relation rel)
pgstat_count_io_op_time(IOOBJECT_TEMP_RELATION,
IOCONTEXT_NORMAL, IOOP_WRITE,
- io_start, 1);
+ io_start, 1, BLCKSZ);
buf_state &= ~(BM_DIRTY | BM_JUST_DIRTIED);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -6014,7 +6014,7 @@ IssuePendingWritebacks(WritebackContext *wb_context, IOContext io_context)
* blocks of permanent relations.
*/
pgstat_count_io_op_time(IOOBJECT_RELATION, io_context,
- IOOP_WRITEBACK, io_start, wb_context->nr_pending);
+ IOOP_WRITEBACK, io_start, wb_context->nr_pending, 0);
wb_context->nr_pending = 0;
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 8da7dd6c98a..37044ecd6c5 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -255,7 +255,7 @@ GetLocalVictimBuffer(void)
/* Temporary table I/O does not use Buffer Access Strategies */
pgstat_count_io_op_time(IOOBJECT_TEMP_RELATION, IOCONTEXT_NORMAL,
- IOOP_WRITE, io_start, 1);
+ IOOP_WRITE, io_start, 1, BLCKSZ);
/* Mark not-dirty now in case we error out below */
buf_state &= ~BM_DIRTY;
@@ -279,7 +279,7 @@ GetLocalVictimBuffer(void)
ClearBufferTag(&bufHdr->tag);
buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
- pgstat_count_io_op(IOOBJECT_TEMP_RELATION, IOCONTEXT_NORMAL, IOOP_EVICT);
+ pgstat_count_io_op(IOOBJECT_TEMP_RELATION, IOCONTEXT_NORMAL, IOOP_EVICT, 0);
}
return BufferDescriptorGetBuffer(bufHdr);
@@ -419,7 +419,7 @@ ExtendBufferedRelLocal(BufferManagerRelation bmr,
smgrzeroextend(bmr.smgr, fork, first_block, extend_by, false);
pgstat_count_io_op_time(IOOBJECT_TEMP_RELATION, IOCONTEXT_NORMAL, IOOP_EXTEND,
- io_start, extend_by);
+ io_start, 1, extend_by * BLCKSZ);
for (uint32 i = 0; i < extend_by; i++)
{
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 6796756358f..9e293d5688f 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -1386,7 +1386,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
* backend fsyncs.
*/
pgstat_count_io_op_time(IOOBJECT_RELATION, IOCONTEXT_NORMAL,
- IOOP_FSYNC, io_start, 1);
+ IOOP_FSYNC, io_start, 1, 0);
}
}
@@ -1773,7 +1773,7 @@ mdsyncfiletag(const FileTag *ftag, char *path)
FileClose(file);
pgstat_count_io_op_time(IOOBJECT_RELATION, IOCONTEXT_NORMAL,
- IOOP_FSYNC, io_start, 1);
+ IOOP_FSYNC, io_start, 1, 0);
errno = save_errno;
return result;
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa9..0213f617f8a 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -23,6 +23,7 @@
typedef struct PgStat_PendingIO
{
+ uint64 bytes[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
PgStat_Counter counts[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
instr_time pending_times[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
} PgStat_PendingIO;
@@ -31,6 +32,12 @@ typedef struct PgStat_PendingIO
static PgStat_PendingIO PendingIOStats;
static bool have_iostats = false;
+static inline bool pgstat_io_count_checks(IOObject io_object,
+ IOContext io_context, IOOp io_op,
+ uint64 bytes);
+static inline void pgstat_count_io_op_n_inline(IOObject io_object,
+ IOContext io_context, IOOp io_op,
+ uint32 cnt, uint64 bytes);
/*
* Check that stats have not been counted for any combination of IOObject,
@@ -73,21 +80,50 @@ pgstat_bktype_io_stats_valid(PgStat_BktypeIO *backend_io,
return true;
}
-void
-pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op)
-{
- pgstat_count_io_op_n(io_object, io_context, io_op, 1);
-}
-
-void
-pgstat_count_io_op_n(IOObject io_object, IOContext io_context, IOOp io_op, uint32 cnt)
+static inline bool
+pgstat_io_count_checks(IOObject io_object, IOContext io_context, IOOp io_op, uint64 bytes)
{
Assert((unsigned int) io_object < IOOBJECT_NUM_TYPES);
Assert((unsigned int) io_context < IOCONTEXT_NUM_TYPES);
Assert((unsigned int) io_op < IOOP_NUM_TYPES);
Assert(pgstat_tracks_io_op(MyBackendType, io_object, io_context, io_op));
+ /* Only IOOP_READ, IOOP_WRITE and IOOP_EXTEND can do IO in bytes. */
+ Assert((io_op == IOOP_READ || io_op == IOOP_WRITE || io_op == IOOP_EXTEND) ||
+ bytes == 0);
+
+ /*
+ * If IO done in bytes and byte is <= 0, this means there is an error
+ * while doing an IO. Don't count these IOs.
+ */
+ if ((io_op == IOOP_READ || io_op == IOOP_WRITE || io_op == IOOP_EXTEND) &&
+ bytes <= 0)
+ return false;
+
+ return true;
+}
+
+void
+pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op, uint64 bytes)
+{
+ if (!pgstat_io_count_checks(io_object, io_context, io_op, bytes))
+ return;
+ pgstat_count_io_op_n_inline(io_object, io_context, io_op, 1, bytes);
+}
+
+void
+pgstat_count_io_op_n(IOObject io_object, IOContext io_context, IOOp io_op, uint32 cnt, uint64 bytes)
+{
+ if (!pgstat_io_count_checks(io_object, io_context, io_op, bytes))
+ return;
+ pgstat_count_io_op_n_inline(io_object, io_context, io_op, cnt, bytes);
+}
+
+static inline void
+pgstat_count_io_op_n_inline(IOObject io_object, IOContext io_context, IOOp io_op, uint32 cnt, uint64 bytes)
+{
PendingIOStats.counts[io_object][io_context][io_op] += cnt;
+ PendingIOStats.bytes[io_object][io_context][io_op] += bytes;
have_iostats = true;
}
@@ -120,8 +156,12 @@ pgstat_prepare_io_time(bool track_io_guc)
*/
void
pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
- instr_time start_time, uint32 cnt)
+ instr_time start_time, uint32 cnt, uint64 bytes)
{
+
+ if (!pgstat_io_count_checks(io_object, io_context, io_op, bytes))
+ return;
+
if (track_io_timing)
{
instr_time io_time;
@@ -150,7 +190,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
io_time);
}
- pgstat_count_io_op_n(io_object, io_context, io_op, cnt);
+ pgstat_count_io_op_n_inline(io_object, io_context, io_op, cnt, bytes);
}
PgStat_IO *
@@ -216,6 +256,9 @@ pgstat_io_flush_cb(bool nowait)
bktype_shstats->counts[io_object][io_context][io_op] +=
PendingIOStats.counts[io_object][io_context][io_op];
+ bktype_shstats->bytes[io_object][io_context][io_op] +=
+ PendingIOStats.bytes[io_object][io_context][io_op];
+
time = PendingIOStats.pending_times[io_object][io_context][io_op];
bktype_shstats->times[io_object][io_context][io_op] +=
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 33c7b25560b..3ea18263f6f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1272,14 +1272,16 @@ typedef enum io_stat_col
IO_COL_OBJECT,
IO_COL_CONTEXT,
IO_COL_READS,
+ IO_COL_READ_BYTES,
IO_COL_READ_TIME,
IO_COL_WRITES,
+ IO_COL_WRITE_BYTES,
IO_COL_WRITE_TIME,
IO_COL_WRITEBACKS,
IO_COL_WRITEBACK_TIME,
IO_COL_EXTENDS,
+ IO_COL_EXTEND_BYTES,
IO_COL_EXTEND_TIME,
- IO_COL_CONVERSION,
IO_COL_HITS,
IO_COL_EVICTIONS,
IO_COL_REUSES,
@@ -1320,11 +1322,36 @@ pgstat_get_io_op_index(IOOp io_op)
pg_unreachable();
}
+/*
+ * Get the number of the column containing IO bytes for the specified IOOp.
+ * If an op does not do IO in bytes, IO_COL_INVALID is returned.
+ */
+static io_stat_col
+pgstat_get_io_byte_index(IOOp io_op)
+{
+ switch (io_op)
+ {
+ case IOOP_EXTEND:
+ return IO_COL_EXTEND_BYTES;
+ case IOOP_READ:
+ return IO_COL_READ_BYTES;
+ case IOOP_WRITE:
+ return IO_COL_WRITE_BYTES;
+ case IOOP_EVICT:
+ case IOOP_FSYNC:
+ case IOOP_HIT:
+ case IOOP_REUSE:
+ case IOOP_WRITEBACK:
+ return IO_COL_INVALID;
+ }
+
+ elog(ERROR, "unrecognized IOOp value: %d", io_op);
+ pg_unreachable();
+}
+
/*
* Get the number of the column containing IO times for the specified IOOp.
- * This function encodes our assumption that IO time for an IOOp is displayed
- * in the view in the column directly after the IOOp counts. If an op has no
- * associated time, IO_COL_INVALID is returned.
+ * If an op has no associated time, IO_COL_INVALID is returned.
*/
static io_stat_col
pgstat_get_io_time_index(IOOp io_op)
@@ -1332,11 +1359,15 @@ pgstat_get_io_time_index(IOOp io_op)
switch (io_op)
{
case IOOP_READ:
+ return IO_COL_READ_TIME;
case IOOP_WRITE:
+ return IO_COL_WRITE_TIME;
case IOOP_WRITEBACK:
+ return IO_COL_WRITEBACK_TIME;
case IOOP_EXTEND:
+ return IO_COL_EXTEND_TIME;
case IOOP_FSYNC:
- return pgstat_get_io_op_index(io_op) + 1;
+ return IO_COL_FSYNC_TIME;
case IOOP_EVICT:
case IOOP_HIT:
case IOOP_REUSE:
@@ -1410,17 +1441,10 @@ pg_stat_get_io(PG_FUNCTION_ARGS)
values[IO_COL_OBJECT] = CStringGetTextDatum(obj_name);
values[IO_COL_RESET_TIME] = reset_time;
- /*
- * Hard-code this to the value of BLCKSZ for now. Future
- * values could include XLOG_BLCKSZ, once WAL IO is tracked,
- * and constant multipliers, once non-block-oriented IO (e.g.
- * temporary file IO) is tracked.
- */
- values[IO_COL_CONVERSION] = Int64GetDatum(BLCKSZ);
-
for (int io_op = 0; io_op < IOOP_NUM_TYPES; io_op++)
{
int op_idx = pgstat_get_io_op_index(io_op);
+ int byte_idx = pgstat_get_io_byte_index(io_op);
int time_idx = pgstat_get_io_time_index(io_op);
/*
@@ -1438,19 +1462,40 @@ pg_stat_get_io(PG_FUNCTION_ARGS)
else
nulls[op_idx] = true;
- /* not every operation is timed */
- if (time_idx == IO_COL_INVALID)
- continue;
-
if (!nulls[op_idx])
{
- PgStat_Counter time =
- bktype_stats->times[io_obj][io_context][io_op];
+ /* not every operation is timed */
+ if (time_idx != IO_COL_INVALID)
+ {
+ PgStat_Counter time =
+ bktype_stats->times[io_obj][io_context][io_op];
- values[time_idx] = Float8GetDatum(pg_stat_us_to_ms(time));
+ values[time_idx] = Float8GetDatum(pg_stat_us_to_ms(time));
+ }
+
+ /* not every IO done in bytes */
+ if (byte_idx != IO_COL_INVALID)
+ {
+ char buf[256];
+ PgStat_Counter byte =
+ bktype_stats->bytes[io_obj][io_context][io_op];
+
+ /* Convert to numeric. */
+ snprintf(buf, sizeof buf, UINT64_FORMAT, byte);
+ values[byte_idx] = DirectFunctionCall3(numeric_in,
+ CStringGetDatum(buf),
+ ObjectIdGetDatum(0),
+ Int32GetDatum(-1));
+ }
}
else
- nulls[time_idx] = true;
+ {
+ if (time_idx != IO_COL_INVALID)
+ nulls[time_idx] = true;
+ if (byte_idx != IO_COL_INVALID)
+ nulls[byte_idx] = true;
+ }
+
}
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a1626f3fae9..4836198f785 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1888,21 +1888,23 @@ pg_stat_io| SELECT backend_type,
object,
context,
reads,
+ read_bytes,
read_time,
writes,
+ write_bytes,
write_time,
writebacks,
writeback_time,
extends,
+ extend_bytes,
extend_time,
- op_bytes,
hits,
evictions,
reuses,
fsyncs,
fsync_time,
stats_reset
- FROM pg_stat_get_io() b(backend_type, object, context, reads, read_time, writes, write_time, writebacks, writeback_time, extends, extend_time, op_bytes, hits, evictions, reuses, fsyncs, fsync_time, stats_reset);
+ FROM pg_stat_get_io() b(backend_type, object, context, reads, read_bytes, read_time, writes, write_bytes, write_time, writebacks, writeback_time, extends, extend_bytes, extend_time, hits, evictions, reuses, fsyncs, fsync_time, stats_reset);
pg_stat_progress_analyze| SELECT s.pid,
s.datid,
d.datname,
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 933de6fe07f..dc85b342b39 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2692,8 +2692,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>reads</structfield> <type>bigint</type>
</para>
<para>
- Number of read operations, each of the size specified in
- <varname>op_bytes</varname>.
+ Number of read operations.
+ </para>
+ </entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry">
+ <para role="column_definition">
+ <structfield>read_bytes</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The total size of read operations in bytes.
</para>
</entry>
</row>
@@ -2716,8 +2726,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>writes</structfield> <type>bigint</type>
</para>
<para>
- Number of write operations, each of the size specified in
- <varname>op_bytes</varname>.
+ Number of write operations.
+ </para>
+ </entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry">
+ <para role="column_definition">
+ <structfield>write_bytes</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The total size of write operations in bytes.
</para>
</entry>
</row>
@@ -2740,7 +2760,7 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>writebacks</structfield> <type>bigint</type>
</para>
<para>
- Number of units of size <varname>op_bytes</varname> which the process
+ Number of units of size <symbol>BLCKSZ</symbol> which the process
requested the kernel write out to permanent storage.
</para>
</entry>
@@ -2766,8 +2786,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>extends</structfield> <type>bigint</type>
</para>
<para>
- Number of relation extend operations, each of the size specified in
- <varname>op_bytes</varname>.
+ Number of relation extend operations.
+ </para>
+ </entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry">
+ <para role="column_definition">
+ <structfield>extend_bytes</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The total size of relation extend operations in bytes.
</para>
</entry>
</row>
@@ -2784,23 +2814,6 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</entry>
</row>
- <row>
- <entry role="catalog_table_entry">
- <para role="column_definition">
- <structfield>op_bytes</structfield> <type>bigint</type>
- </para>
- <para>
- The number of bytes per unit of I/O read, written, or extended.
- </para>
- <para>
- Relation data reads, writes, and extends are done in
- <varname>block_size</varname> units, derived from the build-time
- parameter <symbol>BLCKSZ</symbol>, which is <literal>8192</literal> by
- default.
- </para>
- </entry>
- </row>
-
<row>
<entry role="catalog_table_entry">
<para role="column_definition">
--
2.45.2
On Wed, Sep 11, 2024 at 7:19 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Currently, in the pg_stat_io view, IOs are counted as blocks. However, there are two issues with this approach:
1- The actual number of IO requests to the kernel is lower because IO requests can be merged before sending the final request. Additionally, it appears that all IOs are counted in block size.
I think this is a great idea. It will allow people to tune
io_combine_limit as you mention below.
2- Some IOs may not align with block size. For example, WAL read IOs are done in variable bytes and it is not possible to correctly show these IOs in the pg_stat_io view [1].
Yep, this makes a lot of sense as a solution.
To address this, I propose showing the total number of IO requests to the kernel (as smgr function calls) and the total number of bytes in the IO. To implement this change, the op_bytes column will be removed from the pg_stat_io view. Instead, the [reads | writes | extends] columns will track the total number of IO requests, and newly added [read | write | extend]_bytes columns will track the total number of bytes in the IO.
smgr API seems like the right place for this.
Example benefit of this change:
Running query [2], the result is:
╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 15.99 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 15.99 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝
I don't understand why background worker is listed here.
You can rerun the same query [2] after setting io_combine_limit to 32 [3]. The result is:
╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 31.70 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 31.60 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝I believe that having visibility into avg_io_[bytes | blocks] is valuable information that could help optimize Postgres.
In general, for this example, I think it would be more clear if you
compared what visibility we have in pg_stat_io on master with what
visibility we have with your patch.
I like that you show how io_combine_limit can be tuned using this, but
I don't think the problem statement is clear nor is the full
narrative.
CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i;
SELECT pg_stat_reset_shared('io');
SELECT * FROM t WHERE i = 0;
SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM pg_stat_io WHERE reads > 0;
I like that you calculate the avg_io_blocks, but I think it is good to
show the raw columns as well.
- Melanie
Hi,
On Wed, Nov 27, 2024 at 11:08:01AM -0500, Melanie Plageman wrote:
On Wed, Sep 11, 2024 at 7:19 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Currently, in the pg_stat_io view, IOs are counted as blocks. However, there are two issues with this approach:
1- The actual number of IO requests to the kernel is lower because IO requests can be merged before sending the final request. Additionally, it appears that all IOs are counted in block size.
I think this is a great idea. It will allow people to tune
io_combine_limit as you mention below.2- Some IOs may not align with block size. For example, WAL read IOs are done in variable bytes and it is not possible to correctly show these IOs in the pg_stat_io view [1].
Yep, this makes a lot of sense as a solution.
Thanks for the patch! I also think it makes sense.
A few random comments:
=== 1
+ /*
+ * If IO done in bytes and byte is <= 0, this means there is an error
+ * while doing an IO. Don't count these IOs.
+ */
s/byte/bytes/?
also:
The pgstat_io_count_checks() parameter is uint64. Does it mean it has to be
changed to int64?
Also from what I can see the calls are done with those values:
- 0
- io_buffers_len * BLCKSZ
- extend_by * BLCKSZ
- BLCKSZ
could io_buffers_len and extend_by be < 0? If not, is the comment correct?
=== 2
+ Assert((io_op == IOOP_READ || io_op == IOOP_WRITE || io_op == IOOP_EXTEND
and
+ if ((io_op == IOOP_READ || io_op == IOOP_WRITE || io_op == IOOP_EXTEND) &&
What about ordering the enum in IOOp (no bytes/bytes) so that we could check
that io_op >= "our firt bytes enum" instead?
Also we could create a macro on top of that to make it clear. And a comment
would be needed around the IOOp definition.
I think that would be simpler to maintain should we add no bytes or bytes op in
the future.
=== 3
+pgstat_io_count_checks(IOObject io_object, IOContext io_context, IOOp io_op, uint64 bytes)
+{
+ Assert((unsigned int) io_object < IOOBJECT_NUM_TYPES);
+ Assert((unsigned int) io_context < IOCONTEXT_NUM_TYPES);
+ Assert((unsigned int) io_op < IOOP_NUM_TYPES);
+ Assert(pgstat_tracks_io_op(MyBackendType, io_object, io_context, io_op));
IOObject and IOContext are passed only for the assertions. What about removing
them from there and put the asserts in other places?
=== 4
+ /* Only IOOP_READ, IOOP_WRITE and IOOP_EXTEND can do IO in bytes. */
Not sure about "can do IO in bytes" (same wording is used in multiple places).
=== 5
/* Convert to numeric. */
"convert to numeric"? to be consistent with others single line comments around.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Hi,
Thanks for looking into this!
On Wed, 27 Nov 2024 at 19:08, Melanie Plageman
<melanieplageman@gmail.com> wrote:
On Wed, Sep 11, 2024 at 7:19 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Currently, in the pg_stat_io view, IOs are counted as blocks. However, there are two issues with this approach:
1- The actual number of IO requests to the kernel is lower because IO requests can be merged before sending the final request. Additionally, it appears that all IOs are counted in block size.
I think this is a great idea. It will allow people to tune
io_combine_limit as you mention below.2- Some IOs may not align with block size. For example, WAL read IOs are done in variable bytes and it is not possible to correctly show these IOs in the pg_stat_io view [1].
Yep, this makes a lot of sense as a solution.
To address this, I propose showing the total number of IO requests to the kernel (as smgr function calls) and the total number of bytes in the IO. To implement this change, the op_bytes column will be removed from the pg_stat_io view. Instead, the [reads | writes | extends] columns will track the total number of IO requests, and newly added [read | write | extend]_bytes columns will track the total number of bytes in the IO.
smgr API seems like the right place for this.
Example benefit of this change:
Running query [2], the result is:
╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 15.99 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 15.99 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝I don't understand why background worker is listed here.
Parallel sequential scan happens in this example and parallel workers
are listed as background workers. After setting
'max_parallel_workers_per_gather' to 0, it is gone.
You can rerun the same query [2] after setting io_combine_limit to 32 [3]. The result is:
╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 31.70 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 31.60 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝I believe that having visibility into avg_io_[bytes | blocks] is valuable information that could help optimize Postgres.
In general, for this example, I think it would be more clear if you
compared what visibility we have in pg_stat_io on master with what
visibility we have with your patch.
I am listing the changes as text, images are also attached.
* [reads | writes | extends] columns count the number of smgr function
calls now. They were counting the number of block IOs before.
* op_bytes column is removed from the view because each IO could have
a different size. They are not always equal to op_bytes.
* [read_bytes | write_bytes | extend_bytes] columns are added. These
columns count IO sizes as bytes.
There are two different IO cases:
1- Size of the IOs are constant:
* See 'client backend / bulkread' row, If you divide read_bytes
columns' value (6826754048) to BLCKSZ (8192) in the patched image, you
get the reads columns' value (833344) in the upstream image. So, we
actually do not lose any information when the size of the IOs are
constant.
2- Size of the IOs are different:
* Upstream version will give wrong information in this case. For
example see WALRead() function. pg_pread() is called with different
segbytes values in this function. It is not possible to correctly show
this stat in pg_stat_io view.
The problem with the upstream version of the pg_stat_io view is that
multiplying the number of blocks with the op_bytes does not always
give the total IO size. Also, it looks like Postgres is doing one IO
request per block. This patch tries to address these problems.
I like that you show how io_combine_limit can be tuned using this, but
I don't think the problem statement is clear nor is the full
narrative.
I just wanted to show one piece of information that can be gathered
with the patched version, it was not possible to gather that before.
CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i;
SELECT pg_stat_reset_shared('io');
SELECT * FROM t WHERE i = 0;
SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM pg_stat_io WHERE reads > 0;I like that you calculate the avg_io_blocks, but I think it is good to
show the raw columns as well.
Images of the view after running the query [1]SET track_io_timing to ON; SET max_parallel_workers_per_gather TO 0; SELECT pg_stat_reset_shared('io'); CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i; SELECT * FROM t WHERE i = 0; SELECT * FROM pg_stat_io; are attached.
P.S. I attached the images of the view because I do not know how they
will look if I copy paste them as text. If there is a way to add them
as text without distortion, please let me know.
[1]: SET track_io_timing to ON; SET max_parallel_workers_per_gather TO 0; SELECT pg_stat_reset_shared('io'); CREATE TABLE t as select i, repeat('a', 600) as filler from generate_series(1, 10000000) as i; SELECT * FROM t WHERE i = 0; SELECT * FROM pg_stat_io;
SET track_io_timing to ON;
SET max_parallel_workers_per_gather TO 0;
SELECT pg_stat_reset_shared('io');
CREATE TABLE t as select i, repeat('a', 600) as filler from
generate_series(1, 10000000) as i;
SELECT * FROM t WHERE i = 0;
SELECT * FROM pg_stat_io;
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachments:
upstream_pg_stat_io.pngimage/png; name=upstream_pg_stat_io.pngDownload
�PNG
IHDR � � Fu�# pHYs � ��+ IDATx���w|SU��M�i)�=�2E��.����Ad���LQd��Q�-*
"K�@�]
��2
�R,{�����#irW�$M������J�{�s�s�{��M
���_���rL;�����������������X�>�"`��.��m�m�m�v������;��1��GV��p,��=�����;w������;��(_�N=���ADDDDDDDDDDDD����������������.� �����������������������������.� �����������������������������.� �����������������������������.� �����������������������������.��A������� ��1��m��������E[��u�+z?���J�y�YM?�n�!k�L?m��7Dl���6'5�S�F�1��c�ZV�
��pM��?F���~��|&`��&��]���Y����9f�Ue�����JG�T������ W�n[��Y�]x��T����)*�����������oWD��
DW�]�us���O��X��7�=�WT��wJ��e�z�;�^Ae�E���� g��U��x>���I]j��w;�j���{w����������)a��_��dL.��b��q�Z���(��i����Y���ms����b��v�B] ����[�69���F��.�n��^>l��]?�Q���.�7o��'���6��p��9�
*�/���\ Bp���{�]�s��?��D�t� ��D�w+���(��w���\���y�"�p��f���br�1w�{���������������������+ �s��5y��������VP�����f|q[@���u���
�+�I�s���u��l��Xox�P�V����mQ�B����S� ��;{K�i������on��������_��G�z��[�P�����/���� ���o�Q������{�f�Gg~�=a��[���>��7�)��vl��{*>l&u��m|c�'R?���g R�Li3ls���f��GS��|��!��~�X���C7��x����
��%�I�z%�t��Z3{��'�b��$h���>b�;/��x|�������M�~U<�<����Sq
��kL���e���(�&�!3M��qJ���J�[ �� 51e����U���o;_��g���z����.i7���}�������W�Pg����?��~�Bi�����+:�� de�l��z���q�uL�k���^�Z�;����c�fN��_����-
�Mj�JtZ�y\#/ e��f��it������y���x�-v+�������^(��O�-;���u��_\me�!�%��sc6]vi�O�&j����d�VN=��1 vL3����/`E��:v �9@�1.�c l��r����6�J�%%�W����N�\��Mg��3 [u�i?K��P�� �D�a�]�X�b���G�:�}�
�? wK�>`�n�j~P�����ar�f&L
�mi���>��*�Kn�*O�z�E�����.�k N�����?��w���s�����}}���TuA�X�����M�������'g��]�5��Y�����oO?�"��V��6/�r���x�@���I����SrA�a[���b���������g3�B�9�
@N��v�1M�^���c��9��k ��f����N6�����Vnd��Q���9c�(p ��.A�'`�K�p��T"�U��=����B5��zXY eO'�|R����EEE�w�^��7����5.W���cV�7�c
h�����w_kT�r�2O�k7am�� /2�M[���}��My���eK������F�F������~�z.����~u2��QS���{����^��e�4�>k��{U��om�<�e�[�|M�zCW����oC#�]wd��#���~�J������t$���/�l����;��]�B�����V��_��6���~����^ x�����"w��X�W|�t��{����5
!8��N3M�.������u*���B��;�n���I?���j>�p��[ 7
����� r��4%��
�N�[d�Svb��_��)�.t��y��j���#�5���o�<�/(�,�Om���"������&{�y����Kh �Q�2�3����1��&�>�jt����KU���[�[���X�H��nQ�6��Y�����"�W`{��d��U�=ZN�{l��F~�����g����!���?}�M�6�i��i*�������4���F�" `��3�1�������������@���7��r�&]m|t�h�U���RbU���iIm�r:g�j>�v�Y��m�������B-3�l2�,��Y�hQ�b��q��f�`����}{������6�e�KN���E�`X�Km����2d���90�����pH����x|X�������]�J�.3����;����h"9�^z4��,9������~��w���~�8P��Un��/��7�2�N�@���By��;d���~
�0���:��~��.���1�'�W�i�����\r�Cnder�p:G-�,����-�]��\���t����,�M���o��zXY eN�b�d�{����s�OD�o��h������6ymF�f�7��������7Y?sm��
m[���������
y��3��?e��T ��a.����
x����uJ������?1���f��
n��s��X�kk��~u]�7��Z|. 2.o[�#.I���;�w}�V9��X�W�vm�_\3���bu���+_}y��"-���-{�i������u�����$���0���#��o�a��$=��/���{�Y���n
!�)M�7:�K�92tGt2�r������7G��H? �d��W.�D�3R�HnC�[�RS�l�3]wm����~�8yH�[��}�����9�A���4��#7t .G���G +@���U������t�B��Q�n<������c5�6v�`��(��������N<1)���M�WKjp��5<���s>��T���<�nt��?g��u*'�^p5k�����(w���i�r��9 N{���29[v�Bl��4�p��*Z9�j:b� �]-���!5� �N"�-�V,�� ���Q�N `������� �v�I� �s���%���-e-�H�����J�D�z�o�>e�n������=�.�s&
�o����=_��&�R���,��F���]���k
|�I M�-k������7��a[�����u:� ��t7$s(�TQ����$���k��Cbder,�t �:vu��$v�p�����n:�.���m~�l�@����?�
������.^N3�����+B������P��[}z�~�F����<o���� M����:~Af������-���������oB���K{U��tg���,���Z\d ������e�''%���G 4A+��\�� ���K�w3��-���[v���M������k�����mT�wz|���ei����hR���B S+W@��)����:!����JkpQH�dD�%�^Er�!�9�4@���2�f��������X2��_�zX����q�t���(�@v�R�3����1����;q��_�"O�s����}�n�o������a��@����>==]����&=^^^��!���=}����5�i�tj���w��������s>-�gD��j����1 �L3����8M��9�)��1��� ����y���������3 �� ���b�l3��-Z��X�"��e��f� �~ o� ���>1=��b��w�.����D ����*>����ds*� �a����������G�����/_;�^v�w�>>���!�G� ���e���9�&s@@�=K)m<%v�-�F,�
II ��/��u�� ��k�#+�c�����[8��nN�sqsVNg��=����N���(y:G/���k<�2���T�6gI��9s������Q���=�K5}���cw��1k����'��k��Tx|hN��Gs�_�Oyd�_F��<�N�l�q�����_������a/�Y��������fNP��
YmR
��g�^y�eFA��t�����P�&����.�����3� �l�r�Fw�l.59�4!��CNJWN�R��6����B=�L��?�����^�����~Q{�3=����� 8��g\:y:��35�z7�m�%������^�h��h�[�"�l5���s�]�,��I ��=�@���]-_ ��yU�O ��tl�t��,&3�����lOf3���,ZT�X���x�c�� ���<`�nQQ=�T�����;A�/���Z,�z@�Y�Ej� �b����8�_�,x�Q�'�9b�k
�"� h�Z�+{��w�{&� ��th4b�n�]�4���S@f���;��B�b���~B�� 8g��[���������f:�-��\���zXZ %O��'7��<*>��'����X��������F����/��w
�W(����! �������z�U��8+��4)g��?~x������s��\�A�s���KO�x����|�%\���U�jy ���:@(R�r1��Y���t�zx�{]��m��to���Q���w�~������ym�L�S�jy�[�X�[K���"<W3�� 4��W/�v���Y�-�~��FA19���YJ\2��4@
����4���7`���{�����~��h�%��SP�N�rf�2�:�z���?.���_O����Ku}����#YJ.vu��]
��������TU����v��?��w(�|c\����U�������/Hh��CN��o������U�b��Xs,I3w�1{�*S� 0_UfrX��cW���aW��3��w�d?�s�Bm��E��E� /[5� �KxC� �m�X7����������=k�@j�.�]U$Y= �
r� �y��I��H$�,�FA������^oZ�? -+���`����cW���`������
�<���''%�����D=�U�b�� C���?�$B �x{M�b�������)p���*��w.$3�EG�la gm/�F��Y��b�*����k��3�������������.\�A�P��A_�V�l��C��8�}�
�w��~�.!O�,U�A�q������ �QkVE��6���
��)]�����qX�&�L�����?^�d��U<�G��)2�� �5.W�b����3|��/x��/���W�t^�����*��X��Oz��P��Tw=�*�6}��"�^^^Fc���52����������w>��o�z��e�����W3o�w[gm�L�k�)�7��h]�|�*/���������^M?�O����Hr����|M��K�[�H���� M�fc'�O^5i�����>>e���C'����pL�!]N�2�3����1M���vmV�j�%�j��yu������R������5@zL�9��m;�/_�XP�oP����I;���D��\�]��'#.��+�Q�����O���L{s���S�8���f�y�]M��cW�Ts$#�t��1� m��YR�lU��������t���f]m��*XR��?���Id��E��%=:�[5��&|z���"m�-&�E��`��TQ m)k�.����*�����l����p�%. ��X"9�^��<���{���m?�������LE,v��j�OVn<[ ��uuFl������^�B��o��4�.��JN���:�dC �t{M�L�e��ic�[��`�k��w.nMi:�����>��^�����+5�''��R\������\��P�y��,�v��U�M�rC ����������f�������}~��N��x�����U#�$������
e$^�'��sD>S�z~���
�����K�$]���������� ���W/�_w=]�����:��x f�2 ���_\uf��*��]��pi�O�.�5�G�1���� ��}�����������hR�����^�o�
?�th��������*w��o�g�|G�x]q���N��:|���Y�-�~J�������Y� @$%
.�n� %��Kr�<*v��Y������l2�[�g������r��������NA$: �(W93K�X��>A�V�\.�;�^��]���t�Y������S8 [�p@w}��9�'��q����� z��.���Bw�dt�O��G���R���7�^�����L����]�M=�`�4��K�j�.v����������r����RtI �Ue�c+g��Zw} Mg��3�]m�zW��B�4���I�~��n����-��u ���k�?`�n1� y�5��09�me.�H��%�����G����<�= 8�P� 0��y�Je�I��C�o�+���?���Q4v��j����uO@yr ����B������W,^qrV'�vM +��|�1�?"w{M�t�e��s�����x�J������9����)%�hQ5[�0^�K��.���b�b���� �����n���������I/������_����?���fmB��Nl���r�0vN�m��m���O�
������~.�8���-��v�`����9�s��s�6��������z8W������(�N=��D�k�|~��M�����~i��q��>Z��W�h���{>{s�G����������~O�iS/��"�i��\��5�H%�Q^�� pQ�cy��n������������F��sk|n�����w���j2�[�QN����{���+2|����Uq�o�u��sNn��n���m�m����v�sA���nq���(�
r�p��u��m��;�!��m�p�����Q>�'�\�������|5���W�:�5$*����
'�w+��������=���d"")n[=�6p"rR��n L|EDDDDDDDDDDDDv�QDDDDDDDDDDDDd|EDDDDDDDDDDDDv�QDDDDDDDDDDDDd|EDDDDDDDDDDDDv�QDDDDDDDDDDDDd|EDDDDDDDDDDDDv�QDDDDDDDDDDDDd���QQQ�i�������Q� L3����-��O�
�N��?�6p�cW;������b�8�������9v�s��t���6p�cW������#""""""""""""���(""""""""""""�>�"""""""""""""���(""""""""""""�>�"""""""""""""���(""""""""""""�>�"""""""""""""���(""""""""""""�>�"""""""""""""���(""""""""""""��=�
hud~�"�xc*��1����^�:�W�:�5��'~6���8qoxh� [5 ��m�����fq����h]���O���45(E>�l���u�Kf�G�/�zDDDDDDDDDD�<u���G4��F���+#�3�G1�v�H��a�sz>���I�k���'6:"�.5(A!�l��d*cW� ���2����T�&�������?��@S��g#�4�V�d�W���{V-�tG����*^��)���S_<<����6J�����1���W�f����o;��;j�����K�V)�������n�~����2��h�� d�Q(�L���:�R�|�6����_�\�@�nl����}_��6�'}��I�Gn��&�A�O>���zI��{�G_8w��������@��d�i�cP��N�$: 9��i�������j����mvk=5"��o2�sp��9�/���Q}Z�.��������'�0�,Q~�]{@��(��PXt�_�$�Andx�D�rK2��IK �%�K�5� dM"��������X]=�X{�����b�����v�4��%��k�������p� X{N��|����b������ �����D��Ou[I IDAT�X>%"�[Ad�-�����<n����bR�7�7���&��6���:��[����)��c��f��� @��w�<�P/����2i+t�A����@�6����cV�����a���oXw%6������/���\�������{��t�7Zw: �Wp�y{y��~�����KV)��>P[J;6�O��Y4_���?#�M����,w`�'�"n{Wm=��Y3�zXg�G&��S8f�3�4gg�� ���4�dFA�� ��8z���tH5~E�t�1�&n:2t\����<���w���5#����\r��&QY�|!�� 4�D����`�U ��t>��� H���$�X���@��t 3�_:Joj�\�\��\�Tx#`������t��A� Z�N��a��F��
���`�1<@rW����b�`rKA���%��D^5z�*=@�t� **��w�k�=f��]#�������vU��0������f_�&��$����HTTTTT��&_�%�Wo?v����G�o_9��%3��z6�{���o|�h�����oY:��� �:,0-����#��U�J6�'�
_8������?�6���%4Y���m��_6�>~0l����j\(�pB�ZoK�����|��3o����
�DW2�+���j5f��}�������
.��`���6x�o���L�����>t$|��Yo�7D���b�i ���#+�H�,�=�����c��[��G��n��7��d�A*o]��%Bf~y6�34���~
�0���������R�-f�b�����S�������<�@�Wj���*b�M<������c�GlZG+��������n����;-<b��^��^Ub���
����������
�
?�s�����t�������Ny���4k�SS���C;'�� B����Gu��}m���7�����@��d�im?���D�d]� M~�8���y�25e:.�����-�6��HP8��s��A��9��s������Q����-���U�[`�g'���v�3p�Q�]�J]e���.����d���!�cW��5�B���M�d�+U���m�$�{"�� ���Z:���H�������xVw�q�0��S�jb����r���hy;M����=�y�����7Mie\E������CV3�a����%(��DUY�����3����9x��Y��������n�������s�h��~B�N���jOr����E{�����MB�� K1ka��
�����:���S�����������J��bf�n����=�������������4��vJ��V
CBBB���4�9A��;����x�z�{]?�U���/:�1<��
��b�'}�t�?q��Fc����q�J�~�z�Y��o��aK���[�!@����9[���i�.}�GT����E���<����}��Q�[4o�w�����3%�S����&�!��$���;M����K?�������������i��u�+�����B����i>���N3�t7"��?��y_E��h\.!��?&� #xV}��E^(�p��:�W������I��.����Y!�{v�$�W������������c:�*,�� �����z�V���7*�.����=�}*���_����s�� ���Tf����z��un��i�R������������iK��QT@QA6e��>�Av�Y��N�,:� ���X&�Q����Swcg�G����5��3U�st���jujz��<��tH���>w�\F�Z����o�P&�,Q��]�J^���$��O�c�vC0V�jjc[����Z"���vIi�tT��!rQ���P@TVN�����9y���R.���S���=�Y�Zp������T�z�Q�NM�����dr5Q`u��:��[
�R��7
\h�ADDDDDDDDDDDD���#�+����������gR�����y<��f��
n��s��X�kk��~u]�7��Z|. 2.o[�#.I���;�w}�V9���<>����c��}R���m�����JoZ}M�����o�~���e��z�QU��'�� {���=�S�O�^}�������y5h����5c>���]���c���K��7����
����/�:�hJ�Yki*���^����;�����3W������9��}��UH���Vd~eR��T1� 9)�DS/������|�~������������|�S(�q��Oz BR��i� �<�@�U/ �M�}���������nI�K �^u��g~�7%������Z��c?��{���f�'6��n�+��k�O�oTq: k.��Y�d��3��O^9�����u9���I�Gl��d���F����Y�%���<�'�����������'xVj��mUh��~���^:e��>�}������y�0
��GMs,��((�� ������,�cb���s;�E�
!1!Q���� (R8�\�#T
*"�F~7���#K�?dW��'�D�����=K_���P�vC�L���( 5��-�dv�- �g&1���/�65D.�����r��Q�^���)�N)
��v�4R��R�C�p%���][�z�Y4� +�)�c� �����Ro�O���.3Jd��6}�)]�Z��S�/��}��P�z��^�{,��5��<y��� ��}������I�����������x9��S�b�M*W���B�Zo�����������=S/�
� �d�>/���������kIy�� h�*V�O�s%���K�w3��9p��q�t���K���iJ�Y�yT�\q.d~e���� ��>UZ��:�>y�*��W����,���(UL�@vJ8�����w��7����K)��w�q���O
�����=�z���+Q�y��#�p}���3������/ �[�������������� ���Lt���[��?w:�Q��_vn[c�"����=fL~)v���?Q�F���n����Z��� O�gn�uB�WJl���� {��un�b�������[3y�SS�/���.#��������W%#�e���N���z���vf���i~f9��|l@�t��cnL�x7AW�>�8Y"�RX��SX��� ��+�15� ����]dI&�#0m U��R���J�"�em�0{�F�^
��5����T�}k>���V�w�������"4A��y%�c�����L�M������3�
�T��[��n�c���?$���S�_^����8�Z��.��'��@�������}4� �^�����r��=�� 4��Y�_�i�����&&em3}EOA� � ���]o�� ����������������?S��R�S�~�I�3a��b�8�j��y�*����K4�:Et~eQ�-��b� rR�y�����������{��/�^Ro��o�����O�d<�� �x��9}�� >��|��6������������P8tg{ ���S�/vl����u�2_`���+N��?��pM�BI���~���0����?�|s��Ro��t �@Fj��d����<�|������B�bAn� 3@����������?�M��>��s�J��; ��t��[��K&e��;�D�Y����$: f�U�>K*�DG!o�O��sGi ��|�"��{v�>��#��?��g?�:�,Q�P\�JPZ{�Z4��)���*(-�b��K�\�T�� P���Dt����!rI{(��)�F�����W2b~����w�tOn'������:�l�m������\�6-��c VS� ��-���������Gg�zw� h�t�5���s�S }Z�����7
l� Jw�����=W�3������Tx�x�r�RSQ��G@j�'�c.��z������sVz�N������xT|�)O�M�]�r9�����W����c_,��� �
�P�C{�C@N� $cO9������~�������������t �.%z��Z^s0VE*W.��� �qgB��b(3��oRJ$��Ez|�������H�)U�z��+���� 5�
�/�/ Qx��X�$�e�����!7���@vJ8�������g��z��[H��| Z_�� (�t����]Vk
[�_�~���3v_��K�c������.�*U�
�"ng>�
��9{v�{ �r�e��l�F �&�j�g|b�E���w��.��d��s!�E�2~��z �%K���~�^f�l��N�5��6��"]���I��t�7h��p:s�#����I37��v9��`V�I���((������#r�-���Oz�g�K[�����A��[�����F�'Y�|a������C�
��) uL>�VA��P�]nIf�k�j�CkT'���Y:Z��!r
2{(��)��4�{) R�]N�q;%�)[ �e�i%�k����X6r��I��?B������O: k�i� 0�������DUF�[Po�/^��7b��+!Ux����9_�f�b�V7
l�����5��_�p��7g��O������<�8�2.�=����������Qj���'��
OzL�9�������z�I�����2�����z������8��_���/�,zt��G�����W����O��r�I ]zj��$�xc���Y����zo`����n�������o��rqw�P��~�������f@:v ��S�&���'����hUl��������J���{F�����=�{�c�#+��:�D�e���0M$�,�[]��-�z|4t����_R�V�{��6""w6�)������s����y�����:��m��G���P= ��%���g� rR�y�^��O�������]L��m@�B�~?�x<��g����<s�v�6��+�
l����#7u��*��|/����SA�����;���?�n�������~��_I?u�����j���B���m�3 ����&���_����w��` HO�r����
����}?�������e�����`x���g��������G�<�����s����o��u���7���W�:�����s7���m?�����Fn��l�LB::�c��1 "]��L���Q������� H� ma ���.�-W�[�����r����)2���sO��}kw}0���nq�?��eX���e�G[�(�Ipd������� ��=�W����:&�2� } ��E�d2;�� ���.�l��DrKG�e?���� �2�S����KQ$uiS�Z��/�V�3w#��S�Z�U�zp�@�����[��&So� ���`�1- �b���K�= ���|y��2�����n�����cW��<��n�W�*��x������
�'�}=�������������7���:��[f��=y����|t����������&�����o�y&��z�d����VH���.��i���F�h� DLi3l��/g���|k�^�~�Y4��_�>���p�����������f�������}~��N��x���8 ��3��_1yi���&�b��3I�F����3�0���������3c�W�'YN� ��O��Y�4�Y�g��y����F��2x�������=u��#����_���s���)��/o�����W���)��������n�ta1� F)�4SOwc�����<����5Iwb�����oO�����T�[�q���C�����?�d�:�W�_* ���}��;��R��H�s���-�lN[����)���� e�Awg���fD���1��)W��&6
8k�qp�w����$t���s~��v�B���s7]��Q��|�s�Bg�n�`�Y����4��JE4�o^��y��G��F&e��9f� ��N�$: 9��i�2� H� 6�����
_�\e��f���}��K/��+�3���������o��{>���y��_�\��I��������������
0?Bt ��c2� �����]bI&��7k ��t �3��DrKG�e?���� �d*��^�J��6������
���x�I><w��?�D��'�{)���� ,>fHz?.�������`}���� ��g�;/�&9@�u� **��� b��.��m�m�m�v������;��1��GV��p,��=�����;w������;��(_�N=���ADDDDDDDDDDDD����������������.� �����������������������������.� �����������������������������.� *�MhxTTTTT������������C��{���^%��������$�������ml�
��~pZ3��eF[��u�+z?�����d��D�y�`��/���6�r�Uld��-DDDDDDDDDD���Q'z�}D���l4n��2R?�sh���k6:������T��v��yb�#��R����yK��0v��V��_p�� F�iY�����c��^~��.������v���B����}��3�I�b"�}��Gnd �Tn���.
�U(��v/���U���y�Wn�&�A�O>���zI��{�G_87�%�3�}���������Z��� ����x}*�S_<<����-�<��j����mvk=5"����O]1�u F!������_W�_E��~���v��w�b3 E��2xP�W���&�8����"������N��������iF��� �j���h,����*��j����E��u��Iw����YrY"�l��W����R@Q�'�������Go�oc'�QcDZ"Ysrr�H�I�Q���T��� �z]
U`m��&b� ;%'��KD�G�C��*��v��v���x)/b�hgN#�� [�f��~ ,>�������d9Y��k� ����
P���E�8�=�RC��|JD~���2������P����q�\�?���GO�3n���}>���Z!�?��>
?�;�y�� ����o�o��_M��m�oWu4i�"�}�S���z����>aA��w�<��^�����'�X���O&E����z��f$��*N��c��t_��YB!��Q��=�3���jF[������K�O�y�)����Z�!U�eC�q� �B��OC��1�GO�v���NO�W1�^�}�[��{���F����]�J�����������\ 8x:��8��ZS�� �����!Q 6�FA�����]d�\g]�LWC�[d�%B�����D�:��� n�=�m����<��^v^��Al�Qc�Z"Ssrr��t�|��%�f�� �y]�i��\ V��i"�~ 23Pf��D���J1����+7_�Y/��"��v����_3K����MG��� ���|r5�+����_�
P���%��D^5z��f�a�"�JQQQ�C(\��1��:q4|��%���j��V����7�b4�D'�D��G������g�0�",��z���7��8�����_(��T������gt��Ekw�=���a/�� ��a��h9�DE�����Wiy�8qo���]�}�e�����6.��jG�n��y����a��fT��B9�
�z["�\_�[��y;�,� P�&�������T�1K7��s]����op����,���;���g���v��#�{7�z��!����O��Y�D2dQ������h�S~��/<���uG�Y�'�
Ry��,,2���������T��S��!��FO|�$��g���/�X<���xn���_�UD�K$�a5Y�� d��Z��\' ���P�����;�
xy���|���ix]�������IVZ����?5�{,�8�����������EO���h0f{���u��V��B����Gu���jV�oZ�l6h�v[vh�������Z��6L{=x�������<�H�2���M�������B����\F� hkZ�S���]��O
k>5l��v��\�3m��H�9�J��3��I9��ZU�E��jQ�PY %�j��j�It��e(��Y������������
?�s���/�����l���j�q��}���L��G2�g�^4�{,�����������2����%�m�mw<����[T�E�&Vm�,���/!H9��i��\�T�1������iJ+����Xj��!����0Jd]��K��������������+@m��k�z�����/�f4 IDAT0,]���z6�������E#&����S�.���\���_n������ap��&c�R�ZX���E�k��8�G���&��������?�'����������l3|O�;��7������&!!����9�U�����&!!M�nNP����a�7���^�fG��1��ew�}������I�.]�O\w����G6-d�R��_��s��[?b������Dg��;p������K���:��`�
o��Q����������y���w�� ��)��R$GV4�YdH$��-�iZ��U�]�q�N�'l���������O������_y�{~��{2:5�N�\WM�:5�N�����h"��\J(M����Z���'O��<��tH���>w�\F�Z����������L���C���4k^�W�G�:-�K?���oR ��U_k�B��'��:�vj<=��fC����-]���o�xT�������"s�������_]0w���{�8����n��h����b�� !�Ja��?
�O�\v���?����Va�?A������=�}*���_����s�� ����S�t��I����L�<#kU
��W-��
`d�Q#�����,Y���Y'�����rY |�|�����������%�k�i��|����,a����.�nD:g]�QU=��/'e&�%�����b���������EsV���QM)k�)�H��,S-8��L�j&�D.�;��V��������2���
�����[�T���/be���#�
y��3��?e��T ��and����g�����-V���y�_]���j��� ����V��K�#�����]_�UNs46�������%�.�W|����������Nwg�w�d���m�6�m�^���O�g@&��g��zZ���>���k��9��j��m��k��|,V���������-j�"��o����: �#�_ru&�I��S���ZKS����v����������vh����� ����Bb~���f�+���s>��T���<�nt��?g��u*'�^p5�lb�d5��\`y
*���D z??M@�����RP7��u�'�?���V�0����'d�=�7����/��6��n���� $���6h�����2�|p2�����=����x�g�V}�V�&�h� �e�_�Nm������h����?�}���K���{�4��^PA�?����>oJ�?��T��>�.~����5�e+�g��A ���~�����%}���a_ON�>bs�wH��u���9z:��8��Z[��^��z�sa��(H��O��P���, �-��I��%�E�nY B`�~��������~{��!H�D��r�i��+VfK2��Y?��u!���� �);*�Gv���@�I�~~���������F�d�*_�<����v�4R��R�C�p%���m3���g��B �8�L��w�L/�*B����Ky���MN5+@u�
l� JS�z�����_��F(R�zi��=�����O����W�k� �����'���I�������^��w�r��K���"4�\Q�k:�k���G��k�+Z��C���L��+ 'p ����8bv?�+�b�%�!: � �X�?�R���/�������]G����&�N�3a��R$G�j+W@������:!����JkpQ�'o]��������{�����kj�^���Y1��5'���?��|��MI$�)Ngw�O���+~�����>�������5���y)�X�n#>��A�Ia�2KT��i=O���������S\��o����n�.n���OM���^����[v�_��\SMj��������A^S���q!M�~_�,�jB�O=t���[���;����/;���n���9�|ZhhN���Kx��8�p����Rb�o7u��\��]kcA�#kM
�)���ZZ=
@
��<�������E��P�jqj�%�P����/����o;~0HM~R-��9� 9�e�U���|���; ���,9� 3e&���E��$K1&��e���Z���O����SLh5z��K�y;��)r@������1����C&��jwl�1�Y4@y&{������p��Z�����
� �1���?$SDN�~ye�n��jI���������{���'<z}�#���������T�6gI��9s������Q���=�'�'�>vw��&�:�{�������F�����+/��&�����Hg��?��qd�29���Udw�q�h4u�����t�tR�gj��n���K���)�v��=�I9�I$�)O�*-��c�/Z��b����O>�.���{���HY-�?��>�w{�f������O\�4{I��3{����+2?����Z����3����~�����mpO���q����cNP��xx'����.�s��Q�
m;?�;<qO��`���BJU(��}��xxj�����v����gMB�xx0
A�?��\�����/�Tz���x?Y��We�9~r��}�R� 7
?#v��f�t�~v%���0�3�V����#����c]9����(�1��X����B��%(-K��w�`��)�����i�Gr@~���2m�b���1����bU`�������>�w{�&