NDirectFileRead and Write
Hello,
I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
for counting reads and writes in BufFile. They are defined, but not used
now. BufFile is used for tuple sorting or materializing, so we could use
NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.
We can get the values of NDirectFileRead/Write when we enable
log_statement_stats, log_parser_stats, log_planner_stats or
log_executor_stats. (They show always 0 for NDirectFile R/W now.)
I have a plan to store the values shown by log_xxx_stats into
pg_stat_statements, that is per-query statistics information collector
I'm developing. The combination of NDirectFile R/W and pg_stat_statements
have an advantage to detect which sql uses large tuple sorting.
And after we find such queries, we can turn trace_sort on and examine
how amount of work_mem is needed by such queries. We could save time
for tuning queries and reading server logs.
-- The patch would be very trivial ;-)
Index: src/backend/storage/file/buffile.c
===================================================================
--- src/backend/storage/file/buffile.c (HEAD)
+++ src/backend/storage/file/buffile.c (working copy)
@@ -238,6 +238,7 @@
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
/* we choose not to advance curOffset here */
+ NDirectFileRead++;
}
/*
@@ -300,6 +301,7 @@
file->offsets[file->curFile] += bytestowrite;
file->curOffset += bytestowrite;
wpos += bytestowrite;
+ NDirectFileWrite++;
}
file->dirty = false;
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Here is a patch to user NDirectFileRead/Write counters to get I/O counts
in BufFile module. We can see the counters when log_statement_stats is on.
The information is different from trace_sort; trace_sort shows used blocks
in external sort, and log_statement_stats shows how many I/Os are submitted
during sorts.
I wrote:
I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
for counting reads and writes in BufFile. They are defined, but not used
now. BufFile is used for tuple sorting or materializing, so we could use
NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.
=# SET client_min_messages = log;
=# SET trace_sort = on;
=# SET log_statement_stats = on;
=# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1000000) AS i ORDER BY i;
LOG: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
LOG: switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 0.35 sec
LOG: performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
LOG: finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG: performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG: external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 3.06 sec
LOG: QUERY STATISTICS
DETAIL: ! system usage stats:
! 3.078000 elapsed 2.234375 user 0.812500 system sec
! [3.328125 user 1.281250 sys total]
! buffer usage stats:
! Shared blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Local blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Direct blocks: 5375 read, 5374 written
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 rows=1000000 loops=1)
Sort Key: i
Sort Method: external sort Disk: 19552kB
-> Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=349.065..892.907 rows=1000000 loops=1)
Total runtime: 3087.305 ms
(5 rows)
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Attachments:
NDirectFileReadWrite.patchapplication/octet-stream; name=NDirectFileReadWrite.patchDownload
diff -cpr HEAD/src/backend/storage/buffer/bufmgr.c NDirectFileReadWrite/src/backend/storage/buffer/bufmgr.c
*** HEAD/src/backend/storage/buffer/bufmgr.c Fri Aug 1 22:16:08 2008
--- NDirectFileReadWrite/src/backend/storage/buffer/bufmgr.c Tue Aug 5 15:00:26 2008
*************** int bgwriter_lru_maxpages = 100;
*** 65,73 ****
double bgwriter_lru_multiplier = 2.0;
! long NDirectFileRead; /* some I/O's are direct file access. bypass
! * bufmgr */
! long NDirectFileWrite; /* e.g., I/O in psort and hashjoin. */
/* local state for StartBufferIO and related functions */
--- 65,76 ----
double bgwriter_lru_multiplier = 2.0;
! /*
! * Some I/O's are direct file access and bypass bufmgr. They are typically
! * disk sorting and materializing using BufFile.
! */
! long NDirectFileRead;
! long NDirectFileWrite;
/* local state for StartBufferIO and related functions */
diff -cpr HEAD/src/backend/storage/file/buffile.c NDirectFileReadWrite/src/backend/storage/file/buffile.c
*** HEAD/src/backend/storage/file/buffile.c Fri May 2 10:08:27 2008
--- NDirectFileReadWrite/src/backend/storage/file/buffile.c Tue Aug 5 15:00:26 2008
*************** BufFileLoadBuffer(BufFile *file)
*** 238,243 ****
--- 238,244 ----
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
/* we choose not to advance curOffset here */
+ NDirectFileRead++;
}
/*
*************** BufFileDumpBuffer(BufFile *file)
*** 300,305 ****
--- 301,307 ----
file->offsets[file->curFile] += bytestowrite;
file->curOffset += bytestowrite;
wpos += bytestowrite;
+ NDirectFileWrite++;
}
file->dirty = false;
diff -cpr HEAD/src/include/executor/execdebug.h NDirectFileReadWrite/src/include/executor/execdebug.h
*** HEAD/src/include/executor/execdebug.h Wed Jan 2 04:45:57 2008
--- NDirectFileReadWrite/src/include/executor/execdebug.h Tue Aug 5 15:00:26 2008
*************** extern int NIndexTupleInserted;
*** 190,208 ****
#define MJ_DEBUG_PROC_NODE(slot)
#endif /* EXEC_MERGEJOINDEBUG */
- /* ----------------------------------------------------------------
- * DO NOT DEFINE THESE EVER OR YOU WILL BURN!
- * ----------------------------------------------------------------
- */
- /* ----------------
- * NOTYET is placed around any code not yet implemented
- * in the executor. Only remove these when actually implementing
- * said code.
- * ----------------
- */
- #undef NOTYET
-
- extern long NDirectFileRead;
- extern long NDirectFileWrite;
-
#endif /* ExecDebugIncluded */
--- 190,193 ----
diff -cpr HEAD/src/include/storage/buffile.h NDirectFileReadWrite/src/include/storage/buffile.h
*** HEAD/src/include/storage/buffile.h Tue Mar 11 05:06:27 2008
--- NDirectFileReadWrite/src/include/storage/buffile.h Tue Aug 5 15:00:26 2008
***************
*** 30,35 ****
--- 30,39 ----
typedef struct BufFile BufFile;
+ /* event counters in buffile.c */
+ extern long NDirectFileRead;
+ extern long NDirectFileWrite;
+
/*
* prototypes for functions in buffile.c
*/
2008/8/5 ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>:
Here is a patch to user NDirectFileRead/Write counters to get I/O counts
in BufFile module. We can see the counters when log_statement_stats is on.The information is different from trace_sort; trace_sort shows used blocks
in external sort, and log_statement_stats shows how many I/Os are submitted
during sorts.I wrote:
I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
for counting reads and writes in BufFile. They are defined, but not used
now. BufFile is used for tuple sorting or materializing, so we could use
NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.=# SET client_min_messages = log;
=# SET trace_sort = on;
=# SET log_statement_stats = on;
=# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1000000) AS i ORDER BY i;
LOG: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
LOG: switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 0.35 sec
LOG: performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
LOG: finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG: performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG: external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 3.06 sec
LOG: QUERY STATISTICS
DETAIL: ! system usage stats:
! 3.078000 elapsed 2.234375 user 0.812500 system sec
! [3.328125 user 1.281250 sys total]
! buffer usage stats:
! Shared blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Local blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Direct blocks: 5375 read, 5374 written
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 rows=1000000 loops=1)
Sort Key: i
Sort Method: external sort Disk: 19552kB
-> Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=349.065..892.907 rows=1000000 loops=1)
Total runtime: 3087.305 ms
(5 rows)Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Though I'm not quite good around buffers, I reviewed this patch and
found that it seems strange about the declaring position of the two
variables. In ResetBufferUsage() of bufmgr.c it says:
void
ResetBufferUsage(void)
{
BufferHitCount = 0;
ReadBufferCount = 0;
BufferFlushCount = 0;
LocalBufferHitCount = 0;
ReadLocalBufferCount = 0;
LocalBufferFlushCount = 0;
NDirectFileRead = 0;
NDirectFileWrite = 0;
}
so I guess all of these variables should be defined together but
actually you put the two in buffile.h while the others in
buf_iinnternals.h. Is there clear reason for that?
Regards,
--
Hitoshi Harada
"Hitoshi Harada" <umi.tanuki@gmail.com> wrote:
so I guess all of these variables should be defined together but
actually you put the two in buffile.h while the others in
buf_internals.h. Is there clear reason for that?
That's because buffile.c includes buffile.h, but not buf_internals.h .
NDirectFileRead/Writes are counters for BufFile in the patch
and don't depend on bufmgr.
It might be cleaner if we had something like storage/bufstats.h
and declared all counter variables in it, but it seems to be an overkill.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
Here is a patch to user NDirectFileRead/Write counters to get I/O counts
in BufFile module. We can see the counters when log_statement_stats is on.
Couple thoughts here:
* Let's rename them BufFileReadCount and BufFileWriteCount to reflect
their actual purpose.
* If the extern's are moved to buffile.h, I think the definitions of the
variables themselves should move to buffile.c. However, that would
imply including buffile.h in bufmgr.c which is a bit ugly from a
modularity standpoint. In any case I agree that the current arrangement
with execdebug.h declaring variables defined in bufmgr.c is just weird.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote:
* Let's rename them BufFileReadCount and BufFileWriteCount to reflect
their actual purpose.
It looks good. No one uses NDirectFileRead/Write variables now,
so we can rename it freely.
BTW, we have the word 'Direct' in ShowBufferUsage().
Can I keep it as-is?
ShowBufferUsage()
! Shared blocks: ...
! Local blocks: ...
! Direct blocks: <R> read, <W> written
* In any case I agree that the current arrangement
with execdebug.h declaring variables defined in bufmgr.c is just weird.
Ok.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
BTW, we have the word 'Direct' in ShowBufferUsage().
Can I keep it as-is?
ShowBufferUsage()
! Shared blocks: ...
! Local blocks: ...
! Direct blocks: <R> read, <W> written
Good point. Seems like it should be changed, but I am not sure to what.
Ideas anyone?
regards, tom lane
Here is a revised patch to renewal NDirectFileRead/Write.
Tom Lane <tgl@sss.pgh.pa.us> wrote:
* Let's rename them BufFileReadCount and BufFileWriteCount to reflect
their actual purpose.
* In any case I agree that the current arrangement
with execdebug.h declaring variables defined in bufmgr.c is just weird.
- NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
- They are still declared in execdebug.h and buffile.c includes it.
I did not touch messages in ShowBufferUsage() in the patch.
I think it still has meaning because BufFile counters are
kinds of direct block access.
ShowBufferUsage()
! Shared blocks: ...
! Local blocks: ...
! Direct blocks: <R> read, <W> written
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Attachments:
BufFileCount.patchapplication/octet-stream; name=BufFileCount.patchDownload
diff -cpr HEAD/src/backend/storage/buffer/bufmgr.c BufFileCount/src/backend/storage/buffer/bufmgr.c
*** HEAD/src/backend/storage/buffer/bufmgr.c Mon Aug 11 20:05:11 2008
--- BufFileCount/src/backend/storage/buffer/bufmgr.c Tue Sep 16 10:27:22 2008
*************** bool zero_damaged_pages = false;
*** 64,73 ****
int bgwriter_lru_maxpages = 100;
double bgwriter_lru_multiplier = 2.0;
!
! long NDirectFileRead; /* some I/O's are direct file access. bypass
! * bufmgr */
! long NDirectFileWrite; /* e.g., I/O in psort and hashjoin. */
/* local state for StartBufferIO and related functions */
--- 64,75 ----
int bgwriter_lru_maxpages = 100;
double bgwriter_lru_multiplier = 2.0;
! /*
! * Some I/O's are direct file access and bypass bufmgr. They occur
! * typically in disk sorting, hashjoin and materializing using BufFile.
! */
! long BufFileReadCount;
! long BufFileWriteCount;
/* local state for StartBufferIO and related functions */
*************** ShowBufferUsage(void)
*** 1572,1578 ****
ReadLocalBufferCount - LocalBufferHitCount, LocalBufferFlushCount, localhitrate);
appendStringInfo(&str,
"!\tDirect blocks: %10ld read, %10ld written\n",
! NDirectFileRead, NDirectFileWrite);
return str.data;
}
--- 1574,1580 ----
ReadLocalBufferCount - LocalBufferHitCount, LocalBufferFlushCount, localhitrate);
appendStringInfo(&str,
"!\tDirect blocks: %10ld read, %10ld written\n",
! BufFileReadCount, BufFileWriteCount);
return str.data;
}
*************** ResetBufferUsage(void)
*** 1586,1593 ****
LocalBufferHitCount = 0;
ReadLocalBufferCount = 0;
LocalBufferFlushCount = 0;
! NDirectFileRead = 0;
! NDirectFileWrite = 0;
}
/*
--- 1588,1595 ----
LocalBufferHitCount = 0;
ReadLocalBufferCount = 0;
LocalBufferFlushCount = 0;
! BufFileReadCount = 0;
! BufFileWriteCount = 0;
}
/*
diff -cpr HEAD/src/backend/storage/file/buffile.c BufFileCount/src/backend/storage/file/buffile.c
*** HEAD/src/backend/storage/file/buffile.c Fri May 2 10:08:27 2008
--- BufFileCount/src/backend/storage/file/buffile.c Tue Sep 16 10:27:22 2008
***************
*** 34,39 ****
--- 34,40 ----
#include "postgres.h"
+ #include "executor/execdebug.h"
#include "storage/fd.h"
#include "storage/buffile.h"
*************** BufFileLoadBuffer(BufFile *file)
*** 238,243 ****
--- 239,245 ----
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
/* we choose not to advance curOffset here */
+ BufFileReadCount++;
}
/*
*************** BufFileDumpBuffer(BufFile *file)
*** 300,305 ****
--- 302,308 ----
file->offsets[file->curFile] += bytestowrite;
file->curOffset += bytestowrite;
wpos += bytestowrite;
+ BufFileWriteCount++;
}
file->dirty = false;
diff -cpr HEAD/src/include/executor/execdebug.h BufFileCount/src/include/executor/execdebug.h
*** HEAD/src/include/executor/execdebug.h Wed Jan 2 04:45:57 2008
--- BufFileCount/src/include/executor/execdebug.h Tue Sep 16 10:27:22 2008
*************** extern int NIndexTupleInserted;
*** 190,208 ****
#define MJ_DEBUG_PROC_NODE(slot)
#endif /* EXEC_MERGEJOINDEBUG */
! /* ----------------------------------------------------------------
! * DO NOT DEFINE THESE EVER OR YOU WILL BURN!
! * ----------------------------------------------------------------
! */
! /* ----------------
! * NOTYET is placed around any code not yet implemented
! * in the executor. Only remove these when actually implementing
! * said code.
! * ----------------
! */
! #undef NOTYET
!
! extern long NDirectFileRead;
! extern long NDirectFileWrite;
#endif /* ExecDebugIncluded */
--- 190,197 ----
#define MJ_DEBUG_PROC_NODE(slot)
#endif /* EXEC_MERGEJOINDEBUG */
! /* event counters in buffile.c */
! extern long BufFileReadCount;
! extern long BufFileWriteCount;
#endif /* ExecDebugIncluded */
Tom Lane wrote:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
BTW, we have the word 'Direct' in ShowBufferUsage().
Can I keep it as-is?ShowBufferUsage()
! Shared blocks: ...
! Local blocks: ...
! Direct blocks: <R> read, <W> writtenGood point. Seems like it should be changed, but I am not sure to what.
Ideas anyone?
How about "Temp blocks"? It might be slightly confusing that access to
temporary tables shows up in "Local blocks", rather than "Temp blocks",
though. But then again, temp_tablespaces already controls where the
files are created, among other things, so there's some precedence to
calling those things "temp".
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
- NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
- They are still declared in execdebug.h and buffile.c includes it.
After some thought I concluded that the least ugly way to do this
was to declare and define the variables in the same places as the
other counters that are printed by ShowBufferUsage. It's not any
worse from a modularity standpoint than the other alternatives we
considered, and it seems to satisfy the principle of least
surprise a little better. Committed it that way.
I did not touch messages in ShowBufferUsage() in the patch.
I didn't change the message either. Heikki's suggestion of "temp
blocks" doesn't seem very good to me because of the likelihood
of confusion with temp tables. I don't have a better alternative
offhand.
regards, tom lane