Buffer Requests Trace

Started by Lucas Lerschabout 11 years ago19 messages
#1Lucas Lersch
lucaslersch@gmail.com

Hello,

I changed the buffer manager code in order to generate a trace of page
requests from the buffer manager perspective. In summary, whenever
ReleaseBuffer() or ReleaseAndReadBuffer() are called, I print the page
currently being released which is identified by the tuple (tableSpace,
dbNode, relationNode, blockNumber).

I am now running a tpcc benchmark from http://oltpbenchmark.com/

Initially I create and load the database with a scale factor of 64. This
sums up to a database of around 6.7GB (~ 800k pages). Then I execute the
tpcc benchmark for 1 minute with only 1 terminal. Finally I analyse the
trace of the buffer requests made by the execution of the benchmark only
(creation and loading not considered).

Unfortunately, in the generated trace with over 2 million buffer requests,
only ~14k different pages are being accessed, out of the 800k of the whole
database. Am I missing something here?

Best regards.
--
Lucas Lersch

#2Stephen Frost
sfrost@snowman.net
In reply to: Lucas Lersch (#1)
Re: Buffer Requests Trace

* Lucas Lersch (lucaslersch@gmail.com) wrote:

Unfortunately, in the generated trace with over 2 million buffer requests,
only ~14k different pages are being accessed, out of the 800k of the whole
database. Am I missing something here?

What do you have shared_buffers set to..?

Thanks,

Stephen

#3Lucas Lersch
lucaslersch@gmail.com
In reply to: Stephen Frost (#2)
Re: Buffer Requests Trace

Sorry, I do not understand the question.

But I forgot to give an additional information: I am printing the page id
for the trace file in ReleaseBuffer() only if it is a shared buffer, I am
not considering local buffers. I assumed that local buffers were used only
for temporary tables.

On Tue, Oct 14, 2014 at 6:25 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Lucas Lersch (lucaslersch@gmail.com) wrote:

Unfortunately, in the generated trace with over 2 million buffer

requests,

only ~14k different pages are being accessed, out of the 800k of the

whole

database. Am I missing something here?

What do you have shared_buffers set to..?

Thanks,

Stephen

--
Lucas Lersch

#4Lucas Lersch
lucaslersch@gmail.com
In reply to: Lucas Lersch (#3)
Re: Buffer Requests Trace

shared_buffers is 128MB and the version of pgsql is 9.3.5

On Tue, Oct 14, 2014 at 6:31 PM, Lucas Lersch <lucaslersch@gmail.com> wrote:

Sorry, I do not understand the question.

But I forgot to give an additional information: I am printing the page id
for the trace file in ReleaseBuffer() only if it is a shared buffer, I am
not considering local buffers. I assumed that local buffers were used only
for temporary tables.

On Tue, Oct 14, 2014 at 6:25 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Lucas Lersch (lucaslersch@gmail.com) wrote:

Unfortunately, in the generated trace with over 2 million buffer

requests,

only ~14k different pages are being accessed, out of the 800k of the

whole

database. Am I missing something here?

What do you have shared_buffers set to..?

Thanks,

Stephen

--
Lucas Lersch

--
Lucas Lersch

#5Stephen Frost
sfrost@snowman.net
In reply to: Lucas Lersch (#4)
Re: Buffer Requests Trace

* Lucas Lersch (lucaslersch@gmail.com) wrote:

shared_buffers is 128MB and the version of pgsql is 9.3.5

I suspect you're not tracking what you think you're tracking, which is
why I brought up shared_buffers.

~14k * 8192 (page size) = ~110MB

What it sounds like you're actually tracking are shared buffer requests
and not heap or index requests.

Now, perhaps the test you're running only touched 110MB of the 6G
database, but that seems pretty unlikely.

Thanks,

Stephen

#6Lucas Lersch
lucaslersch@gmail.com
In reply to: Stephen Frost (#5)
Re: Buffer Requests Trace

Aren't heap and index requests supposed to go through the shared buffers
anyway?

On Tue, Oct 14, 2014 at 7:02 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Lucas Lersch (lucaslersch@gmail.com) wrote:

shared_buffers is 128MB and the version of pgsql is 9.3.5

I suspect you're not tracking what you think you're tracking, which is
why I brought up shared_buffers.

~14k * 8192 (page size) = ~110MB

What it sounds like you're actually tracking are shared buffer requests
and not heap or index requests.

Now, perhaps the test you're running only touched 110MB of the 6G
database, but that seems pretty unlikely.

Thanks,

Stephen

--
Lucas Lersch

#7Stephen Frost
sfrost@snowman.net
In reply to: Lucas Lersch (#6)
Re: Buffer Requests Trace

* Lucas Lersch (lucaslersch@gmail.com) wrote:

Aren't heap and index requests supposed to go through the shared buffers
anyway?

Sure they do, but a given page in shared_buffers can be used over and
over again for different heap and index pages..

Thanks,

Stephen

#8Lucas Lersch
lucaslersch@gmail.com
In reply to: Stephen Frost (#7)
Re: Buffer Requests Trace

I see this... but ReleaseBuffer() simply decrements the reference count of
page the buffer currently holds. Assuming that a ReadBuffer() -
ReleaseBuffer() pattern is used for interacting with the shared_buffers,
there will be a ReleaseBuffer() call for any page (heap or index) "loaded"
into the shared_buffers.

On Tue, Oct 14, 2014 at 7:21 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Lucas Lersch (lucaslersch@gmail.com) wrote:

Aren't heap and index requests supposed to go through the shared buffers
anyway?

Sure they do, but a given page in shared_buffers can be used over and
over again for different heap and index pages..

Thanks,

Stephen

--
Lucas Lersch

#9Stephen Frost
sfrost@snowman.net
In reply to: Lucas Lersch (#8)
Re: Buffer Requests Trace

* Lucas Lersch (lucaslersch@gmail.com) wrote:

I see this... but ReleaseBuffer() simply decrements the reference count of
page the buffer currently holds. Assuming that a ReadBuffer() -
ReleaseBuffer() pattern is used for interacting with the shared_buffers,
there will be a ReleaseBuffer() call for any page (heap or index) "loaded"
into the shared_buffers.

Not sure what you're getting at here. This was the original comment
that I was addressing:

-----------
Unfortunately, in the generated trace with over 2 million buffer requests,
only ~14k different pages are being accessed, out of the 800k of the whole
database. Am I missing something here?
-----------

So, there's 2MM buffer requests with only ~14k different pages even
though the database consists of ~800k different pages.

Either your short benchmark is only hitting ~14k different pages out of
the ~800k, or what you're actually looking at are the ~14k pages (eh,
more like 16k, but whatever) of the shared_buffer cache. Somewhere in
your analysis of the 2MM buffer requests you reduced the set of buffer
requests down to the set of "~14k different pages" that you're asking
about here.

What would be helpful here would be actual code changes you made (eg: a
patch), the resulting buffer request data (or at least a snippet of it),
and exactly how you did your analysis to come up with the ~14k number.

Thanks

Stephen

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Lucas Lersch (#1)
Re: Buffer Requests Trace

On 14 October 2014 17:08, Lucas Lersch <lucaslersch@gmail.com> wrote:

Unfortunately, in the generated trace with over 2 million buffer requests,
only ~14k different pages are being accessed, out of the 800k of the whole
database. Am I missing something here?

We can't tell what you're doing just by knowing the number of unique
items in your list.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Lucas Lersch
lucaslersch@gmail.com
In reply to: Simon Riggs (#10)
1 attachment(s)
Re: Buffer Requests Trace

Sorry for taking so long to answer. I am sending attached the patch with
the changes I did to pgsql code. I followed the steps for compiling and
installing pgsql from:
http://www.postgresql.org/docs/current/static/install-short.html

In summary, the page_id of the page being released in ReleaseBuffer() and
ReleaseAndReadBuffer() is written to the file: /usr/loca/pgsql/data/trace.
This file is created manually.

I have also created a PrivateDirtyFlag for each backend, in analogy to the
PrivateRefCount. I use this to keep track if the current backend performed
an update operation in a page in the buffer pool or simply a read operation
(it is not relevant now). The trace file consists of one line for each
ReleaseBuffer() or ReleaseAndReadBuffer() call. The line has the format:

operation,tblSpace,dbNode,relNode,blockNumber

Once the trace file is complete after the execution of the tpcc benchmark,
I use the following bash script to get only unique pages:

cut -d',' -f2-5 trace | sort -n -t',' -k1 -k2 -k3 -k4 | uniq

Today I realized that I was making a mistake in executing the
oltpbenchmark application. From the 64 warehouses created for tpcc, only 1
was being accessed (the 14k distinct pages that I mentioned). I increased
the "terminal" option of the tpcc benchmark from 1 to 64, resulting in one
terminal for each warehouse.

This provided me with a higher number of distinct pages being
accessed. Unfortunately, from the 800k pages in the database (64
warehouses), executing tpcc for 10min resulted in 400k distinct pages being
accessed. This number is much better than the previous results, but I think
it is still not realistic.

I would like to thank you guys for all the attention given to my problem :)

On Wed, Oct 15, 2014 at 9:49 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 14 October 2014 17:08, Lucas Lersch <lucaslersch@gmail.com> wrote:

Unfortunately, in the generated trace with over 2 million buffer

requests,

only ~14k different pages are being accessed, out of the 800k of the

whole

database. Am I missing something here?

We can't tell what you're doing just by knowing the number of unique
items in your list.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Lucas Lersch

Attachments:

trace.patchtext/x-patch; charset=US-ASCII; name=trace.patchDownload
diff -rupN ./postgresql-9.3.5_original/config.log ./postgresql-9.3.5_trace/config.log
--- ./postgresql-9.3.5_original/config.log	2014-10-15 10:30:11.552923099 +0200
+++ ./postgresql-9.3.5_trace/config.log	2014-10-15 10:35:45.786580479 +0200
@@ -349,7 +349,7 @@ configure:7734: $? = 0
 configure:7755: result: yes
 configure:7766: checking for library containing setproctitle
 configure:7807: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lm  >&5
-/tmp/ccf4os1l.o: In function `main':
+/tmp/ccMWb4Lv.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `setproctitle'
 collect2: ld returned 1 exit status
 configure:7814: $? = 1
@@ -389,7 +389,7 @@ configure: failed program was:
 |   return 0;
 | }
 configure:7807: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lutil  -lm  >&5
-/tmp/ccSriijn.o: In function `main':
+/tmp/ccxsb80w.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `setproctitle'
 collect2: ld returned 1 exit status
 configure:7814: $? = 1
@@ -431,7 +431,7 @@ configure: failed program was:
 configure:7845: result: no
 configure:7853: checking for library containing dlopen
 configure:7894: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lm  >&5
-/tmp/ccZgGSzt.o: In function `main':
+/tmp/ccBvyipD.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `dlopen'
 collect2: ld returned 1 exit status
 configure:7901: $? = 1
@@ -479,7 +479,7 @@ configure:7988: $? = 0
 configure:8019: result: none required
 configure:8027: checking for library containing shl_load
 configure:8068: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -ldl -lm  >&5
-/tmp/cc2JE0qC.o: In function `main':
+/tmp/ccSVMb9F.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `shl_load'
 collect2: ld returned 1 exit status
 configure:8075: $? = 1
@@ -564,7 +564,7 @@ configure:8254: $? = 0
 configure:8285: result: none required
 configure:8293: checking for library containing crypt
 configure:8334: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -ldl -lm  >&5
-/tmp/ccX9NprQ.o: In function `main':
+/tmp/cc5xTu8N.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `crypt'
 collect2: ld returned 1 exit status
 configure:8341: $? = 1
@@ -4077,7 +4077,7 @@ configure:19825: $? = 0
 configure:19847: result: yes
 configure:19762: checking for getpeerucred
 configure:19818: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccyZfPFD.o: In function `main':
+/tmp/ccULVa3H.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `getpeerucred'
 collect2: ld returned 1 exit status
 configure:19825: $? = 1
@@ -4215,7 +4215,7 @@ configure:19825: $? = 0
 configure:19847: result: yes
 configure:19762: checking for mbstowcs_l
 configure:19818: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccOjxb0M.o: In function `main':
+/tmp/ccizj96T.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `mbstowcs_l'
 collect2: ld returned 1 exit status
 configure:19825: $? = 1
@@ -4359,7 +4359,7 @@ configure:19825: $? = 0
 configure:19847: result: yes
 configure:19762: checking for pstat
 configure:19818: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccIKM0o0.o: In function `main':
+/tmp/ccXCQWr8.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `pstat'
 collect2: ld returned 1 exit status
 configure:19825: $? = 1
@@ -4500,7 +4500,7 @@ configure:19825: $? = 0
 configure:19847: result: yes
 configure:19762: checking for setproctitle
 configure:19818: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccqGf9Bb.o: In function `main':
+/tmp/cchW5Kzf.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `setproctitle'
 collect2: ld returned 1 exit status
 configure:19825: $? = 1
@@ -4671,7 +4671,7 @@ configure:19825: $? = 0
 configure:19847: result: yes
 configure:19762: checking for wcstombs_l
 configure:19818: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccPbEXIG.o: In function `main':
+/tmp/ccT128IO.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `wcstombs_l'
 collect2: ld returned 1 exit status
 configure:19825: $? = 1
@@ -5590,7 +5590,7 @@ configure:21389: $? = 0
 configure:21411: result: yes
 configure:21326: checking for fls
 configure:21382: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/cc3q3U3I.o: In function `main':
+/tmp/ccmFSRqS.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `fls'
 collect2: ld returned 1 exit status
 configure:21389: $? = 1
@@ -5784,7 +5784,7 @@ configure:21389: $? = 0
 configure:21411: result: yes
 configure:21326: checking for strlcat
 configure:21382: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/cc8EiwIp.o: In function `main':
+/tmp/cc0djF2m.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `strlcat'
 collect2: ld returned 1 exit status
 configure:21389: $? = 1
@@ -5953,7 +5953,7 @@ configure: failed program was:
 configure:21411: result: no
 configure:21326: checking for strlcpy
 configure:21382: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccmSyguw.o: In function `main':
+/tmp/ccrAbcKt.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `strlcpy'
 collect2: ld returned 1 exit status
 configure:21389: $? = 1
@@ -6126,7 +6126,7 @@ configure:21519: $? = 0
 configure:21541: result: yes
 configure:21456: checking for getpeereid
 configure:21512: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/ccb838LF.o: In function `main':
+/tmp/ccchSDXz.o: In function `main':
 conftest.c:(.text.startup+0x7): undefined reference to `getpeereid'
 collect2: ld returned 1 exit status
 configure:21519: $? = 1
@@ -6330,7 +6330,7 @@ configure:22509: $? = 0
 configure:22529: result: yes
 configure:22539: checking for optreset
 configure:22565: gcc -o conftest -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  -D_GNU_SOURCE    conftest.c -lz  -lcrypt -ldl -lm  >&5
-/tmp/cc2IKXv3.o: In function `main':
+/tmp/ccg21Fuc.o: In function `main':
 conftest.c:(.text.startup+0x2): undefined reference to `optreset'
 collect2: ld returned 1 exit status
 configure:22572: $? = 1
@@ -6673,7 +6673,7 @@ conftest.c:144:22: warning: null argumen
 conftest.c:145:22: warning: null argument where non-null required (argument 1) [-Wnonnull]
 conftest.c:145:22: warning: null argument where non-null required (argument 3) [-Wnonnull]
 conftest.c:143:27: warning: 'th' is used uninitialized in this function [-Wuninitialized]
-/tmp/cc8sk1gT.o: In function `main':
+/tmp/ccykBF5P.o: In function `main':
 conftest.c:(.text.startup+0x9): undefined reference to `pthread_join'
 conftest.c:(.text.startup+0x26): undefined reference to `__pthread_register_cancel'
 conftest.c:(.text.startup+0x33): undefined reference to `pthread_create'
diff -rupN ./postgresql-9.3.5_original/config.status ./postgresql-9.3.5_trace/config.status
--- ./postgresql-9.3.5_original/config.status	2014-10-15 10:30:11.304921877 +0200
+++ ./postgresql-9.3.5_trace/config.status	2014-10-15 10:35:45.538579249 +0200
@@ -367,7 +367,7 @@ Copyright (C) 2008 Free Software Foundat
 This config.status script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it."
 
-ac_pwd='/home/postgres/postgresql-9.3.5_original'
+ac_pwd='/home/postgres/postgresql-9.3.5_trace'
 srcdir='.'
 INSTALL='/usr/bin/install -c'
 MKDIR_P='/bin/mkdir -p'
diff -rupN ./postgresql-9.3.5_original/src/backend/storage/buffer/buf_init.c ./postgresql-9.3.5_trace/src/backend/storage/buffer/buf_init.c
--- ./postgresql-9.3.5_original/src/backend/storage/buffer/buf_init.c	2014-07-21 21:10:42.000000000 +0200
+++ ./postgresql-9.3.5_trace/src/backend/storage/buffer/buf_init.c	2014-10-15 10:37:03.894967795 +0200
@@ -21,6 +21,7 @@
 BufferDesc *BufferDescriptors;
 char	   *BufferBlocks;
 int32	   *PrivateRefCount;
+bool	  *PrivateDirtyFlag;
 
 
 /*
@@ -144,6 +145,8 @@ InitBufferPool(void)
 void
 InitBufferPoolAccess(void)
 {
+	PrivateDirtyFlag = (bool *) calloc(NBuffers, sizeof(bool));
+
 	/*
 	 * Allocate and zero local arrays of per-buffer info.
 	 */
diff -rupN ./postgresql-9.3.5_original/src/backend/storage/buffer/bufmgr.c ./postgresql-9.3.5_trace/src/backend/storage/buffer/bufmgr.c
--- ./postgresql-9.3.5_original/src/backend/storage/buffer/bufmgr.c	2014-07-21 21:10:42.000000000 +0200
+++ ./postgresql-9.3.5_trace/src/backend/storage/buffer/bufmgr.c	2014-10-15 10:42:09.188481660 +0200
@@ -33,6 +33,8 @@
 #include <sys/file.h>
 #include <unistd.h>
 
+#include <stdio.h>
+
 #include "catalog/catalog.h"
 #include "catalog/storage.h"
 #include "common/relpath.h"
@@ -839,6 +841,7 @@ BufferAlloc(SMgrRelation smgr, char relp
 	 */
 	buf->tag = newTag;
 	buf->flags &= ~(BM_VALID | BM_DIRTY | BM_JUST_DIRTIED | BM_CHECKPOINT_NEEDED | BM_IO_ERROR | BM_PERMANENT);
+	PrivateDirtyFlag[buf->buf_id] = false;
 	if (relpersistence == RELPERSISTENCE_PERMANENT)
 		buf->flags |= BM_TAG_VALID | BM_PERMANENT;
 	else
@@ -1019,6 +1022,7 @@ MarkBufferDirty(Buffer buffer)
 	}
 
 	bufHdr->flags |= (BM_DIRTY | BM_JUST_DIRTIED);
+	PrivateDirtyFlag[buffer - 1] = true;
 
 	UnlockBufHdr(bufHdr);
 }
@@ -1066,6 +1070,29 @@ ReleaseAndReadBuffer(Buffer buffer,
 				RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
 				bufHdr->tag.forkNum == forkNum)
 				return buffer;
+
+			FILE* f;
+			f = fopen("/usr/local/pgsql/data/trace", "a");
+			if(f != NULL) {
+				char str[128];
+				if(PrivateDirtyFlag[buffer - 1] == true) {
+					PrivateDirtyFlag[buffer - 1] = false;
+					sprintf(str, "U,%d,%d,%d,%d\n", bufHdr->tag.rnode.spcNode,
+									bufHdr->tag.rnode.dbNode,
+									bufHdr->tag.rnode.relNode,
+									bufHdr->tag.blockNum);
+				}
+				else {
+					sprintf(str, "R,%d,%d,%d,%d\n", bufHdr->tag.rnode.spcNode,
+									bufHdr->tag.rnode.dbNode,
+									bufHdr->tag.rnode.relNode,
+									bufHdr->tag.blockNum);
+				}
+				//fprintf(stdout, str);
+				fputs(str, f);
+				fclose(f);
+			}
+			/* ************************************************************* */
 			UnpinBuffer(bufHdr, true);
 		}
 	}
@@ -2537,6 +2564,29 @@ ReleaseBuffer(Buffer buffer)
 
 	Assert(PrivateRefCount[buffer - 1] > 0);
 
+	FILE* f;
+	f = fopen("/usr/local/pgsql/data/trace", "a");
+	if(f != NULL) {
+		char str[128];
+		if(PrivateDirtyFlag[buffer - 1] == true) {
+			PrivateDirtyFlag[buffer - 1] = false;
+			sprintf(str, "U,%d,%d,%d,%d\n", bufHdr->tag.rnode.spcNode,
+							bufHdr->tag.rnode.dbNode,
+							bufHdr->tag.rnode.relNode,
+							bufHdr->tag.blockNum);
+		}
+		else {
+			sprintf(str, "R,%d,%d,%d,%d\n", bufHdr->tag.rnode.spcNode,
+							bufHdr->tag.rnode.dbNode,
+							bufHdr->tag.rnode.relNode,
+							bufHdr->tag.blockNum);
+		}
+		//fprintf(stdout, str);
+		fputs(str, f);
+		fclose(f);
+	}
+	/* ************************************************************* */
+
 	if (PrivateRefCount[buffer - 1] > 1)
 		PrivateRefCount[buffer - 1]--;
 	else
@@ -2700,6 +2750,7 @@ MarkBufferDirtyHint(Buffer buffer, bool
 				PageSetLSN(page, lsn);
 		}
 		bufHdr->flags |= (BM_DIRTY | BM_JUST_DIRTIED);
+		PrivateDirtyFlag[bufHdr->buf_id] = true;
 		UnlockBufHdr(bufHdr);
 
 		if (delayChkpt)
diff -rupN ./postgresql-9.3.5_original/src/include/storage/bufmgr.h ./postgresql-9.3.5_trace/src/include/storage/bufmgr.h
--- ./postgresql-9.3.5_original/src/include/storage/bufmgr.h	2014-07-21 21:10:42.000000000 +0200
+++ ./postgresql-9.3.5_trace/src/include/storage/bufmgr.h	2014-10-06 14:33:23.783472937 +0200
@@ -56,6 +56,7 @@ extern int	target_prefetch_pages;
 /* in buf_init.c */
 extern PGDLLIMPORT char *BufferBlocks;
 extern PGDLLIMPORT int32 *PrivateRefCount;
+extern PGDLLIMPORT bool *PrivateDirtyFlag;
 
 /* in localbuf.c */
 extern PGDLLIMPORT int NLocBuffer;
diff -rupN ./postgresql-9.3.5_original/src/Makefile.global ./postgresql-9.3.5_trace/src/Makefile.global
--- ./postgresql-9.3.5_original/src/Makefile.global	2014-10-15 10:30:11.416922427 +0200
+++ ./postgresql-9.3.5_trace/src/Makefile.global	2014-10-15 10:35:45.630579704 +0200
@@ -40,7 +40,7 @@ MAJORVERSION = 9.3
 
 # Support for VPATH builds
 vpath_build = no
-abs_top_srcdir = /home/postgres/postgresql-9.3.5_original
+abs_top_srcdir = /home/postgres/postgresql-9.3.5_trace
 
 ifneq ($(vpath_build),yes)
 top_srcdir = $(top_builddir)
#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Lucas Lersch (#11)
Re: Buffer Requests Trace

On 15 October 2014 12:49, Lucas Lersch <lucaslersch@gmail.com> wrote:

Sorry for taking so long to answer. I am sending attached the patch with the
changes I did to pgsql code. I followed the steps for compiling and
installing pgsql from:
http://www.postgresql.org/docs/current/static/install-short.html

Are you recording the bufferid or the blockid?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Lucas Lersch
lucaslersch@gmail.com
In reply to: Simon Riggs (#12)
Re: Buffer Requests Trace

I am recording the BufferDesc.tag.blockNum for the buffer along with the
spcNode, dbNode, relNode, also present in the tag.

On Wed, Oct 15, 2014 at 2:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 15 October 2014 12:49, Lucas Lersch <lucaslersch@gmail.com> wrote:

Sorry for taking so long to answer. I am sending attached the patch with

the

changes I did to pgsql code. I followed the steps for compiling and
installing pgsql from:
http://www.postgresql.org/docs/current/static/install-short.html

Are you recording the bufferid or the blockid?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Lucas Lersch

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Lucas Lersch (#13)
Re: Buffer Requests Trace

On 15 October 2014 13:44, Lucas Lersch <lucaslersch@gmail.com> wrote:

I am recording the BufferDesc.tag.blockNum for the buffer along with the
spcNode, dbNode, relNode, also present in the tag.

The TPC-C I/O is random, so if you run it for longer you should see a wider set.

Cacheing isn't possible as a way to improve txn rates.

Check that you're touching all tables.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Lucas Lersch
lucaslersch@gmail.com
In reply to: Simon Riggs (#14)
Re: Buffer Requests Trace

So is it a possible normal behavior that running tpcc for 10min only access
50% of the database? Furthermore, is there a guideline of parameters for
tpcc (# of warehouses, execution time, operations weight)?

On Wed, Oct 15, 2014 at 3:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 15 October 2014 13:44, Lucas Lersch <lucaslersch@gmail.com> wrote:

I am recording the BufferDesc.tag.blockNum for the buffer along with the
spcNode, dbNode, relNode, also present in the tag.

The TPC-C I/O is random, so if you run it for longer you should see a
wider set.

Cacheing isn't possible as a way to improve txn rates.

Check that you're touching all tables.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Lucas Lersch

#16Stephen Frost
sfrost@snowman.net
In reply to: Lucas Lersch (#15)
Re: Buffer Requests Trace

* Lucas Lersch (lucaslersch@gmail.com) wrote:

So is it a possible normal behavior that running tpcc for 10min only access
50% of the database? Furthermore, is there a guideline of parameters for
tpcc (# of warehouses, execution time, operations weight)?

Depends- you may be aware that we support index-only scans in certain
situations. This means that only the index page for a given relation
(and the visibility map) are accessed, and the heap is not.

Thanks,

Stephen

#17Lucas Lersch
lucaslersch@gmail.com
In reply to: Stephen Frost (#16)
Re: Buffer Requests Trace

I got the following numbers from my tpcc database:

Data size: ~6059MB
Index size: ~1390MB
Total size: ~7400MB

Even considering index-only scans, the ratio of around 50% of the database
pages being accessed seems unrealistic to me.

On Wed, Oct 15, 2014 at 3:50 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Lucas Lersch (lucaslersch@gmail.com) wrote:

So is it a possible normal behavior that running tpcc for 10min only

access

50% of the database? Furthermore, is there a guideline of parameters for
tpcc (# of warehouses, execution time, operations weight)?

Depends- you may be aware that we support index-only scans in certain
situations. This means that only the index page for a given relation
(and the visibility map) are accessed, and the heap is not.

Thanks,

Stephen

--
Lucas Lersch

#18Jeff Janes
jeff.janes@gmail.com
In reply to: Lucas Lersch (#15)
Re: Buffer Requests Trace

On Wed, Oct 15, 2014 at 6:22 AM, Lucas Lersch <lucaslersch@gmail.com> wrote:

So is it a possible normal behavior that running tpcc for 10min only
access 50% of the database? Furthermore, is there a guideline of parameters
for tpcc (# of warehouses, execution time, operations weight)?

I'm not familiar with your benchmarking tool. With the one I am most
familiar with, pgbench, if you run it against a database which is too big
to fit in memory, it can take a very long time to touch each page once,
because the constant random disk reads makes it run very slowly. Maybe
that is something to consider here--how many transactions were actually
executed during your 10 min run?

Also, the tool might build tables that are only used under certain run
options. Perhaps you just aren't choosing the options which invoke usage
of those tables. Since you have the trace data, it should be pretty easy
to count how many distinct blocks are accessed from each relation, and
compare that to the size of the relations to see which relations are unused
or lightly used.

Cheers,

Jeff

#19Lucas Lersch
lucaslersch@gmail.com
In reply to: Jeff Janes (#18)
Re: Buffer Requests Trace

Answering your first question: running tpcc for 1 minute, in a database
with 64 warehouses (6~7GB), with a buffer pool of 128MB (around 1.8% of
database size) and a hit ratio of ~91%, I get a throughput of 45~50
transactions per second.

I did some experiments and I got the following information about my tpcc
database and benchmark. The database is created with 64 warehouses.

Table | Index | Data Size | Index Size
| Total
------------+--------------------------------------+-----------+------------+---------
stock | stock_pkey | 2209 MB | 263 MB
| 2472 MB
order_line | order_line_pkey | 2041 MB | 678 MB
| 2719 MB
customer | idx_customer_name | 1216 MB | 146 MB
| 1420 MB
customer | customer_pkey | 1216 MB | 58 MB
| 1420 MB
history | | 164 MB |
| 164 MB
oorder | oorder_pkey | 134 MB | 68 MB
| 362 MB
oorder | idx_order | 134 MB | 80 MB
| 362 MB
oorder | oorder_o_w_id_o_d_id_o_c_id_o_id_key | 134 MB | 80 MB
| 362 MB
new_order | new_order_pkey | 27 MB | 17 MB
| 45 MB
item | item_pkey | 10168 kB | 2208 kB
| 12 MB
district | district_pkey | 776 kB | 72 kB
| 880 kB
warehouse | warehouse_pkey | 384 kB | 16 kB
| 432 kB

By executing the tpcc benchmark for 1 minute I get about 2.9 million buffer
requests. The distribution of these requests in the relations and indexes
are (in descending order):

customer 1383399
stock_pkey 442600
stock 321370
order_line 255314
order_line_pkey 156132
oorder 58665
oorder_pkey 57895
customer_pkey 44471
new_order_pkey 39552
idx_customer_name 28286
new_order 25861
item_pkey 11702
item 11606
district 11389
district_pkey 7575
warehouse 5276
idx_order 4072
oorder_o_w_id_o_d_id_o_c_id_o_id_key 2410
warehouse_pkey 1998
history 1958

All this information seems normal to me. However, from the 2.9 million
buffer requests over ~800k pages, only ~150k distinct pages are being
requested. This behavior could be explained by the benchmark accessing only
a small set of the 64 warehouses instead of having a normal distributed
access over the 64 warehouses. In other words, I think that the execution
time of the benchmark is irrelevant, assuming that the transactions follow
a normal distribution regarding accesses to warehouses.

On Wed, Oct 15, 2014 at 7:41 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Wed, Oct 15, 2014 at 6:22 AM, Lucas Lersch <lucaslersch@gmail.com>
wrote:

So is it a possible normal behavior that running tpcc for 10min only
access 50% of the database? Furthermore, is there a guideline of parameters
for tpcc (# of warehouses, execution time, operations weight)?

I'm not familiar with your benchmarking tool. With the one I am most
familiar with, pgbench, if you run it against a database which is too big
to fit in memory, it can take a very long time to touch each page once,
because the constant random disk reads makes it run very slowly. Maybe
that is something to consider here--how many transactions were actually
executed during your 10 min run?

Also, the tool might build tables that are only used under certain run
options. Perhaps you just aren't choosing the options which invoke usage
of those tables. Since you have the trace data, it should be pretty easy
to count how many distinct blocks are accessed from each relation, and
compare that to the size of the relations to see which relations are unused
or lightly used.

Cheers,

Jeff

--
Lucas Lersch