Compression of full-page-writes
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.
The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.
My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.
I measured how much WAL this patch can reduce, by using pgbench.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840
* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100
checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)
* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)
[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)
At least in my test, the patch could reduce the WAL size to one-third!
The patch is WIP yet. But I'd like to hear the opinions about this idea
before completing it, and then add the patch to next CF if okay.
Regards,
--
Fujii Masao
Attachments:
compress_fpw_v1.patchapplication/octet-stream; name=compress_fpw_v1.patchDownload+72-0
(2013/08/30 11:55), Fujii Masao wrote:
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.I measured how much WAL this patch can reduce, by using pgbench.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)
I believe that the amount of backup blocks in WAL files is affected
by how often the checkpoints are occurring, particularly under such
update-intensive workload.
Under your configuration, checkpoint should occur so often.
So, you need to change checkpoint_timeout larger in order to
determine whether the patch is realistic.
Regards,
* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)At least in my test, the patch could reduce the WAL size to one-third!
The patch is WIP yet. But I'd like to hear the opinions about this idea
before completing it, and then add the patch to next CF if okay.Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
(2013/08/30 12:07), Satoshi Nagayasu wrote:
(2013/08/30 11:55), Fujii Masao wrote:
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.I measured how much WAL this patch can reduce, by using pgbench.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by
checkpoint_timeout)I believe that the amount of backup blocks in WAL files is affected
by how often the checkpoints are occurring, particularly under such
update-intensive workload.Under your configuration, checkpoint should occur so often.
So, you need to change checkpoint_timeout larger in order to
determine whether the patch is realistic.
In fact, the following chart shows that checkpoint_timeout=30min
also reduces WAL size to one-third, compared with 5min timeout,
in the pgbench experimentation.
https://www.oss.ecl.ntt.co.jp/ossc/oss/img/pglesslog_img02.jpg
Regards,
Regards,
* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)At least in my test, the patch could reduce the WAL size to one-third!
The patch is WIP yet. But I'd like to hear the opinions about this idea
before completing it, and then add the patch to next CF if okay.Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 29, 2013 at 7:55 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)
Interesting.
I wonder, what is the impact on recovery time under the same
conditions? I suppose that the cost of the random I/O involved would
probably dominate just as with compress_backup_block = off. That said,
you've used an SSD here, so perhaps not.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 8:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.I measured how much WAL this patch can reduce, by using pgbench.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)
This is really nice data.
I think if you want, you can once try with one of the tests Heikki has
posted for one of my other patch which is here:
/messages/by-id/51366323.8070606@vmware.com
Also if possible, for with lesser clients (1,2,4) and may be with more
frequency of checkpoint.
This is just to show benefits of this idea with other kind of workload.
I think we can do these tests later as well, I had mentioned because
sometime back (probably 6 months), one of my colleagues have tried
exactly the same idea of using compression method (LZ and few others)
for FPW, but it turned out that even though the WAL size is reduced
but performance went down which is not the case in the data you have
shown even though you have used SSD, might be he has done some mistake
as he was not as experienced, but I think still it's good to check on
various workloads.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
(2013/08/30 11:55), Fujii Masao wrote:
* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)
Did you execute munual checkpoint before starting benchmark?
We read only your message, it occuered three times checkpoint during benchmark.
But if you did not executed manual checkpoint, it would be different.
You had better clear this point for more transparent evaluation.
Regards,
--
Mitsumasa KONDO
NTT Open Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Fujii-san,
I must be missing something really trivial, but why not try to compress all
types of WAL blocks and not just FPW?
Regards,
Nikhils
On Fri, Aug 30, 2013 at 8:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Show quoted text
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.I measured how much WAL this patch can reduce, by using pgbench.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)At least in my test, the patch could reduce the WAL size to one-third!
The patch is WIP yet. But I'd like to hear the opinions about this idea
before completing it, and then add the patch to next CF if okay.Regards,
--
Fujii Masao--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 11:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.
Instead of a generic 'compress', what about using the name of the
compression method as parameter value? Just to keep the door open to
new types of compression methods.
* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)At least in my test, the patch could reduce the WAL size to one-third!
Nice numbers! Testing this patch with other benchmarks than pgbench
would be interesting as well.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 12:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Thu, Aug 29, 2013 at 7:55 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)Interesting.
I wonder, what is the impact on recovery time under the same
conditions?
Will test! I can imagine that the recovery time would be a bit
longer with compress_backup_block=on because compressed
FPW needs to be decompressed.
I suppose that the cost of the random I/O involved would
probably dominate just as with compress_backup_block = off. That said,
you've used an SSD here, so perhaps not.
Oh, maybe my description was confusing. full_page_writes was enabled
while running the benchmark even if compress_backup_block = off.
I've not merged those two parameters yet. So even in
compress_backup_block = off, random I/O would not be increased in recovery.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 1:43 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Aug 30, 2013 at 8:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.I measured how much WAL this patch can reduce, by using pgbench.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)[the amount of WAL generated during running pgbench]
4302 MB (compress_backup_block = off)
1521 MB (compress_backup_block = on)This is really nice data.
I think if you want, you can once try with one of the tests Heikki has
posted for one of my other patch which is here:
/messages/by-id/51366323.8070606@vmware.comAlso if possible, for with lesser clients (1,2,4) and may be with more
frequency of checkpoint.This is just to show benefits of this idea with other kind of workload.
Yep, I will do more tests.
I think we can do these tests later as well, I had mentioned because
sometime back (probably 6 months), one of my colleagues have tried
exactly the same idea of using compression method (LZ and few others)
for FPW, but it turned out that even though the WAL size is reduced
but performance went down which is not the case in the data you have
shown even though you have used SSD, might be he has done some mistake
as he was not as experienced, but I think still it's good to check on
various workloads.
I'd appreciate if you test the patch with HDD. Now I have no machine with HDD.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 2:32 PM, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:
(2013/08/30 11:55), Fujii Masao wrote:
* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by
checkpoint_timeout)Did you execute munual checkpoint before starting benchmark?
Yes.
We read only your message, it occuered three times checkpoint during
benchmark.
But if you did not executed manual checkpoint, it would be different.You had better clear this point for more transparent evaluation.
What I executed was:
-------------------------------------
CHECKPOINT
SELECT pg_current_xlog_location()
pgbench -c 32 -j 4 -T 900 -M prepared -r -P 10
SELECT pg_current_xlog_location()
SELECT pg_xlog_location_diff() -- calculate the diff of the above locations
-------------------------------------
I repeated this several times to eliminate the noise.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 29, 2013 at 10:55 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
I suppose that the cost of the random I/O involved would
probably dominate just as with compress_backup_block = off. That said,
you've used an SSD here, so perhaps not.Oh, maybe my description was confusing. full_page_writes was enabled
while running the benchmark even if compress_backup_block = off.
I've not merged those two parameters yet. So even in
compress_backup_block = off, random I/O would not be increased in recovery.
I understood it that way. I just meant that it could be that the
random I/O was so expensive that the additional cost of decompressing
the FPIs looked insignificant in comparison. If that was the case, the
increase in recovery time would be modest.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 2:37 PM, Nikhil Sontakke <nikkhils@gmail.com> wrote:
Hi Fujii-san,
I must be missing something really trivial, but why not try to compress all
types of WAL blocks and not just FPW?
The size of non-FPW WAL is small, compared to that of FPW.
I thought that compression of such a small WAL would not have
big effect on the reduction of WAL size. Rather, compression of
every WAL records might cause large performance overhead.
Also, focusing on FPW makes the patch very simple. We can
add the compression of other WAL later if we want.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30.08.2013 05:55, Fujii Masao wrote:
* Result
[tps]
1386.8 (compress_backup_block = off)
1627.7 (compress_backup_block = on)
It would be good to check how much of this effect comes from reducing
the amount of data that needs to be CRC'd, because there has been some
talk of replacing the current CRC-32 algorithm with something faster.
See
/messages/by-id/20130829223004.GD4283@awork2.anarazel.de.
It might even be beneficial to use one routine for full-page-writes,
which are generally much larger than other WAL records, and another
routine for smaller records. As long as they both produce the same CRC,
of course.
Speeding up the CRC calculation obviously won't help with the WAL volume
per se, ie. you still generate the same amount of WAL that needs to be
shipped in replication. But then again, if all you want to do is to
reduce the volume, you could just compress the whole WAL stream.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 29, 2013 at 10:55 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Attached patch adds new GUC parameter 'compress_backup_block'.
I think this is a great idea.
(This is not to disagree with any of the suggestions made on this
thread for further investigation, all of which I think I basically
agree with. But I just wanted to voice general support for the
general idea, regardless of what specifically we end up with.)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Aug 30, 2013 at 11:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.
Done. Attached is the updated version of the patch.
In this patch, full_page_writes accepts three values: on, compress, and off.
When it's set to compress, the full page image is compressed before it's
inserted into the WAL buffers.
I measured how much this patch affects the performance and the WAL
volume again, and I also measured how much this patch affects the
recovery time.
* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840
* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100
checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)
* Result
[tps]
1344.2 (full_page_writes = on)
1605.9 (compress)
1810.1 (off)
[the amount of WAL generated during running pgbench]
4422 MB (on)
1517 MB (compress)
885 MB (off)
[time required to replay WAL generated during running pgbench]
61s (on) .... 1209911 transactions were replayed,
recovery speed: 19834.6 transactions/sec
39s (compress) .... 1445446 transactions were replayed,
recovery speed: 37062.7 transactions/sec
37s (off) .... 1629235 transactions were replayed,
recovery speed: 44033.3 transactions/sec
When full_page_writes is disabled, the recovery speed is basically very low
because of random I/O. But, ISTM that, since I was using SSD in my box,
the recovery with full_page_writse=off was fastest.
Regards,
--
Fujii Masao
Attachments:
compress_fpw_v2.patchapplication/octet-stream; name=compress_fpw_v2.patchDownload+212-97
On 2013-09-11 19:39:14 +0900, Fujii Masao wrote:
* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)* Result
[tps]
1344.2 (full_page_writes = on)
1605.9 (compress)
1810.1 (off)[the amount of WAL generated during running pgbench]
4422 MB (on)
1517 MB (compress)
885 MB (off)[time required to replay WAL generated during running pgbench]
61s (on) .... 1209911 transactions were replayed,
recovery speed: 19834.6 transactions/sec
39s (compress) .... 1445446 transactions were replayed,
recovery speed: 37062.7 transactions/sec
37s (off) .... 1629235 transactions were replayed,
recovery speed: 44033.3 transactions/sec
ISTM for those benchmarks you should use an absolute number of
transactions, not one based on elapsed time. Otherwise the comparison
isn't really meaningful.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 11, 2013 at 7:39 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Fri, Aug 30, 2013 at 11:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hi,
Attached patch adds new GUC parameter 'compress_backup_block'.
When this parameter is enabled, the server just compresses FPW
(full-page-writes) in WAL by using pglz_compress() before inserting it
to the WAL buffers. Then, the compressed FPW is decompressed
in recovery. This is very simple patch.The purpose of this patch is the reduction of WAL size.
Under heavy write load, the server needs to write a large amount of
WAL and this is likely to be a bottleneck. What's the worse is,
in replication, a large amount of WAL would have harmful effect on
not only WAL writing in the master, but also WAL streaming and
WAL writing in the standby. Also we would need to spend more
money on the storage to store such a large data.
I'd like to alleviate such harmful situations by reducing WAL size.My idea is very simple, just compress FPW because FPW is
a big part of WAL. I used pglz_compress() as a compression method,
but you might think that other method is better. We can add
something like FPW-compression-hook for that later. The patch
adds new GUC parameter, but I'm thinking to merge it to full_page_writes
parameter to avoid increasing the number of GUC. That is,
I'm thinking to change full_page_writes so that it can accept new value
'compress'.Done. Attached is the updated version of the patch.
In this patch, full_page_writes accepts three values: on, compress, and off.
When it's set to compress, the full page image is compressed before it's
inserted into the WAL buffers.I measured how much this patch affects the performance and the WAL
volume again, and I also measured how much this patch affects the
recovery time.* Server spec
CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
Mem: 16GB
Disk: 500GB SSD Samsung 840* Benchmark
pgbench -c 32 -j 4 -T 900 -M prepared
scaling factor: 100checkpoint_segments = 1024
checkpoint_timeout = 5min
(every checkpoint during benchmark were triggered by checkpoint_timeout)* Result
[tps]
1344.2 (full_page_writes = on)
1605.9 (compress)
1810.1 (off)[the amount of WAL generated during running pgbench]
4422 MB (on)
1517 MB (compress)
885 MB (off)
On second thought, the patch could compress WAL very much because I
used pgbench.
Most of data in pgbench are pgbench_accounts table's "filler" columns, i.e.,
blank-padded empty strings. So, the compression ratio of WAL was very high.
I will do the same measurement by using another benchmark.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Fujii-san,
(2013/09/30 12:49), Fujii Masao wrote:
On second thought, the patch could compress WAL very much because I used pgbench.
I will do the same measurement by using another benchmark.
If you hope, I can test this patch in DBT-2 benchmark in end of this week.
I will use under following test server.
* Test server
Server: HP Proliant DL360 G7
CPU: Xeon E5640 2.66GHz (1P/4C)
Memory: 18GB(PC3-10600R-9)
Disk: 146GB(15k)*4 RAID1+0
RAID controller: P410i/256MB
This is PG-REX test server as you know.
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 30, 2013 at 1:27 PM, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:
Hi Fujii-san,
(2013/09/30 12:49), Fujii Masao wrote:
On second thought, the patch could compress WAL very much because I used
pgbench.I will do the same measurement by using another benchmark.
If you hope, I can test this patch in DBT-2 benchmark in end of this week.
I will use under following test server.* Test server
Server: HP Proliant DL360 G7
CPU: Xeon E5640 2.66GHz (1P/4C)
Memory: 18GB(PC3-10600R-9)
Disk: 146GB(15k)*4 RAID1+0
RAID controller: P410i/256MB
Yep, please! It's really helpful!
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers