Using COPY FREEZE in pgbench
Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.
Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).
Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).
As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.
Thanks for the COPY FREEZE patch recently committed:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
Attached is one line patch for this.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench.difftext/x-patch; charset=us-asciiDownload+1-1
On Mon, 2021-03-08 at 14:39 +0900, Tatsuo Ishii wrote:
Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.Thanks for the COPY FREEZE patch recently committed:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2Attached is one line patch for this.
That is indeed low hanging fruit and an improvement.
- res = PQexec(con, "copy pgbench_accounts from stdin"); + res = PQexec(con, "copy pgbench_accounts from stdin freeze");
I think it would be better to use the official syntax and put the "freeze"
in parentheses. Perhaps the old syntax will be desupported some day.
Yours,
Laurenz Albe
Hello Tatsuo-san,
Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.Curent master:
pgbench -i -s 100
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).
That looks good!
As COPY FREEZE was introduced in 9.3, this means that loading data would
break with previous versions. Pgbench attempts at being compatible with
older versions. I'm wondering whether we should not care or if we should
attempt some compatibility layer. It seems enough to test
"PQserverVersion() >= 90300"?
--
Fabien.
Hi Fabien,
That looks good!
As COPY FREEZE was introduced in 9.3, this means that loading data
would break with previous versions. Pgbench attempts at being
compatible with older versions. I'm wondering whether we should not
care or if we should attempt some compatibility layer. It seems enough
to test "PQserverVersion() >= 90300"?
Good point.
Unfortunately with pre-14 COPY FREEZE we cannot get the speed up
effect because it requires the commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
which was there only in the master branch as of Jan 17, 2021.
So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v2.difftext/x-patch; charset=us-asciiDownload+18-1
- res = PQexec(con, "copy pgbench_accounts from stdin"); + res = PQexec(con, "copy pgbench_accounts from stdin freeze");I think it would be better to use the official syntax and put the "freeze"
in parentheses. Perhaps the old syntax will be desupported some day.
Agreed.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hi Fabien,
That looks good!
As COPY FREEZE was introduced in 9.3, this means that loading data
would break with previous versions. Pgbench attempts at being
compatible with older versions. I'm wondering whether we should not
care or if we should attempt some compatibility layer. It seems enough
to test "PQserverVersion() >= 90300"?Good point.
Unfortunately with pre-14 COPY FREEZE we cannot get the speed up
effect because it requires the commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
which was there only in the master branch as of Jan 17, 2021.So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.
I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello Tatsuo-san,
So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/
My 0.02 ᅵ
After giving it some thought, ISTM that there could also be a performance
improvement with copy freeze from older version, so I'd suggest to add it
after 9.3 where the option was added, i.e. 90300.
Also, I do not think it is worth to fail on a 0 pqserverversion, just keep
the number and consider it a very old version?
Question: should there be a word about copy using the freeze option? I'd
say yes, in the section describing "g".
--
Fabien.
I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/My 0.02 €
After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.
You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com
Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?
If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?
Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".
Can you elaborate about "section describing "g"? I am not sure what
you mean.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
Oops. I meant Pre-14, not pre-13.
Show quoted text
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.comAlso, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".Can you elaborate about "section describing "g"? I am not sure what
you mean.Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello,
After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com
Ok. ISTM that the option should be triggered as soon as it is available,
but you do as you wish.
Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?
I'm unsure how this could happen ever. The benefit of not caring is less
lines of codes in pgbench and a later call will catch the issue anyway, if
libpq is corrupted.
Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".Can you elaborate about "section describing "g"? I am not sure what
you mean.
The "g" item in the section describing initialization steps (i.e. option
-I). I'd suggest just to replace "COPY" with "COPY FREEZE" in the
sentence.
--
Fabien.
Ok. ISTM that the option should be triggered as soon as it is
available, but you do as you wish.
Can you elaborate why you think that using COPY FREEZE before 14 is
beneficial? Or do you want to standardize to use COPY FREEZE?
I'm unsure how this could happen ever. The benefit of not caring is
less lines of codes in pgbench and a later call will catch the issue
anyway, if libpq is corrupted.
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.
Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".Can you elaborate about "section describing "g"? I am not sure what
you mean.The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.
Ok. The section is needed to be modified.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.
Attached v3 patch addresses this.
The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.Ok. The section is needed to be modified.
This is also addressed in the patch.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v3.difftext/x-patch; charset=us-asciiDownload+16-1
Hello Tatsuo-san,
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.Attached v3 patch addresses this.
The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.Ok. The section is needed to be modified.
This is also addressed in the patch.
V3 works for me and looks ok. I changed it to ready in the CF app.
--
Fabien.
Hi Fabien,
Hello Tatsuo-san,
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.Attached v3 patch addresses this.
The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.Ok. The section is needed to be modified.
This is also addressed in the patch.
V3 works for me and looks ok. I changed it to ready in the CF app.
Thank you for your review!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
V3 works for me and looks ok. I changed it to ready in the CF app.
Thank you for your review!
Unfortunately it seems cfbot is not happy with the patch.
# Failed test 'pgbench scale 1 initialization status (got 1 vs expected 0)'
# at t/001_pgbench_with_server.pl line 116.
# Failed test 'pgbench scale 1 initialization stderr /(?^:creating foreign keys)/'
# at t/001_pgbench_with_server.pl line 116.
# 'dropping old tables...
# creating tables...
# creating 2 partitions...
# creating primary keys...
# vacuuming...
# generating data (client-side)...
# 100000 of 100000 tuples (100%) done (elapsed 0.02 s, remaining 0.00 s)
# ERROR: cannot perform COPY FREEZE on a partitioned table
# pgbench: fatal: PQendcopy failed
I think pgbench needs to check whether partitioning option is
specified or not. If specified, pgbench should not use COPY
FREEZE. Attached v4 patch does this.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v4.difftext/x-patch; charset=us-asciiDownload+17-1
V3 works for me and looks ok. I changed it to ready in the CF app.
Thank you for your review!
Unfortunately it seems cfbot is not happy with the patch.
Argh. Indeed, I did not thought of testing on a partitioned table:-( ISTM
I did "make check" in pgbench to trigger tap tests, but possibly it was in
a dream:-(
The feature is a little disappointing because if someone has partition
tables then probably they have a lot of data and probably they would like
freeze to work there. Maybe freeze would work on table partitions
themselves, but I do not think it is worth the effort to do that.
About v4 there is a typo in the doc "portioning":
<command>pgbench</command> uses FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command> if portioning option is not
specified.
I'd suggest:
<command>pgbench</command> uses the FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command>, unless partitions are enabled.
--
Fabien.
The feature is a little disappointing because if someone has partition
tables then probably they have a lot of data and probably they would
like freeze to work there. Maybe freeze would work on table partitions
themselves, but I do not think it is worth the effort to do that.
Agreed.
About v4 there is a typo in the doc "portioning":
<command>pgbench</command> uses FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command> if portioning option is not
specified.I'd suggest:
<command>pgbench</command> uses the FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command>, unless partitions are enabled.
Thanks for pointing it out. Also I think that in "with 14 or later
version", "version" should be "versions".
Attached is the v5 patch.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v5.difftext/x-patch; charset=us-asciiDownload+16-1
Attached is the v5 patch.
About v5: doc gen ok, global and local make check ok.
I did a few tests on my laptop. Is seems that copying takes a little more
time, say about 10%, but vacuum is indeed very significantly reduced, so
that the total time for copying and vacuuming is reduced by 10% on
overall.
So it is okay for me.
--
Fabien.
I did a few tests on my laptop. Is seems that copying takes a little
more time, say about 10%, but vacuum is indeed very significantly
reduced, so that the total time for copying and vacuuming is reduced
by 10% on overall.So it is okay for me.
Thanks for the test.
I wrote:
Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.
I did same test again.
13.2 pgbench + master branch server:
done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s).
With patch on master branch:
done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s).
This time current pgbench performs much faster than I wrote (15.47 s
vs. 70.78 s). I don't why.
Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello Tatsuo-san,
13.2 pgbench + master branch server:
done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s).With patch on master branch:
done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s).
Yes, this is the kind of figures I got on my laptop.
This time current pgbench performs much faster than I wrote (15.47 s vs.
70.78 s). I don't why.
Duno.
Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.
Yes, because (1) why not take +10% and (2) it exercises an option.
--
Fabien.