Using COPY FREEZE in pgbench
Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.
Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).
Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).
As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.
Thanks for the COPY FREEZE patch recently committed:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
Attached is one line patch for this.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench.difftext/x-patch; charset=us-asciiDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f1d98be2d2..eea96bc53b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3869,7 +3869,7 @@ initGenerateDataClientSide(PGconn *con)
/*
* accounts is big enough to be worth using COPY and tracking runtime
*/
- res = PQexec(con, "copy pgbench_accounts from stdin");
+ res = PQexec(con, "copy pgbench_accounts from stdin freeze");
if (PQresultStatus(res) != PGRES_COPY_IN)
{
pg_log_fatal("unexpected copy in result: %s", PQerrorMessage(con));
On Mon, 2021-03-08 at 14:39 +0900, Tatsuo Ishii wrote:
Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.Thanks for the COPY FREEZE patch recently committed:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2Attached is one line patch for this.
That is indeed low hanging fruit and an improvement.
- res = PQexec(con, "copy pgbench_accounts from stdin"); + res = PQexec(con, "copy pgbench_accounts from stdin freeze");
I think it would be better to use the official syntax and put the "freeze"
in parentheses. Perhaps the old syntax will be desupported some day.
Yours,
Laurenz Albe
Hello Tatsuo-san,
Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.Curent master:
pgbench -i -s 100
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).
That looks good!
As COPY FREEZE was introduced in 9.3, this means that loading data would
break with previous versions. Pgbench attempts at being compatible with
older versions. I'm wondering whether we should not care or if we should
attempt some compatibility layer. It seems enough to test
"PQserverVersion() >= 90300"?
--
Fabien.
Hi Fabien,
That looks good!
As COPY FREEZE was introduced in 9.3, this means that loading data
would break with previous versions. Pgbench attempts at being
compatible with older versions. I'm wondering whether we should not
care or if we should attempt some compatibility layer. It seems enough
to test "PQserverVersion() >= 90300"?
Good point.
Unfortunately with pre-14 COPY FREEZE we cannot get the speed up
effect because it requires the commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
which was there only in the master branch as of Jan 17, 2021.
So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v2.difftext/x-patch; charset=us-asciiDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f1d98be2d2..6fbf8d3311 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3820,6 +3820,7 @@ initGenerateDataClientSide(PGconn *con)
PGresult *res;
int i;
int64 k;
+ int server_version;
/* used to track elapsed time and estimate of the remaining time */
instr_time start,
@@ -3869,7 +3870,23 @@ initGenerateDataClientSide(PGconn *con)
/*
* accounts is big enough to be worth using COPY and tracking runtime
*/
- res = PQexec(con, "copy pgbench_accounts from stdin");
+
+ server_version = PQserverVersion(con);
+ if (server_version == 0)
+ {
+ pg_log_fatal("could not get server version");
+ exit(1);
+ }
+
+ /*
+ * If server version is 14.0 or later, we can take account of freeze
+ * option of copy.
+ */
+ if (server_version >= 140000)
+ res = PQexec(con, "copy pgbench_accounts from stdin with (freeze on)");
+ else
+ res = PQexec(con, "copy pgbench_accounts from stdin");
+
if (PQresultStatus(res) != PGRES_COPY_IN)
{
pg_log_fatal("unexpected copy in result: %s", PQerrorMessage(con));
- res = PQexec(con, "copy pgbench_accounts from stdin"); + res = PQexec(con, "copy pgbench_accounts from stdin freeze");I think it would be better to use the official syntax and put the "freeze"
in parentheses. Perhaps the old syntax will be desupported some day.
Agreed.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hi Fabien,
That looks good!
As COPY FREEZE was introduced in 9.3, this means that loading data
would break with previous versions. Pgbench attempts at being
compatible with older versions. I'm wondering whether we should not
care or if we should attempt some compatibility layer. It seems enough
to test "PQserverVersion() >= 90300"?Good point.
Unfortunately with pre-14 COPY FREEZE we cannot get the speed up
effect because it requires the commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
which was there only in the master branch as of Jan 17, 2021.So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.
I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello Tatsuo-san,
So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/
My 0.02 ᅵ
After giving it some thought, ISTM that there could also be a performance
improvement with copy freeze from older version, so I'd suggest to add it
after 9.3 where the option was added, i.e. 90300.
Also, I do not think it is worth to fail on a 0 pqserverversion, just keep
the number and consider it a very old version?
Question: should there be a word about copy using the freeze option? I'd
say yes, in the section describing "g".
--
Fabien.
I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/My 0.02 €
After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.
You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com
Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?
If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?
Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".
Can you elaborate about "section describing "g"? I am not sure what
you mean.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
Oops. I meant Pre-14, not pre-13.
Show quoted text
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.comAlso, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".Can you elaborate about "section describing "g"? I am not sure what
you mean.Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello,
After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com
Ok. ISTM that the option should be triggered as soon as it is available,
but you do as you wish.
Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?
I'm unsure how this could happen ever. The benefit of not caring is less
lines of codes in pgbench and a later call will catch the issue anyway, if
libpq is corrupted.
Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".Can you elaborate about "section describing "g"? I am not sure what
you mean.
The "g" item in the section describing initialization steps (i.e. option
-I). I'd suggest just to replace "COPY" with "COPY FREEZE" in the
sentence.
--
Fabien.
Ok. ISTM that the option should be triggered as soon as it is
available, but you do as you wish.
Can you elaborate why you think that using COPY FREEZE before 14 is
beneficial? Or do you want to standardize to use COPY FREEZE?
I'm unsure how this could happen ever. The benefit of not caring is
less lines of codes in pgbench and a later call will catch the issue
anyway, if libpq is corrupted.
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.
Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".Can you elaborate about "section describing "g"? I am not sure what
you mean.The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.
Ok. The section is needed to be modified.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.
Attached v3 patch addresses this.
The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.Ok. The section is needed to be modified.
This is also addressed in the patch.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v3.difftext/x-patch; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 50cf22ba6b..9badafbc1f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -220,6 +220,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
data is generated in <command>pgbench</command> client and then
sent to the server. This uses the client/server bandwidth
extensively through a <command>COPY</command>.
+ <command>pgbench</command> uses FREEZE option with 14 or later
+ version of <productname>PostgreSQL</productname> to speed up
+ subsequent <command>VACUUM</command>.
Using <literal>g</literal> causes logging to print one message
every 100,000 rows while generating data for the
<structname>pgbench_accounts</structname> table.
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index e69d43b26b..a842b59188 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3976,6 +3976,7 @@ initGenerateDataClientSide(PGconn *con)
PGresult *res;
int i;
int64 k;
+ int server_version;
/* used to track elapsed time and estimate of the remaining time */
pg_time_usec_t start;
@@ -4022,7 +4023,18 @@ initGenerateDataClientSide(PGconn *con)
/*
* accounts is big enough to be worth using COPY and tracking runtime
*/
- res = PQexec(con, "copy pgbench_accounts from stdin");
+
+ /*
+ * If server version is 14.0 or later, we can take account of freeze
+ * option of copy.
+ */
+ server_version = PQserverVersion(con);
+
+ if (server_version >= 140000)
+ res = PQexec(con, "copy pgbench_accounts from stdin with (freeze on)");
+ else
+ res = PQexec(con, "copy pgbench_accounts from stdin");
+
if (PQresultStatus(res) != PGRES_COPY_IN)
{
pg_log_fatal("unexpected copy in result: %s", PQerrorMessage(con));
Hello Tatsuo-san,
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.Attached v3 patch addresses this.
The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.Ok. The section is needed to be modified.
This is also addressed in the patch.
V3 works for me and looks ok. I changed it to ready in the CF app.
--
Fabien.
Hi Fabien,
Hello Tatsuo-san,
I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.Attached v3 patch addresses this.
The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.Ok. The section is needed to be modified.
This is also addressed in the patch.
V3 works for me and looks ok. I changed it to ready in the CF app.
Thank you for your review!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
V3 works for me and looks ok. I changed it to ready in the CF app.
Thank you for your review!
Unfortunately it seems cfbot is not happy with the patch.
# Failed test 'pgbench scale 1 initialization status (got 1 vs expected 0)'
# at t/001_pgbench_with_server.pl line 116.
# Failed test 'pgbench scale 1 initialization stderr /(?^:creating foreign keys)/'
# at t/001_pgbench_with_server.pl line 116.
# 'dropping old tables...
# creating tables...
# creating 2 partitions...
# creating primary keys...
# vacuuming...
# generating data (client-side)...
# 100000 of 100000 tuples (100%) done (elapsed 0.02 s, remaining 0.00 s)
# ERROR: cannot perform COPY FREEZE on a partitioned table
# pgbench: fatal: PQendcopy failed
I think pgbench needs to check whether partitioning option is
specified or not. If specified, pgbench should not use COPY
FREEZE. Attached v4 patch does this.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v4.difftext/x-patch; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 50cf22ba6b..0f6a6babc2 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -220,6 +220,10 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
data is generated in <command>pgbench</command> client and then
sent to the server. This uses the client/server bandwidth
extensively through a <command>COPY</command>.
+ <command>pgbench</command> uses FREEZE option with 14 or later
+ version of <productname>PostgreSQL</productname> to speed up
+ subsequent <command>VACUUM</command> if portioning option is not
+ specified.
Using <literal>g</literal> causes logging to print one message
every 100,000 rows while generating data for the
<structname>pgbench_accounts</structname> table.
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index e69d43b26b..44d39ad55d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3976,6 +3976,7 @@ initGenerateDataClientSide(PGconn *con)
PGresult *res;
int i;
int64 k;
+ char *copy_statement;
/* used to track elapsed time and estimate of the remaining time */
pg_time_usec_t start;
@@ -4022,7 +4023,18 @@ initGenerateDataClientSide(PGconn *con)
/*
* accounts is big enough to be worth using COPY and tracking runtime
*/
- res = PQexec(con, "copy pgbench_accounts from stdin");
+
+ /*
+ * If partitioning is not enabled and server version is 14.0 or later, we
+ * can take account of freeze option of copy.
+ */
+ if (partitions == 0 && PQserverVersion(con) >= 140000)
+ copy_statement = "copy pgbench_accounts from stdin with (freeze on)";
+ else
+ copy_statement = "copy pgbench_accounts from stdin";
+
+ res = PQexec(con, copy_statement);
+
if (PQresultStatus(res) != PGRES_COPY_IN)
{
pg_log_fatal("unexpected copy in result: %s", PQerrorMessage(con));
V3 works for me and looks ok. I changed it to ready in the CF app.
Thank you for your review!
Unfortunately it seems cfbot is not happy with the patch.
Argh. Indeed, I did not thought of testing on a partitioned table:-( ISTM
I did "make check" in pgbench to trigger tap tests, but possibly it was in
a dream:-(
The feature is a little disappointing because if someone has partition
tables then probably they have a lot of data and probably they would like
freeze to work there. Maybe freeze would work on table partitions
themselves, but I do not think it is worth the effort to do that.
About v4 there is a typo in the doc "portioning":
<command>pgbench</command> uses FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command> if portioning option is not
specified.
I'd suggest:
<command>pgbench</command> uses the FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command>, unless partitions are enabled.
--
Fabien.
The feature is a little disappointing because if someone has partition
tables then probably they have a lot of data and probably they would
like freeze to work there. Maybe freeze would work on table partitions
themselves, but I do not think it is worth the effort to do that.
Agreed.
About v4 there is a typo in the doc "portioning":
<command>pgbench</command> uses FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command> if portioning option is not
specified.I'd suggest:
<command>pgbench</command> uses the FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command>, unless partitions are enabled.
Thanks for pointing it out. Also I think that in "with 14 or later
version", "version" should be "versions".
Attached is the v5 patch.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v5.difftext/x-patch; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 50cf22ba6b..f0b3a2a624 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -220,6 +220,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
data is generated in <command>pgbench</command> client and then
sent to the server. This uses the client/server bandwidth
extensively through a <command>COPY</command>.
+ <command>pgbench</command> uses the FREEZE option with 14 or later
+ versions of <productname>PostgreSQL</productname> to speed up
+ subsequent <command>VACUUM</command>, unless partitions are enabled.
Using <literal>g</literal> causes logging to print one message
every 100,000 rows while generating data for the
<structname>pgbench_accounts</structname> table.
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index e69d43b26b..44d39ad55d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3976,6 +3976,7 @@ initGenerateDataClientSide(PGconn *con)
PGresult *res;
int i;
int64 k;
+ char *copy_statement;
/* used to track elapsed time and estimate of the remaining time */
pg_time_usec_t start;
@@ -4022,7 +4023,18 @@ initGenerateDataClientSide(PGconn *con)
/*
* accounts is big enough to be worth using COPY and tracking runtime
*/
- res = PQexec(con, "copy pgbench_accounts from stdin");
+
+ /*
+ * If partitioning is not enabled and server version is 14.0 or later, we
+ * can take account of freeze option of copy.
+ */
+ if (partitions == 0 && PQserverVersion(con) >= 140000)
+ copy_statement = "copy pgbench_accounts from stdin with (freeze on)";
+ else
+ copy_statement = "copy pgbench_accounts from stdin";
+
+ res = PQexec(con, copy_statement);
+
if (PQresultStatus(res) != PGRES_COPY_IN)
{
pg_log_fatal("unexpected copy in result: %s", PQerrorMessage(con));
Attached is the v5 patch.
About v5: doc gen ok, global and local make check ok.
I did a few tests on my laptop. Is seems that copying takes a little more
time, say about 10%, but vacuum is indeed very significantly reduced, so
that the total time for copying and vacuuming is reduced by 10% on
overall.
So it is okay for me.
--
Fabien.
I did a few tests on my laptop. Is seems that copying takes a little
more time, say about 10%, but vacuum is indeed very significantly
reduced, so that the total time for copying and vacuuming is reduced
by 10% on overall.So it is okay for me.
Thanks for the test.
I wrote:
Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.
I did same test again.
13.2 pgbench + master branch server:
done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s).
With patch on master branch:
done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s).
This time current pgbench performs much faster than I wrote (15.47 s
vs. 70.78 s). I don't why.
Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello Tatsuo-san,
13.2 pgbench + master branch server:
done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s).With patch on master branch:
done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s).
Yes, this is the kind of figures I got on my laptop.
This time current pgbench performs much faster than I wrote (15.47 s vs.
70.78 s). I don't why.
Duno.
Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.
Yes, because (1) why not take +10% and (2) it exercises an option.
--
Fabien.
On Sun, Mar 21, 2021 at 5:23 PM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.
Most of the time when I run pgbench I use my own shell script, which does this:
PGOPTIONS='-c vacuum_freeze_min_age=0 -c wal_compression=off' pgbench
-i -s $SCALE
Have you considered this case? In other words, have you considered the
benefits of this patch for users that currently deliberately force
freezing by VACUUM, just because it matters to their benchmark?
(BTW you might be surprised how much wal_compression=off matters here.)
--
Peter Geoghegan
Most of the time when I run pgbench I use my own shell script, which does this:
PGOPTIONS='-c vacuum_freeze_min_age=0 -c wal_compression=off' pgbench
-i -s $SCALEHave you considered this case? In other words, have you considered the
benefits of this patch for users that currently deliberately force
freezing by VACUUM, just because it matters to their benchmark?
I am not sure how many people use this kind of options while running
pgbench -i but we could add yet another switch to fall back to none
FREEZE COPY if you want.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
On Fri, Apr 2, 2021 at 4:58 PM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
I am not sure how many people use this kind of options while running
pgbench -i but we could add yet another switch to fall back to none
FREEZE COPY if you want.
I was unclear. What I meant was that your patch isn't just useful
because it speeds up "pgbench -i" for everybody. It's also useful
because having all of the tuples already frozen after bulk loading
seems like a good benchmarking practice, at least most of the time.
The patch changes the initial state of the database with "pgbench -i",
I think. But that's good.
--
Peter Geoghegan
I was unclear. What I meant was that your patch isn't just useful
because it speeds up "pgbench -i" for everybody. It's also useful
because having all of the tuples already frozen after bulk loading
seems like a good benchmarking practice, at least most of the time.The patch changes the initial state of the database with "pgbench -i",
I think. But that's good.
Oh, ok. Thanks for the explanation!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
After this commit:
https://git.postgresql.org/pg/commitdiff/8e03eb92e9ad54e2f1ed8b5a73617601f6262f81
I was worried about that the benefit of COPY FREEZE patch is somewhat
reduced or gone. So I ran a pgbench test again.
Current master:
$ pgbench -i -s 100 test
:
:
done in 20.23 s (drop tables 0.00 s, create tables 0.02 s, client-side generate 13.54 s, vacuum 2.34 s, primary keys 4.33 s).
With v5 patch:
done in 16.92 s (drop tables 0.21 s, create tables 0.01 s, client-side generate 12.68 s, vacuum 0.24 s, primary keys 3.77 s).
So overall gain by the patch is around 15%, whereas the last test
before the commit was 14%. It seems the patch is still beneficial
after the commit.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Hello Tatsuo-san,
So overall gain by the patch is around 15%, whereas the last test before
the commit was 14%. It seems the patch is still beneficial after the
commit.
Yes, that's good!
I had a quick look again, and about the comment:
/*
* If partitioning is not enabled and server version is 14.0 or later, we
* can take account of freeze option of copy.
*/
I'd suggest instead the shorter:
/* use COPY with FREEZE on v14 and later without partioning */
Or maybe even to fully drop the comment, because the code is clear enough
and the doc already says it.
--
Fabien.
Hi fabien,
So overall gain by the patch is around 15%, whereas the last test
before the commit was 14%. It seems the patch is still beneficial
after the commit.Yes, that's good!
Yeah!
I had a quick look again, and about the comment:
/*
* If partitioning is not enabled and server version is 14.0 or later, we
* can take account of freeze option of copy.
*/I'd suggest instead the shorter:
/* use COPY with FREEZE on v14 and later without partioning */
Or maybe even to fully drop the comment, because the code is clear
enough and the doc already says it.
I'd prefer to leave a comment. People (including me) tend to forget
things in the future, that are obvious now:-)
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
Attachments:
pgbench_freeze_v6.difftext/x-patch; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0c60077e1f..0f432767c2 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -220,6 +220,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
data is generated in <command>pgbench</command> client and then
sent to the server. This uses the client/server bandwidth
extensively through a <command>COPY</command>.
+ <command>pgbench</command> uses the FREEZE option with 14 or later
+ versions of <productname>PostgreSQL</productname> to speed up
+ subsequent <command>VACUUM</command>, unless partitions are enabled.
Using <literal>g</literal> causes logging to print one message
every 100,000 rows while generating data for the
<structname>pgbench_accounts</structname> table.
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4aeccd93af..54d993075f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -4119,6 +4119,7 @@ initGenerateDataClientSide(PGconn *con)
PGresult *res;
int i;
int64 k;
+ char *copy_statement;
/* used to track elapsed time and estimate of the remaining time */
pg_time_usec_t start;
@@ -4165,7 +4166,15 @@ initGenerateDataClientSide(PGconn *con)
/*
* accounts is big enough to be worth using COPY and tracking runtime
*/
- res = PQexec(con, "copy pgbench_accounts from stdin");
+
+ /* use COPY with FREEZE on v14 and later without partioning */
+ if (partitions == 0 && PQserverVersion(con) >= 140000)
+ copy_statement = "copy pgbench_accounts from stdin with (freeze on)";
+ else
+ copy_statement = "copy pgbench_accounts from stdin";
+
+ res = PQexec(con, copy_statement);
+
if (PQresultStatus(res) != PGRES_COPY_IN)
{
pg_log_fatal("unexpected copy in result: %s", PQerrorMessage(con));
On Sun, 4 Jul 2021 at 09:32, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
So overall gain by the patch is around 15%, whereas the last test
before the commit was 14%. It seems the patch is still beneficial
after the commit.Yes, that's good!
Yeah!
I tested this with -s100 and got similar results, but with -s1000 it
was more like 1.5x faster:
master:
done in 111.33 s (drop tables 0.00 s, create tables 0.01 s,
client-side generate 52.45 s, vacuum 32.30 s, primary keys 26.58 s)
patch:
done in 74.04 s (drop tables 0.46 s, create tables 0.04 s, client-side
generate 51.81 s, vacuum 2.11 s, primary keys 19.63 s)
Nice!
Regards,
Dean
I tested this with -s100 and got similar results, but with -s1000 it
was more like 1.5x faster:master:
done in 111.33 s (drop tables 0.00 s, create tables 0.01 s,
client-side generate 52.45 s, vacuum 32.30 s, primary keys 26.58 s)patch:
done in 74.04 s (drop tables 0.46 s, create tables 0.04 s, client-side
generate 51.81 s, vacuum 2.11 s, primary keys 19.63 s)Nice!
Regards,
Dean
If there's no objection, I am going to commit/push to master branch in
early September.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
I tested this with -s100 and got similar results, but with -s1000 it
was more like 1.5x faster:master:
done in 111.33 s (drop tables 0.00 s, create tables 0.01 s,
client-side generate 52.45 s, vacuum 32.30 s, primary keys 26.58 s)patch:
done in 74.04 s (drop tables 0.46 s, create tables 0.04 s, client-side
generate 51.81 s, vacuum 2.11 s, primary keys 19.63 s)Nice!
Regards,
DeanIf there's no objection, I am going to commit/push to master branch in
early September.
I have pushed the patch to master branch.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=06ba4a63b85e5aa47b325c3235c16c05a0b58b96
Thank you for those who gave me the valuable reviews!
Reviewed-by: Fabien COELHO, Laurenz Albe, Peter Geoghegan, Dean Rasheed
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp