Using COPY FREEZE in pgbench

Started by Tatsuo Ishiiabout 5 years ago30 messageshackers
Jump to latest
#1Tatsuo Ishii
t-ishii@sra.co.jp

Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.

Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).

Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).

As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.

Thanks for the COPY FREEZE patch recently committed:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2

Attached is one line patch for this.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Attachments:

pgbench.difftext/x-patch; charset=us-asciiDownload+1-1
#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tatsuo Ishii (#1)
Re: Using COPY FREEZE in pgbench

On Mon, 2021-03-08 at 14:39 +0900, Tatsuo Ishii wrote:

Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.

Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).

Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).

As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.

Thanks for the COPY FREEZE patch recently committed:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2

Attached is one line patch for this.

That is indeed low hanging fruit and an improvement.

-	res = PQexec(con, "copy pgbench_accounts from stdin");
+	res = PQexec(con, "copy pgbench_accounts from stdin freeze");

I think it would be better to use the official syntax and put the "freeze"
in parentheses. Perhaps the old syntax will be desupported some day.

Yours,
Laurenz Albe

#3Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#1)
Re: Using COPY FREEZE in pgbench

Hello Tatsuo-san,

Currently pgbench uses plain COPY to populate pgbench_accounts
table. With adding FREEZE option to COPY, the time to perform "pgbench
-i" will be significantly reduced.

Curent master:
pgbench -i -s 100
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).

Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).

That looks good!

As COPY FREEZE was introduced in 9.3, this means that loading data would
break with previous versions. Pgbench attempts at being compatible with
older versions. I'm wondering whether we should not care or if we should
attempt some compatibility layer. It seems enough to test
"PQserverVersion() >= 90300"?

--
Fabien.

#4Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#3)
Re: Using COPY FREEZE in pgbench

Hi Fabien,

That looks good!

As COPY FREEZE was introduced in 9.3, this means that loading data
would break with previous versions. Pgbench attempts at being
compatible with older versions. I'm wondering whether we should not
care or if we should attempt some compatibility layer. It seems enough
to test "PQserverVersion() >= 90300"?

Good point.

Unfortunately with pre-14 COPY FREEZE we cannot get the speed up
effect because it requires the commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
which was there only in the master branch as of Jan 17, 2021.

So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Attachments:

pgbench_freeze_v2.difftext/x-patch; charset=us-asciiDownload+18-1
#5Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Laurenz Albe (#2)
Re: Using COPY FREEZE in pgbench
-	res = PQexec(con, "copy pgbench_accounts from stdin");
+	res = PQexec(con, "copy pgbench_accounts from stdin freeze");

I think it would be better to use the official syntax and put the "freeze"
in parentheses. Perhaps the old syntax will be desupported some day.

Agreed.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#6Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#4)
Re: Using COPY FREEZE in pgbench

Hi Fabien,

That looks good!

As COPY FREEZE was introduced in 9.3, this means that loading data
would break with previous versions. Pgbench attempts at being
compatible with older versions. I'm wondering whether we should not
care or if we should attempt some compatibility layer. It seems enough
to test "PQserverVersion() >= 90300"?

Good point.

Unfortunately with pre-14 COPY FREEZE we cannot get the speed up
effect because it requires the commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7db0cd2145f2bce84cac92402e205e4d2b045bf2
which was there only in the master branch as of Jan 17, 2021.

So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.

I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#7Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#6)
Re: Using COPY FREEZE in pgbench

Hello Tatsuo-san,

So I think adding "freeze" to the copy statement should only happen in
PostgreSQL 14 or later. Probably the test should be
"PQserverVersion() >= 140000" I think. Attached is the patch doing
what you suggest.

I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/

My 0.02 ᅵ

After giving it some thought, ISTM that there could also be a performance
improvement with copy freeze from older version, so I'd suggest to add it
after 9.3 where the option was added, i.e. 90300.

Also, I do not think it is worth to fail on a 0 pqserverversion, just keep
the number and consider it a very old version?

Question: should there be a word about copy using the freeze option? I'd
say yes, in the section describing "g".

--
Fabien.

#8Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#7)
Re: Using COPY FREEZE in pgbench

I have created a CommitFest entry for this.
https://commitfest.postgresql.org/33/3034/

My 0.02 €

After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.

You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com

Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?

If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?

Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".

Can you elaborate about "section describing "g"? I am not sure what
you mean.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#9Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#8)
Re: Using COPY FREEZE in pgbench

After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.

You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not

Oops. I meant Pre-14, not pre-13.

Show quoted text

contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com

Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?

If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?

Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".

Can you elaborate about "section describing "g"? I am not sure what
you mean.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#10Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#8)
Re: Using COPY FREEZE in pgbench

Hello,

After giving it some thought, ISTM that there could also be a
performance improvement with copy freeze from older version, so I'd
suggest to add it after 9.3 where the option was added, i.e. 90300.

You misunderstand about COPY FREEZE. Pre-13 COPY FREEZE does not
contribute a performance improvement. See discussions for more details.
/messages/by-id/CAMkU=1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ@mail.gmail.com
/messages/by-id/CABOikdN-ptGv0mZntrK2Q8OtfUuAjqaYMGmkdU1dCKFtUxVLrg@mail.gmail.com

Ok. ISTM that the option should be triggered as soon as it is available,
but you do as you wish.

Also, I do not think it is worth to fail on a 0 pqserverversion, just
keep the number and consider it a very old version?

If pqserverversion fails, then following PQ calls are likely fail
too. What's a benefit to continue after pqserverversion returns 0?

I'm unsure how this could happen ever. The benefit of not caring is less
lines of codes in pgbench and a later call will catch the issue anyway, if
libpq is corrupted.

Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".

Can you elaborate about "section describing "g"? I am not sure what
you mean.

The "g" item in the section describing initialization steps (i.e. option
-I). I'd suggest just to replace "COPY" with "COPY FREEZE" in the
sentence.

--
Fabien.

#11Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#10)
Re: Using COPY FREEZE in pgbench

Ok. ISTM that the option should be triggered as soon as it is
available, but you do as you wish.

Can you elaborate why you think that using COPY FREEZE before 14 is
beneficial? Or do you want to standardize to use COPY FREEZE?

I'm unsure how this could happen ever. The benefit of not caring is
less lines of codes in pgbench and a later call will catch the issue
anyway, if libpq is corrupted.

I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.

Question: should there be a word about copy using the freeze option?
I'd say yes, in the section describing "g".

Can you elaborate about "section describing "g"? I am not sure what
you mean.

The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.

Ok. The section is needed to be modified.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#12Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#11)
Re: Using COPY FREEZE in pgbench

I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.

Attached v3 patch addresses this.

The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.

Ok. The section is needed to be modified.

This is also addressed in the patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Attachments:

pgbench_freeze_v3.difftext/x-patch; charset=us-asciiDownload+16-1
#13Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#12)
Re: Using COPY FREEZE in pgbench

Hello Tatsuo-san,

I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.

Attached v3 patch addresses this.

The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.

Ok. The section is needed to be modified.

This is also addressed in the patch.

V3 works for me and looks ok. I changed it to ready in the CF app.

--
Fabien.

#14Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#13)
Re: Using COPY FREEZE in pgbench

Hi Fabien,

Hello Tatsuo-san,

I have looked in the code of PQprotocolVersion. The only case in which
it returns 0 is there's no connection. Yes, you are right. Once the
connection has been successfuly established, there's no chance it
fails. So I agree with you.

Attached v3 patch addresses this.

The "g" item in the section describing initialization steps
(i.e. option -I). I'd suggest just to replace "COPY" with "COPY
FREEZE" in the sentence.

Ok. The section is needed to be modified.

This is also addressed in the patch.

V3 works for me and looks ok. I changed it to ready in the CF app.

Thank you for your review!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#15Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#14)
Re: Using COPY FREEZE in pgbench

V3 works for me and looks ok. I changed it to ready in the CF app.

Thank you for your review!

Unfortunately it seems cfbot is not happy with the patch.

# Failed test 'pgbench scale 1 initialization status (got 1 vs expected 0)'
# at t/001_pgbench_with_server.pl line 116.
# Failed test 'pgbench scale 1 initialization stderr /(?^:creating foreign keys)/'
# at t/001_pgbench_with_server.pl line 116.
# 'dropping old tables...
# creating tables...
# creating 2 partitions...
# creating primary keys...
# vacuuming...
# generating data (client-side)...
# 100000 of 100000 tuples (100%) done (elapsed 0.02 s, remaining 0.00 s)
# ERROR: cannot perform COPY FREEZE on a partitioned table
# pgbench: fatal: PQendcopy failed

I think pgbench needs to check whether partitioning option is
specified or not. If specified, pgbench should not use COPY
FREEZE. Attached v4 patch does this.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Attachments:

pgbench_freeze_v4.difftext/x-patch; charset=us-asciiDownload+17-1
#16Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#15)
Re: Using COPY FREEZE in pgbench

V3 works for me and looks ok. I changed it to ready in the CF app.

Thank you for your review!

Unfortunately it seems cfbot is not happy with the patch.

Argh. Indeed, I did not thought of testing on a partitioned table:-( ISTM
I did "make check" in pgbench to trigger tap tests, but possibly it was in
a dream:-(

The feature is a little disappointing because if someone has partition
tables then probably they have a lot of data and probably they would like
freeze to work there. Maybe freeze would work on table partitions
themselves, but I do not think it is worth the effort to do that.

About v4 there is a typo in the doc "portioning":

<command>pgbench</command> uses FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command> if portioning option is not
specified.

I'd suggest:

<command>pgbench</command> uses the FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command>, unless partitions are enabled.

--
Fabien.

#17Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#16)
Re: Using COPY FREEZE in pgbench

The feature is a little disappointing because if someone has partition
tables then probably they have a lot of data and probably they would
like freeze to work there. Maybe freeze would work on table partitions
themselves, but I do not think it is worth the effort to do that.

Agreed.

About v4 there is a typo in the doc "portioning":

<command>pgbench</command> uses FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command> if portioning option is not
specified.

I'd suggest:

<command>pgbench</command> uses the FREEZE option with 14 or later
version of <productname>PostgreSQL</productname> to speed up
subsequent <command>VACUUM</command>, unless partitions are enabled.

Thanks for pointing it out. Also I think that in "with 14 or later
version", "version" should be "versions".

Attached is the v5 patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Attachments:

pgbench_freeze_v5.difftext/x-patch; charset=us-asciiDownload+16-1
#18Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#17)
Re: Using COPY FREEZE in pgbench

Attached is the v5 patch.

About v5: doc gen ok, global and local make check ok.

I did a few tests on my laptop. Is seems that copying takes a little more
time, say about 10%, but vacuum is indeed very significantly reduced, so
that the total time for copying and vacuuming is reduced by 10% on
overall.

So it is okay for me.

--
Fabien.

#19Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#18)
Re: Using COPY FREEZE in pgbench

I did a few tests on my laptop. Is seems that copying takes a little
more time, say about 10%, but vacuum is indeed very significantly
reduced, so that the total time for copying and vacuuming is reduced
by 10% on overall.

So it is okay for me.

Thanks for the test.

I wrote:

Curent master:
pgbench -i -s 100
:
:
done in 70.78 s (drop tables 0.21 s, create tables 0.02 s, client-side generate 12.42 s, vacuum 51.11 s, primary keys 7.02 s).

Using FREEZE:
done in 16.86 s (drop tables 0.20 s, create tables 0.01 s, client-side generate 11.86 s, vacuum 0.25 s, primary keys 4.53 s).

As you can see total time drops from 70.78 seconds to 16.86 seconds,
that is 4.1 times faster. This is mainly because vacuum takes only
0.25 seconds after COPY FREEZE while unpatched pgbench takes 51.11
seconds, which is 204 times slower.

I did same test again.

13.2 pgbench + master branch server:
done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s).

With patch on master branch:
done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s).

This time current pgbench performs much faster than I wrote (15.47 s
vs. 70.78 s). I don't why.

Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

#20Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#19)
Re: Using COPY FREEZE in pgbench

Hello Tatsuo-san,

13.2 pgbench + master branch server:
done in 15.47 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.07 s, vacuum 2.07 s, primary keys 4.13 s).

With patch on master branch:
done in 13.38 s (drop tables 0.19 s, create tables 0.01 s, client-side generate 9.68 s, vacuum 0.23 s, primary keys 3.27 s).

Yes, this is the kind of figures I got on my laptop.

This time current pgbench performs much faster than I wrote (15.47 s vs.
70.78 s). I don't why.

Duno.

Anyway, this time total pgbench time is reduced by 14% over all
here. I hope people agree that the patch is worth the gain.

Yes, because (1) why not take +10% and (2) it exercises an option.

--
Fabien.

In reply to: Tatsuo Ishii (#19)
#22Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Peter Geoghegan (#21)
In reply to: Tatsuo Ishii (#22)
#24Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Peter Geoghegan (#23)
#25Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#24)
#26Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tatsuo Ishii (#25)
#27Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Fabien COELHO (#26)
#28Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tatsuo Ishii (#27)
#29Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Dean Rasheed (#28)
#30Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#29)