pgbench - allow to create partitioned tables
Hello devs,
While doing some performance tests and reviewing patches, I needed to
create partitioned tables. Given the current syntax this is time
consumming.
The attached patch adds two options to create a partitioned "account"
table in pgbench.
It allows to answer quickly simple questions, eg "what is the overhead of
hash partitioning on a simple select on my laptop"? Answer:
# N=0..?
sh> pgench -i -s 1 --partition-number=$N --partition-type=hash
# then run
sh> pgench -S -M prepared -P 1 -T 10
# and look at latency:
# no parts = 0.071 ms
# 1 hash = 0.071 ms (did someone optimize this case?!)
# 2 hash ~ 0.126 ms (+ 0.055 ms)
# 50 hash ~ 0.155 ms
# 100 hash ~ 0.178 ms
# 150 hash ~ 0.232 ms
# 200 hash ~ 0.279 ms
# overhead ~ (0.050 + [0.0005-0.0008] * nparts) ms
--
Fabien.
Attachments:
pgbench-init-partitioned-1.patchtext/x-diff; name=pgbench-init-partitioned-1.patchDownload+151-9
On Tue, 23 Jul 2019 at 19:26, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello devs,
While doing some performance tests and reviewing patches, I needed to
create partitioned tables. Given the current syntax this is time
consumming.
Good idea. I wonder why we didn't have it already.
The attached patch adds two options to create a partitioned "account"
table in pgbench.It allows to answer quickly simple questions, eg "what is the overhead of
hash partitioning on a simple select on my laptop"? Answer:# N=0..?
sh> pgench -i -s 1 --partition-number=$N --partition-type=hash
Given current naming of options, I would call this
--partitions=number-of-partitions and --partition-method=hash
# then run
sh> pgench -S -M prepared -P 1 -T 10# and look at latency:
# no parts = 0.071 ms
# 1 hash = 0.071 ms (did someone optimize this case?!)
# 2 hash ~ 0.126 ms (+ 0.055 ms)
# 50 hash ~ 0.155 ms
# 100 hash ~ 0.178 ms
# 150 hash ~ 0.232 ms
# 200 hash ~ 0.279 ms
# overhead ~ (0.050 + [0.0005-0.0008] * nparts) ms
It is linear?
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Solutions for the Enterprise
Hello Simon,
While doing some performance tests and reviewing patches, I needed to
create partitioned tables. Given the current syntax this is time
consumming.Good idea. I wonder why we didn't have it already.
Probably because I did not have to create partitioned table for some
testing:-)
sh> pgench -i -s 1 --partition-number=$N --partition-type=hash
Given current naming of options, I would call this
--partitions=number-of-partitions and --partition-method=hash
Ok.
# then run
sh> pgench -S -M prepared -P 1 -T 10# and look at latency:
# no parts = 0.071 ms
# 1 hash = 0.071 ms (did someone optimize this case?!)
# 2 hash ~ 0.126 ms (+ 0.055 ms)
# 50 hash ~ 0.155 ms
# 100 hash ~ 0.178 ms
# 150 hash ~ 0.232 ms
# 200 hash ~ 0.279 ms
# overhead ~ (0.050 + [0.0005-0.0008] * nparts) msIt is linear?
Good question. I would have hoped affine, but this is not very clear on
these data, which are the median of about five runs, hence the bracket on
the slope factor. At least it is increasing with the number of partitions.
Maybe it would be clearer on the minimum of five runs.
--
Fabien.
Attachments:
pgbench-init-partitioned-2.patchtext/x-diff; name=pgbench-init-partitioned-2.patchDownload+152-9
# and look at latency:
# no parts = 0.071 ms
# 1 hash = 0.071 ms (did someone optimize this case?!)
# 2 hash ~ 0.126 ms (+ 0.055 ms)
# 50 hash ~ 0.155 ms
# 100 hash ~ 0.178 ms
# 150 hash ~ 0.232 ms
# 200 hash ~ 0.279 ms
# overhead ~ (0.050 + [0.0005-0.0008] * nparts) msIt is linear?
Good question. I would have hoped affine, but this is not very clear on these
data, which are the median of about five runs, hence the bracket on the slope
factor. At least it is increasing with the number of partitions. Maybe it
would be clearer on the minimum of five runs.
Here is a fellow up.
On the minimum of all available runs the query time on hash partitions is
about:
0.64375 nparts + 118.30979 (in ᅵs).
So the overhead is about 47.30979 + 0.64375 nparts, and it is indeed
pretty convincingly linear as suggested by the attached figure.
--
Fabien.
Attachments:
regression.pngimage/png; name=regression.pngDownload
Attached v3 fixes strcasecmp non portability on windows, per postgresql
patch tester.
--
Fabien.
Attachments:
pgbench-init-partitioned-3.patchtext/x-diff; name=pgbench-init-partitioned-3.patchDownload+152-9
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
Hi,
The patch looks good to me, Just one suggestion --partition-method option should be made dependent on --partitions, because it has no use unless used with --partitions. What do you think?
Regards,
Asif
The new status of this patch is: Waiting on Author
Just one suggestion --partition-method option should be made dependent
on --partitions, because it has no use unless used with --partitions.
What do you think?
Why not. V4 attached.
--
Fabien.
Attachments:
pgbench-init-partitioned-4.patchtext/x-diff; name=pgbench-init-partitioned-4.patchDownload+165-8
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
Thanks. All looks good, making it ready for committer.
Regards,
Asif
The new status of this patch is: Ready for Committer
On Mon, Aug 26, 2019 at 11:04 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Just one suggestion --partition-method option should be made dependent
on --partitions, because it has no use unless used with --partitions.
What do you think?
Some comments:
*
+ case 11: /* partitions */
+ initialization_option_set = true;
+ partitions = atoi(optarg);
+ if (partitions < 0)
+ {
+ fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+ optarg);
+ exit(1);
+ }
+ break;
Is there a reason why we treat "partitions = 0" as a valid value?
Also, shouldn't we keep some max limit for this parameter as well?
Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table? I understand it is not sensible to give such
a value, but I guess the API should behave sanely in such cases as
well. I am not sure what will be the good max value for it, but I
think there should be one. Anyone else have any better suggestions
for this?
*
@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
const char *bigcols; /* column decls if accountIDs are 64 bits */
int declare_fillfactor;
};
+
static const struct ddlinfo DDLs[] = {
Spurious line change.
*
+ " --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"
+ " --partition-method=(range|hash)\n"
+ " partition account table with this
method (default: range)\n"
Refer complete table name like pgbench_accounts instead of just
account. It will be clear and in sync with what we display in some
other options like --skip-some-updates.
*
+ " --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"
/defaults/default.
*
I think we should print the information about partitions in
printResults. It can help users while analyzing results.
*
+enum { PART_NONE, PART_RANGE, PART_HASH }
+ partition_method = PART_NONE;
+
I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.
*
- int i;
fprintf(stderr, "creating tables...\n");
- for (i = 0; i < lengthof(DDLs); i++)
+ for (int i = 0; i < lengthof(DDLs); i++)
This is unnecessary change as far as this patch is concerned. I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.
*
+ if (partitions >= 1)
+ {
+ int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
+
+ fprintf(stderr, "creating %d partitions...\n", partitions);
+
+ for (int p = 1; p <= partitions; p++)
+ {
+ char query[256];
+
+ if (partition_method == PART_RANGE)
+ {
part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here. In general, this part of the code can use some
comments.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Hello Amit,
Thanks for the feedback.
+ case 11: /* partitions */ + initialization_option_set = true; + partitions = atoi(optarg); + if (partitions < 0) + { + fprintf(stderr, "invalid number of partitions: \"%s\"\n", + optarg); + exit(1); + } + break;Is there a reason why we treat "partitions = 0" as a valid value?
Yes. It is an explicit "do not create partitioned tables", which differ
from 1 which says "create a partitionned table with just one partition".
Also, shouldn't we keep some max limit for this parameter as well?
I do not think so. If someone wants to test how terrible it is to use
100000 partitions, we should not prevent it.
Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table?
Although I agree that it does not make much sense, for testing purposes
why not, to test overheads in critical cases for instance.
I understand it is not sensible to give such a value, but I guess the
API should behave sanely in such cases as well.
Yep, it should work.
I am not sure what will be the good max value for it, but I
think there should be one.
I disagree. Pgbench is a tool for testing performance for given
parameters. If postgres accepts a parameter there is no reason why pgbench
should reject it.
@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
const char *bigcols; /* column decls if accountIDs are 64 bits */
int declare_fillfactor;
};
+
static const struct ddlinfo DDLs[] = {Spurious line change.
Indeed.
* + " --partitions=NUM partition account table in NUM parts (defaults: 0)\n" + " --partition-method=(range|hash)\n" + " partition account table with this method (default: range)\n"Refer complete table name like pgbench_accounts instead of just account.
It will be clear and in sync with what we display in some other options
like --skip-some-updates.
Ok.
*
+ " --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"/defaults/default.
Ok.
I think we should print the information about partitions in
printResults. It can help users while analyzing results.
Hmmm. Why not, with some hocus-pocus to get the information out of
pg_catalog, and trying to fail gracefully so that if pgbench is run
against a no partitioning-support version.
* +enum { PART_NONE, PART_RANGE, PART_HASH } + partition_method = PART_NONE; +I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.
Hmmm. Why not. This means inventing a used-once type name for
partition_method. My great creativity lead to partition_method_t.
*
- int i;fprintf(stderr, "creating tables...\n");
- for (i = 0; i < lengthof(DDLs); i++) + for (int i = 0; i < lengthof(DDLs); i++)This is unnecessary change as far as this patch is concerned. I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.
The reason I did that is that I had a stupid bug in a development version
which was due to an accidental reuse of this index, which would have been
prevented by this declaration style. Removed anyway.
+ if (partitions >= 1) + { + int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions; + char ff[64]; + ff[0] = '\0'; + append_fillfactor(ff, sizeof(ff)); + + fprintf(stderr, "creating %d partitions...\n", partitions); + + for (int p = 1; p <= partitions; p++) + { + char query[256]; + + if (partition_method == PART_RANGE) + {part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here.
I just wanted to avoid recomputing the value in the loop, but indeed it
may be computed needlessly. Moved.
In general, this part of the code can use some comments.
Ok.
Attached an updated version.
--
Fabien.
Attachments:
pgbench-init-partitioned-5.patchtext/x-diff; name=pgbench-init-partitioned-5.patchDownload+218-6
On Wed, Sep 11, 2019 at 6:08 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Attached an updated version.
I have reviewed the patch and done some basic testing. It works as
per the expectation
I have a few cosmetic comments
1.
+ if (partitions >= 1)
+ {
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
Generally, we give one blank line between the variable declaration and
the first statement of the block.
2.
+ if (p == 1)
+ sprintf(minvalue, "minvalue");
+ else
+ sprintf(minvalue, INT64_FORMAT, (p-1) * part_size + 1);
(p-1) -> (p - 1)
I am just wondering will it be a good idea to expand it to support
multi-level partitioning?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Sep 11, 2019 at 6:08 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
I would like to take inputs from others as well for the display part
of this patch. After this patch, for a simple-update pgbench test,
the changed output will be as follows (note: partition method and
partitions):
pgbench.exe -c 4 -j 4 -T 10 -N postgres
starting vacuum...end.
transaction type: <builtin: simple update>
scaling factor: 1
partition method: hash
partitions: 3
query mode: simple
number of clients: 4
number of threads: 4
duration: 10 s
number of transactions actually processed: 14563
latency average = 2.749 ms
tps = 1454.899150 (including connections establishing)
tps = 1466.689412 (excluding connections establishing)
What do others think about this? This will be the case when the user
has used --partitions option in pgbench, otherwise, it won't change.
+ case 11: /* partitions */ + initialization_option_set = true; + partitions = atoi(optarg); + if (partitions < 0) + { + fprintf(stderr, "invalid number of partitions: \"%s\"\n", + optarg); + exit(1); + } + break;Is there a reason why we treat "partitions = 0" as a valid value?
Yes. It is an explicit "do not create partitioned tables", which differ
from 1 which says "create a partitionned table with just one partition".
Why would anyone want to use --partitions option in the first case
("do not create partitioned tables")?
I think we should print the information about partitions in
printResults. It can help users while analyzing results.Hmmm. Why not, with some hocus-pocus to get the information out of
pg_catalog, and trying to fail gracefully so that if pgbench is run
against a no partitioning-support version.
+ res = PQexec(con,
+ "select p.partstrat, count(*) "
+ "from pg_catalog.pg_class as c "
+ "left join pg_catalog.pg_partitioned_table as p on (p.partrelid = c.oid) "
+ "left join pg_catalog.pg_inherits as i on (c.oid = i.inhparent) "
+ "where c.relname = 'pgbench_accounts' "
+ "group by 1, c.oid");
Can't we write this query with inner join instead of left join? What
additional purpose you are trying to serve by using left join?
* +enum { PART_NONE, PART_RANGE, PART_HASH } + partition_method = PART_NONE; +I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.Hmmm. Why not. This means inventing a used-once type name for
partition_method. My great creativity lead to partition_method_t.
+partition_method_t partition_method = PART_NONE;
It is better to make this static.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Hello Dilip,
Generally, we give one blank line between the variable declaration and
the first statement of the block.
Ok.
(p-1) -> (p - 1)
Ok.
I am just wondering will it be a good idea to expand it to support
multi-level partitioning?
ISTM that how the user could specify multi-level parameters is pretty
unclear, so I would let that as a possible extension if someone wants it
enough.
Attached v6 implements the two cosmetic changes outlined above.
--
Fabien.
Attachments:
pgbench-init-partitioned-6.patchtext/x-diff; name=pgbench-init-partitioned-6.patchDownload+219-6
On Fri, Sep 13, 2019 at 1:35 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Thanks for the updated version of the patch.
Generally, we give one blank line between the variable declaration and
the first statement of the block.Ok.
(p-1) -> (p - 1)
Ok.
I am just wondering will it be a good idea to expand it to support
multi-level partitioning?ISTM that how the user could specify multi-level parameters is pretty
unclear, so I would let that as a possible extension if someone wants it
enough.
Ok
Attached v6 implements the two cosmetic changes outlined above.
+ /* For RANGE, we use open-ended partitions at the beginning and end */
+ if (p == 1)
+ sprintf(minvalue, "minvalue");
+ else
+ sprintf(minvalue, INT64_FORMAT, (p-1) * part_size + 1);
+
+ if (p < partitions)
+ sprintf(maxvalue, INT64_FORMAT, p * part_size + 1);
+ else
+ sprintf(maxvalue, "maxvalue");
I do not understand the reason why first partition need to be
open-ended? Because we are clear that the minimum value of the aid is
1 in pgbench_accout. So if you directly use
sprintf(minvalue, INT64_FORMAT, (p-1) * part_size + 1); then also it
will give 1 as minvalue for the first partition and that will be the
right thing to do. Am I missing something here?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Hello Amit,
Is there a reason why we treat "partitions = 0" as a valid value?
Yes. It is an explicit "do not create partitioned tables", which differ
from 1 which says "create a partitionned table with just one partition".Why would anyone want to use --partitions option in the first case
("do not create partitioned tables")?
Having an explicit value for the default is generally a good idea, eg for
a script to tests various partitioning settings:
for nparts in 0 1 2 3 4 5 6 7 8 9 ; do
pgbench -i --partitions=$nparts ... ;
...
done
Otherwise you would need significant kludging to add/remove the option.
Allowing 0 does not harm anyone.
Now if the consensus is to remove an explicit 0, it is simple enough to
change it, but my opinion is that it is better to have it.
I think we should print the information about partitions in
printResults. It can help users while analyzing results.+ res = PQexec(con, + "select p.partstrat, count(*) " + "from pg_catalog.pg_class as c " + "left join pg_catalog.pg_partitioned_table as p on (p.partrelid = c.oid) " + "left join pg_catalog.pg_inherits as i on (c.oid = i.inhparent) " + "where c.relname = 'pgbench_accounts' " + "group by 1, c.oid");Can't we write this query with inner join instead of left join? What
additional purpose you are trying to serve by using left join?
I'm ensuring that there is always a one line answer, whether it is
partitioned or not. Maybe the count(*) should be count(something in p) to
get 0 instead of 1 on non partitioned tables, though, but this is hidden
in the display anyway.
+partition_method_t partition_method = PART_NONE;
It is better to make this static.
I do agree, but this would depart from all other global variables around
which are currently not static. Maybe a separate patch could turn them all
as static, but ISTM that this patch should not change the current style.
--
Fabien.
Hello Dilip,
+ /* For RANGE, we use open-ended partitions at the beginning and end */ + if (p == 1) + sprintf(minvalue, "minvalue"); + else + sprintf(minvalue, INT64_FORMAT, (p-1) * part_size + 1); + + if (p < partitions) + sprintf(maxvalue, INT64_FORMAT, p * part_size + 1); + else + sprintf(maxvalue, "maxvalue");I do not understand the reason why first partition need to be
open-ended? Because we are clear that the minimum value of the aid is 1
in pgbench_accout. So if you directly use sprintf(minvalue,
INT64_FORMAT, (p-1) * part_size + 1); then also it will give 1 as
minvalue for the first partition and that will be the right thing to do.
Am I missing something here?
This is simply for the principle that any value allowed for the primary
key type has a corresponding partition, and also that it exercices these
special values.
It also probably reduces the cost of checking whether a value belongs to
the first partition because one test is removed, so there is a small
additional performance benefit beyond principle and coverage.
--
Fabien.
On Fri, Sep 13, 2019 at 2:05 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Dilip,
+ /* For RANGE, we use open-ended partitions at the beginning and end */ + if (p == 1) + sprintf(minvalue, "minvalue"); + else + sprintf(minvalue, INT64_FORMAT, (p-1) * part_size + 1); + + if (p < partitions) + sprintf(maxvalue, INT64_FORMAT, p * part_size + 1); + else + sprintf(maxvalue, "maxvalue");I do not understand the reason why first partition need to be
open-ended? Because we are clear that the minimum value of the aid is 1
in pgbench_accout. So if you directly use sprintf(minvalue,
INT64_FORMAT, (p-1) * part_size + 1); then also it will give 1 as
minvalue for the first partition and that will be the right thing to do.
Am I missing something here?
This is simply for the principle that any value allowed for the primary
key type has a corresponding partition, and also that it exercices these
special values.
IMHO, the primary key values for the pgbench_accout tables are always
within the defined range minvalue=1 and maxvalue=scale*100000, right?
It also probably reduces the cost of checking whether a value belongs to
the first partition because one test is removed, so there is a small
additional performance benefit beyond principle and coverage.
Ok, I agree that it will slightly reduce the cost for the tuple
falling in the first and the last partition.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Sep 13, 2019 at 1:50 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Is there a reason why we treat "partitions = 0" as a valid value?
Yes. It is an explicit "do not create partitioned tables", which differ
from 1 which says "create a partitionned table with just one partition".Why would anyone want to use --partitions option in the first case
("do not create partitioned tables")?Having an explicit value for the default is generally a good idea, eg for
a script to tests various partitioning settings:for nparts in 0 1 2 3 4 5 6 7 8 9 ; do
pgbench -i --partitions=$nparts ... ;
...
doneOtherwise you would need significant kludging to add/remove the option.
Allowing 0 does not harm anyone.Now if the consensus is to remove an explicit 0, it is simple enough to
change it, but my opinion is that it is better to have it.
Fair enough, let us see if anyone else wants to weigh in.
I think we should print the information about partitions in
printResults. It can help users while analyzing results.+ res = PQexec(con, + "select p.partstrat, count(*) " + "from pg_catalog.pg_class as c " + "left join pg_catalog.pg_partitioned_table as p on (p.partrelid = c.oid) " + "left join pg_catalog.pg_inherits as i on (c.oid = i.inhparent) " + "where c.relname = 'pgbench_accounts' " + "group by 1, c.oid");Can't we write this query with inner join instead of left join? What
additional purpose you are trying to serve by using left join?I'm ensuring that there is always a one line answer, whether it is
partitioned or not. Maybe the count(*) should be count(something in p) to
get 0 instead of 1 on non partitioned tables, though, but this is hidden
in the display anyway.
Sure, but I feel the code will be simplified. I see no reason for
using left join here.
+partition_method_t partition_method = PART_NONE;
It is better to make this static.
I do agree, but this would depart from all other global variables around
which are currently not static.
Check QueryMode.
Maybe a separate patch could turn them all
as static, but ISTM that this patch should not change the current style.
No need to change others, but we can do it for this one.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 2019-Sep-13, Amit Kapila wrote:
On Fri, Sep 13, 2019 at 1:50 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Is there a reason why we treat "partitions = 0" as a valid value?
Yes. It is an explicit "do not create partitioned tables", which differ
from 1 which says "create a partitionned table with just one partition".Why would anyone want to use --partitions option in the first case
("do not create partitioned tables")?Having an explicit value for the default is generally a good idea, eg for
a script to tests various partitioning settings:for nparts in 0 1 2 3 4 5 6 7 8 9 ; do
pgbench -i --partitions=$nparts ... ;
...
doneOtherwise you would need significant kludging to add/remove the option.
Allowing 0 does not harm anyone.Now if the consensus is to remove an explicit 0, it is simple enough to
change it, but my opinion is that it is better to have it.Fair enough, let us see if anyone else wants to weigh in.
It seems convenient UI -- I vote to keep it.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Sep-13, Amit Kapila wrote:
I would like to take inputs from others as well for the display part
of this patch. After this patch, for a simple-update pgbench test,
the changed output will be as follows (note: partition method and
partitions):
pgbench.exe -c 4 -j 4 -T 10 -N postgres
starting vacuum...end.
transaction type: <builtin: simple update>
scaling factor: 1
partition method: hash
partitions: 3
query mode: simple
number of clients: 4
number of threads: 4
duration: 10 s
number of transactions actually processed: 14563
latency average = 2.749 ms
tps = 1454.899150 (including connections establishing)
tps = 1466.689412 (excluding connections establishing)What do others think about this? This will be the case when the user
has used --partitions option in pgbench, otherwise, it won't change.
I wonder what's the intended usage of this output ... it seems to be
getting a bit too long. Is this intended for machine processing? I
would rather have more things per line in a more compact header.
But then I'm not the kind of person who automates multiple pgbench runs.
Maybe we can get some input from Tomas, who does -- how do you automate
extracting data from collected pgbench output, or do you instead just
redirect the output to a file whose path/name indicates the parameters
that were used? (I do the latter.)
I mean, if we changed it like this (and I'm not proposing to do it in
this patch, this is only an example), would it bother anyone?
$ pgbench -x -y -z ...
starting vacuum...end.
scaling factor: 1 partition method: hash partitions: 1
transaction type: <builtin: simple update> query mode: simple
number of clients: 4 number of threads: 4 duration: 10s
number of transactions actually processed: 14563
latency average = 2.749 ms
tps = 1454.899150 (including connections establishing)
tps = 1466.689412 (excluding connections establishing)
If this output doesn't bother people, then I suggest that this patch
should put the partition information in the line together with scaling
factor.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services