pgbench -i order of vacuum

Started by Jeff Janesover 13 years ago5 messages
#1Jeff Janes
jeff.janes@gmail.com
1 attachment(s)

Is there a reason to vacuum the pgbench_* tables after the indexes on
them are built, rather than before?

Since the indexes are on fresh tables, they can't have anything that
needs to be cleaned.

I don't think the current order accomplishes anything, except to slow
down large initializations by ~25%.

The attached patch moves the vacuums up.

I also made -n skip the vacuums altogether. Since -n is allowed under
-i, it would be nice if it did something, and there is only one
intuitive thing for it to do. I don't know what the use case for is,
but I think I've heard grumbling about it before.

Cheers,

Jeff

Attachments:

pgbench_vacuum_order_v1.patchapplication/octet-stream; name=pgbench_vacuum_order_v1.patchDownload
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
new file mode 100644
index 7aa4750..67ce7d0
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*************** usage(void)
*** 372,378 ****
  		   "  -l           write transaction times to log file\n"
  		   "  -M simple|extended|prepared\n"
  		   "               protocol for submitting queries to server (default: simple)\n"
! 		   "  -n           do not run VACUUM before tests\n"
  		   "  -N           do not update tables \"pgbench_tellers\" and \"pgbench_branches\"\n"
  		   "  -r           report average latency per command\n"
  		   "  -s NUM       report this scale factor in output\n"
--- 372,378 ----
  		   "  -l           write transaction times to log file\n"
  		   "  -M simple|extended|prepared\n"
  		   "               protocol for submitting queries to server (default: simple)\n"
! 		   "  -n           do not run VACUUM before tests, (or after initialization if given with -i)\n"
  		   "  -N           do not update tables \"pgbench_tellers\" and \"pgbench_branches\"\n"
  		   "  -r           report average latency per command\n"
  		   "  -s NUM       report this scale factor in output\n"
*************** disconnect_all(CState *state, int length
*** 1290,1296 ****
  
  /* create tables and setup data */
  static void
! init(void)
  {
  	/*
  	 * Note: TPC-B requires at least 100 bytes per row, and the "filler"
--- 1290,1296 ----
  
  /* create tables and setup data */
  static void
! init(bool is_no_vacuum)
  {
  	/*
  	 * Note: TPC-B requires at least 100 bytes per row, and the "filler"
*************** init(void)
*** 1441,1446 ****
--- 1441,1455 ----
  	}
  	executeStatement(con, "commit");
  
+ 	/* vacuum */
+ 	if (!is_no_vacuum) {
+ 		fprintf(stderr, "vacuum...\n");
+ 		executeStatement(con, "vacuum analyze pgbench_branches");
+ 		executeStatement(con, "vacuum analyze pgbench_tellers");
+ 		executeStatement(con, "vacuum analyze pgbench_accounts");
+ 		executeStatement(con, "vacuum analyze pgbench_history");
+ 	}
+ 
  	/*
  	 * create indexes
  	 */
*************** init(void)
*** 1477,1488 ****
  		}
  	}
  
- 	/* vacuum */
- 	fprintf(stderr, "vacuum...");
- 	executeStatement(con, "vacuum analyze pgbench_branches");
- 	executeStatement(con, "vacuum analyze pgbench_tellers");
- 	executeStatement(con, "vacuum analyze pgbench_accounts");
- 	executeStatement(con, "vacuum analyze pgbench_history");
  
  	fprintf(stderr, "done.\n");
  	PQfinish(con);
--- 1486,1491 ----
*************** main(int argc, char **argv)
*** 2162,2168 ****
  
  	if (is_init_mode)
  	{
! 		init();
  		exit(0);
  	}
  
--- 2165,2171 ----
  
  	if (is_init_mode)
  	{
! 		init(is_no_vacuum);
  		exit(0);
  	}
  
#2Amit Kapila
amit.kapila@huawei.com
In reply to: Jeff Janes (#1)
Re: pgbench -i order of vacuum

From: pgsql-hackers-owner@postgresql.org

[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jeff Janes

Sent: Friday, July 20, 2012 5:36 AM

Is there a reason to vacuum the pgbench_* tables after the indexes on them

are built, rather than before?

Since the indexes are on fresh tables, they can't have anything that needs

to be cleaned.

The command it executes is "vacuum analyze ..", so it will do analyze also
on table which means
it will collect stats corresponding to table and index. So if you do it
before creation of index pgbench might behave
different.
In specific, from function do_analyze_rel(), it will not call
compute_index_stats() if you execute the command before
Creation of index.

With Regards,
Amit Kapila.

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Jeff Janes (#1)
Re: pgbench -i order of vacuum

On Fri, Jul 20, 2012 at 7:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote:

From: pgsql-hackers-owner@postgresql.org

[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jeff Janes

Sent: Friday, July 20, 2012 5:36 AM

Is there a reason to vacuum the pgbench_* tables after the indexes on them

are built, rather than before?

Since the indexes are on fresh tables, they can't have anything that needs

to be cleaned.

The command it executes is "vacuum analyze ..", so it will do analyze also
on table which means
it will collect stats corresponding to table and index.

Are there stats collected on indexes? I thought all stats were on
tables only, and the only reason to visit the index was to remove
all-visible-dead entries.

Cheers,

Jeff

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#3)
Re: pgbench -i order of vacuum

Jeff Janes <jeff.janes@gmail.com> writes:

On Fri, Jul 20, 2012 at 7:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote:

The command it executes is "vacuum analyze ..", so it will do analyze also
on table which means
it will collect stats corresponding to table and index.

Are there stats collected on indexes?

Only for expression indexes, which there aren't any of in the standard
pgbench scenario. I don't see a reason not to change the ordering
as you suggest.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: pgbench -i order of vacuum

On Fri, Jul 20, 2012 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

On Fri, Jul 20, 2012 at 7:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote:

The command it executes is "vacuum analyze ..", so it will do analyze also
on table which means
it will collect stats corresponding to table and index.

Are there stats collected on indexes?

Only for expression indexes, which there aren't any of in the standard
pgbench scenario. I don't see a reason not to change the ordering
as you suggest.

OK, done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company