patch - per-tablespace random_page_cost/seq_page_cost

Started by Robert Haasabout 16 years ago30 messages

robertmhaas@gmail.com

about 16 years ago

1 attachment(s)

Well, I was regretting missing the deadline for this CommitFest and
then realized today was only the 14th, so I finished this up while the
kids were napping.

I ended up not reusing the reloptions.c code. It looks like a lot of
extra complexity for no obvious benefit, considering that there is no
equivalent of AMs for tablespaces and therefore no need to support
AM-specific options. I did reuse the reloptions syntax, and I think
the internal representation could always be redone later, if we find
that there's a use case for something more complicated.

...Robert

Attachments:

spcoptions.patchtext/x-diff; charset=US-ASCII; name=spcoptions.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 1935,1940 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 1935,1943 ----
         <para>
          Sets the planner's estimate of the cost of a disk page fetch
          that is part of a series of sequential fetches.  The default is 1.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
         </para>
        </listitem>
       </varlistentry>
***************
*** 1948,1953 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 1951,1962 ----
         <para>
          Sets the planner's estimate of the cost of a
          non-sequentially-fetched disk page.  The default is 4.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
+        </para>
+ 
+ 	   <para>
          Reducing this value relative to <varname>seq_page_cost</>
          will cause the system to prefer index scans; raising it will
          make index scans look relatively more expensive.  You can raise
*** a/doc/src/sgml/ref/alter_tablespace.sgml
--- b/doc/src/sgml/ref/alter_tablespace.sgml
***************
*** 23,28 **** PostgreSQL documentation
--- 23,30 ----
  <synopsis>
  ALTER TABLESPACE <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
  ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner</replaceable>
+ ALTER TABLESPACE <replaceable>name</replaceable> SET ( <replaceable class="PARAMETER">tablespace_option</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] )
+ ALTER TABLESPACE <replaceable>name</replaceable> RESET ( <replaceable class="PARAMETER">tablespace_option</replaceable> [, ... ] )
  </synopsis>
   </refsynopsisdiv>
    
***************
*** 74,79 **** ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner
--- 76,99 ----
       </para>
      </listitem>
     </varlistentry>
+ 
+    <varlistentry>
+     <term><replaceable class="parameter">tablespace_parameter</replaceable></term>
+     <listitem>
+      <para>
+       A tablespace parameter to be set or reset.  Currently, the only
+       available parameters are <varname>seq_page_cost</> and
+       <varname>random_page_cost</>.  Setting either value for a particular
+       tablespace will override the planner's usual estimate of the cost of
+       reading pages from tables in that tablespace, as established by
+       the configuration parameters of the same name (see
+       <xref linkend="guc-seq-page-cost">,
+       <xref linkend="guc-random-page-cost">).  This may be useful if one
+       tablespace is located on a disk which is faster or slower than the
+       remainder of the I/O subsystem.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </refsect1>
  
*** a/src/backend/catalog/aclchk.c
--- b/src/backend/catalog/aclchk.c
***************
*** 2621,2638 **** ExecGrant_Tablespace(InternalGrant *istmt)
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
- 		ScanKeyData entry[1];
- 		SysScanDesc scan;
  		HeapTuple	tuple;
  
! 		/* There's no syscache for pg_tablespace, so must look the hard way */
! 		ScanKeyInit(&entry[0],
! 					ObjectIdAttributeNumber,
! 					BTEqualStrategyNumber, F_OIDEQ,
! 					ObjectIdGetDatum(tblId));
! 		scan = systable_beginscan(relation, TablespaceOidIndexId, true,
! 								  SnapshotNow, 1, entry);
! 		tuple = systable_getnext(scan);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
--- 2621,2631 ----
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
  		HeapTuple	tuple;
  
! 		/* Search syscache for pg_tablespace */
! 		tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(tblId),
! 							   0, 0, 0);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
***************
*** 2703,2709 **** ExecGrant_Tablespace(InternalGrant *istmt)
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		systable_endscan(scan);
  
  		pfree(new_acl);
  
--- 2696,2702 ----
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		ReleaseSysCache(tuple);
  
  		pfree(new_acl);
  
***************
*** 3443,3451 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  					  AclMode mask, AclMaskHow how)
  {
  	AclMode		result;
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	tuple;
  	Datum		aclDatum;
  	bool		isNull;
--- 3436,3441 ----
***************
*** 3458,3474 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
! 	 *
! 	 * There's no syscache for pg_tablespace, so must look the hard way
! 	 */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 	tuple = systable_getnext(scan);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 3448,3456 ----
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
! 	 */	
! 	tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 						   0, 0, 0);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 3476,3483 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = heap_getattr(tuple, Anum_pg_tablespace_spcacl,
! 							RelationGetDescr(pg_tablespace), &isNull);
  
  	if (isNull)
  	{
--- 3458,3466 ----
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = SysCacheGetAttr(TABLESPACEOID, tuple,
! 								   Anum_pg_tablespace_spcacl,
! 								   &isNull);
  
  	if (isNull)
  	{
***************
*** 3497,3504 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return result;
  }
--- 3480,3486 ----
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	ReleaseSysCache(tuple);
  
  	return result;
  }
***************
*** 4025,4033 **** pg_namespace_ownercheck(Oid nsp_oid, Oid roleid)
  bool
  pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  {
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	spctuple;
  	Oid			spcowner;
  
--- 4007,4012 ----
***************
*** 4035,4051 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* There's no syscache for pg_tablespace, so must look the hard way */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 
! 	spctuple = systable_getnext(scan);
! 
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 4014,4022 ----
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* Search syscache for pg_tablespace */
! 	spctuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 							  0, 0, 0);
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 4053,4060 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return has_privs_of_role(roleid, spcowner);
  }
--- 4024,4030 ----
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	ReleaseSysCache(spctuple);
  
  	return has_privs_of_role(roleid, spcowner);
  }
*** a/src/backend/commands/tablespace.c
--- b/src/backend/commands/tablespace.c
***************
*** 56,61 ****
--- 56,62 ----
  #include "catalog/indexing.h"
  #include "catalog/pg_tablespace.h"
  #include "commands/comment.h"
+ #include "commands/defrem.h"
  #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "postmaster/bgwriter.h"
***************
*** 77,82 **** char	   *temp_tablespaces = NULL;
--- 78,84 ----
  
  static bool remove_tablespace_directories(Oid tablespaceoid, bool redo);
  static void set_short_version(const char *path);
+ static double interpret_page_cost(DefElem *opt);
  
  
  /*
***************
*** 284,289 **** CreateTableSpace(CreateTableSpaceStmt *stmt)
--- 286,295 ----
  		DirectFunctionCall1(namein, CStringGetDatum(stmt->tablespacename));
  	values[Anum_pg_tablespace_spcowner - 1] =
  		ObjectIdGetDatum(ownerId);
+ 	values[Anum_pg_tablespace_spcseq_page_cost - 1] =
+ 		Float8GetDatum(-1.0);
+ 	values[Anum_pg_tablespace_spcrandom_page_cost - 1] =
+ 		Float8GetDatum(-1.0);
  	values[Anum_pg_tablespace_spclocation - 1] =
  		CStringGetTextDatum(location);
  	nulls[Anum_pg_tablespace_spcacl - 1] = true;
***************
*** 910,915 **** AlterTableSpaceOwner(const char *name, Oid newOwnerId)
--- 916,1029 ----
  
  
  /*
+  * Alter table space options
+  */
+ void
+ AlterTableSpace(AlterTableSpaceStmt *stmt)
+ {
+ 	Relation	rel;
+ 	ScanKeyData entry[1];
+ 	HeapScanDesc scandesc;
+ 	HeapTuple	tup;
+ 	Datum		repl_val[Natts_pg_tablespace];
+ 	bool		repl_null[Natts_pg_tablespace];
+ 	bool		repl_repl[Natts_pg_tablespace];
+ 	HeapTuple	newtuple;
+ 	ListCell   *lc;
+ 
+ 	/* Search pg_tablespace */
+ 	rel = heap_open(TableSpaceRelationId, RowExclusiveLock);
+ 
+ 	ScanKeyInit(&entry[0],
+ 				Anum_pg_tablespace_spcname,
+ 				BTEqualStrategyNumber, F_NAMEEQ,
+ 				CStringGetDatum(stmt->tablespacename));
+ 	scandesc = heap_beginscan(rel, SnapshotNow, 1, entry);
+ 	tup = heap_getnext(scandesc, ForwardScanDirection);
+ 	if (!HeapTupleIsValid(tup))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_UNDEFINED_OBJECT),
+ 				 errmsg("tablespace \"%s\" does not exist",
+ 					stmt->tablespacename)));
+ 
+ 	/* Must be owner of the existing object */
+ 	if (!pg_tablespace_ownercheck(HeapTupleGetOid(tup), GetUserId()))
+ 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_TABLESPACE,
+ 					   stmt->tablespacename);
+ 
+ 	/* Prepare to build new tuple. */
+ 	memset(repl_null, false, sizeof(repl_null));
+ 	memset(repl_repl, false, sizeof(repl_repl));
+ 
+ 	/* Parse options list. */
+ 	foreach(lc, stmt->options)
+ 	{
+ 		DefElem *opt = (DefElem *) lfirst(lc);
+ 
+ 		if (strcmp(opt->defname, "seq_page_cost") == 0)
+ 		{
+ 			double newval = interpret_page_cost(opt);
+ 			repl_repl[Anum_pg_tablespace_spcseq_page_cost - 1] = true;
+ 			repl_val[Anum_pg_tablespace_spcseq_page_cost - 1] =
+ 				Float8GetDatum(newval);
+ 		}
+ 		else if (strcmp(opt->defname, "random_page_cost") == 0)
+ 		{
+ 			double newval = interpret_page_cost(opt);
+ 			repl_repl[Anum_pg_tablespace_spcrandom_page_cost - 1] = true;
+ 			repl_val[Anum_pg_tablespace_spcrandom_page_cost - 1] =
+ 					Float8GetDatum(newval);
+ 		}
+ 		else
+ 		{
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 					 errmsg("unrecognized parameter \"%s\"", opt->defname)));
+ 		}
+ 	}
+ 
+ 	/* Update system catalog. */
+ 	newtuple = heap_modify_tuple(tup, RelationGetDescr(rel), repl_val,
+ 								 repl_null, repl_repl);
+ 	simple_heap_update(rel, &newtuple->t_self, newtuple);
+ 	CatalogUpdateIndexes(rel, newtuple);
+ 	heap_freetuple(newtuple);
+ 
+ 	/* Conclude heap scan. */
+ 	heap_endscan(scandesc);
+ 	heap_close(rel, NoLock);
+ }
+ 
+ /*
+  * Friendly helper function for making sense of page cost parameters.
+  */
+ static double
+ interpret_page_cost(DefElem *opt)
+ {
+ 	double newval;
+ 
+ 	if (opt->defaction == DEFELEM_DROP)
+ 	{
+ 		if (opt->arg != NULL)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_SYNTAX_ERROR),
+ 					errmsg("RESET must not include values for parameters")));
+ 		newval = -1.0;
+ 	}
+ 	else
+ 	{
+ 		newval = defGetNumeric(opt);
+ 		if (newval < 0)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 					 errmsg("%g is outside the valid range for parameter \"%s\" (%g .. %g)",
+ 									newval, opt->defname, 0.0, DBL_MAX)));
+ 	}
+ 
+ 	return newval;
+ }
+ 
+ /*
   * Routines for handling the GUC variable 'default_tablespace'.
   */
  
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3058,3063 **** _copyDropTableSpaceStmt(DropTableSpaceStmt *from)
--- 3058,3074 ----
  	return newnode;
  }
  
+ static AlterTableSpaceStmt *
+ _copyAlterTableSpaceStmt(AlterTableSpaceStmt *from)
+ {
+ 	AlterTableSpaceStmt *newnode = makeNode(AlterTableSpaceStmt);
+ 
+ 	COPY_STRING_FIELD(tablespacename);
+ 	COPY_NODE_FIELD(options);
+ 
+ 	return newnode;
+ }
+ 
  static CreateFdwStmt *
  _copyCreateFdwStmt(CreateFdwStmt *from)
  {
***************
*** 4021,4026 **** copyObject(void *from)
--- 4032,4040 ----
  		case T_DropTableSpaceStmt:
  			retval = _copyDropTableSpaceStmt(from);
  			break;
+ 		case T_AlterTableSpaceStmt:
+ 			retval = _copyAlterTableSpaceStmt(from);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _copyCreateFdwStmt(from);
  			break;
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1569,1574 **** _equalDropTableSpaceStmt(DropTableSpaceStmt *a, DropTableSpaceStmt *b)
--- 1569,1583 ----
  }
  
  static bool
+ _equalAlterTableSpaceStmt(AlterTableSpaceStmt *a, AlterTableSpaceStmt *b)
+ {
+ 	COMPARE_STRING_FIELD(tablespacename);
+ 	COMPARE_NODE_FIELD(options);
+ 
+ 	return true;
+ }
+ 
+ static bool
  _equalCreateFdwStmt(CreateFdwStmt *a, CreateFdwStmt *b)
  {
  	COMPARE_STRING_FIELD(fdwname);
***************
*** 2714,2719 **** equal(void *a, void *b)
--- 2723,2731 ----
  		case T_DropTableSpaceStmt:
  			retval = _equalDropTableSpaceStmt(a, b);
  			break;
+ 		case T_AlterTableSpaceStmt:
+ 			retval = _equalAlterTableSpaceStmt(a, b);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _equalCreateFdwStmt(a, b);
  			break;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 1585,1590 **** _outRelOptInfo(StringInfo str, RelOptInfo *node)
--- 1585,1591 ----
  	WRITE_NODE_FIELD(cheapest_total_path);
  	WRITE_NODE_FIELD(cheapest_unique_path);
  	WRITE_UINT_FIELD(relid);
+ 	WRITE_UINT_FIELD(reltablespace);
  	WRITE_ENUM_FIELD(rtekind, RTEKind);
  	WRITE_INT_FIELD(min_attr);
  	WRITE_INT_FIELD(max_attr);
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
***************
*** 27,32 ****
--- 27,37 ----
   * detail.	Note that all of these parameters are user-settable, in case
   * the default values are drastically off for a particular platform.
   *
+  * seq_page_cost and random_page_cost can also be overridden for an individual
+  * tablespace, in case some data is on a fast disk and other data is on a slow
+  * disk.  Per-tablespace overrides never apply to temporary work files such as
+  * an external sort or a materialize node that overflows work_mem.
+  *
   * We compute two separate costs for each path:
   *		total_cost: total estimated cost to fetch all tuples
   *		startup_cost: cost that is expended before first tuple is fetched
***************
*** 164,169 **** void
--- 169,175 ----
  cost_seqscan(Path *path, PlannerInfo *root,
  			 RelOptInfo *baserel)
  {
+ 	double		spc_seq_page_cost;
  	Cost		startup_cost = 0;
  	Cost		run_cost = 0;
  	Cost		cpu_per_tuple;
***************
*** 175,184 **** cost_seqscan(Path *path, PlannerInfo *root,
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
  	/*
  	 * disk costs
  	 */
! 	run_cost += seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
--- 181,195 ----
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  NULL,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * disk costs
  	 */
! 	run_cost += spc_seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
***************
*** 226,231 **** cost_index(IndexPath *path, PlannerInfo *root,
--- 237,244 ----
  	Selectivity indexSelectivity;
  	double		indexCorrelation,
  				csquared;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	Cost		min_IO_cost,
  				max_IO_cost;
  	Cost		cpu_per_tuple;
***************
*** 272,284 **** cost_index(IndexPath *path, PlannerInfo *root,
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
--- 285,302 ----
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
+ 	/* fetch estimated page costs for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge spc_random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
***************
*** 286,292 **** cost_index(IndexPath *path, PlannerInfo *root,
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		random_page_cost + (pages_fetched - 1) * seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
--- 304,310 ----
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
***************
*** 309,315 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
--- 327,333 ----
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
***************
*** 328,334 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  	}
  	else
  	{
--- 346,352 ----
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  	}
  	else
  	{
***************
*** 342,354 **** cost_index(IndexPath *path, PlannerInfo *root,
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * seq_page_cost;
  	}
  
  	/*
--- 360,372 ----
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * spc_random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = spc_random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
  	}
  
  	/*
***************
*** 553,558 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 571,578 ----
  	Cost		cost_per_page;
  	double		tuples_fetched;
  	double		pages_fetched;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	double		T;
  
  	/* Should only be applied to base relations */
***************
*** 571,576 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 591,601 ----
  
  	startup_cost += indexTotalCost;
  
+ 	/* Fetch estimated page costs for tablespace containing table. */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * Estimate number of main-table pages fetched.
  	 */
***************
*** 609,625 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = random_page_cost -
! 			(random_page_cost - seq_page_cost) * sqrt(pages_fetched / T);
  	else
! 		cost_per_page = random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
--- 634,651 ----
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge spc_random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge spc_seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = spc_random_page_cost -
! 			(spc_random_page_cost - spc_seq_page_cost)
! 			* sqrt(pages_fetched / T);
  	else
! 		cost_per_page = spc_random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
***************
*** 783,788 **** cost_tidscan(Path *path, PlannerInfo *root,
--- 809,815 ----
  	QualCost	tid_qual_cost;
  	int			ntuples;
  	ListCell   *l;
+ 	double		spc_random_page_cost;
  
  	/* Should only be applied to base relations */
  	Assert(baserel->relid > 0);
***************
*** 835,842 **** cost_tidscan(Path *path, PlannerInfo *root,
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
--- 862,874 ----
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += spc_random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 91,96 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 91,97 ----
  
  	rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
  	rel->max_attr = RelationGetNumberOfAttributes(relation);
+ 	rel->reltablespace = RelationGetForm(relation)->reltablespace;
  
  	Assert(rel->max_attr >= rel->min_attr);
  	rel->attr_needed = (Relids *)
***************
*** 183,188 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 184,191 ----
  			info = makeNode(IndexOptInfo);
  
  			info->indexoid = index->indexrelid;
+ 			info->reltablespace =
+ 				RelationGetForm(indexRelation)->reltablespace;
  			info->rel = rel;
  			info->ncolumns = ncolumns = index->indnatts;
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 5622,5627 **** RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
--- 5622,5647 ----
  					n->newname = $6;
  					$$ = (Node *)n;
  				}
+ 			| ALTER TABLESPACE name SET reloptions
+ 				{
+ 					AlterTableSpaceStmt *n = makeNode(AlterTableSpaceStmt);
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					$$ = (Node *)n;
+ 				}
+ 			| ALTER TABLESPACE name RESET reloptions
+ 				{
+ 					AlterTableSpaceStmt *n = makeNode(AlterTableSpaceStmt);
+ 					ListCell *lc;
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					foreach (lc, n->options)
+ 					{
+ 						DefElem *def = lfirst(lc);
+ 						def->defaction = DEFELEM_DROP;
+ 					}
+ 					$$ = (Node *)n;
+ 				}
  			| ALTER TEXT_P SEARCH PARSER any_name RENAME TO name
  				{
  					RenameStmt *n = makeNode(RenameStmt);
*** a/src/backend/tcop/utility.c
--- b/src/backend/tcop/utility.c
***************
*** 214,219 **** check_xact_readonly(Node *parsetree)
--- 214,220 ----
  		case T_CreateUserMappingStmt:
  		case T_AlterUserMappingStmt:
  		case T_DropUserMappingStmt:
+ 		case T_AlterTableSpaceStmt:
  			ereport(ERROR,
  					(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
  					 errmsg("transaction is read-only")));
***************
*** 480,485 **** ProcessUtility(Node *parsetree,
--- 481,490 ----
  			DropTableSpace((DropTableSpaceStmt *) parsetree);
  			break;
  
+ 		case T_AlterTableSpaceStmt:
+ 			AlterTableSpace((AlterTableSpaceStmt *) parsetree);
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			CreateForeignDataWrapper((CreateFdwStmt *) parsetree);
  			break;
***************
*** 1386,1391 **** CreateCommandTag(Node *parsetree)
--- 1391,1400 ----
  			tag = "DROP TABLESPACE";
  			break;
  
+ 		case T_AlterTableSpaceStmt:
+ 			tag = "ALTER TABLESPACE";
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			tag = "CREATE FOREIGN DATA WRAPPER";
  			break;
***************
*** 2165,2170 **** GetCommandLogLevel(Node *parsetree)
--- 2174,2183 ----
  			lev = LOGSTMT_DDL;
  			break;
  
+ 		case T_AlterTableSpaceStmt:
+ 			lev = LOGSTMT_DDL;
+ 			break;
+ 
  		case T_CreateFdwStmt:
  		case T_AlterFdwStmt:
  		case T_DropFdwStmt:
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
***************
*** 5372,5377 **** genericcostestimate(PlannerInfo *root,
--- 5372,5378 ----
  	QualCost	index_qual_cost;
  	double		qual_op_cost;
  	double		qual_arg_cost;
+ 	double		spc_random_page_cost;
  	List	   *selectivityQuals;
  	ListCell   *l;
  
***************
*** 5480,5485 **** genericcostestimate(PlannerInfo *root,
--- 5481,5491 ----
  	else
  		numIndexPages = 1.0;
  
+ 	/* fetch estimated page cost for schema containing index */
+ 	get_tablespace_page_costs(index->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/*
  	 * Now compute the disk access costs.
  	 *
***************
*** 5526,5540 **** genericcostestimate(PlannerInfo *root,
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * random_page_cost) / num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge random_page_cost per page
! 		 * touched.
  		 */
! 		*indexTotalCost = numIndexPages * random_page_cost;
  	}
  
  	/*
--- 5532,5547 ----
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * spc_random_page_cost)
! 							/ num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge spc_random_page_cost per
! 		 * page touched.
  		 */
! 		*indexTotalCost = numIndexPages * spc_random_page_cost;
  	}
  
  	/*
***************
*** 5549,5559 **** genericcostestimate(PlannerInfo *root,
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * random_page_cost per 100000 index pages, which should be small enough
! 	 * to not alter index-vs-seqscan decisions, but will prevent indexes of
! 	 * different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
--- 5556,5566 ----
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * spc_random_page_cost per 100000 index pages, which should be small
! 	 * enough to not alter index-vs-seqscan decisions, but will prevent
! 	 * indexes of different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * spc_random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
*** a/src/backend/utils/cache/lsyscache.c
--- b/src/backend/utils/cache/lsyscache.c
***************
*** 26,34 ****
--- 26,36 ----
  #include "catalog/pg_operator.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_type.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
+ #include "optimizer/cost.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/datum.h"
***************
*** 2776,2778 **** get_roleid_checked(const char *rolname)
--- 2778,2819 ----
  				 errmsg("role \"%s\" does not exist", rolname)));
  	return roleid;
  }
+ 
+ /*				---------- PG_TABLESPACE CACHE ----------				 */
+ 
+ /*
+  * get_tablespace_page_costs
+  *		Returns random and seqential page costs for a given tablespace
+  */
+ void
+ get_tablespace_page_costs(Oid spcid, double *spc_random_page_cost,
+ 					     double *spc_seq_page_cost)
+ {
+ 	HeapTuple	tp;
+ 
+ 	/* Ensure output args are initialized on failure */
+ 	if (spc_random_page_cost)
+ 		*spc_random_page_cost = random_page_cost;
+ 	if (spc_seq_page_cost)
+ 		*spc_seq_page_cost = seq_page_cost;
+ 
+ 	/* spcid is always from a pg_class tuple, so InvalidOid implies the
+ 	 * default */
+ 	if (spcid == InvalidOid)
+ 		spcid = MyDatabaseTableSpace;
+ 
+ 	tp = SearchSysCache(TABLESPACEOID,
+ 						ObjectIdGetDatum(spcid),
+ 						0, 0, 0);
+ 	if (HeapTupleIsValid(tp))
+ 	{
+ 		Form_pg_tablespace spctup = (Form_pg_tablespace) GETSTRUCT(tp);
+ 
+ 		if (spc_random_page_cost && spctup->spcrandom_page_cost >= 0)
+ 			*spc_random_page_cost = (double) spctup->spcrandom_page_cost;
+ 		if (spc_seq_page_cost && spctup->spcseq_page_cost >= 0)
+ 			*spc_seq_page_cost = (double) spctup->spcseq_page_cost;
+ 
+ 		ReleaseSysCache(tp);
+ 	}
+ }
*** a/src/backend/utils/cache/syscache.c
--- b/src/backend/utils/cache/syscache.c
***************
*** 43,48 ****
--- 43,49 ----
  #include "catalog/pg_proc.h"
  #include "catalog/pg_rewrite.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_ts_config.h"
  #include "catalog/pg_ts_config_map.h"
  #include "catalog/pg_ts_dict.h"
***************
*** 609,614 **** static const struct cachedesc cacheinfo[] = {
--- 610,627 ----
  		},
  		1024
  	},
+ 	{TableSpaceRelationId,		/* TABLESPACEOID */
+ 		TablespaceOidIndexId,
+ 		0,
+ 		1,
+ 		{
+ 			ObjectIdAttributeNumber,
+ 			0,
+ 			0,
+ 			0,
+ 		},
+ 		16
+ 	},
  	{TSConfigMapRelationId,		/* TSCONFIGMAP */
  		TSConfigMapIndexId,
  		0,
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
***************
*** 956,974 **** dumpTablespaces(PGconn *conn)
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80200)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
--- 956,988 ----
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80500)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
+         				   "array_to_string(ARRAY["
+ 				   "CASE WHEN spcseq_page_cost < 0 THEN NULL "
+ 				   "ELSE 'seq_page_cost = ' || spcseq_page_cost END,"
+ 				   "CASE WHEN spcrandom_page_cost < 0 THEN NULL "
+ 				   "ELSE 'random_page_cost = ' || spcrandom_page_cost END"
+ 							"], ', '),"
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
+ 	else if (server_version >= 80200)
+ 		res = executeQuery(conn, "SELECT spcname, "
+ 						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
+ 						   "spclocation, spcacl, null, "
+ 						"pg_catalog.shobj_description(oid, 'pg_tablespace'), "
+ 						   "FROM pg_catalog.pg_tablespace "
+ 						   "WHERE spcname !~ '^pg_' "
+ 						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null, null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
***************
*** 983,989 **** dumpTablespaces(PGconn *conn)
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spccomment = PQgetvalue(res, i, 4);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
--- 997,1004 ----
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spcoptions = PQgetvalue(res, i, 4);
! 		char	   *spccomment = PQgetvalue(res, i, 5);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
***************
*** 996,1001 **** dumpTablespaces(PGconn *conn)
--- 1011,1020 ----
  		appendStringLiteralConn(buf, spclocation, conn);
  		appendPQExpBuffer(buf, ";\n");
  
+ 		if (spcoptions && spcoptions[0] != '\0')
+ 			appendPQExpBuffer(buf, "ALTER TABLESPACE %s SET (%s);\n",
+ 							  fspcname, spcoptions);
+ 
  		if (!skip_acls &&
  			!buildACLCommands(fspcname, NULL, "TABLESPACE", spcacl, spcowner,
  							  "", server_version, buf))
*** a/src/include/catalog/pg_tablespace.h
--- b/src/include/catalog/pg_tablespace.h
***************
*** 32,37 **** CATALOG(pg_tablespace,1213) BKI_SHARED_RELATION
--- 32,39 ----
  {
  	NameData	spcname;		/* tablespace name */
  	Oid			spcowner;		/* owner of tablespace */
+ 	float8		spcrandom_page_cost; /* per-tablespace random_page_cost */
+ 	float8		spcseq_page_cost; /* per-tablespace seq_page_cost */
  	text		spclocation;	/* physical location (VAR LENGTH) */
  	aclitem		spcacl[1];		/* access permissions (VAR LENGTH) */
  } FormData_pg_tablespace;
***************
*** 48,61 **** typedef FormData_pg_tablespace *Form_pg_tablespace;
   * ----------------
   */
  
! #define Natts_pg_tablespace				4
! #define Anum_pg_tablespace_spcname		1
! #define Anum_pg_tablespace_spcowner		2
! #define Anum_pg_tablespace_spclocation	3
! #define Anum_pg_tablespace_spcacl		4
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
--- 50,65 ----
   * ----------------
   */
  
! #define Natts_pg_tablespace						6
! #define Anum_pg_tablespace_spcname				1
! #define Anum_pg_tablespace_spcowner				2
! #define Anum_pg_tablespace_spcrandom_page_cost	3
! #define Anum_pg_tablespace_spcseq_page_cost		4
! #define Anum_pg_tablespace_spclocation			5
! #define Anum_pg_tablespace_spcacl				6
  
! DATA(insert OID = 1663 ( pg_default PGUID -1 -1 "" _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID -1 -1 "" _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
*** a/src/include/commands/tablespace.h
--- b/src/include/commands/tablespace.h
***************
*** 35,40 **** typedef struct xl_tblspc_drop_rec
--- 35,41 ----
  
  extern void CreateTableSpace(CreateTableSpaceStmt *stmt);
  extern void DropTableSpace(DropTableSpaceStmt *stmt);
+ extern void AlterTableSpace(AlterTableSpaceStmt *stmt);
  extern void RenameTableSpace(const char *oldname, const char *newname);
  extern void AlterTableSpaceOwner(const char *name, Oid newOwnerId);
  
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 346,351 **** typedef enum NodeTag
--- 346,352 ----
  	T_CreateUserMappingStmt,
  	T_AlterUserMappingStmt,
  	T_DropUserMappingStmt,
+ 	T_AlterTableSpaceStmt,
  
  	/*
  	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 1464,1469 **** typedef struct DropTableSpaceStmt
--- 1464,1476 ----
  	bool		missing_ok;		/* skip error if missing? */
  } DropTableSpaceStmt;
  
+ typedef struct AlterTableSpaceStmt
+ {
+ 	NodeTag		type;
+ 	char	   *tablespacename;
+ 	List	   *options;
+ } AlterTableSpaceStmt;
+ 
  /* ----------------------
   *		Create/Drop FOREIGN DATA WRAPPER Statements
   * ----------------------
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 361,366 **** typedef struct RelOptInfo
--- 361,367 ----
  
  	/* information about a base rel (not set for join rels!) */
  	Index		relid;
+ 	Oid			reltablespace;	/* containing tablespace */
  	RTEKind		rtekind;		/* RELATION, SUBQUERY, or FUNCTION */
  	AttrNumber	min_attr;		/* smallest attrno of rel (often <0) */
  	AttrNumber	max_attr;		/* largest attrno of rel */
***************
*** 425,430 **** typedef struct IndexOptInfo
--- 426,432 ----
  	NodeTag		type;
  
  	Oid			indexoid;		/* OID of the index relation */
+ 	Oid			reltablespace;	/* tablespace of index (not table) */
  	RelOptInfo *rel;			/* back-link to index's table */
  
  	/* statistics from pg_class */
*** a/src/include/utils/lsyscache.h
--- b/src/include/utils/lsyscache.h
***************
*** 137,142 **** extern void free_attstatsslot(Oid atttype,
--- 137,144 ----
  extern char *get_namespace_name(Oid nspid);
  extern Oid	get_roleid(const char *rolname);
  extern Oid	get_roleid_checked(const char *rolname);
+ void get_tablespace_page_costs(Oid spcid, float8 *spc_random_page_cost,
+ 					     float8 *spc_seq_page_cost);
  
  #define type_is_array(typid)  (get_element_type(typid) != InvalidOid)
  
*** a/src/include/utils/syscache.h
--- b/src/include/utils/syscache.h
***************
*** 71,76 **** enum SysCacheIdentifier
--- 71,77 ----
  	RELOID,
  	RULERELNAME,
  	STATRELATT,
+ 	TABLESPACEOID,
  	TSCONFIGMAP,
  	TSCONFIGNAMENSP,
  	TSCONFIGOID,

Greg Stark

gsstark@mit.edu

about 16 years ago

In reply to: Robert Haas (#1)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sat, Nov 14, 2009 at 7:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I ended up not reusing the reloptions.c code. It looks like a lot of
extra complexity for no obvious benefit, considering that there is no
equivalent of AMs for tablespaces and therefore no need to support
AM-specific options. I did reuse the reloptions syntax, and I think
the internal representation could always be redone later, if we find
that there's a use case for something more complicated.

a) effective_io_concurrency really deserves to be in the list as well.

b) I thought Tom came down pretty stridently against any data model
which hard codes a specific list of supported options. I can't
remember exactly what level of flexibility he wanted but I think
"doesn't require catalog changes to add a new option" might have been
it.

I agree that having everything smashed to text is a bit kludgy though.
I'm not sure we have the tools to do much better though.

--
greg

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Greg Stark (#2)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sat, Nov 14, 2009 at 3:06 PM, Greg Stark <gsstark@mit.edu> wrote:

On Sat, Nov 14, 2009 at 7:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I ended up not reusing the reloptions.c code. It looks like a lot of
extra complexity for no obvious benefit, considering that there is no
equivalent of AMs for tablespaces and therefore no need to support
AM-specific options. I did reuse the reloptions syntax, and I think
the internal representation could always be redone later, if we find
that there's a use case for something more complicated.

a) effective_io_concurrency really deserves to be in the list as well.

b) I thought Tom came down pretty stridently against any data model
which hard codes a specific list of supported options. I can't
remember exactly what level of flexibility he wanted but I think
"doesn't require catalog changes to add a new option" might have been
it.

I agree that having everything smashed to text is a bit kludgy though.
I'm not sure we have the tools to do much better though.

I'm hoping Tom will reconsider, or at least flesh out his thinking.
What the reloptions code does is create a general framework for
handling options. Everything gets smashed down to text[], and then
when we actually need to use the reloptions we parse them into a C
struct appropriate to the underlying object type. This is really the
only feasible design, because pg_class contains multiple different
types of objects - in particular, tables and indices - and indices in
turn come in multiple types, depending on the AM. So the exact
options that are legal depend on the the type of object, and for
indices the AM, and we populate a *different* C struct depending on
the situation. pg_tablespace, on the other hand, only contains one
type of object: a tablespace. So, if we stored the options as text[],
we'd parse them out into a C struct just as we do for pg_class, but
unlike the pg_class case, it would always be the *same* C struct.

In other words, we CAN'T use dedicated columns for pg_class because we
can't know in advance precisely what columns will be needed - it
depends on what AMs someone chooses to load up. For pg_tablespace, we
know exactly what columns will be needed, and the answer is exactly
those options that we choose to support, because tablespaces are not
extensible.

That's my thinking, anyway... YMMV.

...Robert

Tom Lane

tgl@sss.pgh.pa.us

about 16 years ago

In reply to: Robert Haas (#3)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Robert Haas <robertmhaas@gmail.com> writes:

.... pg_tablespace, on the other hand, only contains one
type of object: a tablespace. So, if we stored the options as text[],
we'd parse them out into a C struct just as we do for pg_class, but
unlike the pg_class case, it would always be the *same* C struct.

The same, until it's different. There is no reason at all to suppose
that the set of options people will want to apply to a tablespace will
remain constant over time --- in fact, I don't think there's even a
solid consensus right now on which GUCs people would want to set at the
tablespace level. I don't believe it is wise to hardwire this into the
catalog schema. Yes, it would look marginally nicer from a theoretical
standpoint, but we'd be forever having to revise the schema, plus a lot
of downstream code (pg_dump for example); which is not only significant
work but absolutely prevents making any adjustments except at major
version boundaries. And I don't see any concrete benefit that we get
out of a hardwired schema for these things. It's not like we care about
optimizing searches for tablespaces having a particular option setting,
for example.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Tom Lane (#4)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

.... pg_tablespace, on the other hand, only contains one
type of object: a tablespace. So, if we stored the options as text[],
we'd parse them out into a C struct just as we do for pg_class, but
unlike the pg_class case, it would always be the *same* C struct.

The same, until it's different. There is no reason at all to suppose
that the set of options people will want to apply to a tablespace will
remain constant over time --- in fact, I don't think there's even a
solid consensus right now on which GUCs people would want to set at the
tablespace level. I don't believe it is wise to hardwire this into the
catalog schema. Yes, it would look marginally nicer from a theoretical
standpoint, but we'd be forever having to revise the schema, plus a lot
of downstream code (pg_dump for example); which is not only significant
work but absolutely prevents making any adjustments except at major
version boundaries. And I don't see any concrete benefit that we get
out of a hardwired schema for these things. It's not like we care about
optimizing searches for tablespaces having a particular option setting,
for example.

I can tell I've lost this argument, but I still don't get it. Why do
we care if we have to change the schema? It's not a lot of work, and
the number of times we would likely bump catversion for new
pg_tablespace options seems unlikely to be significant in the grand
scheme of things. I don't think there are very many parameters that
make sense to set per-tablespace. As for major version boundaries, it
seems almost unimaginable that we would backpatch code to add a new
tablespace option whether the schema permits it or not. Can you
clarify the nature of your concern here?

What I'm concerned about with text[] is that I *think* it's going to
force us to invent an analog of the relcache for tablespaces. With
hardwired columns, a regular catcache is all we need. But the
reloptions stuff is designed to populate a struct, and once we
populate that struct we have to have someplace to hang it - or I guess
maybe we could reparse it on every call to cost_seqscan(),
cost_index(), genericcostestimate(), etc, but that doesn't seem like a
great idea. So it seems like we'll need another caching layer sitting
over the catcache. If we already had such a beast it would be
reasonable to add this in, but I would assume that we wouldn't want to
add such a thing without a fairly clear use case that I'm not sure we
have. Maybe you see it differently? Or do you have some idea for a
simpler way to set this up?

...Robert

Tom Lane

tgl@sss.pgh.pa.us

about 16 years ago

In reply to: Robert Haas (#5)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Robert Haas <robertmhaas@gmail.com> writes:

I can tell I've lost this argument, but I still don't get it. Why do
we care if we have to change the schema? It's not a lot of work,

Try doing it a few times. Don't forget to keep psql and pg_dump
apprised of which PG versions contain which columns. Not to mention
other tools such as pgAdmin that might like to show these settings.
It gets old pretty fast.

What I'm concerned about with text[] is that I *think* it's going to
force us to invent an analog of the relcache for tablespaces.

I'm not really convinced of that, but even if we do, so what? It's not
that much code to have an extra cache watching the syscache traffic.
There's an example in parse_oper.c of a specialized cache that's about
as complicated as this would be. It's about 150 lines including copious
comments. We didn't even bother to split it out into its own source
file.

With
hardwired columns, a regular catcache is all we need. But the
reloptions stuff is designed to populate a struct, and once we
populate that struct we have to have someplace to hang it - or I guess
maybe we could reparse it on every call to cost_seqscan(),
cost_index(), genericcostestimate(), etc, but that doesn't seem like a
great idea.

Well, no, we would not do it that way. I would imagine instead that
plancat.c would be responsible for attaching appropriate cost values to
each RelOptInfo struct, so it'd be more like one lookup per referenced
table per query. It's possible that a cache would be useful even at
that load level, but I'm not convinced.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Tom Lane (#6)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sat, Nov 14, 2009 at 6:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not really convinced of that, but even if we do, so what? It's not
that much code to have an extra cache watching the syscache traffic.
There's an example in parse_oper.c of a specialized cache that's about
as complicated as this would be. It's about 150 lines including copious
comments. We didn't even bother to split it out into its own source
file.

Well, if it's that simple maybe it's not too bad. I'll take a look at that one.

With
hardwired columns, a regular catcache is all we need. But the
reloptions stuff is designed to populate a struct, and once we
populate that struct we have to have someplace to hang it - or I guess
maybe we could reparse it on every call to cost_seqscan(),
cost_index(), genericcostestimate(), etc, but that doesn't seem like a
great idea.

Well, no, we would not do it that way. I would imagine instead that
plancat.c would be responsible for attaching appropriate cost values to
each RelOptInfo struct, so it'd be more like one lookup per referenced
table per query. It's possible that a cache would be useful even at
that load level, but I'm not convinced.

I'm not sure exactly what you mean by the last sentence, but my
current design attaches the tablespace OID to RelOptInfo (for baserels
only, of course) and IndexOptInfo, and the costing functions trigger
the actual lookup of the page costs. I guess that might be slightly
inferior to actually attaching the actualized values to the
RelOptInfo, since each possible index-path needs the values for both
the index and the underlying table.

I will take another crack at it.

...Robert

Bernd Helmle

mailings@oopsware.de

about 16 years ago

In reply to: Robert Haas (#7)

Re: patch - per-tablespace random_page_cost/seq_page_cost

--On 14. November 2009 20:22:42 -0500 Robert Haas <robertmhaas@gmail.com>
wrote:

I will take another crack at it.

...Robert

I take this that you are going to provide a new patch version?

--
Thanks

Bernd

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Bernd Helmle (#8)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Mon, Nov 16, 2009 at 4:37 AM, Bernd Helmle <mailings@oopsware.de> wrote:

--On 14. November 2009 20:22:42 -0500 Robert Haas <robertmhaas@gmail.com>
wrote:

I will take another crack at it.

...Robert

I take this that you are going to provide a new patch version?

Yes. I'm not sure whether or not it will be in time for this CF, however.

...Robert

#10

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Tom Lane (#4)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think there's even a
solid consensus right now on which GUCs people would want to set at the
tablespace level.

This seems like an important point that we need to nail down. The
original motivation for this patch was based on seq_page_cost and
random_page_cost, to cover the case where, for example, one tablespace
is on an SSD and another tablespace is on a RAID array.

Greg Stark proposed adding effective_io_concurrency, and that makes
plenty of sense to me, but I'm sort of disinclined to attempt to
implement that as part of this patch because I have no familiarity
with that part of the code and no hardware that I can use to test
either the current behavior or the modified behavior. Since I'm
recoding this to use the reloptions mechanism, a patch to add support
for that should be pretty easy to write as a follow-on patch once this
goes in.

Any other suggestions?

Current version of patch is attached. I've revised it to use the
reloptions stuff, but I don't think it's committable as-is because it
currently thinks that extracting options from a pg_tablespace tuple is
a cheap operation, which was true in the non-reloptions-based
implementation but is less true now. At least, some benchmarking
needs to be done to figure out whether and to what extent this is an
issue.

...Robert

Attachments:

spcoptions-v2.patchtext/x-diff; charset=US-ASCII; name=spcoptions-v2.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 1935,1940 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 1935,1943 ----
         <para>
          Sets the planner's estimate of the cost of a disk page fetch
          that is part of a series of sequential fetches.  The default is 1.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
         </para>
        </listitem>
       </varlistentry>
***************
*** 1948,1953 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 1951,1962 ----
         <para>
          Sets the planner's estimate of the cost of a
          non-sequentially-fetched disk page.  The default is 4.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
+        </para>
+ 
+ 	   <para>
          Reducing this value relative to <varname>seq_page_cost</>
          will cause the system to prefer index scans; raising it will
          make index scans look relatively more expensive.  You can raise
*** a/doc/src/sgml/ref/alter_tablespace.sgml
--- b/doc/src/sgml/ref/alter_tablespace.sgml
***************
*** 23,28 **** PostgreSQL documentation
--- 23,30 ----
  <synopsis>
  ALTER TABLESPACE <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
  ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner</replaceable>
+ ALTER TABLESPACE <replaceable>name</replaceable> SET ( <replaceable class="PARAMETER">tablespace_option</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] )
+ ALTER TABLESPACE <replaceable>name</replaceable> RESET ( <replaceable class="PARAMETER">tablespace_option</replaceable> [, ... ] )
  </synopsis>
   </refsynopsisdiv>
    
***************
*** 74,79 **** ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner
--- 76,99 ----
       </para>
      </listitem>
     </varlistentry>
+ 
+    <varlistentry>
+     <term><replaceable class="parameter">tablespace_parameter</replaceable></term>
+     <listitem>
+      <para>
+       A tablespace parameter to be set or reset.  Currently, the only
+       available parameters are <varname>seq_page_cost</> and
+       <varname>random_page_cost</>.  Setting either value for a particular
+       tablespace will override the planner's usual estimate of the cost of
+       reading pages from tables in that tablespace, as established by
+       the configuration parameters of the same name (see
+       <xref linkend="guc-seq-page-cost">,
+       <xref linkend="guc-random-page-cost">).  This may be useful if one
+       tablespace is located on a disk which is faster or slower than the
+       remainder of the I/O subsystem.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </refsect1>
  
*** a/src/backend/access/common/reloptions.c
--- b/src/backend/access/common/reloptions.c
***************
*** 21,26 ****
--- 21,27 ----
  #include "access/reloptions.h"
  #include "catalog/pg_type.h"
  #include "commands/defrem.h"
+ #include "commands/tablespace.h"
  #include "nodes/makefuncs.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
***************
*** 179,184 **** static relopt_real realRelOpts[] =
--- 180,201 ----
  		},
  		-1, 0.0, 100.0
  	},
+ 	{
+ 		{
+ 			"seq_page_cost",
+ 			"Sets the planner's estimate of the cost of a sequentially fetched disk page.",
+ 			RELOPT_KIND_TABLESPACE
+ 		},
+ 		-1, 0.0, DBL_MAX
+ 	},
+ 	{
+ 		{
+ 			"random_page_cost",
+ 			"Sets the planner's estimate of the cost of a nonsequentially fetched disk page.",
+ 			RELOPT_KIND_TABLESPACE
+ 		},
+ 		-1, 0.0, DBL_MAX
+ 	},
  	/* list terminator */
  	{{NULL}}
  };
***************
*** 1168,1170 **** index_reloptions(RegProcedure amoptions, Datum reloptions, bool validate)
--- 1185,1218 ----
  
  	return DatumGetByteaP(result);
  }
+ 
+ /*
+  * Option parser for tablespace reloptions
+  */
+ bytea *
+ tablespace_reloptions(Datum reloptions, bool validate)
+ {
+ 	relopt_value *options;
+ 	TableSpaceOpts	*tsopts;
+ 	int			numoptions;
+ 	static const relopt_parse_elt tab[] = {
+ 		{"random_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, random_page_cost)},
+ 		{"seq_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, seq_page_cost)}
+ 	};
+ 
+ 	options = parseRelOptions(reloptions, validate, RELOPT_KIND_TABLESPACE,
+ 							  &numoptions);
+ 
+ 	/* if none set, we're done */
+ 	if (numoptions == 0)
+ 		return NULL;
+ 
+ 	tsopts = allocateReloptStruct(sizeof(TableSpaceOpts), options, numoptions);
+ 
+ 	fillRelOptions((void *) tsopts, sizeof(TableSpaceOpts), options, numoptions,
+ 				   validate, tab, lengthof(tab));
+ 
+ 	pfree(options);
+ 
+ 	return (bytea *) tsopts;
+ }
*** a/src/backend/catalog/aclchk.c
--- b/src/backend/catalog/aclchk.c
***************
*** 2621,2638 **** ExecGrant_Tablespace(InternalGrant *istmt)
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
- 		ScanKeyData entry[1];
- 		SysScanDesc scan;
  		HeapTuple	tuple;
  
! 		/* There's no syscache for pg_tablespace, so must look the hard way */
! 		ScanKeyInit(&entry[0],
! 					ObjectIdAttributeNumber,
! 					BTEqualStrategyNumber, F_OIDEQ,
! 					ObjectIdGetDatum(tblId));
! 		scan = systable_beginscan(relation, TablespaceOidIndexId, true,
! 								  SnapshotNow, 1, entry);
! 		tuple = systable_getnext(scan);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
--- 2621,2631 ----
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
  		HeapTuple	tuple;
  
! 		/* Search syscache for pg_tablespace */
! 		tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(tblId),
! 							   0, 0, 0);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
***************
*** 2703,2709 **** ExecGrant_Tablespace(InternalGrant *istmt)
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		systable_endscan(scan);
  
  		pfree(new_acl);
  
--- 2696,2702 ----
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		ReleaseSysCache(tuple);
  
  		pfree(new_acl);
  
***************
*** 3443,3451 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  					  AclMode mask, AclMaskHow how)
  {
  	AclMode		result;
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	tuple;
  	Datum		aclDatum;
  	bool		isNull;
--- 3436,3441 ----
***************
*** 3458,3474 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
- 	 *
- 	 * There's no syscache for pg_tablespace, so must look the hard way
  	 */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 	tuple = systable_getnext(scan);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 3448,3456 ----
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
  	 */
! 	tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 						   0, 0, 0);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 3476,3483 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = heap_getattr(tuple, Anum_pg_tablespace_spcacl,
! 							RelationGetDescr(pg_tablespace), &isNull);
  
  	if (isNull)
  	{
--- 3458,3466 ----
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = SysCacheGetAttr(TABLESPACEOID, tuple,
! 								   Anum_pg_tablespace_spcacl,
! 								   &isNull);
  
  	if (isNull)
  	{
***************
*** 3497,3504 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return result;
  }
--- 3480,3486 ----
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	ReleaseSysCache(tuple);
  
  	return result;
  }
***************
*** 4025,4033 **** pg_namespace_ownercheck(Oid nsp_oid, Oid roleid)
  bool
  pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  {
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	spctuple;
  	Oid			spcowner;
  
--- 4007,4012 ----
***************
*** 4035,4051 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* There's no syscache for pg_tablespace, so must look the hard way */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 
! 	spctuple = systable_getnext(scan);
! 
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 4014,4022 ----
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* Search syscache for pg_tablespace */
! 	spctuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 							  0, 0, 0);
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 4053,4060 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return has_privs_of_role(roleid, spcowner);
  }
--- 4024,4030 ----
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	ReleaseSysCache(spctuple);
  
  	return has_privs_of_role(roleid, spcowner);
  }
*** a/src/backend/commands/tablespace.c
--- b/src/backend/commands/tablespace.c
***************
*** 49,54 ****
--- 49,55 ----
  #include <sys/stat.h>
  
  #include "access/heapam.h"
+ #include "access/reloptions.h"
  #include "access/sysattr.h"
  #include "access/xact.h"
  #include "catalog/catalog.h"
***************
*** 56,61 ****
--- 57,63 ----
  #include "catalog/indexing.h"
  #include "catalog/pg_tablespace.h"
  #include "commands/comment.h"
+ #include "commands/defrem.h"
  #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "postmaster/bgwriter.h"
***************
*** 67,72 ****
--- 69,75 ----
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
+ #include "utils/syscache.h"
  #include "utils/tqual.h"
  
  
***************
*** 287,292 **** CreateTableSpace(CreateTableSpaceStmt *stmt)
--- 290,296 ----
  	values[Anum_pg_tablespace_spclocation - 1] =
  		CStringGetTextDatum(location);
  	nulls[Anum_pg_tablespace_spcacl - 1] = true;
+ 	nulls[Anum_pg_tablespace_spcoptions - 1] = true;
  
  	tuple = heap_form_tuple(rel->rd_att, values, nulls);
  
***************
*** 910,915 **** AlterTableSpaceOwner(const char *name, Oid newOwnerId)
--- 914,986 ----
  
  
  /*
+  * Alter table space options
+  */
+ void
+ AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt)
+ {
+ 	Relation	rel;
+ 	ScanKeyData entry[1];
+ 	HeapScanDesc scandesc;
+ 	HeapTuple	tup;
+ 	Datum		datum;
+ 	Datum		newOptions;
+ 	Datum		repl_val[Natts_pg_tablespace];
+ 	bool		isnull;
+ 	bool		repl_null[Natts_pg_tablespace];
+ 	bool		repl_repl[Natts_pg_tablespace];
+ 	HeapTuple	newtuple;
+ 
+ 	/* Search pg_tablespace */
+ 	rel = heap_open(TableSpaceRelationId, RowExclusiveLock);
+ 
+ 	ScanKeyInit(&entry[0],
+ 				Anum_pg_tablespace_spcname,
+ 				BTEqualStrategyNumber, F_NAMEEQ,
+ 				CStringGetDatum(stmt->tablespacename));
+ 	scandesc = heap_beginscan(rel, SnapshotNow, 1, entry);
+ 	tup = heap_getnext(scandesc, ForwardScanDirection);
+ 	if (!HeapTupleIsValid(tup))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_UNDEFINED_OBJECT),
+ 				 errmsg("tablespace \"%s\" does not exist",
+ 					stmt->tablespacename)));
+ 
+ 	/* Must be owner of the existing object */
+ 	if (!pg_tablespace_ownercheck(HeapTupleGetOid(tup), GetUserId()))
+ 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_TABLESPACE,
+ 					   stmt->tablespacename);
+ 
+ 	/* Generate new proposed spcoptions (text array) */
+ 	datum = heap_getattr(tup, Anum_pg_tablespace_spcoptions,
+ 						 RelationGetDescr(rel), &isnull);
+ 	newOptions = transformRelOptions(isnull ? (Datum) 0 : datum,
+ 									 stmt->options, NULL, NULL, false,
+ 									 stmt->isReset);
+ 	(void) tablespace_reloptions(newOptions, true);
+ 
+ 	/* Build new tuple. */
+ 	memset(repl_null, false, sizeof(repl_null));
+ 	memset(repl_repl, false, sizeof(repl_repl));
+ 	if (newOptions != (Datum) 0)
+ 		repl_val[Anum_pg_tablespace_spcoptions - 1] = newOptions;
+ 	else
+ 		repl_null[Anum_pg_tablespace_spcoptions - 1] = true;
+ 	repl_repl[Anum_pg_tablespace_spcoptions - 1] = true;
+ 	newtuple = heap_modify_tuple(tup, RelationGetDescr(rel), repl_val,
+ 								 repl_null, repl_repl);
+ 
+ 	/* Update system catalog. */
+ 	simple_heap_update(rel, &newtuple->t_self, newtuple);
+ 	CatalogUpdateIndexes(rel, newtuple);
+ 	heap_freetuple(newtuple);
+ 
+ 	/* Conclude heap scan. */
+ 	heap_endscan(scandesc);
+ 	heap_close(rel, NoLock);
+ }
+ 
+ /*
   * Routines for handling the GUC variable 'default_tablespace'.
   */
  
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3055,3060 **** _copyDropTableSpaceStmt(DropTableSpaceStmt *from)
--- 3055,3072 ----
  	return newnode;
  }
  
+ static AlterTableSpaceOptionsStmt *
+ _copyAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *from)
+ {
+ 	AlterTableSpaceOptionsStmt *newnode = makeNode(AlterTableSpaceOptionsStmt);
+ 
+ 	COPY_STRING_FIELD(tablespacename);
+ 	COPY_NODE_FIELD(options);
+ 	COPY_SCALAR_FIELD(isReset);
+ 
+ 	return newnode;
+ }
+ 
  static CreateFdwStmt *
  _copyCreateFdwStmt(CreateFdwStmt *from)
  {
***************
*** 4019,4024 **** copyObject(void *from)
--- 4031,4039 ----
  		case T_DropTableSpaceStmt:
  			retval = _copyDropTableSpaceStmt(from);
  			break;
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			retval = _copyAlterTableSpaceOptionsStmt(from);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _copyCreateFdwStmt(from);
  			break;
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1566,1571 **** _equalDropTableSpaceStmt(DropTableSpaceStmt *a, DropTableSpaceStmt *b)
--- 1566,1582 ----
  }
  
  static bool
+ _equalAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *a,
+ 											 AlterTableSpaceOptionsStmt *b)
+ {
+ 	COMPARE_STRING_FIELD(tablespacename);
+ 	COMPARE_NODE_FIELD(options);
+ 	COMPARE_SCALAR_FIELD(isReset);
+ 
+ 	return true;
+ }
+ 
+ static bool
  _equalCreateFdwStmt(CreateFdwStmt *a, CreateFdwStmt *b)
  {
  	COMPARE_STRING_FIELD(fdwname);
***************
*** 2712,2717 **** equal(void *a, void *b)
--- 2723,2731 ----
  		case T_DropTableSpaceStmt:
  			retval = _equalDropTableSpaceStmt(a, b);
  			break;
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			retval = _equalAlterTableSpaceOptionsStmt(a, b);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _equalCreateFdwStmt(a, b);
  			break;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 1586,1591 **** _outRelOptInfo(StringInfo str, RelOptInfo *node)
--- 1586,1592 ----
  	WRITE_NODE_FIELD(cheapest_total_path);
  	WRITE_NODE_FIELD(cheapest_unique_path);
  	WRITE_UINT_FIELD(relid);
+ 	WRITE_UINT_FIELD(reltablespace);
  	WRITE_ENUM_FIELD(rtekind, RTEKind);
  	WRITE_INT_FIELD(min_attr);
  	WRITE_INT_FIELD(max_attr);
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
***************
*** 27,32 ****
--- 27,37 ----
   * detail.	Note that all of these parameters are user-settable, in case
   * the default values are drastically off for a particular platform.
   *
+  * seq_page_cost and random_page_cost can also be overridden for an individual
+  * tablespace, in case some data is on a fast disk and other data is on a slow
+  * disk.  Per-tablespace overrides never apply to temporary work files such as
+  * an external sort or a materialize node that overflows work_mem.
+  *
   * We compute two separate costs for each path:
   *		total_cost: total estimated cost to fetch all tuples
   *		startup_cost: cost that is expended before first tuple is fetched
***************
*** 164,169 **** void
--- 169,175 ----
  cost_seqscan(Path *path, PlannerInfo *root,
  			 RelOptInfo *baserel)
  {
+ 	double		spc_seq_page_cost;
  	Cost		startup_cost = 0;
  	Cost		run_cost = 0;
  	Cost		cpu_per_tuple;
***************
*** 175,184 **** cost_seqscan(Path *path, PlannerInfo *root,
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
  	/*
  	 * disk costs
  	 */
! 	run_cost += seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
--- 181,195 ----
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  NULL,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * disk costs
  	 */
! 	run_cost += spc_seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
***************
*** 226,231 **** cost_index(IndexPath *path, PlannerInfo *root,
--- 237,244 ----
  	Selectivity indexSelectivity;
  	double		indexCorrelation,
  				csquared;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	Cost		min_IO_cost,
  				max_IO_cost;
  	Cost		cpu_per_tuple;
***************
*** 272,284 **** cost_index(IndexPath *path, PlannerInfo *root,
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
--- 285,302 ----
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
+ 	/* fetch estimated page costs for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge spc_random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
***************
*** 286,292 **** cost_index(IndexPath *path, PlannerInfo *root,
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		random_page_cost + (pages_fetched - 1) * seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
--- 304,310 ----
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
***************
*** 309,315 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
--- 327,333 ----
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
***************
*** 328,334 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  	}
  	else
  	{
--- 346,352 ----
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  	}
  	else
  	{
***************
*** 342,354 **** cost_index(IndexPath *path, PlannerInfo *root,
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * seq_page_cost;
  	}
  
  	/*
--- 360,372 ----
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * spc_random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = spc_random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
  	}
  
  	/*
***************
*** 553,558 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 571,578 ----
  	Cost		cost_per_page;
  	double		tuples_fetched;
  	double		pages_fetched;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	double		T;
  
  	/* Should only be applied to base relations */
***************
*** 571,576 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 591,601 ----
  
  	startup_cost += indexTotalCost;
  
+ 	/* Fetch estimated page costs for tablespace containing table. */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * Estimate number of main-table pages fetched.
  	 */
***************
*** 609,625 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = random_page_cost -
! 			(random_page_cost - seq_page_cost) * sqrt(pages_fetched / T);
  	else
! 		cost_per_page = random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
--- 634,651 ----
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge spc_random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge spc_seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = spc_random_page_cost -
! 			(spc_random_page_cost - spc_seq_page_cost)
! 			* sqrt(pages_fetched / T);
  	else
! 		cost_per_page = spc_random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
***************
*** 783,788 **** cost_tidscan(Path *path, PlannerInfo *root,
--- 809,815 ----
  	QualCost	tid_qual_cost;
  	int			ntuples;
  	ListCell   *l;
+ 	double		spc_random_page_cost;
  
  	/* Should only be applied to base relations */
  	Assert(baserel->relid > 0);
***************
*** 835,842 **** cost_tidscan(Path *path, PlannerInfo *root,
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
--- 862,874 ----
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += spc_random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 91,96 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 91,97 ----
  
  	rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
  	rel->max_attr = RelationGetNumberOfAttributes(relation);
+ 	rel->reltablespace = RelationGetForm(relation)->reltablespace;
  
  	Assert(rel->max_attr >= rel->min_attr);
  	rel->attr_needed = (Relids *)
***************
*** 183,188 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 184,191 ----
  			info = makeNode(IndexOptInfo);
  
  			info->indexoid = index->indexrelid;
+ 			info->reltablespace =
+ 				RelationGetForm(indexRelation)->reltablespace;
  			info->rel = rel;
  			info->ncolumns = ncolumns = index->indnatts;
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 5630,5635 **** RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
--- 5630,5654 ----
  					n->newname = $6;
  					$$ = (Node *)n;
  				}
+ 			| ALTER TABLESPACE name SET reloptions
+ 				{
+ 					AlterTableSpaceOptionsStmt *n =
+ 						makeNode(AlterTableSpaceOptionsStmt);
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					n->isReset = FALSE;
+ 					$$ = (Node *)n;
+ 				}
+ 			| ALTER TABLESPACE name RESET reloptions
+ 				{
+ 					AlterTableSpaceOptionsStmt *n =
+ 						makeNode(AlterTableSpaceOptionsStmt);
+ 					ListCell *lc;
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					n->isReset = TRUE;
+ 					$$ = (Node *)n;
+ 				}
  			| ALTER TEXT_P SEARCH PARSER any_name RENAME TO name
  				{
  					RenameStmt *n = makeNode(RenameStmt);
*** a/src/backend/tcop/utility.c
--- b/src/backend/tcop/utility.c
***************
*** 214,219 **** check_xact_readonly(Node *parsetree)
--- 214,220 ----
  		case T_CreateUserMappingStmt:
  		case T_AlterUserMappingStmt:
  		case T_DropUserMappingStmt:
+ 		case T_AlterTableSpaceOptionsStmt:
  			ereport(ERROR,
  					(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
  					 errmsg("transaction is read-only")));
***************
*** 480,485 **** ProcessUtility(Node *parsetree,
--- 481,490 ----
  			DropTableSpace((DropTableSpaceStmt *) parsetree);
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			AlterTableSpaceOptions((AlterTableSpaceOptionsStmt *) parsetree);
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			CreateForeignDataWrapper((CreateFdwStmt *) parsetree);
  			break;
***************
*** 1386,1391 **** CreateCommandTag(Node *parsetree)
--- 1391,1400 ----
  			tag = "DROP TABLESPACE";
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			tag = "ALTER TABLESPACE";
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			tag = "CREATE FOREIGN DATA WRAPPER";
  			break;
***************
*** 2165,2170 **** GetCommandLogLevel(Node *parsetree)
--- 2174,2183 ----
  			lev = LOGSTMT_DDL;
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			lev = LOGSTMT_DDL;
+ 			break;
+ 
  		case T_CreateFdwStmt:
  		case T_AlterFdwStmt:
  		case T_DropFdwStmt:
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
***************
*** 5372,5377 **** genericcostestimate(PlannerInfo *root,
--- 5372,5378 ----
  	QualCost	index_qual_cost;
  	double		qual_op_cost;
  	double		qual_arg_cost;
+ 	double		spc_random_page_cost;
  	List	   *selectivityQuals;
  	ListCell   *l;
  
***************
*** 5480,5485 **** genericcostestimate(PlannerInfo *root,
--- 5481,5491 ----
  	else
  		numIndexPages = 1.0;
  
+ 	/* fetch estimated page cost for schema containing index */
+ 	get_tablespace_page_costs(index->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/*
  	 * Now compute the disk access costs.
  	 *
***************
*** 5526,5540 **** genericcostestimate(PlannerInfo *root,
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * random_page_cost) / num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge random_page_cost per page
! 		 * touched.
  		 */
! 		*indexTotalCost = numIndexPages * random_page_cost;
  	}
  
  	/*
--- 5532,5547 ----
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * spc_random_page_cost)
! 							/ num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge spc_random_page_cost per
! 		 * page touched.
  		 */
! 		*indexTotalCost = numIndexPages * spc_random_page_cost;
  	}
  
  	/*
***************
*** 5549,5559 **** genericcostestimate(PlannerInfo *root,
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * random_page_cost per 100000 index pages, which should be small enough
! 	 * to not alter index-vs-seqscan decisions, but will prevent indexes of
! 	 * different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
--- 5556,5566 ----
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * spc_random_page_cost per 100000 index pages, which should be small
! 	 * enough to not alter index-vs-seqscan decisions, but will prevent
! 	 * indexes of different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * spc_random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
*** a/src/backend/utils/cache/lsyscache.c
--- b/src/backend/utils/cache/lsyscache.c
***************
*** 17,22 ****
--- 17,23 ----
  
  #include "access/hash.h"
  #include "access/nbtree.h"
+ #include "access/reloptions.h"
  #include "bootstrap/bootstrap.h"
  #include "catalog/pg_amop.h"
  #include "catalog/pg_amproc.h"
***************
*** 26,34 ****
--- 27,38 ----
  #include "catalog/pg_operator.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_type.h"
+ #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
+ #include "optimizer/cost.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/datum.h"
***************
*** 2776,2778 **** get_roleid_checked(const char *rolname)
--- 2780,2830 ----
  				 errmsg("role \"%s\" does not exist", rolname)));
  	return roleid;
  }
+ 
+ /*				---------- PG_TABLESPACE CACHE ----------				 */
+ 
+ /*
+  * get_tablespace_page_costs
+  *		Returns random and seqential page costs for a given tablespace
+  */
+ void
+ get_tablespace_page_costs(Oid spcid, double *spc_random_page_cost,
+ 					     double *spc_seq_page_cost)
+ {
+ 	HeapTuple	tp;
+ 
+ 	/* Ensure output args are initialized on failure */
+ 	if (spc_random_page_cost)
+ 		*spc_random_page_cost = random_page_cost;
+ 	if (spc_seq_page_cost)
+ 		*spc_seq_page_cost = seq_page_cost;
+ 
+ 	/* spcid is always from a pg_class tuple, so InvalidOid implies the
+ 	 * default */
+ 	if (spcid == InvalidOid)
+ 		spcid = MyDatabaseTableSpace;
+ 
+ 	tp = SearchSysCache(TABLESPACEOID,
+ 						ObjectIdGetDatum(spcid),
+ 						0, 0, 0);
+ 	if (HeapTupleIsValid(tp))
+ 	{
+ 		bool	isNull;
+ 		Datum	datum;
+ 
+ 		datum = SysCacheGetAttr(TABLESPACEOID, tp,
+ 			Anum_pg_tablespace_spcoptions, &isNull);
+ 
+ 		if (!isNull)
+ 		{
+ 			TableSpaceOpts	*opts = (TableSpaceOpts *)
+ 				tablespace_reloptions(datum, false);
+ 			if (spc_random_page_cost && opts->random_page_cost >= 0)
+ 				*spc_random_page_cost = opts->random_page_cost;
+ 			if (spc_seq_page_cost && opts->seq_page_cost >= 0)
+ 				*spc_seq_page_cost = opts->seq_page_cost;
+ 		}
+ 
+ 		ReleaseSysCache(tp);
+ 	}
+ }
*** a/src/backend/utils/cache/syscache.c
--- b/src/backend/utils/cache/syscache.c
***************
*** 43,48 ****
--- 43,49 ----
  #include "catalog/pg_proc.h"
  #include "catalog/pg_rewrite.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_ts_config.h"
  #include "catalog/pg_ts_config_map.h"
  #include "catalog/pg_ts_dict.h"
***************
*** 609,614 **** static const struct cachedesc cacheinfo[] = {
--- 610,627 ----
  		},
  		1024
  	},
+ 	{TableSpaceRelationId,		/* TABLESPACEOID */
+ 		TablespaceOidIndexId,
+ 		0,
+ 		1,
+ 		{
+ 			ObjectIdAttributeNumber,
+ 			0,
+ 			0,
+ 			0,
+ 		},
+ 		16
+ 	},
  	{TSConfigMapRelationId,		/* TSCONFIGMAP */
  		TSConfigMapIndexId,
  		0,
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
***************
*** 956,974 **** dumpTablespaces(PGconn *conn)
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80200)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
--- 956,983 ----
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80500)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
+ 						   "array_to_string(spcoptions, ', '),"
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
+ 	else if (server_version >= 80200)
+ 		res = executeQuery(conn, "SELECT spcname, "
+ 						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
+ 						   "spclocation, spcacl, null, "
+ 						"pg_catalog.shobj_description(oid, 'pg_tablespace'), "
+ 						   "FROM pg_catalog.pg_tablespace "
+ 						   "WHERE spcname !~ '^pg_' "
+ 						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null, null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
***************
*** 983,989 **** dumpTablespaces(PGconn *conn)
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spccomment = PQgetvalue(res, i, 4);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
--- 992,999 ----
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spcoptions = PQgetvalue(res, i, 4);
! 		char	   *spccomment = PQgetvalue(res, i, 5);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
***************
*** 996,1001 **** dumpTablespaces(PGconn *conn)
--- 1006,1015 ----
  		appendStringLiteralConn(buf, spclocation, conn);
  		appendPQExpBuffer(buf, ";\n");
  
+ 		if (spcoptions && spcoptions[0] != '\0')
+ 			appendPQExpBuffer(buf, "ALTER TABLESPACE %s SET (%s);\n",
+ 							  fspcname, spcoptions);
+ 
  		if (!skip_acls &&
  			!buildACLCommands(fspcname, NULL, "TABLESPACE", spcacl, spcowner,
  							  "", server_version, buf))
*** a/src/include/access/reloptions.h
--- b/src/include/access/reloptions.h
***************
*** 1,7 ****
  /*-------------------------------------------------------------------------
   *
   * reloptions.h
!  *	  Core support for relation options (pg_class.reloptions)
   *
   * Note: the functions dealing with text-array reloptions values declare
   * them as Datum, not ArrayType *, to avoid needing to include array.h
--- 1,8 ----
  /*-------------------------------------------------------------------------
   *
   * reloptions.h
!  *	  Core support for relation and tablespace options (pg_class.reloptions
!  *	  and pg_tablespace.spcoptions)
   *
   * Note: the functions dealing with text-array reloptions values declare
   * them as Datum, not ArrayType *, to avoid needing to include array.h
***************
*** 39,46 **** typedef enum relopt_kind
  	RELOPT_KIND_HASH = (1 << 3),
  	RELOPT_KIND_GIN = (1 << 4),
  	RELOPT_KIND_GIST = (1 << 5),
  	/* if you add a new kind, make sure you update "last_default" too */
! 	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_GIST,
  	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
  	RELOPT_KIND_MAX = (1 << 30)
  } relopt_kind;
--- 40,48 ----
  	RELOPT_KIND_HASH = (1 << 3),
  	RELOPT_KIND_GIN = (1 << 4),
  	RELOPT_KIND_GIST = (1 << 5),
+ 	RELOPT_KIND_TABLESPACE = (1 << 6),
  	/* if you add a new kind, make sure you update "last_default" too */
! 	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_TABLESPACE,
  	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
  	RELOPT_KIND_MAX = (1 << 30)
  } relopt_kind;
***************
*** 264,268 **** extern bytea *default_reloptions(Datum reloptions, bool validate,
--- 266,271 ----
  extern bytea *heap_reloptions(char relkind, Datum reloptions, bool validate);
  extern bytea *index_reloptions(RegProcedure amoptions, Datum reloptions,
  				 bool validate);
+ extern bytea *tablespace_reloptions(Datum reloptions, bool validate);
  
  #endif   /* RELOPTIONS_H */
*** a/src/include/catalog/pg_tablespace.h
--- b/src/include/catalog/pg_tablespace.h
***************
*** 34,39 **** CATALOG(pg_tablespace,1213) BKI_SHARED_RELATION
--- 34,40 ----
  	Oid			spcowner;		/* owner of tablespace */
  	text		spclocation;	/* physical location (VAR LENGTH) */
  	aclitem		spcacl[1];		/* access permissions (VAR LENGTH) */
+ 	text		spcoptions[1];	/* per-tablespace options */
  } FormData_pg_tablespace;
  
  /* ----------------
***************
*** 48,61 **** typedef FormData_pg_tablespace *Form_pg_tablespace;
   * ----------------
   */
  
! #define Natts_pg_tablespace				4
  #define Anum_pg_tablespace_spcname		1
  #define Anum_pg_tablespace_spcowner		2
  #define Anum_pg_tablespace_spclocation	3
  #define Anum_pg_tablespace_spcacl		4
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
--- 49,63 ----
   * ----------------
   */
  
! #define Natts_pg_tablespace				6
  #define Anum_pg_tablespace_spcname		1
  #define Anum_pg_tablespace_spcowner		2
  #define Anum_pg_tablespace_spclocation	3
  #define Anum_pg_tablespace_spcacl		4
+ #define Anum_pg_tablespace_spcoptions	5
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
*** a/src/include/commands/tablespace.h
--- b/src/include/commands/tablespace.h
***************
*** 32,42 **** typedef struct xl_tblspc_drop_rec
--- 32,48 ----
  	Oid			ts_id;
  } xl_tblspc_drop_rec;
  
+ typedef struct TableSpaceOpts
+ {
+ 	float8		random_page_cost;
+ 	float8		seq_page_cost;
+ } TableSpaceOpts;
  
  extern void CreateTableSpace(CreateTableSpaceStmt *stmt);
  extern void DropTableSpace(DropTableSpaceStmt *stmt);
  extern void RenameTableSpace(const char *oldname, const char *newname);
  extern void AlterTableSpaceOwner(const char *name, Oid newOwnerId);
+ extern void AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
  
  extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
  
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 346,351 **** typedef enum NodeTag
--- 346,352 ----
  	T_CreateUserMappingStmt,
  	T_AlterUserMappingStmt,
  	T_DropUserMappingStmt,
+ 	T_AlterTableSpaceOptionsStmt,
  
  	/*
  	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 1464,1469 **** typedef struct DropTableSpaceStmt
--- 1464,1477 ----
  	bool		missing_ok;		/* skip error if missing? */
  } DropTableSpaceStmt;
  
+ typedef struct AlterTableSpaceOptionsStmt
+ {
+ 	NodeTag		type;
+ 	char	   *tablespacename;
+ 	List	   *options;
+ 	bool		isReset;
+ } AlterTableSpaceOptionsStmt;
+ 
  /* ----------------------
   *		Create/Drop FOREIGN DATA WRAPPER Statements
   * ----------------------
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 361,366 **** typedef struct RelOptInfo
--- 361,367 ----
  
  	/* information about a base rel (not set for join rels!) */
  	Index		relid;
+ 	Oid			reltablespace;	/* containing tablespace */
  	RTEKind		rtekind;		/* RELATION, SUBQUERY, or FUNCTION */
  	AttrNumber	min_attr;		/* smallest attrno of rel (often <0) */
  	AttrNumber	max_attr;		/* largest attrno of rel */
***************
*** 425,430 **** typedef struct IndexOptInfo
--- 426,432 ----
  	NodeTag		type;
  
  	Oid			indexoid;		/* OID of the index relation */
+ 	Oid			reltablespace;	/* tablespace of index (not table) */
  	RelOptInfo *rel;			/* back-link to index's table */
  
  	/* statistics from pg_class */
*** a/src/include/utils/lsyscache.h
--- b/src/include/utils/lsyscache.h
***************
*** 137,142 **** extern void free_attstatsslot(Oid atttype,
--- 137,144 ----
  extern char *get_namespace_name(Oid nspid);
  extern Oid	get_roleid(const char *rolname);
  extern Oid	get_roleid_checked(const char *rolname);
+ void get_tablespace_page_costs(Oid spcid, float8 *spc_random_page_cost,
+ 					     float8 *spc_seq_page_cost);
  
  #define type_is_array(typid)  (get_element_type(typid) != InvalidOid)
  
*** a/src/include/utils/syscache.h
--- b/src/include/utils/syscache.h
***************
*** 71,76 **** enum SysCacheIdentifier
--- 71,77 ----
  	RELOID,
  	RULERELNAME,
  	STATRELATT,
+ 	TABLESPACEOID,
  	TSCONFIGMAP,
  	TSCONFIGNAMENSP,
  	TSCONFIGOID,

#11

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Robert Haas (#10)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Thu, Nov 26, 2009 at 4:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Current version of patch is attached. I've revised it to use the
reloptions stuff, but I don't think it's committable as-is because it
currently thinks that extracting options from a pg_tablespace tuple is
a cheap operation, which was true in the non-reloptions-based
implementation but is less true now. At least, some benchmarking
needs to be done to figure out whether and to what extent this is an
issue.

Hmm. I'm not able to reliably detect a performance difference between
unpatched CVS HEAD (er... git master branch) and same with
spcoptions-v2.patch applied. I figured that if there were going to be
an impact, it would be most likely to manifest itself in a query that
touches lots and lots of tables but does very little actual work. So
I used the attached script to create 200 empty tables, 100 in the
default tablespace and 100 in tablespace "dork" (also known as, why I
am working on this at 11 PM on Thanksgiving). Then I did:

SELECT * FROM a1, a2, a3, ..., a100;

...and likewise for the bn. I tried this on an unpatched install and
also with the patch applied, with and without options set on
tablespace dork. I tried it a couple of times and the times were
pretty consistent on any given run, but bounced around enough between
runs that I can't say with any confidence that this patch makes any
difference one way or the other.

So it seems as if there is little reason to worry about caching, as
Tom suspected, unless someone sees a flaw in my testing methodology.
It might matter more in the future, if we have a larger number of
tablespace options, but we could always add a cache then if need be.

...Robert

#12

David Rowley

dgrowley@gmail.com

about 16 years ago

In reply to: Robert Haas (#11)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Robert Haas Wrote:

Hmm. I'm not able to reliably detect a performance difference between
unpatched CVS HEAD (er... git master branch) and same with spcoptions-
v2.patch applied. I figured that if there were going to be an impact,
it would be most likely to manifest itself in a query that touches lots
and lots of tables but does very little actual work. So I used the
attached script to create 200 empty tables, 100 in the default
tablespace and 100 in tablespace "dork" (also known as, why I am
working on this at 11 PM on Thanksgiving). Then I did:

SELECT * FROM a1, a2, a3, ..., a100;

(I've not read the patch, but I've just read the thread)
If you're just benchmarking the planner times to see if the extra lookups
are affecting the planning times, would it not be better to benchmark
EXPLAIN SELECT * FROM a1, a2, a3, ..., a100; ?
Otherwise any small changes might be drowned out in the execution time.
Scanning 100 relations even if they are empty could account for quite a bit
of that time, right?

David

#13

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: David Rowley (#12)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sat, Nov 28, 2009 at 9:54 PM, David Rowley <dgrowley@gmail.com> wrote:

Robert Haas Wrote:

Hmm. I'm not able to reliably detect a performance difference between
unpatched CVS HEAD (er... git master branch) and same with spcoptions-
v2.patch applied. I figured that if there were going to be an impact,
it would be most likely to manifest itself in a query that touches lots
and lots of tables but does very little actual work. So I used the
attached script to create 200 empty tables, 100 in the default
tablespace and 100 in tablespace "dork" (also known as, why I am
working on this at 11 PM on Thanksgiving). Then I did:

SELECT * FROM a1, a2, a3, ..., a100;

(I've not read the patch, but I've just read the thread)
If you're just benchmarking the planner times to see if the extra lookups
are affecting the planning times, would it not be better to benchmark
EXPLAIN SELECT * FROM a1, a2, a3, ..., a100; ?
Otherwise any small changes might be drowned out in the execution time.
Scanning 100 relations even if they are empty could account for quite a bit
of that time, right?

Possibly, but even if I can measure a difference doing it that way,
it's not clear that it matters. It's fairly certain that there will
be a performance degradation if we measure carefully enough, but if
that difference is imperceptible in real-world scanerios, then it's
not worth worrying about. Still, I probably will test it just to see.

...Robert

#14

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Robert Haas (#13)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Thu, Dec 3, 2009 at 11:00 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Nov 28, 2009 at 9:54 PM, David Rowley <dgrowley@gmail.com> wrote:

Robert Haas Wrote:

Hmm. I'm not able to reliably detect a performance difference between
unpatched CVS HEAD (er... git master branch) and same with spcoptions-
v2.patch applied. I figured that if there were going to be an impact,
it would be most likely to manifest itself in a query that touches lots
and lots of tables but does very little actual work. So I used the
attached script to create 200 empty tables, 100 in the default
tablespace and 100 in tablespace "dork" (also known as, why I am
working on this at 11 PM on Thanksgiving). Then I did:

SELECT * FROM a1, a2, a3, ..., a100;

(I've not read the patch, but I've just read the thread)
If you're just benchmarking the planner times to see if the extra lookups
are affecting the planning times, would it not be better to benchmark
EXPLAIN SELECT * FROM a1, a2, a3, ..., a100; ?
Otherwise any small changes might be drowned out in the execution time.
Scanning 100 relations even if they are empty could account for quite a bit
of that time, right?

Possibly, but even if I can measure a difference doing it that way,
it's not clear that it matters. It's fairly certain that there will
be a performance degradation if we measure carefully enough, but if
that difference is imperceptible in real-world scanerios, then it's
not worth worrying about. Still, I probably will test it just to see.

I did some fairly careful benchmarking of EXPLAIN SELECT * FROM a1,
a2, a3, ..., a100. I explained this query 100 times via DBD::Pg and
used time to measure how long the script took to run. I ran the
script three times. And the result is... the unpatched version came
out 1.7% SLOWER than the patched version. This seems difficult to
take seriously, since it can't possibly be faster to do a syscache
lookup and parse an array than it is to fetch a constant from a known
memory address, but that's what I got. At any rate, it seems pretty
clear that it's not hurting much.

...Robert

#15

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Robert Haas (#10)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Thu, Nov 26, 2009 at 4:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think there's even a
solid consensus right now on which GUCs people would want to set at the
tablespace level.

This seems like an important point that we need to nail down. The
original motivation for this patch was based on seq_page_cost and
random_page_cost, to cover the case where, for example, one tablespace
is on an SSD and another tablespace is on a RAID array.

Greg Stark proposed adding effective_io_concurrency, and that makes
plenty of sense to me, but I'm sort of disinclined to attempt to
implement that as part of this patch because I have no familiarity
with that part of the code and no hardware that I can use to test
either the current behavior or the modified behavior. Since I'm
recoding this to use the reloptions mechanism, a patch to add support
for that should be pretty easy to write as a follow-on patch once this
goes in.

Any other suggestions?

Going once... going twice... since no one has suggested anything or
spoken against the proposal above, I'm just going to implement
seq_page_cost and random_page_cost for now.

Current version of patch is attached. I've revised it to use the
reloptions stuff, but I don't think it's committable as-is because it
currently thinks that extracting options from a pg_tablespace tuple is
a cheap operation, which was true in the non-reloptions-based
implementation but is less true now. At least, some benchmarking
needs to be done to figure out whether and to what extent this is an
issue.

Per the email that I just sent a few minutes ago, there doesn't appear
to be a performance impact in doing this even in a relatively stupid
way - every call that requires seq_page_cost and/or random_page_cost
results in a syscache lookup and then uses the relcache machinery to
parse the returned array.

That leaves the question of what the most elegant design is here. Tom
suggested upthread that we should tag every RelOptInfo - and,
presumably, IndexOptInfo, though it wasn't discussed - with this
information. I don't however much like the idea of adding identically
named members in both places. Should the number of options expand in
the future, this will become silly very quickly. One option is to
define a struct with seq_page_cost and random_page_cost that is then
included in RelOptInfo and IndexOptInfo. It would seem to make sense
to make the struct, rather than a pointer to the struct, the member,
because it makes the copyfuncs/equalfuncs stuff easier to handle, and
there's not really any benefit in incurring more palloc overhead.

However, I'm sort of inclined to go ahead and invent a mini-cache for
tablespaces. It avoids the (apparently insignificant) overhead of
reparsing the array multiple times, but it also avoids bloating
RelOptInfo and IndexOptInfo with more members than really necessary.
It seems like a good idea to add one member to those structures
anyway, for reltablespace, but copying all the values into every one
we create just seems silly. Admittedly there are only two values
right now, but again we may want to add more someday, and caching at
the tablespace level just seems like the right way to do it.

Thoughts?

...Robert

#16

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Robert Haas (#15)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Thu, Dec 17, 2009 at 9:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Going once... going twice... since no one has suggested anything or
spoken against the proposal above, I'm just going to implement
seq_page_cost and random_page_cost for now.

[...]

Per the email that I just sent a few minutes ago, there doesn't appear
to be a performance impact in doing this even in a relatively stupid
way - every call that requires seq_page_cost and/or random_page_cost
results in a syscache lookup and then uses the relcache machinery to
parse the returned array.

That leaves the question of what the most elegant design is here. Tom
suggested upthread that we should tag every RelOptInfo - and,
presumably, IndexOptInfo, though it wasn't discussed - with this
information. I don't however much like the idea of adding identically
named members in both places. Should the number of options expand in
the future, this will become silly very quickly. One option is to
define a struct with seq_page_cost and random_page_cost that is then
included in RelOptInfo and IndexOptInfo. It would seem to make sense
to make the struct, rather than a pointer to the struct, the member,
because it makes the copyfuncs/equalfuncs stuff easier to handle, and
there's not really any benefit in incurring more palloc overhead.

However, I'm sort of inclined to go ahead and invent a mini-cache for
tablespaces. It avoids the (apparently insignificant) overhead of
reparsing the array multiple times, but it also avoids bloating
RelOptInfo and IndexOptInfo with more members than really necessary.
It seems like a good idea to add one member to those structures
anyway, for reltablespace, but copying all the values into every one
we create just seems silly. Admittedly there are only two values
right now, but again we may want to add more someday, and caching at
the tablespace level just seems like the right way to do it.

Thoughts?

Hearing no thoughts, I have implemented as per the above. PFA the
latest version. Any reviews, comments, feedback, etc. much
appreciated.

Thanks,

...Robert

Attachments:

spcoptions-v3.patchtext/x-patch; charset=US-ASCII; name=spcoptions-v3.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 2000,2005 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 2000,2008 ----
         <para>
          Sets the planner's estimate of the cost of a disk page fetch
          that is part of a series of sequential fetches.  The default is 1.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
         </para>
        </listitem>
       </varlistentry>
***************
*** 2013,2018 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 2016,2027 ----
         <para>
          Sets the planner's estimate of the cost of a
          non-sequentially-fetched disk page.  The default is 4.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
+        </para>
+ 
+ 	   <para>
          Reducing this value relative to <varname>seq_page_cost</>
          will cause the system to prefer index scans; raising it will
          make index scans look relatively more expensive.  You can raise
*** a/doc/src/sgml/ref/alter_tablespace.sgml
--- b/doc/src/sgml/ref/alter_tablespace.sgml
***************
*** 23,28 **** PostgreSQL documentation
--- 23,30 ----
  <synopsis>
  ALTER TABLESPACE <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
  ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner</replaceable>
+ ALTER TABLESPACE <replaceable>name</replaceable> SET ( <replaceable class="PARAMETER">tablespace_option</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] )
+ ALTER TABLESPACE <replaceable>name</replaceable> RESET ( <replaceable class="PARAMETER">tablespace_option</replaceable> [, ... ] )
  </synopsis>
   </refsynopsisdiv>
    
***************
*** 74,79 **** ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner
--- 76,99 ----
       </para>
      </listitem>
     </varlistentry>
+ 
+    <varlistentry>
+     <term><replaceable class="parameter">tablespace_parameter</replaceable></term>
+     <listitem>
+      <para>
+       A tablespace parameter to be set or reset.  Currently, the only
+       available parameters are <varname>seq_page_cost</> and
+       <varname>random_page_cost</>.  Setting either value for a particular
+       tablespace will override the planner's usual estimate of the cost of
+       reading pages from tables in that tablespace, as established by
+       the configuration parameters of the same name (see
+       <xref linkend="guc-seq-page-cost">,
+       <xref linkend="guc-random-page-cost">).  This may be useful if one
+       tablespace is located on a disk which is faster or slower than the
+       remainder of the I/O subsystem.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </refsect1>
  
*** a/src/backend/access/common/reloptions.c
--- b/src/backend/access/common/reloptions.c
***************
*** 21,26 ****
--- 21,27 ----
  #include "access/reloptions.h"
  #include "catalog/pg_type.h"
  #include "commands/defrem.h"
+ #include "commands/tablespace.h"
  #include "nodes/makefuncs.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
***************
*** 179,184 **** static relopt_real realRelOpts[] =
--- 180,201 ----
  		},
  		-1, 0.0, 100.0
  	},
+ 	{
+ 		{
+ 			"seq_page_cost",
+ 			"Sets the planner's estimate of the cost of a sequentially fetched disk page.",
+ 			RELOPT_KIND_TABLESPACE
+ 		},
+ 		-1, 0.0, DBL_MAX
+ 	},
+ 	{
+ 		{
+ 			"random_page_cost",
+ 			"Sets the planner's estimate of the cost of a nonsequentially fetched disk page.",
+ 			RELOPT_KIND_TABLESPACE
+ 		},
+ 		-1, 0.0, DBL_MAX
+ 	},
  	/* list terminator */
  	{{NULL}}
  };
***************
*** 1168,1170 **** index_reloptions(RegProcedure amoptions, Datum reloptions, bool validate)
--- 1185,1218 ----
  
  	return DatumGetByteaP(result);
  }
+ 
+ /*
+  * Option parser for tablespace reloptions
+  */
+ bytea *
+ tablespace_reloptions(Datum reloptions, bool validate)
+ {
+ 	relopt_value *options;
+ 	TableSpaceOpts	*tsopts;
+ 	int			numoptions;
+ 	static const relopt_parse_elt tab[] = {
+ 		{"random_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, random_page_cost)},
+ 		{"seq_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, seq_page_cost)}
+ 	};
+ 
+ 	options = parseRelOptions(reloptions, validate, RELOPT_KIND_TABLESPACE,
+ 							  &numoptions);
+ 
+ 	/* if none set, we're done */
+ 	if (numoptions == 0)
+ 		return NULL;
+ 
+ 	tsopts = allocateReloptStruct(sizeof(TableSpaceOpts), options, numoptions);
+ 
+ 	fillRelOptions((void *) tsopts, sizeof(TableSpaceOpts), options, numoptions,
+ 				   validate, tab, lengthof(tab));
+ 
+ 	pfree(options);
+ 
+ 	return (bytea *) tsopts;
+ }
*** a/src/backend/catalog/aclchk.c
--- b/src/backend/catalog/aclchk.c
***************
*** 2783,2800 **** ExecGrant_Tablespace(InternalGrant *istmt)
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
- 		ScanKeyData entry[1];
- 		SysScanDesc scan;
  		HeapTuple	tuple;
  
! 		/* There's no syscache for pg_tablespace, so must look the hard way */
! 		ScanKeyInit(&entry[0],
! 					ObjectIdAttributeNumber,
! 					BTEqualStrategyNumber, F_OIDEQ,
! 					ObjectIdGetDatum(tblId));
! 		scan = systable_beginscan(relation, TablespaceOidIndexId, true,
! 								  SnapshotNow, 1, entry);
! 		tuple = systable_getnext(scan);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
--- 2783,2793 ----
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
  		HeapTuple	tuple;
  
! 		/* Search syscache for pg_tablespace */
! 		tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(tblId),
! 							   0, 0, 0);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
***************
*** 2865,2872 **** ExecGrant_Tablespace(InternalGrant *istmt)
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		systable_endscan(scan);
! 
  		pfree(new_acl);
  
  		/* prevent error when processing duplicate objects */
--- 2858,2864 ----
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		ReleaseSysCache(tuple);
  		pfree(new_acl);
  
  		/* prevent error when processing duplicate objects */
***************
*** 3696,3704 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  					  AclMode mask, AclMaskHow how)
  {
  	AclMode		result;
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	tuple;
  	Datum		aclDatum;
  	bool		isNull;
--- 3688,3693 ----
***************
*** 3711,3727 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
- 	 *
- 	 * There's no syscache for pg_tablespace, so must look the hard way
  	 */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 	tuple = systable_getnext(scan);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 3700,3708 ----
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
  	 */
! 	tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 						   0, 0, 0);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 3729,3736 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = heap_getattr(tuple, Anum_pg_tablespace_spcacl,
! 							RelationGetDescr(pg_tablespace), &isNull);
  
  	if (isNull)
  	{
--- 3710,3718 ----
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = SysCacheGetAttr(TABLESPACEOID, tuple,
! 								   Anum_pg_tablespace_spcacl,
! 								   &isNull);
  
  	if (isNull)
  	{
***************
*** 3750,3757 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return result;
  }
--- 3732,3738 ----
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	ReleaseSysCache(tuple);
  
  	return result;
  }
***************
*** 4338,4346 **** pg_namespace_ownercheck(Oid nsp_oid, Oid roleid)
  bool
  pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  {
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	spctuple;
  	Oid			spcowner;
  
--- 4319,4324 ----
***************
*** 4348,4364 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* There's no syscache for pg_tablespace, so must look the hard way */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 
! 	spctuple = systable_getnext(scan);
! 
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 4326,4334 ----
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* Search syscache for pg_tablespace */
! 	spctuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 							  0, 0, 0);
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 4366,4373 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return has_privs_of_role(roleid, spcowner);
  }
--- 4336,4342 ----
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	ReleaseSysCache(spctuple);
  
  	return has_privs_of_role(roleid, spcowner);
  }
*** a/src/backend/commands/tablespace.c
--- b/src/backend/commands/tablespace.c
***************
*** 49,54 ****
--- 49,55 ----
  #include <sys/stat.h>
  
  #include "access/heapam.h"
+ #include "access/reloptions.h"
  #include "access/sysattr.h"
  #include "access/transam.h"
  #include "access/xact.h"
***************
*** 57,62 ****
--- 58,64 ----
  #include "catalog/indexing.h"
  #include "catalog/pg_tablespace.h"
  #include "commands/comment.h"
+ #include "commands/defrem.h"
  #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "postmaster/bgwriter.h"
***************
*** 70,75 ****
--- 72,78 ----
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
+ #include "utils/syscache.h"
  #include "utils/tqual.h"
  
  
***************
*** 290,295 **** CreateTableSpace(CreateTableSpaceStmt *stmt)
--- 293,299 ----
  	values[Anum_pg_tablespace_spclocation - 1] =
  		CStringGetTextDatum(location);
  	nulls[Anum_pg_tablespace_spcacl - 1] = true;
+ 	nulls[Anum_pg_tablespace_spcoptions - 1] = true;
  
  	tuple = heap_form_tuple(rel->rd_att, values, nulls);
  
***************
*** 913,918 **** AlterTableSpaceOwner(const char *name, Oid newOwnerId)
--- 917,989 ----
  
  
  /*
+  * Alter table space options
+  */
+ void
+ AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt)
+ {
+ 	Relation	rel;
+ 	ScanKeyData entry[1];
+ 	HeapScanDesc scandesc;
+ 	HeapTuple	tup;
+ 	Datum		datum;
+ 	Datum		newOptions;
+ 	Datum		repl_val[Natts_pg_tablespace];
+ 	bool		isnull;
+ 	bool		repl_null[Natts_pg_tablespace];
+ 	bool		repl_repl[Natts_pg_tablespace];
+ 	HeapTuple	newtuple;
+ 
+ 	/* Search pg_tablespace */
+ 	rel = heap_open(TableSpaceRelationId, RowExclusiveLock);
+ 
+ 	ScanKeyInit(&entry[0],
+ 				Anum_pg_tablespace_spcname,
+ 				BTEqualStrategyNumber, F_NAMEEQ,
+ 				CStringGetDatum(stmt->tablespacename));
+ 	scandesc = heap_beginscan(rel, SnapshotNow, 1, entry);
+ 	tup = heap_getnext(scandesc, ForwardScanDirection);
+ 	if (!HeapTupleIsValid(tup))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_UNDEFINED_OBJECT),
+ 				 errmsg("tablespace \"%s\" does not exist",
+ 					stmt->tablespacename)));
+ 
+ 	/* Must be owner of the existing object */
+ 	if (!pg_tablespace_ownercheck(HeapTupleGetOid(tup), GetUserId()))
+ 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_TABLESPACE,
+ 					   stmt->tablespacename);
+ 
+ 	/* Generate new proposed spcoptions (text array) */
+ 	datum = heap_getattr(tup, Anum_pg_tablespace_spcoptions,
+ 						 RelationGetDescr(rel), &isnull);
+ 	newOptions = transformRelOptions(isnull ? (Datum) 0 : datum,
+ 									 stmt->options, NULL, NULL, false,
+ 									 stmt->isReset);
+ 	(void) tablespace_reloptions(newOptions, true);
+ 
+ 	/* Build new tuple. */
+ 	memset(repl_null, false, sizeof(repl_null));
+ 	memset(repl_repl, false, sizeof(repl_repl));
+ 	if (newOptions != (Datum) 0)
+ 		repl_val[Anum_pg_tablespace_spcoptions - 1] = newOptions;
+ 	else
+ 		repl_null[Anum_pg_tablespace_spcoptions - 1] = true;
+ 	repl_repl[Anum_pg_tablespace_spcoptions - 1] = true;
+ 	newtuple = heap_modify_tuple(tup, RelationGetDescr(rel), repl_val,
+ 								 repl_null, repl_repl);
+ 
+ 	/* Update system catalog. */
+ 	simple_heap_update(rel, &newtuple->t_self, newtuple);
+ 	CatalogUpdateIndexes(rel, newtuple);
+ 	heap_freetuple(newtuple);
+ 
+ 	/* Conclude heap scan. */
+ 	heap_endscan(scandesc);
+ 	heap_close(rel, NoLock);
+ }
+ 
+ /*
   * Routines for handling the GUC variable 'default_tablespace'.
   */
  
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3062,3067 **** _copyDropTableSpaceStmt(DropTableSpaceStmt *from)
--- 3062,3079 ----
  	return newnode;
  }
  
+ static AlterTableSpaceOptionsStmt *
+ _copyAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *from)
+ {
+ 	AlterTableSpaceOptionsStmt *newnode = makeNode(AlterTableSpaceOptionsStmt);
+ 
+ 	COPY_STRING_FIELD(tablespacename);
+ 	COPY_NODE_FIELD(options);
+ 	COPY_SCALAR_FIELD(isReset);
+ 
+ 	return newnode;
+ }
+ 
  static CreateFdwStmt *
  _copyCreateFdwStmt(CreateFdwStmt *from)
  {
***************
*** 4026,4031 **** copyObject(void *from)
--- 4038,4046 ----
  		case T_DropTableSpaceStmt:
  			retval = _copyDropTableSpaceStmt(from);
  			break;
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			retval = _copyAlterTableSpaceOptionsStmt(from);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _copyCreateFdwStmt(from);
  			break;
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1568,1573 **** _equalDropTableSpaceStmt(DropTableSpaceStmt *a, DropTableSpaceStmt *b)
--- 1568,1584 ----
  }
  
  static bool
+ _equalAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *a,
+ 											 AlterTableSpaceOptionsStmt *b)
+ {
+ 	COMPARE_STRING_FIELD(tablespacename);
+ 	COMPARE_NODE_FIELD(options);
+ 	COMPARE_SCALAR_FIELD(isReset);
+ 
+ 	return true;
+ }
+ 
+ static bool
  _equalCreateFdwStmt(CreateFdwStmt *a, CreateFdwStmt *b)
  {
  	COMPARE_STRING_FIELD(fdwname);
***************
*** 2719,2724 **** equal(void *a, void *b)
--- 2730,2738 ----
  		case T_DropTableSpaceStmt:
  			retval = _equalDropTableSpaceStmt(a, b);
  			break;
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			retval = _equalAlterTableSpaceOptionsStmt(a, b);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _equalCreateFdwStmt(a, b);
  			break;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 1588,1593 **** _outRelOptInfo(StringInfo str, RelOptInfo *node)
--- 1588,1594 ----
  	WRITE_NODE_FIELD(cheapest_total_path);
  	WRITE_NODE_FIELD(cheapest_unique_path);
  	WRITE_UINT_FIELD(relid);
+ 	WRITE_UINT_FIELD(reltablespace);
  	WRITE_ENUM_FIELD(rtekind, RTEKind);
  	WRITE_INT_FIELD(min_attr);
  	WRITE_INT_FIELD(max_attr);
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
***************
*** 27,32 ****
--- 27,37 ----
   * detail.	Note that all of these parameters are user-settable, in case
   * the default values are drastically off for a particular platform.
   *
+  * seq_page_cost and random_page_cost can also be overridden for an individual
+  * tablespace, in case some data is on a fast disk and other data is on a slow
+  * disk.  Per-tablespace overrides never apply to temporary work files such as
+  * an external sort or a materialize node that overflows work_mem.
+  *
   * We compute two separate costs for each path:
   *		total_cost: total estimated cost to fetch all tuples
   *		startup_cost: cost that is expended before first tuple is fetched
***************
*** 76,81 ****
--- 81,87 ----
  #include "parser/parsetree.h"
  #include "utils/lsyscache.h"
  #include "utils/selfuncs.h"
+ #include "utils/spccache.h"
  #include "utils/tuplesort.h"
  
  
***************
*** 164,169 **** void
--- 170,176 ----
  cost_seqscan(Path *path, PlannerInfo *root,
  			 RelOptInfo *baserel)
  {
+ 	double		spc_seq_page_cost;
  	Cost		startup_cost = 0;
  	Cost		run_cost = 0;
  	Cost		cpu_per_tuple;
***************
*** 175,184 **** cost_seqscan(Path *path, PlannerInfo *root,
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
  	/*
  	 * disk costs
  	 */
! 	run_cost += seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
--- 182,196 ----
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  NULL,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * disk costs
  	 */
! 	run_cost += spc_seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
***************
*** 226,231 **** cost_index(IndexPath *path, PlannerInfo *root,
--- 238,245 ----
  	Selectivity indexSelectivity;
  	double		indexCorrelation,
  				csquared;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	Cost		min_IO_cost,
  				max_IO_cost;
  	Cost		cpu_per_tuple;
***************
*** 272,284 **** cost_index(IndexPath *path, PlannerInfo *root,
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
--- 286,303 ----
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
+ 	/* fetch estimated page costs for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge spc_random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
***************
*** 286,292 **** cost_index(IndexPath *path, PlannerInfo *root,
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		random_page_cost + (pages_fetched - 1) * seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
--- 305,311 ----
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
***************
*** 309,315 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
--- 328,334 ----
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
***************
*** 328,334 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  	}
  	else
  	{
--- 347,353 ----
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  	}
  	else
  	{
***************
*** 342,354 **** cost_index(IndexPath *path, PlannerInfo *root,
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * seq_page_cost;
  	}
  
  	/*
--- 361,373 ----
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * spc_random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = spc_random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
  	}
  
  	/*
***************
*** 553,558 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 572,579 ----
  	Cost		cost_per_page;
  	double		tuples_fetched;
  	double		pages_fetched;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	double		T;
  
  	/* Should only be applied to base relations */
***************
*** 571,576 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 592,602 ----
  
  	startup_cost += indexTotalCost;
  
+ 	/* Fetch estimated page costs for tablespace containing table. */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * Estimate number of main-table pages fetched.
  	 */
***************
*** 609,625 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = random_page_cost -
! 			(random_page_cost - seq_page_cost) * sqrt(pages_fetched / T);
  	else
! 		cost_per_page = random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
--- 635,652 ----
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge spc_random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge spc_seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = spc_random_page_cost -
! 			(spc_random_page_cost - spc_seq_page_cost)
! 			* sqrt(pages_fetched / T);
  	else
! 		cost_per_page = spc_random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
***************
*** 783,788 **** cost_tidscan(Path *path, PlannerInfo *root,
--- 810,816 ----
  	QualCost	tid_qual_cost;
  	int			ntuples;
  	ListCell   *l;
+ 	double		spc_random_page_cost;
  
  	/* Should only be applied to base relations */
  	Assert(baserel->relid > 0);
***************
*** 835,842 **** cost_tidscan(Path *path, PlannerInfo *root,
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
--- 863,875 ----
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += spc_random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 91,96 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 91,97 ----
  
  	rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
  	rel->max_attr = RelationGetNumberOfAttributes(relation);
+ 	rel->reltablespace = RelationGetForm(relation)->reltablespace;
  
  	Assert(rel->max_attr >= rel->min_attr);
  	rel->attr_needed = (Relids *)
***************
*** 183,188 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 184,191 ----
  			info = makeNode(IndexOptInfo);
  
  			info->indexoid = index->indexrelid;
+ 			info->reltablespace =
+ 				RelationGetForm(indexRelation)->reltablespace;
  			info->rel = rel;
  			info->ncolumns = ncolumns = index->indnatts;
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 5687,5692 **** RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
--- 5687,5711 ----
  					n->newname = $6;
  					$$ = (Node *)n;
  				}
+ 			| ALTER TABLESPACE name SET reloptions
+ 				{
+ 					AlterTableSpaceOptionsStmt *n =
+ 						makeNode(AlterTableSpaceOptionsStmt);
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					n->isReset = FALSE;
+ 					$$ = (Node *)n;
+ 				}
+ 			| ALTER TABLESPACE name RESET reloptions
+ 				{
+ 					AlterTableSpaceOptionsStmt *n =
+ 						makeNode(AlterTableSpaceOptionsStmt);
+ 					ListCell *lc;
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					n->isReset = TRUE;
+ 					$$ = (Node *)n;
+ 				}
  			| ALTER TEXT_P SEARCH PARSER any_name RENAME TO name
  				{
  					RenameStmt *n = makeNode(RenameStmt);
*** a/src/backend/tcop/utility.c
--- b/src/backend/tcop/utility.c
***************
*** 218,223 **** check_xact_readonly(Node *parsetree)
--- 218,224 ----
  		case T_CreateUserMappingStmt:
  		case T_AlterUserMappingStmt:
  		case T_DropUserMappingStmt:
+ 		case T_AlterTableSpaceOptionsStmt:
  			ereport(ERROR,
  					(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
  					 errmsg("transaction is read-only")));
***************
*** 528,533 **** standard_ProcessUtility(Node *parsetree,
--- 529,538 ----
  			DropTableSpace((DropTableSpaceStmt *) parsetree);
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			AlterTableSpaceOptions((AlterTableSpaceOptionsStmt *) parsetree);
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			CreateForeignDataWrapper((CreateFdwStmt *) parsetree);
  			break;
***************
*** 1456,1461 **** CreateCommandTag(Node *parsetree)
--- 1461,1470 ----
  			tag = "DROP TABLESPACE";
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			tag = "ALTER TABLESPACE";
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			tag = "CREATE FOREIGN DATA WRAPPER";
  			break;
***************
*** 2238,2243 **** GetCommandLogLevel(Node *parsetree)
--- 2247,2256 ----
  			lev = LOGSTMT_DDL;
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			lev = LOGSTMT_DDL;
+ 			break;
+ 
  		case T_CreateFdwStmt:
  		case T_AlterFdwStmt:
  		case T_DropFdwStmt:
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
***************
*** 117,122 ****
--- 117,123 ----
  #include "utils/nabstime.h"
  #include "utils/pg_locale.h"
  #include "utils/selfuncs.h"
+ #include "utils/spccache.h"
  #include "utils/syscache.h"
  
  
***************
*** 5372,5377 **** genericcostestimate(PlannerInfo *root,
--- 5373,5379 ----
  	QualCost	index_qual_cost;
  	double		qual_op_cost;
  	double		qual_arg_cost;
+ 	double		spc_random_page_cost;
  	List	   *selectivityQuals;
  	ListCell   *l;
  
***************
*** 5480,5485 **** genericcostestimate(PlannerInfo *root,
--- 5482,5492 ----
  	else
  		numIndexPages = 1.0;
  
+ 	/* fetch estimated page cost for schema containing index */
+ 	get_tablespace_page_costs(index->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/*
  	 * Now compute the disk access costs.
  	 *
***************
*** 5526,5540 **** genericcostestimate(PlannerInfo *root,
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * random_page_cost) / num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge random_page_cost per page
! 		 * touched.
  		 */
! 		*indexTotalCost = numIndexPages * random_page_cost;
  	}
  
  	/*
--- 5533,5548 ----
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * spc_random_page_cost)
! 							/ num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge spc_random_page_cost per
! 		 * page touched.
  		 */
! 		*indexTotalCost = numIndexPages * spc_random_page_cost;
  	}
  
  	/*
***************
*** 5549,5559 **** genericcostestimate(PlannerInfo *root,
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * random_page_cost per 100000 index pages, which should be small enough
! 	 * to not alter index-vs-seqscan decisions, but will prevent indexes of
! 	 * different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
--- 5557,5567 ----
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * spc_random_page_cost per 100000 index pages, which should be small
! 	 * enough to not alter index-vs-seqscan decisions, but will prevent
! 	 * indexes of different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * spc_random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
*** a/src/backend/utils/cache/Makefile
--- b/src/backend/utils/cache/Makefile
***************
*** 13,18 **** top_builddir = ../../../..
  include $(top_builddir)/src/Makefile.global
  
  OBJS = catcache.o inval.o plancache.o relcache.o \
! 	syscache.o lsyscache.o typcache.o ts_cache.o
  
  include $(top_srcdir)/src/backend/common.mk
--- 13,18 ----
  include $(top_builddir)/src/Makefile.global
  
  OBJS = catcache.o inval.o plancache.o relcache.o \
! 	spccache.o syscache.o lsyscache.o typcache.o ts_cache.o
  
  include $(top_srcdir)/src/backend/common.mk
*** a/src/backend/utils/cache/lsyscache.c
--- b/src/backend/utils/cache/lsyscache.c
***************
*** 17,22 ****
--- 17,23 ----
  
  #include "access/hash.h"
  #include "access/nbtree.h"
+ #include "access/reloptions.h"
  #include "bootstrap/bootstrap.h"
  #include "catalog/pg_amop.h"
  #include "catalog/pg_amproc.h"
***************
*** 26,34 ****
--- 27,38 ----
  #include "catalog/pg_operator.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_type.h"
+ #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
+ #include "optimizer/cost.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/datum.h"
*** /dev/null
--- b/src/backend/utils/cache/spccache.c
***************
*** 0 ****
--- 1,183 ----
+ /*-------------------------------------------------------------------------
+  *
+  * spccache.c
+  *	  Tablespace cache management.
+  *
+  * We cache the parsed version of spcoptions for each tablespace to avoid
+  * needing to reparse on every lookup.  Right now, there doesn't appear to
+  * be a measurable performance gain from doing this, but that might change
+  * in the future as we add more options.
+  *
+  * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * IDENTIFICATION
+  *	  $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ #include "access/reloptions.h"
+ #include "catalog/pg_tablespace.h"
+ #include "commands/tablespace.h"
+ #include "miscadmin.h"
+ #include "optimizer/cost.h"
+ #include "utils/catcache.h"
+ #include "utils/hsearch.h"
+ #include "utils/inval.h"
+ #include "utils/spccache.h"
+ #include "utils/syscache.h"
+ 
+ static HTAB *TableSpaceCacheHash = NULL;
+ 
+ typedef struct {
+ 	Oid			oid;
+ 	TableSpaceOpts *opts;
+ } TableSpace;
+ 
+ /*
+  * InvalidateTableSpaceCacheCallback
+  *		Flush all cache entries when pg_tablespace is updated.
+  *
+  * When pg_tablespace is updated, we must flush the cache entry at least
+  * for that tablespace.  Currently, we just flush them all.  This is quick
+  * and easy and doesn't cost much, since there shouldn't be terribly many
+  * tablespaces, nor do we expect them to be frequently modified.
+  */
+ static void
+ InvalidateTableSpaceCacheCallback(Datum arg, int cacheid, ItemPointer tuplePtr)
+ {
+ 	HASH_SEQ_STATUS status;
+ 	TableSpace *spc;
+ 
+ 	hash_seq_init(&status, TableSpaceCacheHash);
+ 	while ((spc = (TableSpace *) hash_seq_search(&status)) != NULL)
+ 	{
+ 		if (hash_search(TableSpaceCacheHash, (void *) &spc->oid, HASH_REMOVE,
+ 						NULL) == NULL)
+ 			elog(ERROR, "hash table corrupted");
+ 		if (spc->opts)
+ 			pfree(spc->opts);
+ 	}
+ }
+ 
+ /*
+  * InitializeTableSpaceCache
+  *		Initiate the tablespace cache.
+  */
+ static void
+ InitializeTableSpaceCache(void)
+ {
+ 	HASHCTL ctl;
+ 
+ 	/* Initialize the hash table. */
+ 	MemSet(&ctl, 0, sizeof(ctl));
+ 	ctl.keysize = sizeof(Oid);
+ 	ctl.entrysize = sizeof(TableSpace);
+ 	ctl.hash = tag_hash;
+ 	TableSpaceCacheHash =
+ 		hash_create("TableSpace cache", 16, &ctl,
+ 				    HASH_ELEM | HASH_FUNCTION);
+ 
+ 	/* Make sure we've initialized CacheMemoryContext. */
+ 	if (!CacheMemoryContext)
+ 		CreateCacheMemoryContext();
+ 
+ 	/* Watch for invalidation events. */
+ 	CacheRegisterSyscacheCallback(TABLESPACEOID,
+ 								  InvalidateTableSpaceCacheCallback,
+ 								  (Datum) 0);
+ }
+ 
+ /*
+  * get_tablespace
+  *		Fetch TableSpace structure for a specified table OID.
+  *
+  * Pointers returned by this function should not be stored, since a cache
+  * flush will invalidate them.
+  */
+ static TableSpace *
+ get_tablespace(Oid spcid)
+ {
+ 	HeapTuple	tp;
+ 	TableSpace *spc;
+ 	bool		found;
+ 
+ 	/*
+ 	 * Since spcid is always from a pg_class tuple, InvalidOid implies the
+ 	 * default.
+ 	 */
+ 	if (spcid == InvalidOid)
+ 		spcid = MyDatabaseTableSpace;
+ 
+ 	/* Find existing cache entry, or create a new one. */
+ 	if (!TableSpaceCacheHash)
+ 		InitializeTableSpaceCache();
+ 	spc = (TableSpace *) hash_search(TableSpaceCacheHash, (void *) &spcid,
+ 									 HASH_ENTER, &found);
+ 	if (found)
+ 		return spc;
+ 
+ 	/*
+ 	 * Not found in TableSpace cache.  Check catcache.  If we don't find a
+ 	 * valid HeapTuple, it must mean someone has managed to request tablespace
+ 	 * details for a non-existent tablespace.  We'll just treat that case as if
+ 	 * no options were specified.
+ 	 */
+ 	tp = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spcid), 0, 0, 0);
+ 	if (!HeapTupleIsValid(tp))
+ 		spc->opts = NULL;
+ 	else
+ 	{
+ 		Datum	datum;
+ 		bool	isNull;
+ 		MemoryContext octx;
+ 
+ 		datum = SysCacheGetAttr(TABLESPACEOID,
+ 								tp,
+ 								Anum_pg_tablespace_spcoptions,
+ 								&isNull);
+ 		if (isNull)
+ 			spc->opts = NULL;
+ 		else
+ 		{
+ 			octx = MemoryContextSwitchTo(CacheMemoryContext);
+ 			spc->opts = (TableSpaceOpts *) tablespace_reloptions(datum, false);
+ 			MemoryContextSwitchTo(octx);
+ 		}
+ 		ReleaseSysCache(tp);
+ 	}
+ 
+ 	/* Update new TableSpace cache entry with results of option parsing. */
+ 	return spc;
+ }
+ 
+ /*
+  * get_tablespace_page_costs
+  *		Return random and sequential page costs for a given tablespace.
+  */
+ void
+ get_tablespace_page_costs(Oid spcid, double *spc_random_page_cost,
+ 							   double *spc_seq_page_cost)
+ {
+ 	TableSpace *spc = get_tablespace(spcid);
+ 
+ 	Assert(spc != NULL);
+ 
+ 	if (spc_random_page_cost)
+ 	{
+ 		if (!spc->opts || spc->opts->random_page_cost < 0)
+ 			*spc_random_page_cost = random_page_cost;
+ 		else
+ 			*spc_random_page_cost = spc->opts->random_page_cost;
+ 	}
+ 
+ 	if (spc_seq_page_cost)
+ 	{
+ 		if (!spc->opts || spc->opts->seq_page_cost < 0)
+ 			*spc_seq_page_cost = seq_page_cost;
+ 		else
+ 			*spc_seq_page_cost = spc->opts->seq_page_cost;
+ 	}
+ }
*** a/src/backend/utils/cache/syscache.c
--- b/src/backend/utils/cache/syscache.c
***************
*** 43,48 ****
--- 43,49 ----
  #include "catalog/pg_proc.h"
  #include "catalog/pg_rewrite.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_ts_config.h"
  #include "catalog/pg_ts_config_map.h"
  #include "catalog/pg_ts_dict.h"
***************
*** 609,614 **** static const struct cachedesc cacheinfo[] = {
--- 610,627 ----
  		},
  		1024
  	},
+ 	{TableSpaceRelationId,		/* TABLESPACEOID */
+ 		TablespaceOidIndexId,
+ 		0,
+ 		1,
+ 		{
+ 			ObjectIdAttributeNumber,
+ 			0,
+ 			0,
+ 			0,
+ 		},
+ 		16
+ 	},
  	{TSConfigMapRelationId,		/* TSCONFIGMAP */
  		TSConfigMapIndexId,
  		0,
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
***************
*** 956,974 **** dumpTablespaces(PGconn *conn)
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80200)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
--- 956,983 ----
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80500)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
+ 						   "array_to_string(spcoptions, ', '),"
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
+ 	else if (server_version >= 80200)
+ 		res = executeQuery(conn, "SELECT spcname, "
+ 						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
+ 						   "spclocation, spcacl, null, "
+ 						"pg_catalog.shobj_description(oid, 'pg_tablespace'), "
+ 						   "FROM pg_catalog.pg_tablespace "
+ 						   "WHERE spcname !~ '^pg_' "
+ 						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null, null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
***************
*** 983,989 **** dumpTablespaces(PGconn *conn)
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spccomment = PQgetvalue(res, i, 4);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
--- 992,999 ----
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spcoptions = PQgetvalue(res, i, 4);
! 		char	   *spccomment = PQgetvalue(res, i, 5);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
***************
*** 996,1001 **** dumpTablespaces(PGconn *conn)
--- 1006,1015 ----
  		appendStringLiteralConn(buf, spclocation, conn);
  		appendPQExpBuffer(buf, ";\n");
  
+ 		if (spcoptions && spcoptions[0] != '\0')
+ 			appendPQExpBuffer(buf, "ALTER TABLESPACE %s SET (%s);\n",
+ 							  fspcname, spcoptions);
+ 
  		if (!skip_acls &&
  			!buildACLCommands(fspcname, NULL, "TABLESPACE", spcacl, spcowner,
  							  "", server_version, buf))
*** a/src/include/access/reloptions.h
--- b/src/include/access/reloptions.h
***************
*** 1,7 ****
  /*-------------------------------------------------------------------------
   *
   * reloptions.h
!  *	  Core support for relation options (pg_class.reloptions)
   *
   * Note: the functions dealing with text-array reloptions values declare
   * them as Datum, not ArrayType *, to avoid needing to include array.h
--- 1,8 ----
  /*-------------------------------------------------------------------------
   *
   * reloptions.h
!  *	  Core support for relation and tablespace options (pg_class.reloptions
!  *	  and pg_tablespace.spcoptions)
   *
   * Note: the functions dealing with text-array reloptions values declare
   * them as Datum, not ArrayType *, to avoid needing to include array.h
***************
*** 39,46 **** typedef enum relopt_kind
  	RELOPT_KIND_HASH = (1 << 3),
  	RELOPT_KIND_GIN = (1 << 4),
  	RELOPT_KIND_GIST = (1 << 5),
  	/* if you add a new kind, make sure you update "last_default" too */
! 	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_GIST,
  	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
  	RELOPT_KIND_MAX = (1 << 30)
  } relopt_kind;
--- 40,48 ----
  	RELOPT_KIND_HASH = (1 << 3),
  	RELOPT_KIND_GIN = (1 << 4),
  	RELOPT_KIND_GIST = (1 << 5),
+ 	RELOPT_KIND_TABLESPACE = (1 << 6),
  	/* if you add a new kind, make sure you update "last_default" too */
! 	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_TABLESPACE,
  	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
  	RELOPT_KIND_MAX = (1 << 30)
  } relopt_kind;
***************
*** 264,268 **** extern bytea *default_reloptions(Datum reloptions, bool validate,
--- 266,271 ----
  extern bytea *heap_reloptions(char relkind, Datum reloptions, bool validate);
  extern bytea *index_reloptions(RegProcedure amoptions, Datum reloptions,
  				 bool validate);
+ extern bytea *tablespace_reloptions(Datum reloptions, bool validate);
  
  #endif   /* RELOPTIONS_H */
*** a/src/include/catalog/pg_tablespace.h
--- b/src/include/catalog/pg_tablespace.h
***************
*** 34,39 **** CATALOG(pg_tablespace,1213) BKI_SHARED_RELATION
--- 34,40 ----
  	Oid			spcowner;		/* owner of tablespace */
  	text		spclocation;	/* physical location (VAR LENGTH) */
  	aclitem		spcacl[1];		/* access permissions (VAR LENGTH) */
+ 	text		spcoptions[1];	/* per-tablespace options */
  } FormData_pg_tablespace;
  
  /* ----------------
***************
*** 48,61 **** typedef FormData_pg_tablespace *Form_pg_tablespace;
   * ----------------
   */
  
! #define Natts_pg_tablespace				4
  #define Anum_pg_tablespace_spcname		1
  #define Anum_pg_tablespace_spcowner		2
  #define Anum_pg_tablespace_spclocation	3
  #define Anum_pg_tablespace_spcacl		4
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
--- 49,63 ----
   * ----------------
   */
  
! #define Natts_pg_tablespace				6
  #define Anum_pg_tablespace_spcname		1
  #define Anum_pg_tablespace_spcowner		2
  #define Anum_pg_tablespace_spclocation	3
  #define Anum_pg_tablespace_spcacl		4
+ #define Anum_pg_tablespace_spcoptions	5
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
*** a/src/include/commands/tablespace.h
--- b/src/include/commands/tablespace.h
***************
*** 32,42 **** typedef struct xl_tblspc_drop_rec
--- 32,48 ----
  	Oid			ts_id;
  } xl_tblspc_drop_rec;
  
+ typedef struct TableSpaceOpts
+ {
+ 	float8		random_page_cost;
+ 	float8		seq_page_cost;
+ } TableSpaceOpts;
  
  extern void CreateTableSpace(CreateTableSpaceStmt *stmt);
  extern void DropTableSpace(DropTableSpaceStmt *stmt);
  extern void RenameTableSpace(const char *oldname, const char *newname);
  extern void AlterTableSpaceOwner(const char *name, Oid newOwnerId);
+ extern void AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
  
  extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
  
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 346,351 **** typedef enum NodeTag
--- 346,352 ----
  	T_CreateUserMappingStmt,
  	T_AlterUserMappingStmt,
  	T_DropUserMappingStmt,
+ 	T_AlterTableSpaceOptionsStmt,
  
  	/*
  	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 1477,1482 **** typedef struct DropTableSpaceStmt
--- 1477,1490 ----
  	bool		missing_ok;		/* skip error if missing? */
  } DropTableSpaceStmt;
  
+ typedef struct AlterTableSpaceOptionsStmt
+ {
+ 	NodeTag		type;
+ 	char	   *tablespacename;
+ 	List	   *options;
+ 	bool		isReset;
+ } AlterTableSpaceOptionsStmt;
+ 
  /* ----------------------
   *		Create/Drop FOREIGN DATA WRAPPER Statements
   * ----------------------
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 371,376 **** typedef struct RelOptInfo
--- 371,377 ----
  
  	/* information about a base rel (not set for join rels!) */
  	Index		relid;
+ 	Oid			reltablespace;	/* containing tablespace */
  	RTEKind		rtekind;		/* RELATION, SUBQUERY, or FUNCTION */
  	AttrNumber	min_attr;		/* smallest attrno of rel (often <0) */
  	AttrNumber	max_attr;		/* largest attrno of rel */
***************
*** 435,440 **** typedef struct IndexOptInfo
--- 436,442 ----
  	NodeTag		type;
  
  	Oid			indexoid;		/* OID of the index relation */
+ 	Oid			reltablespace;	/* tablespace of index (not table) */
  	RelOptInfo *rel;			/* back-link to index's table */
  
  	/* statistics from pg_class */
*** /dev/null
--- b/src/include/utils/spccache.h
***************
*** 0 ****
--- 1,19 ----
+ /*-------------------------------------------------------------------------
+  *
+  * spccache.h
+  *	  Tablespace cache.
+  *
+  * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ #ifndef SPCCACHE_H
+ #define SPCCACHE_H
+ 
+ void get_tablespace_page_costs(Oid spcid, float8 *spc_random_page_cost,
+ 					     float8 *spc_seq_page_cost);
+ 
+ #endif   /* SPCCACHE_H */
*** a/src/include/utils/syscache.h
--- b/src/include/utils/syscache.h
***************
*** 71,76 **** enum SysCacheIdentifier
--- 71,77 ----
  	RELOID,
  	RULERELNAME,
  	STATRELATT,
+ 	TABLESPACEOID,
  	TSCONFIGMAP,
  	TSCONFIGNAMENSP,
  	TSCONFIGOID,
*** a/src/test/regress/input/tablespace.source
--- b/src/test/regress/input/tablespace.source
***************
*** 1,6 ****
--- 1,12 ----
  -- create a tablespace we can use
  CREATE TABLESPACE testspace LOCATION '@testtablespace@';
  
+ -- try setting and resetting some properties for the new tablespace
+ ALTER TABLESPACE testspace SET (random_page_cost = 1.0);
+ ALTER TABLESPACE testspace SET (some_nonexistent_parameter = true);  -- fail
+ ALTER TABLESPACE testspace RESET (random_page_cost = 2.0); -- fail
+ ALTER TABLESPACE testspace RESET (random_page_cost, seq_page_cost); -- ok
+ 
  -- create a schema we can use
  CREATE SCHEMA testschema;
  
*** a/src/test/regress/output/tablespace.source
--- b/src/test/regress/output/tablespace.source
***************
*** 1,5 ****
--- 1,12 ----
  -- create a tablespace we can use
  CREATE TABLESPACE testspace LOCATION '@testtablespace@';
+ -- try setting and resetting some properties for the new tablespace
+ ALTER TABLESPACE testspace SET (random_page_cost = 1.0);
+ ALTER TABLESPACE testspace SET (some_nonexistent_parameter = true);  -- fail
+ ERROR:  unrecognized parameter "some_nonexistent_parameter"
+ ALTER TABLESPACE testspace RESET (random_page_cost = 2.0); -- fail
+ ERROR:  RESET must not include values for parameters
+ ALTER TABLESPACE testspace RESET (random_page_cost, seq_page_cost); -- ok
  -- create a schema we can use
  CREATE SCHEMA testschema;
  -- try a table

#17

Jaime Casanova

jcasanov@systemguards.com.ec

about 16 years ago

In reply to: Robert Haas (#16)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Mon, Dec 28, 2009 at 2:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Hearing no thoughts, I have implemented as per the above. PFA the
latest version. Any reviews, comments, feedback, etc. much
appreciated.

btw, you need to change

STATRELATT,

for

STATRELATTINH,

in syscache.c

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#18

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Jaime Casanova (#17)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sun, Jan 3, 2010 at 6:56 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Mon, Dec 28, 2009 at 2:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Hearing no thoughts, I have implemented as per the above. PFA the
latest version. Any reviews, comments, feedback, etc. much
appreciated.

btw, you need to change

STATRELATT,

for

STATRELATTINH,

in syscache.c

Hmm, I see this needs to be rebased over Tom's latest changes, but the
conflict I got was in syscache.h, rather than syscache.c. Not sure if
that's what you were going for or if there's another issue. Updated
patch attached.

...Robert

Attachments:

spcoptions-v4.patchtext/x-patch; charset=US-ASCII; name=spcoptions-v4.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 2000,2005 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 2000,2008 ----
         <para>
          Sets the planner's estimate of the cost of a disk page fetch
          that is part of a series of sequential fetches.  The default is 1.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
         </para>
        </listitem>
       </varlistentry>
***************
*** 2013,2018 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 2016,2027 ----
         <para>
          Sets the planner's estimate of the cost of a
          non-sequentially-fetched disk page.  The default is 4.0.
+         This value can be overriden for a particular tablespace by setting
+         the tablespace parameter of the same name
+         (see <xref linkend="sql-altertablespace">).
+        </para>
+ 
+ 	   <para>
          Reducing this value relative to <varname>seq_page_cost</>
          will cause the system to prefer index scans; raising it will
          make index scans look relatively more expensive.  You can raise
*** a/doc/src/sgml/ref/alter_tablespace.sgml
--- b/doc/src/sgml/ref/alter_tablespace.sgml
***************
*** 23,28 **** PostgreSQL documentation
--- 23,30 ----
  <synopsis>
  ALTER TABLESPACE <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
  ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner</replaceable>
+ ALTER TABLESPACE <replaceable>name</replaceable> SET ( <replaceable class="PARAMETER">tablespace_option</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] )
+ ALTER TABLESPACE <replaceable>name</replaceable> RESET ( <replaceable class="PARAMETER">tablespace_option</replaceable> [, ... ] )
  </synopsis>
   </refsynopsisdiv>
    
***************
*** 74,79 **** ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner
--- 76,99 ----
       </para>
      </listitem>
     </varlistentry>
+ 
+    <varlistentry>
+     <term><replaceable class="parameter">tablespace_parameter</replaceable></term>
+     <listitem>
+      <para>
+       A tablespace parameter to be set or reset.  Currently, the only
+       available parameters are <varname>seq_page_cost</> and
+       <varname>random_page_cost</>.  Setting either value for a particular
+       tablespace will override the planner's usual estimate of the cost of
+       reading pages from tables in that tablespace, as established by
+       the configuration parameters of the same name (see
+       <xref linkend="guc-seq-page-cost">,
+       <xref linkend="guc-random-page-cost">).  This may be useful if one
+       tablespace is located on a disk which is faster or slower than the
+       remainder of the I/O subsystem.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </refsect1>
  
*** a/src/backend/access/common/reloptions.c
--- b/src/backend/access/common/reloptions.c
***************
*** 21,26 ****
--- 21,27 ----
  #include "access/reloptions.h"
  #include "catalog/pg_type.h"
  #include "commands/defrem.h"
+ #include "commands/tablespace.h"
  #include "nodes/makefuncs.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
***************
*** 179,184 **** static relopt_real realRelOpts[] =
--- 180,201 ----
  		},
  		-1, 0.0, 100.0
  	},
+ 	{
+ 		{
+ 			"seq_page_cost",
+ 			"Sets the planner's estimate of the cost of a sequentially fetched disk page.",
+ 			RELOPT_KIND_TABLESPACE
+ 		},
+ 		-1, 0.0, DBL_MAX
+ 	},
+ 	{
+ 		{
+ 			"random_page_cost",
+ 			"Sets the planner's estimate of the cost of a nonsequentially fetched disk page.",
+ 			RELOPT_KIND_TABLESPACE
+ 		},
+ 		-1, 0.0, DBL_MAX
+ 	},
  	/* list terminator */
  	{{NULL}}
  };
***************
*** 1168,1170 **** index_reloptions(RegProcedure amoptions, Datum reloptions, bool validate)
--- 1185,1218 ----
  
  	return DatumGetByteaP(result);
  }
+ 
+ /*
+  * Option parser for tablespace reloptions
+  */
+ bytea *
+ tablespace_reloptions(Datum reloptions, bool validate)
+ {
+ 	relopt_value *options;
+ 	TableSpaceOpts	*tsopts;
+ 	int			numoptions;
+ 	static const relopt_parse_elt tab[] = {
+ 		{"random_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, random_page_cost)},
+ 		{"seq_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, seq_page_cost)}
+ 	};
+ 
+ 	options = parseRelOptions(reloptions, validate, RELOPT_KIND_TABLESPACE,
+ 							  &numoptions);
+ 
+ 	/* if none set, we're done */
+ 	if (numoptions == 0)
+ 		return NULL;
+ 
+ 	tsopts = allocateReloptStruct(sizeof(TableSpaceOpts), options, numoptions);
+ 
+ 	fillRelOptions((void *) tsopts, sizeof(TableSpaceOpts), options, numoptions,
+ 				   validate, tab, lengthof(tab));
+ 
+ 	pfree(options);
+ 
+ 	return (bytea *) tsopts;
+ }
*** a/src/backend/catalog/aclchk.c
--- b/src/backend/catalog/aclchk.c
***************
*** 2783,2800 **** ExecGrant_Tablespace(InternalGrant *istmt)
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
- 		ScanKeyData entry[1];
- 		SysScanDesc scan;
  		HeapTuple	tuple;
  
! 		/* There's no syscache for pg_tablespace, so must look the hard way */
! 		ScanKeyInit(&entry[0],
! 					ObjectIdAttributeNumber,
! 					BTEqualStrategyNumber, F_OIDEQ,
! 					ObjectIdGetDatum(tblId));
! 		scan = systable_beginscan(relation, TablespaceOidIndexId, true,
! 								  SnapshotNow, 1, entry);
! 		tuple = systable_getnext(scan);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
--- 2783,2793 ----
  		int			nnewmembers;
  		Oid		   *oldmembers;
  		Oid		   *newmembers;
  		HeapTuple	tuple;
  
! 		/* Search syscache for pg_tablespace */
! 		tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(tblId),
! 							   0, 0, 0);
  		if (!HeapTupleIsValid(tuple))
  			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
  
***************
*** 2865,2872 **** ExecGrant_Tablespace(InternalGrant *istmt)
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		systable_endscan(scan);
! 
  		pfree(new_acl);
  
  		/* prevent error when processing duplicate objects */
--- 2858,2864 ----
  							  noldmembers, oldmembers,
  							  nnewmembers, newmembers);
  
! 		ReleaseSysCache(tuple);
  		pfree(new_acl);
  
  		/* prevent error when processing duplicate objects */
***************
*** 3696,3704 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  					  AclMode mask, AclMaskHow how)
  {
  	AclMode		result;
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	tuple;
  	Datum		aclDatum;
  	bool		isNull;
--- 3688,3693 ----
***************
*** 3711,3727 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
- 	 *
- 	 * There's no syscache for pg_tablespace, so must look the hard way
  	 */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 	tuple = systable_getnext(scan);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 3700,3708 ----
  
  	/*
  	 * Get the tablespace's ACL from pg_tablespace
  	 */
! 	tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 						   0, 0, 0);
  	if (!HeapTupleIsValid(tuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 3729,3736 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = heap_getattr(tuple, Anum_pg_tablespace_spcacl,
! 							RelationGetDescr(pg_tablespace), &isNull);
  
  	if (isNull)
  	{
--- 3710,3718 ----
  
  	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
  
! 	aclDatum = SysCacheGetAttr(TABLESPACEOID, tuple,
! 								   Anum_pg_tablespace_spcacl,
! 								   &isNull);
  
  	if (isNull)
  	{
***************
*** 3750,3757 **** pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return result;
  }
--- 3732,3738 ----
  	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
  		pfree(acl);
  
! 	ReleaseSysCache(tuple);
  
  	return result;
  }
***************
*** 4338,4346 **** pg_namespace_ownercheck(Oid nsp_oid, Oid roleid)
  bool
  pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  {
- 	Relation	pg_tablespace;
- 	ScanKeyData entry[1];
- 	SysScanDesc scan;
  	HeapTuple	spctuple;
  	Oid			spcowner;
  
--- 4319,4324 ----
***************
*** 4348,4364 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* There's no syscache for pg_tablespace, so must look the hard way */
! 	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
! 	ScanKeyInit(&entry[0],
! 				ObjectIdAttributeNumber,
! 				BTEqualStrategyNumber, F_OIDEQ,
! 				ObjectIdGetDatum(spc_oid));
! 	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
! 							  SnapshotNow, 1, entry);
! 
! 	spctuple = systable_getnext(scan);
! 
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
--- 4326,4334 ----
  	if (superuser_arg(roleid))
  		return true;
  
! 	/* Search syscache for pg_tablespace */
! 	spctuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
! 							  0, 0, 0);
  	if (!HeapTupleIsValid(spctuple))
  		ereport(ERROR,
  				(errcode(ERRCODE_UNDEFINED_OBJECT),
***************
*** 4366,4373 **** pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	systable_endscan(scan);
! 	heap_close(pg_tablespace, AccessShareLock);
  
  	return has_privs_of_role(roleid, spcowner);
  }
--- 4336,4342 ----
  
  	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
  
! 	ReleaseSysCache(spctuple);
  
  	return has_privs_of_role(roleid, spcowner);
  }
*** a/src/backend/commands/tablespace.c
--- b/src/backend/commands/tablespace.c
***************
*** 49,54 ****
--- 49,55 ----
  #include <sys/stat.h>
  
  #include "access/heapam.h"
+ #include "access/reloptions.h"
  #include "access/sysattr.h"
  #include "access/transam.h"
  #include "access/xact.h"
***************
*** 57,62 ****
--- 58,64 ----
  #include "catalog/indexing.h"
  #include "catalog/pg_tablespace.h"
  #include "commands/comment.h"
+ #include "commands/defrem.h"
  #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "postmaster/bgwriter.h"
***************
*** 70,75 ****
--- 72,78 ----
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
+ #include "utils/syscache.h"
  #include "utils/tqual.h"
  
  
***************
*** 290,295 **** CreateTableSpace(CreateTableSpaceStmt *stmt)
--- 293,299 ----
  	values[Anum_pg_tablespace_spclocation - 1] =
  		CStringGetTextDatum(location);
  	nulls[Anum_pg_tablespace_spcacl - 1] = true;
+ 	nulls[Anum_pg_tablespace_spcoptions - 1] = true;
  
  	tuple = heap_form_tuple(rel->rd_att, values, nulls);
  
***************
*** 913,918 **** AlterTableSpaceOwner(const char *name, Oid newOwnerId)
--- 917,989 ----
  
  
  /*
+  * Alter table space options
+  */
+ void
+ AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt)
+ {
+ 	Relation	rel;
+ 	ScanKeyData entry[1];
+ 	HeapScanDesc scandesc;
+ 	HeapTuple	tup;
+ 	Datum		datum;
+ 	Datum		newOptions;
+ 	Datum		repl_val[Natts_pg_tablespace];
+ 	bool		isnull;
+ 	bool		repl_null[Natts_pg_tablespace];
+ 	bool		repl_repl[Natts_pg_tablespace];
+ 	HeapTuple	newtuple;
+ 
+ 	/* Search pg_tablespace */
+ 	rel = heap_open(TableSpaceRelationId, RowExclusiveLock);
+ 
+ 	ScanKeyInit(&entry[0],
+ 				Anum_pg_tablespace_spcname,
+ 				BTEqualStrategyNumber, F_NAMEEQ,
+ 				CStringGetDatum(stmt->tablespacename));
+ 	scandesc = heap_beginscan(rel, SnapshotNow, 1, entry);
+ 	tup = heap_getnext(scandesc, ForwardScanDirection);
+ 	if (!HeapTupleIsValid(tup))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_UNDEFINED_OBJECT),
+ 				 errmsg("tablespace \"%s\" does not exist",
+ 					stmt->tablespacename)));
+ 
+ 	/* Must be owner of the existing object */
+ 	if (!pg_tablespace_ownercheck(HeapTupleGetOid(tup), GetUserId()))
+ 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_TABLESPACE,
+ 					   stmt->tablespacename);
+ 
+ 	/* Generate new proposed spcoptions (text array) */
+ 	datum = heap_getattr(tup, Anum_pg_tablespace_spcoptions,
+ 						 RelationGetDescr(rel), &isnull);
+ 	newOptions = transformRelOptions(isnull ? (Datum) 0 : datum,
+ 									 stmt->options, NULL, NULL, false,
+ 									 stmt->isReset);
+ 	(void) tablespace_reloptions(newOptions, true);
+ 
+ 	/* Build new tuple. */
+ 	memset(repl_null, false, sizeof(repl_null));
+ 	memset(repl_repl, false, sizeof(repl_repl));
+ 	if (newOptions != (Datum) 0)
+ 		repl_val[Anum_pg_tablespace_spcoptions - 1] = newOptions;
+ 	else
+ 		repl_null[Anum_pg_tablespace_spcoptions - 1] = true;
+ 	repl_repl[Anum_pg_tablespace_spcoptions - 1] = true;
+ 	newtuple = heap_modify_tuple(tup, RelationGetDescr(rel), repl_val,
+ 								 repl_null, repl_repl);
+ 
+ 	/* Update system catalog. */
+ 	simple_heap_update(rel, &newtuple->t_self, newtuple);
+ 	CatalogUpdateIndexes(rel, newtuple);
+ 	heap_freetuple(newtuple);
+ 
+ 	/* Conclude heap scan. */
+ 	heap_endscan(scandesc);
+ 	heap_close(rel, NoLock);
+ }
+ 
+ /*
   * Routines for handling the GUC variable 'default_tablespace'.
   */
  
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3064,3069 **** _copyDropTableSpaceStmt(DropTableSpaceStmt *from)
--- 3064,3081 ----
  	return newnode;
  }
  
+ static AlterTableSpaceOptionsStmt *
+ _copyAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *from)
+ {
+ 	AlterTableSpaceOptionsStmt *newnode = makeNode(AlterTableSpaceOptionsStmt);
+ 
+ 	COPY_STRING_FIELD(tablespacename);
+ 	COPY_NODE_FIELD(options);
+ 	COPY_SCALAR_FIELD(isReset);
+ 
+ 	return newnode;
+ }
+ 
  static CreateFdwStmt *
  _copyCreateFdwStmt(CreateFdwStmt *from)
  {
***************
*** 4028,4033 **** copyObject(void *from)
--- 4040,4048 ----
  		case T_DropTableSpaceStmt:
  			retval = _copyDropTableSpaceStmt(from);
  			break;
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			retval = _copyAlterTableSpaceOptionsStmt(from);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _copyCreateFdwStmt(from);
  			break;
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1569,1574 **** _equalDropTableSpaceStmt(DropTableSpaceStmt *a, DropTableSpaceStmt *b)
--- 1569,1585 ----
  }
  
  static bool
+ _equalAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *a,
+ 											 AlterTableSpaceOptionsStmt *b)
+ {
+ 	COMPARE_STRING_FIELD(tablespacename);
+ 	COMPARE_NODE_FIELD(options);
+ 	COMPARE_SCALAR_FIELD(isReset);
+ 
+ 	return true;
+ }
+ 
+ static bool
  _equalCreateFdwStmt(CreateFdwStmt *a, CreateFdwStmt *b)
  {
  	COMPARE_STRING_FIELD(fdwname);
***************
*** 2720,2725 **** equal(void *a, void *b)
--- 2731,2739 ----
  		case T_DropTableSpaceStmt:
  			retval = _equalDropTableSpaceStmt(a, b);
  			break;
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			retval = _equalAlterTableSpaceOptionsStmt(a, b);
+ 			break;
  		case T_CreateFdwStmt:
  			retval = _equalCreateFdwStmt(a, b);
  			break;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 1590,1595 **** _outRelOptInfo(StringInfo str, RelOptInfo *node)
--- 1590,1596 ----
  	WRITE_NODE_FIELD(cheapest_total_path);
  	WRITE_NODE_FIELD(cheapest_unique_path);
  	WRITE_UINT_FIELD(relid);
+ 	WRITE_UINT_FIELD(reltablespace);
  	WRITE_ENUM_FIELD(rtekind, RTEKind);
  	WRITE_INT_FIELD(min_attr);
  	WRITE_INT_FIELD(max_attr);
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
***************
*** 27,32 ****
--- 27,37 ----
   * detail.	Note that all of these parameters are user-settable, in case
   * the default values are drastically off for a particular platform.
   *
+  * seq_page_cost and random_page_cost can also be overridden for an individual
+  * tablespace, in case some data is on a fast disk and other data is on a slow
+  * disk.  Per-tablespace overrides never apply to temporary work files such as
+  * an external sort or a materialize node that overflows work_mem.
+  *
   * We compute two separate costs for each path:
   *		total_cost: total estimated cost to fetch all tuples
   *		startup_cost: cost that is expended before first tuple is fetched
***************
*** 76,81 ****
--- 81,87 ----
  #include "parser/parsetree.h"
  #include "utils/lsyscache.h"
  #include "utils/selfuncs.h"
+ #include "utils/spccache.h"
  #include "utils/tuplesort.h"
  
  
***************
*** 164,169 **** void
--- 170,176 ----
  cost_seqscan(Path *path, PlannerInfo *root,
  			 RelOptInfo *baserel)
  {
+ 	double		spc_seq_page_cost;
  	Cost		startup_cost = 0;
  	Cost		run_cost = 0;
  	Cost		cpu_per_tuple;
***************
*** 175,184 **** cost_seqscan(Path *path, PlannerInfo *root,
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
  	/*
  	 * disk costs
  	 */
! 	run_cost += seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
--- 182,196 ----
  	if (!enable_seqscan)
  		startup_cost += disable_cost;
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  NULL,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * disk costs
  	 */
! 	run_cost += spc_seq_page_cost * baserel->pages;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup;
***************
*** 226,231 **** cost_index(IndexPath *path, PlannerInfo *root,
--- 238,245 ----
  	Selectivity indexSelectivity;
  	double		indexCorrelation,
  				csquared;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	Cost		min_IO_cost,
  				max_IO_cost;
  	Cost		cpu_per_tuple;
***************
*** 272,284 **** cost_index(IndexPath *path, PlannerInfo *root,
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
--- 286,303 ----
  	/* estimate number of main-table tuples fetched */
  	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
  
+ 	/* fetch estimated page costs for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*----------
  	 * Estimate number of main-table pages fetched, and compute I/O cost.
  	 *
  	 * When the index ordering is uncorrelated with the table ordering,
  	 * we use an approximation proposed by Mackert and Lohman (see
  	 * index_pages_fetched() for details) to compute the number of pages
! 	 * fetched, and then charge spc_random_page_cost per page fetched.
  	 *
  	 * When the index ordering is exactly correlated with the table ordering
  	 * (just after a CLUSTER, for example), the number of pages fetched should
***************
*** 286,292 **** cost_index(IndexPath *path, PlannerInfo *root,
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		random_page_cost + (pages_fetched - 1) * seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
--- 305,311 ----
  	 * will be sequential fetches, not the random fetches that occur in the
  	 * uncorrelated case.  So if the number of pages is more than 1, we
  	 * ought to charge
! 	 *		spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
  	 * For partially-correlated indexes, we ought to charge somewhere between
  	 * these two estimates.  We currently interpolate linearly between the
  	 * estimates based on the correlation squared (XXX is that appropriate?).
***************
*** 309,315 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
--- 328,334 ----
  											(double) index->pages,
  											root);
  
! 		max_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  
  		/*
  		 * In the perfectly correlated case, the number of pages touched by
***************
*** 328,334 **** cost_index(IndexPath *path, PlannerInfo *root,
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * random_page_cost) / num_scans;
  	}
  	else
  	{
--- 347,353 ----
  											(double) index->pages,
  											root);
  
! 		min_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
  	}
  	else
  	{
***************
*** 342,354 **** cost_index(IndexPath *path, PlannerInfo *root,
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * seq_page_cost;
  	}
  
  	/*
--- 361,373 ----
  											root);
  
  		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
! 		max_IO_cost = pages_fetched * spc_random_page_cost;
  
  		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
  		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
! 		min_IO_cost = spc_random_page_cost;
  		if (pages_fetched > 1)
! 			min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
  	}
  
  	/*
***************
*** 553,558 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 572,579 ----
  	Cost		cost_per_page;
  	double		tuples_fetched;
  	double		pages_fetched;
+ 	double		spc_seq_page_cost,
+ 				spc_random_page_cost;
  	double		T;
  
  	/* Should only be applied to base relations */
***************
*** 571,576 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
--- 592,602 ----
  
  	startup_cost += indexTotalCost;
  
+ 	/* Fetch estimated page costs for tablespace containing table. */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  &spc_seq_page_cost);
+ 
  	/*
  	 * Estimate number of main-table pages fetched.
  	 */
***************
*** 609,625 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = random_page_cost -
! 			(random_page_cost - seq_page_cost) * sqrt(pages_fetched / T);
  	else
! 		cost_per_page = random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
--- 635,652 ----
  		pages_fetched = ceil(pages_fetched);
  
  	/*
! 	 * For small numbers of pages we should charge spc_random_page_cost apiece,
  	 * while if nearly all the table's pages are being read, it's more
! 	 * appropriate to charge spc_seq_page_cost apiece.	The effect is nonlinear,
  	 * too. For lack of a better idea, interpolate like this to determine the
  	 * cost per page.
  	 */
  	if (pages_fetched >= 2.0)
! 		cost_per_page = spc_random_page_cost -
! 			(spc_random_page_cost - spc_seq_page_cost)
! 			* sqrt(pages_fetched / T);
  	else
! 		cost_per_page = spc_random_page_cost;
  
  	run_cost += pages_fetched * cost_per_page;
  
***************
*** 783,788 **** cost_tidscan(Path *path, PlannerInfo *root,
--- 810,816 ----
  	QualCost	tid_qual_cost;
  	int			ntuples;
  	ListCell   *l;
+ 	double		spc_random_page_cost;
  
  	/* Should only be applied to base relations */
  	Assert(baserel->relid > 0);
***************
*** 835,842 **** cost_tidscan(Path *path, PlannerInfo *root,
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
--- 863,875 ----
  	 */
  	cost_qual_eval(&tid_qual_cost, tidquals, root);
  
+ 	/* fetch estimated page cost for tablespace containing table */
+ 	get_tablespace_page_costs(baserel->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/* disk costs --- assume each tuple on a different page */
! 	run_cost += spc_random_page_cost * ntuples;
  
  	/* CPU costs */
  	startup_cost += baserel->baserestrictcost.startup +
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 91,96 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 91,97 ----
  
  	rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
  	rel->max_attr = RelationGetNumberOfAttributes(relation);
+ 	rel->reltablespace = RelationGetForm(relation)->reltablespace;
  
  	Assert(rel->max_attr >= rel->min_attr);
  	rel->attr_needed = (Relids *)
***************
*** 183,188 **** get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
--- 184,191 ----
  			info = makeNode(IndexOptInfo);
  
  			info->indexoid = index->indexrelid;
+ 			info->reltablespace =
+ 				RelationGetForm(indexRelation)->reltablespace;
  			info->rel = rel;
  			info->ncolumns = ncolumns = index->indnatts;
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 5687,5692 **** RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
--- 5687,5711 ----
  					n->newname = $6;
  					$$ = (Node *)n;
  				}
+ 			| ALTER TABLESPACE name SET reloptions
+ 				{
+ 					AlterTableSpaceOptionsStmt *n =
+ 						makeNode(AlterTableSpaceOptionsStmt);
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					n->isReset = FALSE;
+ 					$$ = (Node *)n;
+ 				}
+ 			| ALTER TABLESPACE name RESET reloptions
+ 				{
+ 					AlterTableSpaceOptionsStmt *n =
+ 						makeNode(AlterTableSpaceOptionsStmt);
+ 					ListCell *lc;
+ 					n->tablespacename = $3;
+ 					n->options = $5;
+ 					n->isReset = TRUE;
+ 					$$ = (Node *)n;
+ 				}
  			| ALTER TEXT_P SEARCH PARSER any_name RENAME TO name
  				{
  					RenameStmt *n = makeNode(RenameStmt);
*** a/src/backend/tcop/utility.c
--- b/src/backend/tcop/utility.c
***************
*** 218,223 **** check_xact_readonly(Node *parsetree)
--- 218,224 ----
  		case T_CreateUserMappingStmt:
  		case T_AlterUserMappingStmt:
  		case T_DropUserMappingStmt:
+ 		case T_AlterTableSpaceOptionsStmt:
  			ereport(ERROR,
  					(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
  					 errmsg("transaction is read-only")));
***************
*** 528,533 **** standard_ProcessUtility(Node *parsetree,
--- 529,538 ----
  			DropTableSpace((DropTableSpaceStmt *) parsetree);
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			AlterTableSpaceOptions((AlterTableSpaceOptionsStmt *) parsetree);
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			CreateForeignDataWrapper((CreateFdwStmt *) parsetree);
  			break;
***************
*** 1456,1461 **** CreateCommandTag(Node *parsetree)
--- 1461,1470 ----
  			tag = "DROP TABLESPACE";
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			tag = "ALTER TABLESPACE";
+ 			break;
+ 
  		case T_CreateFdwStmt:
  			tag = "CREATE FOREIGN DATA WRAPPER";
  			break;
***************
*** 2238,2243 **** GetCommandLogLevel(Node *parsetree)
--- 2247,2256 ----
  			lev = LOGSTMT_DDL;
  			break;
  
+ 		case T_AlterTableSpaceOptionsStmt:
+ 			lev = LOGSTMT_DDL;
+ 			break;
+ 
  		case T_CreateFdwStmt:
  		case T_AlterFdwStmt:
  		case T_DropFdwStmt:
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
***************
*** 119,124 ****
--- 119,125 ----
  #include "utils/nabstime.h"
  #include "utils/pg_locale.h"
  #include "utils/selfuncs.h"
+ #include "utils/spccache.h"
  #include "utils/syscache.h"
  #include "utils/tqual.h"
  
***************
*** 5648,5653 **** genericcostestimate(PlannerInfo *root,
--- 5649,5655 ----
  	QualCost	index_qual_cost;
  	double		qual_op_cost;
  	double		qual_arg_cost;
+ 	double		spc_random_page_cost;
  	List	   *selectivityQuals;
  	ListCell   *l;
  
***************
*** 5756,5761 **** genericcostestimate(PlannerInfo *root,
--- 5758,5768 ----
  	else
  		numIndexPages = 1.0;
  
+ 	/* fetch estimated page cost for schema containing index */
+ 	get_tablespace_page_costs(index->reltablespace,
+ 							  &spc_random_page_cost,
+ 							  NULL);
+ 
  	/*
  	 * Now compute the disk access costs.
  	 *
***************
*** 5802,5816 **** genericcostestimate(PlannerInfo *root,
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * random_page_cost) / num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge random_page_cost per page
! 		 * touched.
  		 */
! 		*indexTotalCost = numIndexPages * random_page_cost;
  	}
  
  	/*
--- 5809,5824 ----
  		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
  		 * since that's internal to the indexscan.)
  		 */
! 		*indexTotalCost = (pages_fetched * spc_random_page_cost)
! 							/ num_outer_scans;
  	}
  	else
  	{
  		/*
! 		 * For a single index scan, we just charge spc_random_page_cost per
! 		 * page touched.
  		 */
! 		*indexTotalCost = numIndexPages * spc_random_page_cost;
  	}
  
  	/*
***************
*** 5825,5835 **** genericcostestimate(PlannerInfo *root,
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * random_page_cost per 100000 index pages, which should be small enough
! 	 * to not alter index-vs-seqscan decisions, but will prevent indexes of
! 	 * different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
--- 5833,5843 ----
  	 *
  	 * We can deal with this by adding a very small "fudge factor" that
  	 * depends on the index size.  The fudge factor used here is one
! 	 * spc_random_page_cost per 100000 index pages, which should be small
! 	 * enough to not alter index-vs-seqscan decisions, but will prevent
! 	 * indexes of different sizes from looking exactly equally attractive.
  	 */
! 	*indexTotalCost += index->pages * spc_random_page_cost / 100000.0;
  
  	/*
  	 * CPU cost: any complex expressions in the indexquals will need to be
*** a/src/backend/utils/cache/Makefile
--- b/src/backend/utils/cache/Makefile
***************
*** 13,18 **** top_builddir = ../../../..
  include $(top_builddir)/src/Makefile.global
  
  OBJS = catcache.o inval.o plancache.o relcache.o \
! 	syscache.o lsyscache.o typcache.o ts_cache.o
  
  include $(top_srcdir)/src/backend/common.mk
--- 13,18 ----
  include $(top_builddir)/src/Makefile.global
  
  OBJS = catcache.o inval.o plancache.o relcache.o \
! 	spccache.o syscache.o lsyscache.o typcache.o ts_cache.o
  
  include $(top_srcdir)/src/backend/common.mk
*** a/src/backend/utils/cache/lsyscache.c
--- b/src/backend/utils/cache/lsyscache.c
***************
*** 17,22 ****
--- 17,23 ----
  
  #include "access/hash.h"
  #include "access/nbtree.h"
+ #include "access/reloptions.h"
  #include "bootstrap/bootstrap.h"
  #include "catalog/pg_amop.h"
  #include "catalog/pg_amproc.h"
***************
*** 26,34 ****
--- 27,38 ----
  #include "catalog/pg_operator.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_type.h"
+ #include "commands/tablespace.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
+ #include "optimizer/cost.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/datum.h"
*** /dev/null
--- b/src/backend/utils/cache/spccache.c
***************
*** 0 ****
--- 1,183 ----
+ /*-------------------------------------------------------------------------
+  *
+  * spccache.c
+  *	  Tablespace cache management.
+  *
+  * We cache the parsed version of spcoptions for each tablespace to avoid
+  * needing to reparse on every lookup.  Right now, there doesn't appear to
+  * be a measurable performance gain from doing this, but that might change
+  * in the future as we add more options.
+  *
+  * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * IDENTIFICATION
+  *	  $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ #include "access/reloptions.h"
+ #include "catalog/pg_tablespace.h"
+ #include "commands/tablespace.h"
+ #include "miscadmin.h"
+ #include "optimizer/cost.h"
+ #include "utils/catcache.h"
+ #include "utils/hsearch.h"
+ #include "utils/inval.h"
+ #include "utils/spccache.h"
+ #include "utils/syscache.h"
+ 
+ static HTAB *TableSpaceCacheHash = NULL;
+ 
+ typedef struct {
+ 	Oid			oid;
+ 	TableSpaceOpts *opts;
+ } TableSpace;
+ 
+ /*
+  * InvalidateTableSpaceCacheCallback
+  *		Flush all cache entries when pg_tablespace is updated.
+  *
+  * When pg_tablespace is updated, we must flush the cache entry at least
+  * for that tablespace.  Currently, we just flush them all.  This is quick
+  * and easy and doesn't cost much, since there shouldn't be terribly many
+  * tablespaces, nor do we expect them to be frequently modified.
+  */
+ static void
+ InvalidateTableSpaceCacheCallback(Datum arg, int cacheid, ItemPointer tuplePtr)
+ {
+ 	HASH_SEQ_STATUS status;
+ 	TableSpace *spc;
+ 
+ 	hash_seq_init(&status, TableSpaceCacheHash);
+ 	while ((spc = (TableSpace *) hash_seq_search(&status)) != NULL)
+ 	{
+ 		if (hash_search(TableSpaceCacheHash, (void *) &spc->oid, HASH_REMOVE,
+ 						NULL) == NULL)
+ 			elog(ERROR, "hash table corrupted");
+ 		if (spc->opts)
+ 			pfree(spc->opts);
+ 	}
+ }
+ 
+ /*
+  * InitializeTableSpaceCache
+  *		Initiate the tablespace cache.
+  */
+ static void
+ InitializeTableSpaceCache(void)
+ {
+ 	HASHCTL ctl;
+ 
+ 	/* Initialize the hash table. */
+ 	MemSet(&ctl, 0, sizeof(ctl));
+ 	ctl.keysize = sizeof(Oid);
+ 	ctl.entrysize = sizeof(TableSpace);
+ 	ctl.hash = tag_hash;
+ 	TableSpaceCacheHash =
+ 		hash_create("TableSpace cache", 16, &ctl,
+ 				    HASH_ELEM | HASH_FUNCTION);
+ 
+ 	/* Make sure we've initialized CacheMemoryContext. */
+ 	if (!CacheMemoryContext)
+ 		CreateCacheMemoryContext();
+ 
+ 	/* Watch for invalidation events. */
+ 	CacheRegisterSyscacheCallback(TABLESPACEOID,
+ 								  InvalidateTableSpaceCacheCallback,
+ 								  (Datum) 0);
+ }
+ 
+ /*
+  * get_tablespace
+  *		Fetch TableSpace structure for a specified table OID.
+  *
+  * Pointers returned by this function should not be stored, since a cache
+  * flush will invalidate them.
+  */
+ static TableSpace *
+ get_tablespace(Oid spcid)
+ {
+ 	HeapTuple	tp;
+ 	TableSpace *spc;
+ 	bool		found;
+ 
+ 	/*
+ 	 * Since spcid is always from a pg_class tuple, InvalidOid implies the
+ 	 * default.
+ 	 */
+ 	if (spcid == InvalidOid)
+ 		spcid = MyDatabaseTableSpace;
+ 
+ 	/* Find existing cache entry, or create a new one. */
+ 	if (!TableSpaceCacheHash)
+ 		InitializeTableSpaceCache();
+ 	spc = (TableSpace *) hash_search(TableSpaceCacheHash, (void *) &spcid,
+ 									 HASH_ENTER, &found);
+ 	if (found)
+ 		return spc;
+ 
+ 	/*
+ 	 * Not found in TableSpace cache.  Check catcache.  If we don't find a
+ 	 * valid HeapTuple, it must mean someone has managed to request tablespace
+ 	 * details for a non-existent tablespace.  We'll just treat that case as if
+ 	 * no options were specified.
+ 	 */
+ 	tp = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spcid), 0, 0, 0);
+ 	if (!HeapTupleIsValid(tp))
+ 		spc->opts = NULL;
+ 	else
+ 	{
+ 		Datum	datum;
+ 		bool	isNull;
+ 		MemoryContext octx;
+ 
+ 		datum = SysCacheGetAttr(TABLESPACEOID,
+ 								tp,
+ 								Anum_pg_tablespace_spcoptions,
+ 								&isNull);
+ 		if (isNull)
+ 			spc->opts = NULL;
+ 		else
+ 		{
+ 			octx = MemoryContextSwitchTo(CacheMemoryContext);
+ 			spc->opts = (TableSpaceOpts *) tablespace_reloptions(datum, false);
+ 			MemoryContextSwitchTo(octx);
+ 		}
+ 		ReleaseSysCache(tp);
+ 	}
+ 
+ 	/* Update new TableSpace cache entry with results of option parsing. */
+ 	return spc;
+ }
+ 
+ /*
+  * get_tablespace_page_costs
+  *		Return random and sequential page costs for a given tablespace.
+  */
+ void
+ get_tablespace_page_costs(Oid spcid, double *spc_random_page_cost,
+ 							   double *spc_seq_page_cost)
+ {
+ 	TableSpace *spc = get_tablespace(spcid);
+ 
+ 	Assert(spc != NULL);
+ 
+ 	if (spc_random_page_cost)
+ 	{
+ 		if (!spc->opts || spc->opts->random_page_cost < 0)
+ 			*spc_random_page_cost = random_page_cost;
+ 		else
+ 			*spc_random_page_cost = spc->opts->random_page_cost;
+ 	}
+ 
+ 	if (spc_seq_page_cost)
+ 	{
+ 		if (!spc->opts || spc->opts->seq_page_cost < 0)
+ 			*spc_seq_page_cost = seq_page_cost;
+ 		else
+ 			*spc_seq_page_cost = spc->opts->seq_page_cost;
+ 	}
+ }
*** a/src/backend/utils/cache/syscache.c
--- b/src/backend/utils/cache/syscache.c
***************
*** 43,48 ****
--- 43,49 ----
  #include "catalog/pg_proc.h"
  #include "catalog/pg_rewrite.h"
  #include "catalog/pg_statistic.h"
+ #include "catalog/pg_tablespace.h"
  #include "catalog/pg_ts_config.h"
  #include "catalog/pg_ts_config_map.h"
  #include "catalog/pg_ts_dict.h"
***************
*** 609,614 **** static const struct cachedesc cacheinfo[] = {
--- 610,627 ----
  		},
  		1024
  	},
+ 	{TableSpaceRelationId,		/* TABLESPACEOID */
+ 		TablespaceOidIndexId,
+ 		0,
+ 		1,
+ 		{
+ 			ObjectIdAttributeNumber,
+ 			0,
+ 			0,
+ 			0,
+ 		},
+ 		16
+ 	},
  	{TSConfigMapRelationId,		/* TSCONFIGMAP */
  		TSConfigMapIndexId,
  		0,
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
***************
*** 956,974 **** dumpTablespaces(PGconn *conn)
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80200)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
--- 956,983 ----
  	 * Get all tablespaces except built-in ones (which we assume are named
  	 * pg_xxx)
  	 */
! 	if (server_version >= 80500)
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
+ 						   "array_to_string(spcoptions, ', '),"
  						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
+ 	else if (server_version >= 80200)
+ 		res = executeQuery(conn, "SELECT spcname, "
+ 						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
+ 						   "spclocation, spcacl, null, "
+ 						"pg_catalog.shobj_description(oid, 'pg_tablespace'), "
+ 						   "FROM pg_catalog.pg_tablespace "
+ 						   "WHERE spcname !~ '^pg_' "
+ 						   "ORDER BY 1");
  	else
  		res = executeQuery(conn, "SELECT spcname, "
  						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
  						   "spclocation, spcacl, "
! 						   "null, null "
  						   "FROM pg_catalog.pg_tablespace "
  						   "WHERE spcname !~ '^pg_' "
  						   "ORDER BY 1");
***************
*** 983,989 **** dumpTablespaces(PGconn *conn)
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spccomment = PQgetvalue(res, i, 4);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
--- 992,999 ----
  		char	   *spcowner = PQgetvalue(res, i, 1);
  		char	   *spclocation = PQgetvalue(res, i, 2);
  		char	   *spcacl = PQgetvalue(res, i, 3);
! 		char	   *spcoptions = PQgetvalue(res, i, 4);
! 		char	   *spccomment = PQgetvalue(res, i, 5);
  		char	   *fspcname;
  
  		/* needed for buildACLCommands() */
***************
*** 996,1001 **** dumpTablespaces(PGconn *conn)
--- 1006,1015 ----
  		appendStringLiteralConn(buf, spclocation, conn);
  		appendPQExpBuffer(buf, ";\n");
  
+ 		if (spcoptions && spcoptions[0] != '\0')
+ 			appendPQExpBuffer(buf, "ALTER TABLESPACE %s SET (%s);\n",
+ 							  fspcname, spcoptions);
+ 
  		if (!skip_acls &&
  			!buildACLCommands(fspcname, NULL, "TABLESPACE", spcacl, spcowner,
  							  "", server_version, buf))
*** a/src/include/access/reloptions.h
--- b/src/include/access/reloptions.h
***************
*** 1,7 ****
  /*-------------------------------------------------------------------------
   *
   * reloptions.h
!  *	  Core support for relation options (pg_class.reloptions)
   *
   * Note: the functions dealing with text-array reloptions values declare
   * them as Datum, not ArrayType *, to avoid needing to include array.h
--- 1,8 ----
  /*-------------------------------------------------------------------------
   *
   * reloptions.h
!  *	  Core support for relation and tablespace options (pg_class.reloptions
!  *	  and pg_tablespace.spcoptions)
   *
   * Note: the functions dealing with text-array reloptions values declare
   * them as Datum, not ArrayType *, to avoid needing to include array.h
***************
*** 39,46 **** typedef enum relopt_kind
  	RELOPT_KIND_HASH = (1 << 3),
  	RELOPT_KIND_GIN = (1 << 4),
  	RELOPT_KIND_GIST = (1 << 5),
  	/* if you add a new kind, make sure you update "last_default" too */
! 	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_GIST,
  	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
  	RELOPT_KIND_MAX = (1 << 30)
  } relopt_kind;
--- 40,48 ----
  	RELOPT_KIND_HASH = (1 << 3),
  	RELOPT_KIND_GIN = (1 << 4),
  	RELOPT_KIND_GIST = (1 << 5),
+ 	RELOPT_KIND_TABLESPACE = (1 << 6),
  	/* if you add a new kind, make sure you update "last_default" too */
! 	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_TABLESPACE,
  	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
  	RELOPT_KIND_MAX = (1 << 30)
  } relopt_kind;
***************
*** 264,268 **** extern bytea *default_reloptions(Datum reloptions, bool validate,
--- 266,271 ----
  extern bytea *heap_reloptions(char relkind, Datum reloptions, bool validate);
  extern bytea *index_reloptions(RegProcedure amoptions, Datum reloptions,
  				 bool validate);
+ extern bytea *tablespace_reloptions(Datum reloptions, bool validate);
  
  #endif   /* RELOPTIONS_H */
*** a/src/include/catalog/pg_tablespace.h
--- b/src/include/catalog/pg_tablespace.h
***************
*** 34,39 **** CATALOG(pg_tablespace,1213) BKI_SHARED_RELATION
--- 34,40 ----
  	Oid			spcowner;		/* owner of tablespace */
  	text		spclocation;	/* physical location (VAR LENGTH) */
  	aclitem		spcacl[1];		/* access permissions (VAR LENGTH) */
+ 	text		spcoptions[1];	/* per-tablespace options */
  } FormData_pg_tablespace;
  
  /* ----------------
***************
*** 48,61 **** typedef FormData_pg_tablespace *Form_pg_tablespace;
   * ----------------
   */
  
! #define Natts_pg_tablespace				4
  #define Anum_pg_tablespace_spcname		1
  #define Anum_pg_tablespace_spcowner		2
  #define Anum_pg_tablespace_spclocation	3
  #define Anum_pg_tablespace_spcacl		4
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
--- 49,63 ----
   * ----------------
   */
  
! #define Natts_pg_tablespace				6
  #define Anum_pg_tablespace_spcname		1
  #define Anum_pg_tablespace_spcowner		2
  #define Anum_pg_tablespace_spclocation	3
  #define Anum_pg_tablespace_spcacl		4
+ #define Anum_pg_tablespace_spcoptions	5
  
! DATA(insert OID = 1663 ( pg_default PGUID "" _null_ _null_ ));
! DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ _null_ ));
  
  #define DEFAULTTABLESPACE_OID 1663
  #define GLOBALTABLESPACE_OID 1664
*** a/src/include/commands/tablespace.h
--- b/src/include/commands/tablespace.h
***************
*** 32,42 **** typedef struct xl_tblspc_drop_rec
--- 32,48 ----
  	Oid			ts_id;
  } xl_tblspc_drop_rec;
  
+ typedef struct TableSpaceOpts
+ {
+ 	float8		random_page_cost;
+ 	float8		seq_page_cost;
+ } TableSpaceOpts;
  
  extern void CreateTableSpace(CreateTableSpaceStmt *stmt);
  extern void DropTableSpace(DropTableSpaceStmt *stmt);
  extern void RenameTableSpace(const char *oldname, const char *newname);
  extern void AlterTableSpaceOwner(const char *name, Oid newOwnerId);
+ extern void AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
  
  extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
  
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 346,351 **** typedef enum NodeTag
--- 346,352 ----
  	T_CreateUserMappingStmt,
  	T_AlterUserMappingStmt,
  	T_DropUserMappingStmt,
+ 	T_AlterTableSpaceOptionsStmt,
  
  	/*
  	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 1477,1482 **** typedef struct DropTableSpaceStmt
--- 1477,1490 ----
  	bool		missing_ok;		/* skip error if missing? */
  } DropTableSpaceStmt;
  
+ typedef struct AlterTableSpaceOptionsStmt
+ {
+ 	NodeTag		type;
+ 	char	   *tablespacename;
+ 	List	   *options;
+ 	bool		isReset;
+ } AlterTableSpaceOptionsStmt;
+ 
  /* ----------------------
   *		Create/Drop FOREIGN DATA WRAPPER Statements
   * ----------------------
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 371,376 **** typedef struct RelOptInfo
--- 371,377 ----
  
  	/* information about a base rel (not set for join rels!) */
  	Index		relid;
+ 	Oid			reltablespace;	/* containing tablespace */
  	RTEKind		rtekind;		/* RELATION, SUBQUERY, or FUNCTION */
  	AttrNumber	min_attr;		/* smallest attrno of rel (often <0) */
  	AttrNumber	max_attr;		/* largest attrno of rel */
***************
*** 435,440 **** typedef struct IndexOptInfo
--- 436,442 ----
  	NodeTag		type;
  
  	Oid			indexoid;		/* OID of the index relation */
+ 	Oid			reltablespace;	/* tablespace of index (not table) */
  	RelOptInfo *rel;			/* back-link to index's table */
  
  	/* statistics from pg_class */
*** /dev/null
--- b/src/include/utils/spccache.h
***************
*** 0 ****
--- 1,19 ----
+ /*-------------------------------------------------------------------------
+  *
+  * spccache.h
+  *	  Tablespace cache.
+  *
+  * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ #ifndef SPCCACHE_H
+ #define SPCCACHE_H
+ 
+ void get_tablespace_page_costs(Oid spcid, float8 *spc_random_page_cost,
+ 					     float8 *spc_seq_page_cost);
+ 
+ #endif   /* SPCCACHE_H */
*** a/src/include/utils/syscache.h
--- b/src/include/utils/syscache.h
***************
*** 71,76 **** enum SysCacheIdentifier
--- 71,77 ----
  	RELOID,
  	RULERELNAME,
  	STATRELATTINH,
+ 	TABLESPACEOID,
  	TSCONFIGMAP,
  	TSCONFIGNAMENSP,
  	TSCONFIGOID,
*** a/src/test/regress/input/tablespace.source
--- b/src/test/regress/input/tablespace.source
***************
*** 1,6 ****
--- 1,12 ----
  -- create a tablespace we can use
  CREATE TABLESPACE testspace LOCATION '@testtablespace@';
  
+ -- try setting and resetting some properties for the new tablespace
+ ALTER TABLESPACE testspace SET (random_page_cost = 1.0);
+ ALTER TABLESPACE testspace SET (some_nonexistent_parameter = true);  -- fail
+ ALTER TABLESPACE testspace RESET (random_page_cost = 2.0); -- fail
+ ALTER TABLESPACE testspace RESET (random_page_cost, seq_page_cost); -- ok
+ 
  -- create a schema we can use
  CREATE SCHEMA testschema;
  
*** a/src/test/regress/output/tablespace.source
--- b/src/test/regress/output/tablespace.source
***************
*** 1,5 ****
--- 1,12 ----
  -- create a tablespace we can use
  CREATE TABLESPACE testspace LOCATION '@testtablespace@';
+ -- try setting and resetting some properties for the new tablespace
+ ALTER TABLESPACE testspace SET (random_page_cost = 1.0);
+ ALTER TABLESPACE testspace SET (some_nonexistent_parameter = true);  -- fail
+ ERROR:  unrecognized parameter "some_nonexistent_parameter"
+ ALTER TABLESPACE testspace RESET (random_page_cost = 2.0); -- fail
+ ERROR:  RESET must not include values for parameters
+ ALTER TABLESPACE testspace RESET (random_page_cost, seq_page_cost); -- ok
  -- create a schema we can use
  CREATE SCHEMA testschema;
  -- try a table

#19

Alvaro Herrera

alvherre@commandprompt.com

about 16 years ago

In reply to: Robert Haas (#18)

Re: patch - per-tablespace random_page_cost/seq_page_cost

--- 49,63 ----
* ----------------
*/
! #define Natts_pg_tablespace 6

Should be 5?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#20

Jaime Casanova

jcasanov@systemguards.com.ec

about 16 years ago

In reply to: Robert Haas (#18)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sun, Jan 3, 2010 at 10:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:

in syscache.c

Hmm, I see this needs to be rebased over Tom's latest changes, but the
conflict I got was in syscache.h, rather than syscache.c. Not sure if
that's what you were going for or if there's another issue. Updated
patch attached.

ah! yeah! it has been a long holiday ;)

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#21

Alvaro Herrera

alvherre@commandprompt.com

about 16 years ago

In reply to: Robert Haas (#18)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Robert Haas escribiï¿½:

Hmm, I see this needs to be rebased over Tom's latest changes, but the
conflict I got was in syscache.h, rather than syscache.c. Not sure if
that's what you were going for or if there's another issue. Updated
patch attached.

FWIW I think the reloptions code in this patch is sane enough. The fact
that it was this easily written means that the API for reloptions was
reasonably chosen, thanks :-)

Hmm, it seems we're missing a "need_initialization = false" at the
bottom of initialize_reloptions ... I'm wondering what happened to
that??

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#22

Tom Lane

tgl@sss.pgh.pa.us

about 16 years ago

In reply to: Robert Haas (#18)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Robert Haas <robertmhaas@gmail.com> writes:

Hmm, I see this needs to be rebased over Tom's latest changes, but the
conflict I got was in syscache.h, rather than syscache.c. Not sure if
that's what you were going for or if there's another issue. Updated
patch attached.

I'm planning to go look at Naylor's bki refactoring patch now. Assuming
there isn't any showstopper problem with that, do you object to it
getting committed first? Either order is going to create a merge
problem, but it seems like we'd be best off to get Naylor's patch in
so people can resync affected patches before the January commitfest
starts.

regards, tom lane

#23

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Tom Lane (#22)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Mon, Jan 4, 2010 at 1:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Hmm, I see this needs to be rebased over Tom's latest changes, but the
conflict I got was in syscache.h, rather than syscache.c. Not sure if
that's what you were going for or if there's another issue. Updated
patch attached.

I'm planning to go look at Naylor's bki refactoring patch now. Assuming
there isn't any showstopper problem with that, do you object to it
getting committed first? Either order is going to create a merge
problem, but it seems like we'd be best off to get Naylor's patch in
so people can resync affected patches before the January commitfest
starts.

My only objection to that is that if we're going to add attoptions
also, I'd like to get this committed first before I start working on
that, and we're running short on time. If you can commit his patch in
the next day or two, then I am fine with rebasing mine afterwards, but
if it needs more work than that then I would prefer to commit mine so
I can move on. Is that reasonable?

...Robert

#24

Tom Lane

tgl@sss.pgh.pa.us

about 16 years ago

In reply to: Robert Haas (#23)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Robert Haas <robertmhaas@gmail.com> writes:

My only objection to that is that if we're going to add attoptions
also, I'd like to get this committed first before I start working on
that, and we're running short on time. If you can commit his patch in
the next day or two, then I am fine with rebasing mine afterwards, but
if it needs more work than that then I would prefer to commit mine so
I can move on. Is that reasonable?

Fair enough --- if I can't get it done today I will let you know and
hold off.

regards, tom lane

#25

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Alvaro Herrera (#21)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Mon, Jan 4, 2010 at 10:42 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Robert Haas escribió:

Hmm, I see this needs to be rebased over Tom's latest changes, but the
conflict I got was in syscache.h, rather than syscache.c. Not sure if
that's what you were going for or if there's another issue. Updated
patch attached.

FWIW I think the reloptions code in this patch is sane enough. The fact
that it was this easily written means that the API for reloptions was
reasonably chosen, thanks :-)

:-)

Actually, there are some things about it that I'm not entirely happy
with, but I haven't brought them up because I don't have a clear idea
what I think we should do about them. The special-case hack to handle
the "oids" option is one of them.... another, possibly related, is
that I wish we could decouple the options-validation logic from the
backend storage representation. But those are issues for a future
thread. I do think it's pretty well-done overall.

Hmm, it seems we're missing a "need_initialization = false" at the
bottom of initialize_reloptions ... I'm wondering what happened to
that??

It appears that it has never been there.

$ git log -Sneed_initialization master src/backend/access/common/reloptions.c
commit f35e4442a6c9893e72fe870d9e1756262d542027
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon Jan 5 17:14:28 2009 +0000

Change the reloptions machinery to use a table-based parser, and provide
a more complete framework for writing custom option processing routines
by user-defined access methods.

Catalog version bumped due to the general API changes, which are going to
affect user-defined "amoptions" routines.

That was the original patch that added need_initialization, and it
didn't add that line.

...Robert

#26

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Alvaro Herrera (#19)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Sun, Jan 3, 2010 at 11:19 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

--- 49,63 ----
   * ----------------
   */
! #define Natts_pg_tablespace 6
Should be 5?

Yep. I also fixed the other two bits of brain fade that you pointed
out to me via private email. Updated patch attached.

...Robert

Attachments:

spcoptions-v5.patchtext/x-patch; charset=US-ASCII; name=spcoptions-v5.patchDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 9c39149..f5df223 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2000,6 +2000,9 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
        <para>
         Sets the planner's estimate of the cost of a disk page fetch
         that is part of a series of sequential fetches.  The default is 1.0.
+        This value can be overriden for a particular tablespace by setting
+        the tablespace parameter of the same name
+        (see <xref linkend="sql-altertablespace">).
        </para>
       </listitem>
      </varlistentry>
@@ -2013,6 +2016,12 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
        <para>
         Sets the planner's estimate of the cost of a
         non-sequentially-fetched disk page.  The default is 4.0.
+        This value can be overriden for a particular tablespace by setting
+        the tablespace parameter of the same name
+        (see <xref linkend="sql-altertablespace">).
+       </para>
+
+	   <para>
         Reducing this value relative to <varname>seq_page_cost</>
         will cause the system to prefer index scans; raising it will
         make index scans look relatively more expensive.  You can raise
diff --git a/doc/src/sgml/ref/alter_tablespace.sgml b/doc/src/sgml/ref/alter_tablespace.sgml
index 6cd6e2f..f477275 100644
--- a/doc/src/sgml/ref/alter_tablespace.sgml
+++ b/doc/src/sgml/ref/alter_tablespace.sgml
@@ -23,6 +23,8 @@ PostgreSQL documentation
 <synopsis>
 ALTER TABLESPACE <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner</replaceable>
+ALTER TABLESPACE <replaceable>name</replaceable> SET ( <replaceable class="PARAMETER">tablespace_option</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] )
+ALTER TABLESPACE <replaceable>name</replaceable> RESET ( <replaceable class="PARAMETER">tablespace_option</replaceable> [, ... ] )
 </synopsis>
  </refsynopsisdiv>
   
@@ -74,6 +76,24 @@ ALTER TABLESPACE <replaceable>name</replaceable> OWNER TO <replaceable>new_owner
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">tablespace_parameter</replaceable></term>
+    <listitem>
+     <para>
+      A tablespace parameter to be set or reset.  Currently, the only
+      available parameters are <varname>seq_page_cost</> and
+      <varname>random_page_cost</>.  Setting either value for a particular
+      tablespace will override the planner's usual estimate of the cost of
+      reading pages from tables in that tablespace, as established by
+      the configuration parameters of the same name (see
+      <xref linkend="guc-seq-page-cost">,
+      <xref linkend="guc-random-page-cost">).  This may be useful if one
+      tablespace is located on a disk which is faster or slower than the
+      remainder of the I/O subsystem.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 4def2c1..24480e3 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -21,6 +21,7 @@
 #include "access/reloptions.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "commands/tablespace.h"
 #include "nodes/makefuncs.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
@@ -179,6 +180,22 @@ static relopt_real realRelOpts[] =
 		},
 		-1, 0.0, 100.0
 	},
+	{
+		{
+			"seq_page_cost",
+			"Sets the planner's estimate of the cost of a sequentially fetched disk page.",
+			RELOPT_KIND_TABLESPACE
+		},
+		-1, 0.0, DBL_MAX
+	},
+	{
+		{
+			"random_page_cost",
+			"Sets the planner's estimate of the cost of a nonsequentially fetched disk page.",
+			RELOPT_KIND_TABLESPACE
+		},
+		-1, 0.0, DBL_MAX
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1168,3 +1185,34 @@ index_reloptions(RegProcedure amoptions, Datum reloptions, bool validate)
 
 	return DatumGetByteaP(result);
 }
+
+/*
+ * Option parser for tablespace reloptions
+ */
+bytea *
+tablespace_reloptions(Datum reloptions, bool validate)
+{
+	relopt_value *options;
+	TableSpaceOpts	*tsopts;
+	int			numoptions;
+	static const relopt_parse_elt tab[] = {
+		{"random_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, random_page_cost)},
+		{"seq_page_cost", RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, seq_page_cost)}
+	};
+
+	options = parseRelOptions(reloptions, validate, RELOPT_KIND_TABLESPACE,
+							  &numoptions);
+
+	/* if none set, we're done */
+	if (numoptions == 0)
+		return NULL;
+
+	tsopts = allocateReloptStruct(sizeof(TableSpaceOpts), options, numoptions);
+
+	fillRelOptions((void *) tsopts, sizeof(TableSpaceOpts), options, numoptions,
+				   validate, tab, lengthof(tab));
+
+	pfree(options);
+
+	return (bytea *) tsopts;
+}
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 4a51612..00244d9 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -2783,18 +2783,11 @@ ExecGrant_Tablespace(InternalGrant *istmt)
 		int			nnewmembers;
 		Oid		   *oldmembers;
 		Oid		   *newmembers;
-		ScanKeyData entry[1];
-		SysScanDesc scan;
 		HeapTuple	tuple;
 
-		/* There's no syscache for pg_tablespace, so must look the hard way */
-		ScanKeyInit(&entry[0],
-					ObjectIdAttributeNumber,
-					BTEqualStrategyNumber, F_OIDEQ,
-					ObjectIdGetDatum(tblId));
-		scan = systable_beginscan(relation, TablespaceOidIndexId, true,
-								  SnapshotNow, 1, entry);
-		tuple = systable_getnext(scan);
+		/* Search syscache for pg_tablespace */
+		tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(tblId),
+							   0, 0, 0);
 		if (!HeapTupleIsValid(tuple))
 			elog(ERROR, "cache lookup failed for tablespace %u", tblId);
 
@@ -2865,8 +2858,7 @@ ExecGrant_Tablespace(InternalGrant *istmt)
 							  noldmembers, oldmembers,
 							  nnewmembers, newmembers);
 
-		systable_endscan(scan);
-
+		ReleaseSysCache(tuple);
 		pfree(new_acl);
 
 		/* prevent error when processing duplicate objects */
@@ -3696,9 +3688,6 @@ pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
 					  AclMode mask, AclMaskHow how)
 {
 	AclMode		result;
-	Relation	pg_tablespace;
-	ScanKeyData entry[1];
-	SysScanDesc scan;
 	HeapTuple	tuple;
 	Datum		aclDatum;
 	bool		isNull;
@@ -3711,17 +3700,9 @@ pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
 
 	/*
 	 * Get the tablespace's ACL from pg_tablespace
-	 *
-	 * There's no syscache for pg_tablespace, so must look the hard way
 	 */
-	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
-	ScanKeyInit(&entry[0],
-				ObjectIdAttributeNumber,
-				BTEqualStrategyNumber, F_OIDEQ,
-				ObjectIdGetDatum(spc_oid));
-	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
-							  SnapshotNow, 1, entry);
-	tuple = systable_getnext(scan);
+	tuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
+						   0, 0, 0);
 	if (!HeapTupleIsValid(tuple))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_OBJECT),
@@ -3729,8 +3710,9 @@ pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
 
 	ownerId = ((Form_pg_tablespace) GETSTRUCT(tuple))->spcowner;
 
-	aclDatum = heap_getattr(tuple, Anum_pg_tablespace_spcacl,
-							RelationGetDescr(pg_tablespace), &isNull);
+	aclDatum = SysCacheGetAttr(TABLESPACEOID, tuple,
+								   Anum_pg_tablespace_spcacl,
+								   &isNull);
 
 	if (isNull)
 	{
@@ -3750,8 +3732,7 @@ pg_tablespace_aclmask(Oid spc_oid, Oid roleid,
 	if (acl && (Pointer) acl != DatumGetPointer(aclDatum))
 		pfree(acl);
 
-	systable_endscan(scan);
-	heap_close(pg_tablespace, AccessShareLock);
+	ReleaseSysCache(tuple);
 
 	return result;
 }
@@ -4338,9 +4319,6 @@ pg_namespace_ownercheck(Oid nsp_oid, Oid roleid)
 bool
 pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
 {
-	Relation	pg_tablespace;
-	ScanKeyData entry[1];
-	SysScanDesc scan;
 	HeapTuple	spctuple;
 	Oid			spcowner;
 
@@ -4348,17 +4326,9 @@ pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
 	if (superuser_arg(roleid))
 		return true;
 
-	/* There's no syscache for pg_tablespace, so must look the hard way */
-	pg_tablespace = heap_open(TableSpaceRelationId, AccessShareLock);
-	ScanKeyInit(&entry[0],
-				ObjectIdAttributeNumber,
-				BTEqualStrategyNumber, F_OIDEQ,
-				ObjectIdGetDatum(spc_oid));
-	scan = systable_beginscan(pg_tablespace, TablespaceOidIndexId, true,
-							  SnapshotNow, 1, entry);
-
-	spctuple = systable_getnext(scan);
-
+	/* Search syscache for pg_tablespace */
+	spctuple = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spc_oid),
+							  0, 0, 0);
 	if (!HeapTupleIsValid(spctuple))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_OBJECT),
@@ -4366,8 +4336,7 @@ pg_tablespace_ownercheck(Oid spc_oid, Oid roleid)
 
 	spcowner = ((Form_pg_tablespace) GETSTRUCT(spctuple))->spcowner;
 
-	systable_endscan(scan);
-	heap_close(pg_tablespace, AccessShareLock);
+	ReleaseSysCache(spctuple);
 
 	return has_privs_of_role(roleid, spcowner);
 }
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index c9925b7..c35a4e0 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -49,6 +49,7 @@
 #include <sys/stat.h>
 
 #include "access/heapam.h"
+#include "access/reloptions.h"
 #include "access/sysattr.h"
 #include "access/transam.h"
 #include "access/xact.h"
@@ -57,6 +58,7 @@
 #include "catalog/indexing.h"
 #include "catalog/pg_tablespace.h"
 #include "commands/comment.h"
+#include "commands/defrem.h"
 #include "commands/tablespace.h"
 #include "miscadmin.h"
 #include "postmaster/bgwriter.h"
@@ -70,6 +72,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "utils/syscache.h"
 #include "utils/tqual.h"
 
 
@@ -290,6 +293,7 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
 	values[Anum_pg_tablespace_spclocation - 1] =
 		CStringGetTextDatum(location);
 	nulls[Anum_pg_tablespace_spcacl - 1] = true;
+	nulls[Anum_pg_tablespace_spcoptions - 1] = true;
 
 	tuple = heap_form_tuple(rel->rd_att, values, nulls);
 
@@ -913,6 +917,73 @@ AlterTableSpaceOwner(const char *name, Oid newOwnerId)
 
 
 /*
+ * Alter table space options
+ */
+void
+AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt)
+{
+	Relation	rel;
+	ScanKeyData entry[1];
+	HeapScanDesc scandesc;
+	HeapTuple	tup;
+	Datum		datum;
+	Datum		newOptions;
+	Datum		repl_val[Natts_pg_tablespace];
+	bool		isnull;
+	bool		repl_null[Natts_pg_tablespace];
+	bool		repl_repl[Natts_pg_tablespace];
+	HeapTuple	newtuple;
+
+	/* Search pg_tablespace */
+	rel = heap_open(TableSpaceRelationId, RowExclusiveLock);
+
+	ScanKeyInit(&entry[0],
+				Anum_pg_tablespace_spcname,
+				BTEqualStrategyNumber, F_NAMEEQ,
+				CStringGetDatum(stmt->tablespacename));
+	scandesc = heap_beginscan(rel, SnapshotNow, 1, entry);
+	tup = heap_getnext(scandesc, ForwardScanDirection);
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("tablespace \"%s\" does not exist",
+					stmt->tablespacename)));
+
+	/* Must be owner of the existing object */
+	if (!pg_tablespace_ownercheck(HeapTupleGetOid(tup), GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_TABLESPACE,
+					   stmt->tablespacename);
+
+	/* Generate new proposed spcoptions (text array) */
+	datum = heap_getattr(tup, Anum_pg_tablespace_spcoptions,
+						 RelationGetDescr(rel), &isnull);
+	newOptions = transformRelOptions(isnull ? (Datum) 0 : datum,
+									 stmt->options, NULL, NULL, false,
+									 stmt->isReset);
+	(void) tablespace_reloptions(newOptions, true);
+
+	/* Build new tuple. */
+	memset(repl_null, false, sizeof(repl_null));
+	memset(repl_repl, false, sizeof(repl_repl));
+	if (newOptions != (Datum) 0)
+		repl_val[Anum_pg_tablespace_spcoptions - 1] = newOptions;
+	else
+		repl_null[Anum_pg_tablespace_spcoptions - 1] = true;
+	repl_repl[Anum_pg_tablespace_spcoptions - 1] = true;
+	newtuple = heap_modify_tuple(tup, RelationGetDescr(rel), repl_val,
+								 repl_null, repl_repl);
+
+	/* Update system catalog. */
+	simple_heap_update(rel, &newtuple->t_self, newtuple);
+	CatalogUpdateIndexes(rel, newtuple);
+	heap_freetuple(newtuple);
+
+	/* Conclude heap scan. */
+	heap_endscan(scandesc);
+	heap_close(rel, NoLock);
+}
+
+/*
  * Routines for handling the GUC variable 'default_tablespace'.
  */
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 0e22c68..0faa05d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3064,6 +3064,18 @@ _copyDropTableSpaceStmt(DropTableSpaceStmt *from)
 	return newnode;
 }
 
+static AlterTableSpaceOptionsStmt *
+_copyAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *from)
+{
+	AlterTableSpaceOptionsStmt *newnode = makeNode(AlterTableSpaceOptionsStmt);
+
+	COPY_STRING_FIELD(tablespacename);
+	COPY_NODE_FIELD(options);
+	COPY_SCALAR_FIELD(isReset);
+
+	return newnode;
+}
+
 static CreateFdwStmt *
 _copyCreateFdwStmt(CreateFdwStmt *from)
 {
@@ -4028,6 +4040,9 @@ copyObject(void *from)
 		case T_DropTableSpaceStmt:
 			retval = _copyDropTableSpaceStmt(from);
 			break;
+		case T_AlterTableSpaceOptionsStmt:
+			retval = _copyAlterTableSpaceOptionsStmt(from);
+			break;
 		case T_CreateFdwStmt:
 			retval = _copyCreateFdwStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f850935..62bf3b1 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1569,6 +1569,17 @@ _equalDropTableSpaceStmt(DropTableSpaceStmt *a, DropTableSpaceStmt *b)
 }
 
 static bool
+_equalAlterTableSpaceOptionsStmt(AlterTableSpaceOptionsStmt *a,
+											 AlterTableSpaceOptionsStmt *b)
+{
+	COMPARE_STRING_FIELD(tablespacename);
+	COMPARE_NODE_FIELD(options);
+	COMPARE_SCALAR_FIELD(isReset);
+
+	return true;
+}
+
+static bool
 _equalCreateFdwStmt(CreateFdwStmt *a, CreateFdwStmt *b)
 {
 	COMPARE_STRING_FIELD(fdwname);
@@ -2720,6 +2731,9 @@ equal(void *a, void *b)
 		case T_DropTableSpaceStmt:
 			retval = _equalDropTableSpaceStmt(a, b);
 			break;
+		case T_AlterTableSpaceOptionsStmt:
+			retval = _equalAlterTableSpaceOptionsStmt(a, b);
+			break;
 		case T_CreateFdwStmt:
 			retval = _equalCreateFdwStmt(a, b);
 			break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 508aa31..d7c62ed 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1590,6 +1590,7 @@ _outRelOptInfo(StringInfo str, RelOptInfo *node)
 	WRITE_NODE_FIELD(cheapest_total_path);
 	WRITE_NODE_FIELD(cheapest_unique_path);
 	WRITE_UINT_FIELD(relid);
+	WRITE_UINT_FIELD(reltablespace);
 	WRITE_ENUM_FIELD(rtekind, RTEKind);
 	WRITE_INT_FIELD(min_attr);
 	WRITE_INT_FIELD(max_attr);
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 09bc624..903be2b 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -27,6 +27,11 @@
  * detail.	Note that all of these parameters are user-settable, in case
  * the default values are drastically off for a particular platform.
  *
+ * seq_page_cost and random_page_cost can also be overridden for an individual
+ * tablespace, in case some data is on a fast disk and other data is on a slow
+ * disk.  Per-tablespace overrides never apply to temporary work files such as
+ * an external sort or a materialize node that overflows work_mem.
+ *
  * We compute two separate costs for each path:
  *		total_cost: total estimated cost to fetch all tuples
  *		startup_cost: cost that is expended before first tuple is fetched
@@ -76,6 +81,7 @@
 #include "parser/parsetree.h"
 #include "utils/lsyscache.h"
 #include "utils/selfuncs.h"
+#include "utils/spccache.h"
 #include "utils/tuplesort.h"
 
 
@@ -164,6 +170,7 @@ void
 cost_seqscan(Path *path, PlannerInfo *root,
 			 RelOptInfo *baserel)
 {
+	double		spc_seq_page_cost;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	Cost		cpu_per_tuple;
@@ -175,10 +182,15 @@ cost_seqscan(Path *path, PlannerInfo *root,
 	if (!enable_seqscan)
 		startup_cost += disable_cost;
 
+	/* fetch estimated page cost for tablespace containing table */
+	get_tablespace_page_costs(baserel->reltablespace,
+							  NULL,
+							  &spc_seq_page_cost);
+
 	/*
 	 * disk costs
 	 */
-	run_cost += seq_page_cost * baserel->pages;
+	run_cost += spc_seq_page_cost * baserel->pages;
 
 	/* CPU costs */
 	startup_cost += baserel->baserestrictcost.startup;
@@ -226,6 +238,8 @@ cost_index(IndexPath *path, PlannerInfo *root,
 	Selectivity indexSelectivity;
 	double		indexCorrelation,
 				csquared;
+	double		spc_seq_page_cost,
+				spc_random_page_cost;
 	Cost		min_IO_cost,
 				max_IO_cost;
 	Cost		cpu_per_tuple;
@@ -272,13 +286,18 @@ cost_index(IndexPath *path, PlannerInfo *root,
 	/* estimate number of main-table tuples fetched */
 	tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
 
+	/* fetch estimated page costs for tablespace containing table */
+	get_tablespace_page_costs(baserel->reltablespace,
+							  &spc_random_page_cost,
+							  &spc_seq_page_cost);
+
 	/*----------
 	 * Estimate number of main-table pages fetched, and compute I/O cost.
 	 *
 	 * When the index ordering is uncorrelated with the table ordering,
 	 * we use an approximation proposed by Mackert and Lohman (see
 	 * index_pages_fetched() for details) to compute the number of pages
-	 * fetched, and then charge random_page_cost per page fetched.
+	 * fetched, and then charge spc_random_page_cost per page fetched.
 	 *
 	 * When the index ordering is exactly correlated with the table ordering
 	 * (just after a CLUSTER, for example), the number of pages fetched should
@@ -286,7 +305,7 @@ cost_index(IndexPath *path, PlannerInfo *root,
 	 * will be sequential fetches, not the random fetches that occur in the
 	 * uncorrelated case.  So if the number of pages is more than 1, we
 	 * ought to charge
-	 *		random_page_cost + (pages_fetched - 1) * seq_page_cost
+	 *		spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
 	 * For partially-correlated indexes, we ought to charge somewhere between
 	 * these two estimates.  We currently interpolate linearly between the
 	 * estimates based on the correlation squared (XXX is that appropriate?).
@@ -309,7 +328,7 @@ cost_index(IndexPath *path, PlannerInfo *root,
 											(double) index->pages,
 											root);
 
-		max_IO_cost = (pages_fetched * random_page_cost) / num_scans;
+		max_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
 
 		/*
 		 * In the perfectly correlated case, the number of pages touched by
@@ -328,7 +347,7 @@ cost_index(IndexPath *path, PlannerInfo *root,
 											(double) index->pages,
 											root);
 
-		min_IO_cost = (pages_fetched * random_page_cost) / num_scans;
+		min_IO_cost = (pages_fetched * spc_random_page_cost) / num_scans;
 	}
 	else
 	{
@@ -342,13 +361,13 @@ cost_index(IndexPath *path, PlannerInfo *root,
 											root);
 
 		/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
-		max_IO_cost = pages_fetched * random_page_cost;
+		max_IO_cost = pages_fetched * spc_random_page_cost;
 
 		/* min_IO_cost is for the perfectly correlated case (csquared=1) */
 		pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
-		min_IO_cost = random_page_cost;
+		min_IO_cost = spc_random_page_cost;
 		if (pages_fetched > 1)
-			min_IO_cost += (pages_fetched - 1) * seq_page_cost;
+			min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
 	}
 
 	/*
@@ -553,6 +572,8 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	Cost		cost_per_page;
 	double		tuples_fetched;
 	double		pages_fetched;
+	double		spc_seq_page_cost,
+				spc_random_page_cost;
 	double		T;
 
 	/* Should only be applied to base relations */
@@ -571,6 +592,11 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 
 	startup_cost += indexTotalCost;
 
+	/* Fetch estimated page costs for tablespace containing table. */
+	get_tablespace_page_costs(baserel->reltablespace,
+							  &spc_random_page_cost,
+							  &spc_seq_page_cost);
+
 	/*
 	 * Estimate number of main-table pages fetched.
 	 */
@@ -609,17 +635,18 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 		pages_fetched = ceil(pages_fetched);
 
 	/*
-	 * For small numbers of pages we should charge random_page_cost apiece,
+	 * For small numbers of pages we should charge spc_random_page_cost apiece,
 	 * while if nearly all the table's pages are being read, it's more
-	 * appropriate to charge seq_page_cost apiece.	The effect is nonlinear,
+	 * appropriate to charge spc_seq_page_cost apiece.	The effect is nonlinear,
 	 * too. For lack of a better idea, interpolate like this to determine the
 	 * cost per page.
 	 */
 	if (pages_fetched >= 2.0)
-		cost_per_page = random_page_cost -
-			(random_page_cost - seq_page_cost) * sqrt(pages_fetched / T);
+		cost_per_page = spc_random_page_cost -
+			(spc_random_page_cost - spc_seq_page_cost)
+			* sqrt(pages_fetched / T);
 	else
-		cost_per_page = random_page_cost;
+		cost_per_page = spc_random_page_cost;
 
 	run_cost += pages_fetched * cost_per_page;
 
@@ -783,6 +810,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	QualCost	tid_qual_cost;
 	int			ntuples;
 	ListCell   *l;
+	double		spc_random_page_cost;
 
 	/* Should only be applied to base relations */
 	Assert(baserel->relid > 0);
@@ -835,8 +863,13 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	 */
 	cost_qual_eval(&tid_qual_cost, tidquals, root);
 
+	/* fetch estimated page cost for tablespace containing table */
+	get_tablespace_page_costs(baserel->reltablespace,
+							  &spc_random_page_cost,
+							  NULL);
+
 	/* disk costs --- assume each tuple on a different page */
-	run_cost += random_page_cost * ntuples;
+	run_cost += spc_random_page_cost * ntuples;
 
 	/* CPU costs */
 	startup_cost += baserel->baserestrictcost.startup +
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 463d318..ea7d055 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -91,6 +91,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
 	rel->max_attr = RelationGetNumberOfAttributes(relation);
+	rel->reltablespace = RelationGetForm(relation)->reltablespace;
 
 	Assert(rel->max_attr >= rel->min_attr);
 	rel->attr_needed = (Relids *)
@@ -183,6 +184,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			info = makeNode(IndexOptInfo);
 
 			info->indexoid = index->indexrelid;
+			info->reltablespace =
+				RelationGetForm(indexRelation)->reltablespace;
 			info->rel = rel;
 			info->ncolumns = ncolumns = index->indnatts;
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 61d47fa..7aa387e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -5687,6 +5687,24 @@ RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
 					n->newname = $6;
 					$$ = (Node *)n;
 				}
+			| ALTER TABLESPACE name SET reloptions
+				{
+					AlterTableSpaceOptionsStmt *n =
+						makeNode(AlterTableSpaceOptionsStmt);
+					n->tablespacename = $3;
+					n->options = $5;
+					n->isReset = FALSE;
+					$$ = (Node *)n;
+				}
+			| ALTER TABLESPACE name RESET reloptions
+				{
+					AlterTableSpaceOptionsStmt *n =
+						makeNode(AlterTableSpaceOptionsStmt);
+					n->tablespacename = $3;
+					n->options = $5;
+					n->isReset = TRUE;
+					$$ = (Node *)n;
+				}
 			| ALTER TEXT_P SEARCH PARSER any_name RENAME TO name
 				{
 					RenameStmt *n = makeNode(RenameStmt);
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 58b6737..8b659ff 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -218,6 +218,7 @@ check_xact_readonly(Node *parsetree)
 		case T_CreateUserMappingStmt:
 		case T_AlterUserMappingStmt:
 		case T_DropUserMappingStmt:
+		case T_AlterTableSpaceOptionsStmt:
 			ereport(ERROR,
 					(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
 					 errmsg("transaction is read-only")));
@@ -528,6 +529,10 @@ standard_ProcessUtility(Node *parsetree,
 			DropTableSpace((DropTableSpaceStmt *) parsetree);
 			break;
 
+		case T_AlterTableSpaceOptionsStmt:
+			AlterTableSpaceOptions((AlterTableSpaceOptionsStmt *) parsetree);
+			break;
+
 		case T_CreateFdwStmt:
 			CreateForeignDataWrapper((CreateFdwStmt *) parsetree);
 			break;
@@ -1456,6 +1461,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "DROP TABLESPACE";
 			break;
 
+		case T_AlterTableSpaceOptionsStmt:
+			tag = "ALTER TABLESPACE";
+			break;
+
 		case T_CreateFdwStmt:
 			tag = "CREATE FOREIGN DATA WRAPPER";
 			break;
@@ -2238,6 +2247,10 @@ GetCommandLogLevel(Node *parsetree)
 			lev = LOGSTMT_DDL;
 			break;
 
+		case T_AlterTableSpaceOptionsStmt:
+			lev = LOGSTMT_DDL;
+			break;
+
 		case T_CreateFdwStmt:
 		case T_AlterFdwStmt:
 		case T_DropFdwStmt:
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 2ba871e..9ef0aa9 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -119,6 +119,7 @@
 #include "utils/nabstime.h"
 #include "utils/pg_locale.h"
 #include "utils/selfuncs.h"
+#include "utils/spccache.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
 
@@ -5648,6 +5649,7 @@ genericcostestimate(PlannerInfo *root,
 	QualCost	index_qual_cost;
 	double		qual_op_cost;
 	double		qual_arg_cost;
+	double		spc_random_page_cost;
 	List	   *selectivityQuals;
 	ListCell   *l;
 
@@ -5756,6 +5758,11 @@ genericcostestimate(PlannerInfo *root,
 	else
 		numIndexPages = 1.0;
 
+	/* fetch estimated page cost for schema containing index */
+	get_tablespace_page_costs(index->reltablespace,
+							  &spc_random_page_cost,
+							  NULL);
+
 	/*
 	 * Now compute the disk access costs.
 	 *
@@ -5802,15 +5809,16 @@ genericcostestimate(PlannerInfo *root,
 		 * share for each outer scan.  (Don't pro-rate for ScalarArrayOpExpr,
 		 * since that's internal to the indexscan.)
 		 */
-		*indexTotalCost = (pages_fetched * random_page_cost) / num_outer_scans;
+		*indexTotalCost = (pages_fetched * spc_random_page_cost)
+							/ num_outer_scans;
 	}
 	else
 	{
 		/*
-		 * For a single index scan, we just charge random_page_cost per page
-		 * touched.
+		 * For a single index scan, we just charge spc_random_page_cost per
+		 * page touched.
 		 */
-		*indexTotalCost = numIndexPages * random_page_cost;
+		*indexTotalCost = numIndexPages * spc_random_page_cost;
 	}
 
 	/*
@@ -5825,11 +5833,11 @@ genericcostestimate(PlannerInfo *root,
 	 *
 	 * We can deal with this by adding a very small "fudge factor" that
 	 * depends on the index size.  The fudge factor used here is one
-	 * random_page_cost per 100000 index pages, which should be small enough
-	 * to not alter index-vs-seqscan decisions, but will prevent indexes of
-	 * different sizes from looking exactly equally attractive.
+	 * spc_random_page_cost per 100000 index pages, which should be small
+	 * enough to not alter index-vs-seqscan decisions, but will prevent
+	 * indexes of different sizes from looking exactly equally attractive.
 	 */
-	*indexTotalCost += index->pages * random_page_cost / 100000.0;
+	*indexTotalCost += index->pages * spc_random_page_cost / 100000.0;
 
 	/*
 	 * CPU cost: any complex expressions in the indexquals will need to be
diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile
index e612f9b..83716a2 100644
--- a/src/backend/utils/cache/Makefile
+++ b/src/backend/utils/cache/Makefile
@@ -13,6 +13,6 @@ top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = catcache.o inval.o plancache.o relcache.o \
-	syscache.o lsyscache.o typcache.o ts_cache.o
+	spccache.o syscache.o lsyscache.o typcache.o ts_cache.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/cache/spccache.c b/src/backend/utils/cache/spccache.c
new file mode 100644
index 0000000..e9cbc0c
--- /dev/null
+++ b/src/backend/utils/cache/spccache.c
@@ -0,0 +1,183 @@
+/*-------------------------------------------------------------------------
+ *
+ * spccache.c
+ *	  Tablespace cache management.
+ *
+ * We cache the parsed version of spcoptions for each tablespace to avoid
+ * needing to reparse on every lookup.  Right now, there doesn't appear to
+ * be a measurable performance gain from doing this, but that might change
+ * in the future as we add more options.
+ *
+ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  $PostgreSQL$
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+#include "access/reloptions.h"
+#include "catalog/pg_tablespace.h"
+#include "commands/tablespace.h"
+#include "miscadmin.h"
+#include "optimizer/cost.h"
+#include "utils/catcache.h"
+#include "utils/hsearch.h"
+#include "utils/inval.h"
+#include "utils/spccache.h"
+#include "utils/syscache.h"
+
+static HTAB *TableSpaceCacheHash = NULL;
+
+typedef struct {
+	Oid			oid;
+	TableSpaceOpts *opts;
+} TableSpace;
+
+/*
+ * InvalidateTableSpaceCacheCallback
+ *		Flush all cache entries when pg_tablespace is updated.
+ *
+ * When pg_tablespace is updated, we must flush the cache entry at least
+ * for that tablespace.  Currently, we just flush them all.  This is quick
+ * and easy and doesn't cost much, since there shouldn't be terribly many
+ * tablespaces, nor do we expect them to be frequently modified.
+ */
+static void
+InvalidateTableSpaceCacheCallback(Datum arg, int cacheid, ItemPointer tuplePtr)
+{
+	HASH_SEQ_STATUS status;
+	TableSpace *spc;
+
+	hash_seq_init(&status, TableSpaceCacheHash);
+	while ((spc = (TableSpace *) hash_seq_search(&status)) != NULL)
+	{
+		if (hash_search(TableSpaceCacheHash, (void *) &spc->oid, HASH_REMOVE,
+						NULL) == NULL)
+			elog(ERROR, "hash table corrupted");
+		if (spc->opts)
+			pfree(spc->opts);
+	}
+}
+
+/*
+ * InitializeTableSpaceCache
+ *		Initiate the tablespace cache.
+ */
+static void
+InitializeTableSpaceCache(void)
+{
+	HASHCTL ctl;
+
+	/* Initialize the hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);
+	ctl.entrysize = sizeof(TableSpace);
+	ctl.hash = tag_hash;
+	TableSpaceCacheHash =
+		hash_create("TableSpace cache", 16, &ctl,
+				    HASH_ELEM | HASH_FUNCTION);
+
+	/* Make sure we've initialized CacheMemoryContext. */
+	if (!CacheMemoryContext)
+		CreateCacheMemoryContext();
+
+	/* Watch for invalidation events. */
+	CacheRegisterSyscacheCallback(TABLESPACEOID,
+								  InvalidateTableSpaceCacheCallback,
+								  (Datum) 0);
+}
+
+/*
+ * get_tablespace
+ *		Fetch TableSpace structure for a specified table OID.
+ *
+ * Pointers returned by this function should not be stored, since a cache
+ * flush will invalidate them.
+ */
+static TableSpace *
+get_tablespace(Oid spcid)
+{
+	HeapTuple	tp;
+	TableSpace *spc;
+	bool		found;
+
+	/*
+	 * Since spcid is always from a pg_class tuple, InvalidOid implies the
+	 * default.
+	 */
+	if (spcid == InvalidOid)
+		spcid = MyDatabaseTableSpace;
+
+	/* Find existing cache entry, or create a new one. */
+	if (!TableSpaceCacheHash)
+		InitializeTableSpaceCache();
+	spc = (TableSpace *) hash_search(TableSpaceCacheHash, (void *) &spcid,
+									 HASH_ENTER, &found);
+	if (found)
+		return spc;
+
+	/*
+	 * Not found in TableSpace cache.  Check catcache.  If we don't find a
+	 * valid HeapTuple, it must mean someone has managed to request tablespace
+	 * details for a non-existent tablespace.  We'll just treat that case as if
+	 * no options were specified.
+	 */
+	tp = SearchSysCache(TABLESPACEOID, ObjectIdGetDatum(spcid), 0, 0, 0);
+	if (!HeapTupleIsValid(tp))
+		spc->opts = NULL;
+	else
+	{
+		Datum	datum;
+		bool	isNull;
+		MemoryContext octx;
+
+		datum = SysCacheGetAttr(TABLESPACEOID,
+								tp,
+								Anum_pg_tablespace_spcoptions,
+								&isNull);
+		if (isNull)
+			spc->opts = NULL;
+		else
+		{
+			octx = MemoryContextSwitchTo(CacheMemoryContext);
+			spc->opts = (TableSpaceOpts *) tablespace_reloptions(datum, false);
+			MemoryContextSwitchTo(octx);
+		}
+		ReleaseSysCache(tp);
+	}
+
+	/* Update new TableSpace cache entry with results of option parsing. */
+	return spc;
+}
+
+/*
+ * get_tablespace_page_costs
+ *		Return random and sequential page costs for a given tablespace.
+ */
+void
+get_tablespace_page_costs(Oid spcid, double *spc_random_page_cost,
+							   double *spc_seq_page_cost)
+{
+	TableSpace *spc = get_tablespace(spcid);
+
+	Assert(spc != NULL);
+
+	if (spc_random_page_cost)
+	{
+		if (!spc->opts || spc->opts->random_page_cost < 0)
+			*spc_random_page_cost = random_page_cost;
+		else
+			*spc_random_page_cost = spc->opts->random_page_cost;
+	}
+
+	if (spc_seq_page_cost)
+	{
+		if (!spc->opts || spc->opts->seq_page_cost < 0)
+			*spc_seq_page_cost = seq_page_cost;
+		else
+			*spc_seq_page_cost = spc->opts->seq_page_cost;
+	}
+}
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index d31335d..a44965f 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_rewrite.h"
 #include "catalog/pg_statistic.h"
+#include "catalog/pg_tablespace.h"
 #include "catalog/pg_ts_config.h"
 #include "catalog/pg_ts_config_map.h"
 #include "catalog/pg_ts_dict.h"
@@ -609,6 +610,18 @@ static const struct cachedesc cacheinfo[] = {
 		},
 		1024
 	},
+	{TableSpaceRelationId,		/* TABLESPACEOID */
+		TablespaceOidIndexId,
+		0,
+		1,
+		{
+			ObjectIdAttributeNumber,
+			0,
+			0,
+			0,
+		},
+		16
+	},
 	{TSConfigMapRelationId,		/* TSCONFIGMAP */
 		TSConfigMapIndexId,
 		0,
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index e7cc1e0..6cd8055 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -956,19 +956,28 @@ dumpTablespaces(PGconn *conn)
 	 * Get all tablespaces except built-in ones (which we assume are named
 	 * pg_xxx)
 	 */
-	if (server_version >= 80200)
+	if (server_version >= 80500)
 		res = executeQuery(conn, "SELECT spcname, "
 						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
 						   "spclocation, spcacl, "
+						   "array_to_string(spcoptions, ', '),"
 						"pg_catalog.shobj_description(oid, 'pg_tablespace') "
 						   "FROM pg_catalog.pg_tablespace "
 						   "WHERE spcname !~ '^pg_' "
 						   "ORDER BY 1");
+	else if (server_version >= 80200)
+		res = executeQuery(conn, "SELECT spcname, "
+						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
+						   "spclocation, spcacl, null, "
+						"pg_catalog.shobj_description(oid, 'pg_tablespace'), "
+						   "FROM pg_catalog.pg_tablespace "
+						   "WHERE spcname !~ '^pg_' "
+						   "ORDER BY 1");
 	else
 		res = executeQuery(conn, "SELECT spcname, "
 						 "pg_catalog.pg_get_userbyid(spcowner) AS spcowner, "
 						   "spclocation, spcacl, "
-						   "null "
+						   "null, null "
 						   "FROM pg_catalog.pg_tablespace "
 						   "WHERE spcname !~ '^pg_' "
 						   "ORDER BY 1");
@@ -983,7 +992,8 @@ dumpTablespaces(PGconn *conn)
 		char	   *spcowner = PQgetvalue(res, i, 1);
 		char	   *spclocation = PQgetvalue(res, i, 2);
 		char	   *spcacl = PQgetvalue(res, i, 3);
-		char	   *spccomment = PQgetvalue(res, i, 4);
+		char	   *spcoptions = PQgetvalue(res, i, 4);
+		char	   *spccomment = PQgetvalue(res, i, 5);
 		char	   *fspcname;
 
 		/* needed for buildACLCommands() */
@@ -996,6 +1006,10 @@ dumpTablespaces(PGconn *conn)
 		appendStringLiteralConn(buf, spclocation, conn);
 		appendPQExpBuffer(buf, ";\n");
 
+		if (spcoptions && spcoptions[0] != '\0')
+			appendPQExpBuffer(buf, "ALTER TABLESPACE %s SET (%s);\n",
+							  fspcname, spcoptions);
+
 		if (!skip_acls &&
 			!buildACLCommands(fspcname, NULL, "TABLESPACE", spcacl, spcowner,
 							  "", server_version, buf))
diff --git a/src/include/access/reloptions.h b/src/include/access/reloptions.h
index 2a7c712..a0087bc 100644
--- a/src/include/access/reloptions.h
+++ b/src/include/access/reloptions.h
@@ -1,7 +1,8 @@
 /*-------------------------------------------------------------------------
  *
  * reloptions.h
- *	  Core support for relation options (pg_class.reloptions)
+ *	  Core support for relation and tablespace options (pg_class.reloptions
+ *	  and pg_tablespace.spcoptions)
  *
  * Note: the functions dealing with text-array reloptions values declare
  * them as Datum, not ArrayType *, to avoid needing to include array.h
@@ -39,8 +40,9 @@ typedef enum relopt_kind
 	RELOPT_KIND_HASH = (1 << 3),
 	RELOPT_KIND_GIN = (1 << 4),
 	RELOPT_KIND_GIST = (1 << 5),
+	RELOPT_KIND_TABLESPACE = (1 << 6),
 	/* if you add a new kind, make sure you update "last_default" too */
-	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_GIST,
+	RELOPT_KIND_LAST_DEFAULT = RELOPT_KIND_TABLESPACE,
 	/* some compilers treat enums as signed ints, so we can't use 1 << 31 */
 	RELOPT_KIND_MAX = (1 << 30)
 } relopt_kind;
@@ -264,5 +266,6 @@ extern bytea *default_reloptions(Datum reloptions, bool validate,
 extern bytea *heap_reloptions(char relkind, Datum reloptions, bool validate);
 extern bytea *index_reloptions(RegProcedure amoptions, Datum reloptions,
 				 bool validate);
+extern bytea *tablespace_reloptions(Datum reloptions, bool validate);
 
 #endif   /* RELOPTIONS_H */
diff --git a/src/include/catalog/pg_tablespace.h b/src/include/catalog/pg_tablespace.h
index 0bda18e..1552297 100644
--- a/src/include/catalog/pg_tablespace.h
+++ b/src/include/catalog/pg_tablespace.h
@@ -34,6 +34,7 @@ CATALOG(pg_tablespace,1213) BKI_SHARED_RELATION
 	Oid			spcowner;		/* owner of tablespace */
 	text		spclocation;	/* physical location (VAR LENGTH) */
 	aclitem		spcacl[1];		/* access permissions (VAR LENGTH) */
+	text		spcoptions[1];	/* per-tablespace options */
 } FormData_pg_tablespace;
 
 /* ----------------
@@ -48,14 +49,15 @@ typedef FormData_pg_tablespace *Form_pg_tablespace;
  * ----------------
  */
 
-#define Natts_pg_tablespace				4
+#define Natts_pg_tablespace				5
 #define Anum_pg_tablespace_spcname		1
 #define Anum_pg_tablespace_spcowner		2
 #define Anum_pg_tablespace_spclocation	3
 #define Anum_pg_tablespace_spcacl		4
+#define Anum_pg_tablespace_spcoptions	5
 
-DATA(insert OID = 1663 ( pg_default PGUID "" _null_ ));
-DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ ));
+DATA(insert OID = 1663 ( pg_default PGUID "" _null_ _null_ ));
+DATA(insert OID = 1664 ( pg_global	PGUID "" _null_ _null_ ));
 
 #define DEFAULTTABLESPACE_OID 1663
 #define GLOBALTABLESPACE_OID 1664
diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index bafaeae..b973450 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -32,11 +32,17 @@ typedef struct xl_tblspc_drop_rec
 	Oid			ts_id;
 } xl_tblspc_drop_rec;
 
+typedef struct TableSpaceOpts
+{
+	float8		random_page_cost;
+	float8		seq_page_cost;
+} TableSpaceOpts;
 
 extern void CreateTableSpace(CreateTableSpaceStmt *stmt);
 extern void DropTableSpace(DropTableSpaceStmt *stmt);
 extern void RenameTableSpace(const char *oldname, const char *newname);
 extern void AlterTableSpaceOwner(const char *name, Oid newOwnerId);
+extern void AlterTableSpaceOptions(AlterTableSpaceOptionsStmt *stmt);
 
 extern void TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool isRedo);
 
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 2c31674..cc20a25 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -346,6 +346,7 @@ typedef enum NodeTag
 	T_CreateUserMappingStmt,
 	T_AlterUserMappingStmt,
 	T_DropUserMappingStmt,
+	T_AlterTableSpaceOptionsStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f9272e8..e05f93b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1477,6 +1477,14 @@ typedef struct DropTableSpaceStmt
 	bool		missing_ok;		/* skip error if missing? */
 } DropTableSpaceStmt;
 
+typedef struct AlterTableSpaceOptionsStmt
+{
+	NodeTag		type;
+	char	   *tablespacename;
+	List	   *options;
+	bool		isReset;
+} AlterTableSpaceOptionsStmt;
+
 /* ----------------------
  *		Create/Drop FOREIGN DATA WRAPPER Statements
  * ----------------------
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index f2a0a7a..d7241eb 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -371,6 +371,7 @@ typedef struct RelOptInfo
 
 	/* information about a base rel (not set for join rels!) */
 	Index		relid;
+	Oid			reltablespace;	/* containing tablespace */
 	RTEKind		rtekind;		/* RELATION, SUBQUERY, or FUNCTION */
 	AttrNumber	min_attr;		/* smallest attrno of rel (often <0) */
 	AttrNumber	max_attr;		/* largest attrno of rel */
@@ -435,6 +436,7 @@ typedef struct IndexOptInfo
 	NodeTag		type;
 
 	Oid			indexoid;		/* OID of the index relation */
+	Oid			reltablespace;	/* tablespace of index (not table) */
 	RelOptInfo *rel;			/* back-link to index's table */
 
 	/* statistics from pg_class */
diff --git a/src/include/utils/spccache.h b/src/include/utils/spccache.h
new file mode 100644
index 0000000..726eb9a
--- /dev/null
+++ b/src/include/utils/spccache.h
@@ -0,0 +1,19 @@
+/*-------------------------------------------------------------------------
+ *
+ * spccache.h
+ *	  Tablespace cache.
+ *
+ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * $PostgreSQL$
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SPCCACHE_H
+#define SPCCACHE_H
+
+void get_tablespace_page_costs(Oid spcid, float8 *spc_random_page_cost,
+					     float8 *spc_seq_page_cost);
+
+#endif   /* SPCCACHE_H */
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 943e553..c6459d0 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -71,6 +71,7 @@ enum SysCacheIdentifier
 	RELOID,
 	RULERELNAME,
 	STATRELATTINH,
+	TABLESPACEOID,
 	TSCONFIGMAP,
 	TSCONFIGNAMENSP,
 	TSCONFIGOID,
diff --git a/src/test/regress/input/tablespace.source b/src/test/regress/input/tablespace.source
index df5479d..dba96f4 100644
--- a/src/test/regress/input/tablespace.source
+++ b/src/test/regress/input/tablespace.source
@@ -1,6 +1,12 @@
 -- create a tablespace we can use
 CREATE TABLESPACE testspace LOCATION '@testtablespace@';
 
+-- try setting and resetting some properties for the new tablespace
+ALTER TABLESPACE testspace SET (random_page_cost = 1.0);
+ALTER TABLESPACE testspace SET (some_nonexistent_parameter = true);  -- fail
+ALTER TABLESPACE testspace RESET (random_page_cost = 2.0); -- fail
+ALTER TABLESPACE testspace RESET (random_page_cost, seq_page_cost); -- ok
+
 -- create a schema we can use
 CREATE SCHEMA testschema;
 
diff --git a/src/test/regress/output/tablespace.source b/src/test/regress/output/tablespace.source
index e57ad2b..79b12a8 100644
--- a/src/test/regress/output/tablespace.source
+++ b/src/test/regress/output/tablespace.source
@@ -1,5 +1,12 @@
 -- create a tablespace we can use
 CREATE TABLESPACE testspace LOCATION '@testtablespace@';
+-- try setting and resetting some properties for the new tablespace
+ALTER TABLESPACE testspace SET (random_page_cost = 1.0);
+ALTER TABLESPACE testspace SET (some_nonexistent_parameter = true);  -- fail
+ERROR:  unrecognized parameter "some_nonexistent_parameter"
+ALTER TABLESPACE testspace RESET (random_page_cost = 2.0); -- fail
+ERROR:  RESET must not include values for parameters
+ALTER TABLESPACE testspace RESET (random_page_cost, seq_page_cost); -- ok
 -- create a schema we can use
 CREATE SCHEMA testschema;
 -- try a table

#27

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Tom Lane (#24)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Mon, Jan 4, 2010 at 1:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

My only objection to that is that if we're going to add attoptions
also, I'd like to get this committed first before I start working on
that, and we're running short on time. If you can commit his patch in
the next day or two, then I am fine with rebasing mine afterwards, but
if it needs more work than that then I would prefer to commit mine so
I can move on. Is that reasonable?

Fair enough --- if I can't get it done today I will let you know and
hold off.

OK. I just took a really fast look at that the bki patch and it looks
pretty nice, so I hope you're able to get it in. Of course, I'm biased
because it's based on earlier work of my own, but biased != wrong.
:-)

A lot more work will need to be done to escape the insanity that is
our current method of handling system catalogs, but this seems like a
good step in the right direction.

I also observe that it applies cleanly over my current spcoptions
branch, so the merge conflicts might be a non-issue.

...Robert

#28

John Naylor

jcnaylor@gmail.com

about 16 years ago

In reply to: Tom Lane (#22)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

Tom,

It seems I introduced a couple errors in src/tools/msvc/clean.bat in
the bki patch. I'm attaching a cumulative fix. I can resend the
complete updated patch, if you like...

Sorry! :-)
John

Show quoted text

I'm planning to go look at Naylor's bki refactoring patch now. Assuming
there isn't any showstopper problem with that, do you object to it
getting committed first? Either order is going to create a merge
problem, but it seems like we'd be best off to get Naylor's patch in
so people can resync affected patches before the January commitfest
starts.

Attachments:

bki_revamp_msvc_clean.patchtext/x-patch; charset=US-ASCII; name=bki_revamp_msvc_clean.patchDownload

diff --git a/src/tools/msvc/clean.bat b/src/tools/msvc/clean.bat
index 8cce31e..1d3ea65 100755
*** a/src/tools/msvc/clean.bat
--- b/src/tools/msvc/clean.bat
*************** REM Delete files created with GenerateFi
*** 20,35 ****
  if exist src\include\pg_config.h del /q src\include\pg_config.h
  if exist src\include\pg_config_os.h del /q src\include\pg_config_os.h
  if %DIST%==1 if exist src\backend\parser\gram.h del /q src\backend\parser\gram.h
! if exist src\include\utils\fmgroids.h del /q src\include\utils\fmgroids.h
! if exist src\include\catalog\schemapg.h del /q src\include\catalog\schemapg.h
  if exist src\include\utils\probes.h del /q src\include\utils\probes.h
  
- if %DIST%==1 if exist src\backend\utils\fmgroids.h del /q src\backend\utils\fmgroids.h
  if %DIST%==1 if exist src\backend\utils\fmgrtab.c del /q src\backend\utils\fmgrtab.c
  if %DIST%==1 if exist src\backend\catalog\postgres.bki del /q src\backend\catalog\postgres.bki
  if %DIST%==1 if exist src\backend\catalog\postgres.description del /q src\backend\catalog\postgres.description
  if %DIST%==1 if exist src\backend\catalog\postgres.shdescription del /q src\backend\catalog\postgres.shdescription
- if %DIST%==1 if exist src\backend\catalog\schemapg.h del /q src\backend\catalog\schemapg.h
  if %DIST%==1 if exist src\backend\parser\scan.c del /q src\backend\parser\scan.c
  if %DIST%==1 if exist src\backend\parser\gram.c del /q src\backend\parser\gram.c
  if %DIST%==1 if exist src\backend\bootstrap\bootscanner.c del /q src\backend\bootstrap\bootscanner.c
--- 20,33 ----
  if exist src\include\pg_config.h del /q src\include\pg_config.h
  if exist src\include\pg_config_os.h del /q src\include\pg_config_os.h
  if %DIST%==1 if exist src\backend\parser\gram.h del /q src\backend\parser\gram.h
! if %DIST%==1 if exist src\backend\utils\fmgroids.h del /q src\backend\utils\fmgroids.h
! if %DIST%==1 if exist src\backend\catalog\schemapg.h del /q src\backend\catalog\schemapg.h
  if exist src\include\utils\probes.h del /q src\include\utils\probes.h
  
  if %DIST%==1 if exist src\backend\utils\fmgrtab.c del /q src\backend\utils\fmgrtab.c
  if %DIST%==1 if exist src\backend\catalog\postgres.bki del /q src\backend\catalog\postgres.bki
  if %DIST%==1 if exist src\backend\catalog\postgres.description del /q src\backend\catalog\postgres.description
  if %DIST%==1 if exist src\backend\catalog\postgres.shdescription del /q src\backend\catalog\postgres.shdescription
  if %DIST%==1 if exist src\backend\parser\scan.c del /q src\backend\parser\scan.c
  if %DIST%==1 if exist src\backend\parser\gram.c del /q src\backend\parser\gram.c
  if %DIST%==1 if exist src\backend\bootstrap\bootscanner.c del /q src\backend\bootstrap\bootscanner.c

#29

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Tom Lane (#24)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Mon, Jan 4, 2010 at 1:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

My only objection to that is that if we're going to add attoptions
also, I'd like to get this committed first before I start working on
that, and we're running short on time. If you can commit his patch in
the next day or two, then I am fine with rebasing mine afterwards, but
if it needs more work than that then I would prefer to commit mine so
I can move on. Is that reasonable?

Fair enough --- if I can't get it done today I will let you know and
hold off.

OK, so since you got this done, I'm going to go ahead and rebase &
commit mine today, after a final read-through or two, unless you or
anyone else wants to insert some last-minute objections?

...Robert

#30

Robert Haas

robertmhaas@gmail.com

about 16 years ago

In reply to: Robert Haas (#29)

1 attachment(s)

Re: patch - per-tablespace random_page_cost/seq_page_cost

On Tue, Jan 5, 2010 at 10:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jan 4, 2010 at 1:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

My only objection to that is that if we're going to add attoptions
also, I'd like to get this committed first before I start working on
that, and we're running short on time. If you can commit his patch in
the next day or two, then I am fine with rebasing mine afterwards, but
if it needs more work than that then I would prefer to commit mine so
I can move on. Is that reasonable?

Fair enough --- if I can't get it done today I will let you know and
hold off.

OK, so since you got this done, I'm going to go ahead and rebase &
commit mine today, after a final read-through or two, unless you or
anyone else wants to insert some last-minute objections?

I committed this, but then in looking some things over further today,
I realized that I seem to have done something stupid - namely, not
adding a varlena header to TableSpaceOpts. I believe that the
attached patch is needed to fix the problem.

(I am not quite sure why we are using bytea here since AFAICS we don't
actually store parsed reloptions structures in any kind of persistent
storage, but clearly overwriting the first four bytes of
random_page_cost with a varlena header is no good.)

...Robert

Attachments:

spcoptions-fix.patchtext/x-patch; charset=US-ASCII; name=spcoptions-fix.patchDownload

diff --git a/src/include/commands/tablespace.h b/src/include/commands/tablespace.h
index b973450..cf005ee 100644
--- a/src/include/commands/tablespace.h
+++ b/src/include/commands/tablespace.h
@@ -34,6 +34,7 @@ typedef struct xl_tblspc_drop_rec
 
 typedef struct TableSpaceOpts
 {
+	int32		vl_len_;		/* varlena header (do not touch directly!) */
 	float8		random_page_cost;
 	float8		seq_page_cost;
 } TableSpaceOpts;