hash partitioning based on v10Beta2

Started by yangjie@highgo.comover 8 years ago6 messages

over 8 years ago

1 attachment(s)

Hi all,

Now we have had the range / list partition, but hash partitioning is not implemented yet.
Attached is a POC patch based on the v10Beta2 to add the hash partitioning feature.
Although we will need more discussions about the syntax and other specifications before going ahead the project,
but I think this runnable code might help to discuss what and how we implement this.

Description

The hash partition's implement is on the basis of the original range / list partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can support multiple values,
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

CREATE TABLE h1 PARTITION OF h;
CREATE TABLE h2 PARTITION OF h;
CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts /* Number of partitions */
;
In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 3
h1 | 5
h1 | 17
h1 | 19
h2 | 2
h2 | 6
h2 | 7
h2 | 11
h2 | 12
h2 | 14
h2 | 15
h2 | 18
h2 | 20
h3 | 1
h3 | 4
h3 | 8
h3 | 9
h3 | 10
h3 | 13
h3 | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 5
h1 | 17
h1 | 19
h1 | 6
h1 | 12
h1 | 8
h1 | 13
h2 | 11
h2 | 14
h3 | 1
h3 | 9
h3 | 2
h3 | 15
h4 | 3
h4 | 7
h4 | 18
h4 | 20
h4 | 4
h4 | 10
h4 | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1)
-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 3
Planning time: 0.346 ms
Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 3
Planning time: 0.720 ms
Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 3
Planning time: 0.396 ms
Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young

https://yonj1e.github.io/

Attachments:

hash_part_on_beta2_v1.patchapplication/octet-stream; name=hash_part_on_beta2_v1.patchDownload

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 011f2b9..a74081b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1125,6 +1125,7 @@ relation_open(Oid relationId, LOCKMODE lockmode)
 	r = RelationIdGetRelation(relationId);
 
 	if (!RelationIsValid(r))
+
 		elog(ERROR, "could not open relation with OID %u", relationId);
 
 	/* Make note that we've accessed a temporary relation */
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 43b8924..8852b9c 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -48,6 +48,7 @@
 #include "utils/rel.h"
 #include "utils/ruleutils.h"
 #include "utils/syscache.h"
+#include "utils/typcache.h"
 
 /*
  * Information about bounds of a partitioned relation
@@ -89,6 +90,7 @@ typedef struct PartitionBoundInfoData
 								 * partitioned table) */
 	int			null_index;		/* Index of the null-accepting partition; -1
 								 * if there isn't one */
+	int			serial;
 } PartitionBoundInfoData;
 
 #define partition_bound_accepts_nulls(bi) ((bi)->null_index != -1)
@@ -114,6 +116,12 @@ typedef struct PartitionRangeBound
 	bool		lower;			/* this is the lower (vs upper) bound */
 } PartitionRangeBound;
 
+/* Internal representation of a hash partition bound */
+typedef struct PartitionHashBound
+{
+	int		serial;	/* hash partition serial number */
+} PartitionHashBound;
+
 static int32 qsort_partition_list_value_cmp(const void *a, const void *b,
 							   void *arg);
 static int32 qsort_partition_rbound_cmp(const void *a, const void *b,
@@ -149,6 +157,49 @@ static int partition_bound_bsearch(PartitionKey key,
 						PartitionBoundInfo boundinfo,
 						void *probe, bool probe_is_bound, bool *is_equal);
 
+/* check if the relation is a hash partition */
+bool is_hash_partition(Oid relid)
+{
+        return is_hash_partitioned(get_partition_parent(relid));
+}
+
+/* check if the relation is hash partitioned */
+bool is_hash_partitioned(Oid relid)
+{
+	Relation 	rel;
+	bool 		res = false;
+
+	rel = heap_open(relid, NoLock);
+	if (rel->rd_partkey)
+		if (rel->rd_partkey->strategy == 'h')
+			res = true;
+
+	heap_close(rel, NoLock);
+
+	return res;
+}
+
+/* 
+ * HashSerialGetPartition
+ *			get hash partition oid according to its index
+*/
+Oid 
+HashSerialGetPartition(Oid parentOid, int index)
+{
+	Relation			parent;
+	PartitionDesc		pdesc;
+	Oid			   	child;
+
+	parent = heap_open(parentOid, AccessShareLock);
+	pdesc = RelationGetPartitionDesc(parent);
+	
+	child = pdesc->oids[index];
+
+	heap_close(parent, AccessShareLock);
+
+	return child;
+}
+
 /*
  * RelationBuildPartitionDesc
  *		Form rel's partition descriptor
@@ -179,6 +230,9 @@ RelationBuildPartitionDesc(Relation rel)
 	/* Range partitioning specific */
 	PartitionRangeBound **rbounds = NULL;
 
+	 /* Hash partitiNG specific */
+	PartitionHashBound  **hashserials = NULL; 
+
 	/*
 	 * The following could happen in situations where rel has a pg_class entry
 	 * but not the pg_partitioned_table entry yet.
@@ -310,6 +364,10 @@ RelationBuildPartitionDesc(Relation rel)
 			qsort_arg(all_values, ndatums, sizeof(PartitionListValue *),
 					  qsort_partition_list_value_cmp, (void *) key);
 		}
+		else if (key->strategy == PARTITION_STRATEGY_HASH)
+		{
+			
+		}
 		else if (key->strategy == PARTITION_STRATEGY_RANGE)
 		{
 			int			j,
@@ -511,6 +569,15 @@ RelationBuildPartitionDesc(Relation rel)
 					break;
 				}
 
+			case PARTITION_STRATEGY_HASH:
+				{					
+					for (i = 0; i < nparts; i++)
+					{
+						mapping[i] = next_index++;
+					}
+					break;
+				}
+
 			case PARTITION_STRATEGY_RANGE:
 				{
 					boundinfo->content = (RangeDatumContent **) palloc(ndatums *
@@ -723,6 +790,13 @@ check_new_partition_bound(char *relname, Relation parent,
 				break;
 			}
 
+		case PARTITION_STRATEGY_HASH:
+			{
+				Assert(spec->strategy == PARTITION_STRATEGY_HASH);
+
+				break;
+			}
+
 		case PARTITION_STRATEGY_RANGE:
 			{
 				PartitionRangeBound *lower,
@@ -898,6 +972,11 @@ get_qual_from_partbound(Relation rel, Relation parent,
 			my_qual = get_qual_for_range(key, spec);
 			break;
 
+		case PARTITION_STRATEGY_HASH:
+			Assert(spec->strategy == PARTITION_STRATEGY_HASH);
+			my_qual = get_qual_for_range(key, spec);
+			break;
+
 		default:
 			elog(ERROR, "unexpected partition strategy: %d",
 				 (int) key->strategy);
@@ -1039,6 +1118,7 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
 	/* Root partitioned table has no parent, so NULL for parent */
 	parted_rel_parents = list_make1(NULL);
 	APPEND_REL_PARTITION_OIDS(rel, all_parts, all_parents);
+
 	forboth(lc1, all_parts, lc2, all_parents)
 	{
 		Relation	partrel = heap_open(lfirst_oid(lc1), lockmode);
@@ -1909,7 +1989,7 @@ FormPartitionKeyDatum(PartitionDispatch pd,
 	if (partexpr_item != NULL)
 		elog(ERROR, "wrong number of partition key expressions");
 }
-
+	
 /*
  * get_partition_for_tuple
  *		Finds a leaf partition for tuple contained in *slot
@@ -1935,6 +2015,7 @@ get_partition_for_tuple(PartitionDispatch *pd,
 				result;
 	ExprContext *ecxt = GetPerTupleExprContext(estate);
 	TupleTableSlot *ecxt_scantuple_old = ecxt->ecxt_scantuple;
+	int		nparts;
 
 	/* start with the root partitioned table */
 	parent = pd[0];
@@ -2016,6 +2097,12 @@ get_partition_for_tuple(PartitionDispatch *pd,
 						cur_index = -1;
 					break;
 
+				case PARTITION_STRATEGY_HASH:
+					Assert(partdesc->nparts > 0);
+					nparts = partdesc->nparts;
+					cur_index = DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts;
+
+					break;
 				case PARTITION_STRATEGY_RANGE:
 
 					/*
@@ -2257,6 +2344,13 @@ partition_bound_cmp(PartitionKey key, PartitionBoundInfo boundinfo,
 													 *(Datum *) probe));
 			break;
 
+		case PARTITION_STRATEGY_HASH:
+			cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0],
+													 key->partcollation[0],
+													 bound_datums[0],
+													 *(Datum *) probe));
+			break;
+
 		case PARTITION_STRATEGY_RANGE:
 			{
 				RangeDatumContent *content = boundinfo->content[offset];
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index bb00858..4e00a60 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -475,6 +475,11 @@ static ObjectAddress ATExecAttachPartition(List **wqueue, Relation rel,
 					  PartitionCmd *cmd);
 static ObjectAddress ATExecDetachPartition(Relation rel, RangeVar *name);
 
+static void ReInsertHashPartition(Relation rel, Oid parentId, Node *bound);
+
+static void InsertIntoTable(Oid relid, HeapTuple tuple, TupleDesc tupdesc);
+static TupleTableSlot *ExecInsertPartition(TupleTableSlot *slot, TupleTableSlot *planSlot, 
+							EState *estate, bool canSetTag);
 
 /* ----------------------------------------------------------------
  *		DefineRelation
@@ -765,6 +770,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	/* Process and store partition bound, if any. */
 	if (stmt->partbound)
 	{
+
 		PartitionBoundSpec *bound;
 		ParseState *pstate;
 		Oid			parentId = linitial_oid(inheritOids);
@@ -799,6 +805,9 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		/* Update the pg_class entry. */
 		StorePartitionBound(rel, parent, bound);
 
+		/*judge if create new hash partition, if so, re_insert data into all partitions*/
+		ReInsertHashPartition(rel, parentId, (Node *)bound);
+
 		heap_close(parent, NoLock);
 
 		/*
@@ -872,6 +881,210 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	return address;
 }
 
+/* 
+ * ReInsertHashPartition
+ * 		when create a new hash partition, re_insert data into all partitions
+ */
+static void 
+ReInsertHashPartition(Relation rel, Oid parentId, Node *bound)
+{
+	Relation			parent;
+	PartitionKey		key;
+	PartitionDesc		pdesc;
+	int				nparts;
+
+	parent = heap_open(parentId, ExclusiveLock);
+	key = RelationGetPartitionKey(parent);
+	pdesc = RelationGetPartitionDesc(parent);
+
+	nparts = pdesc->nparts + 1;
+
+	if (key->strategy == PARTITION_STRATEGY_HASH)
+	{
+		int i;
+		Oid	*children = pdesc->oids;
+
+		/*get data from every partition and re_insert*/
+		for (i = 0; i < pdesc->nparts; i++)
+		{
+			Oid child = children[i];
+			Relation hash_rel;
+			HeapScanDesc hash_scan;
+			HeapTuple 	hash_tup;
+			Snapshot	snapshot;
+			Datum 	val;
+			bool		isnull;
+			int   	index;
+			Oid 	insertRel;
+
+			hash_rel = heap_open(child, ExclusiveLock);
+			snapshot = RegisterSnapshot(GetLatestSnapshot());
+	  		hash_scan = heap_beginscan(hash_rel, snapshot, 0, NULL);
+			
+			/*only partition table has tuples then re_insert*/
+			while ((hash_tup = heap_getnext(hash_scan, ForwardScanDirection)) != NULL)
+			{
+				/*get the key's data of the tuple*/
+				val = fastgetattr(hash_tup, key->partattrs[0], hash_rel->rd_att, &isnull);
+				
+				/*get the partition's index of the key*/
+				index = DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, val)) % nparts;
+				
+				/*get partition's oid that match the key's data*/
+				insertRel = HashSerialGetPartition(parentId, index);
+
+				if (index == nparts - 1)
+				{
+					InsertIntoTable(RelationGetRelid(rel), hash_tup, parent->rd_att);
+					simple_heap_delete(hash_rel, &hash_tup->t_self);
+				}
+				else
+				{
+					InsertIntoTable(insertRel, hash_tup, parent->rd_att);
+					simple_heap_delete(hash_rel, &hash_tup->t_self);
+				}
+			}
+
+			heap_endscan(hash_scan);
+			UnregisterSnapshot(snapshot);
+			heap_close(hash_rel, ExclusiveLock);
+		}		
+	}
+
+	heap_close(parent, ExclusiveLock);	
+}
+
+/*
+ * InsertIntoTable
+ *		Find a table relation given it's name and insert the
+ *		tuple on it (updating the indexes and calling the triggers)
+ */
+static void 
+InsertIntoTable(Oid relid, HeapTuple tuple, TupleDesc tupdesc)
+{
+	ResultRelInfo	*resultRelInfo;
+	TupleTableSlot	*slot;
+	EState 		*estate = CreateExecutorState();
+	Relation 		relation;
+
+	/* Lookup the relation */
+	if (relid == InvalidOid)
+		elog(ERROR, "partition_insert_trigger: Invalid child table %s", get_rel_name(relid));
+	relation = RelationIdGetRelation(relid);
+	if (relation == NULL)
+		elog(ERROR, "partition_insert_trigger: Failed to locate relation for child table %s", get_rel_name(relid));
+
+	/*
+	 * We need a ResultRelInfo so we can use the regular executor's
+	 * index-entry-making machinery.
+	 */
+	resultRelInfo = makeNode(ResultRelInfo);
+	resultRelInfo->ri_RangeTableIndex = 1;		/* dummy */
+	resultRelInfo->ri_RelationDesc = relation;
+	resultRelInfo->ri_TrigDesc = CopyTriggerDesc(relation->trigdesc);
+	if (resultRelInfo->ri_TrigDesc)
+	{
+		resultRelInfo->ri_TrigFunctions = (FmgrInfo *)
+			palloc0(resultRelInfo->ri_TrigDesc->numtriggers * sizeof(FmgrInfo));
+		resultRelInfo->ri_TrigWhenExprs = (List **)
+			palloc0(resultRelInfo->ri_TrigDesc->numtriggers * sizeof(List *));
+	}
+	resultRelInfo->ri_TrigInstrument = NULL;
+	ExecOpenIndices(resultRelInfo, false);
+	estate->es_result_relations = resultRelInfo;
+	estate->es_num_result_relations = 1;
+	estate->es_result_relation_info = resultRelInfo;
+	/*added for GetModifiedColumns in ExecConstraints functions*/
+	estate->es_range_table = list_make1(relation);
+
+	/* Set up a tuple slot too */
+	slot = ExecInitExtraTupleSlot(estate);
+	ExecSetSlotDescriptor(slot, tupdesc);
+	/* Triggers might need a slot as well */
+	estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate);
+
+	ExecStoreTuple(tuple, slot, InvalidBuffer, false);
+	ExecInsertPartition(slot, slot, estate, false);
+
+	/* Free resources */
+	ExecResetTupleTable(estate->es_tupleTable, false);
+	ExecCloseIndices(resultRelInfo);
+	FreeExecutorState(estate);
+	RelationClose(relation);
+}
+
+/*
+ * ExecInsertPartition:
+ *		For INSERT, we have to insert the tuple into the target relation
+ *		and insert appropriate tuples into the index relations.
+ */
+static TupleTableSlot *
+ExecInsertPartition(TupleTableSlot *slot, TupleTableSlot *planSlot, EState *estate, bool canSetTag)
+{
+	HeapTuple		tuple;
+	ResultRelInfo 	*resultRelInfo;
+	Relation		resultRelationDesc;
+	Oid			newId;
+	List	   		*recheckIndexes = NIL;
+
+	/*
+	 * get the heap tuple out of the tuple table slot, making sure we have a
+	 * writable copy
+	 */
+	tuple = ExecMaterializeSlot(slot);
+
+	/* get information on the (current) result relation*/
+	resultRelInfo = estate->es_result_relation_info;
+	resultRelationDesc = resultRelInfo->ri_RelationDesc;
+
+	if (resultRelationDesc->rd_rel->relhasoids)
+		HeapTupleSetOid(tuple, InvalidOid);
+
+	/* BEFORE ROW INSERT Triggers */
+	if (resultRelInfo->ri_TrigDesc &&
+			resultRelInfo->ri_TrigDesc->trig_insert_before_row)
+	{
+		slot = ExecBRInsertTriggers(estate, resultRelInfo, slot);
+		if (slot == NULL)
+			return NULL;
+		/* trigger might have changed tuple */
+		tuple = ExecMaterializeSlot(slot);
+	}
+
+	/* INSTEAD OF ROW INSERT Triggers */
+	if (resultRelInfo->ri_TrigDesc &&
+			resultRelInfo->ri_TrigDesc->trig_insert_instead_row)
+	{
+		slot = ExecIRInsertTriggers(estate, resultRelInfo, slot);
+		if (slot == NULL)
+			return NULL;
+		/* trigger might have changed tuple */
+		tuple = ExecMaterializeSlot(slot);
+		newId = InvalidOid;
+	}
+	else
+	{
+		/* Check the constraints of the tuple*/
+		if (resultRelationDesc->rd_att->constr)
+			ExecConstraints(resultRelInfo, slot, estate);
+		newId = heap_insert(resultRelationDesc, tuple,
+								estate->es_output_cid, 0, NULL);
+	}
+
+	if (canSetTag)
+	{
+		(estate->es_processed)++;
+		estate->es_lastoid = newId;
+		setLastTid(&(tuple->t_self));
+	}
+
+	/* AFTER ROW INSERT Triggers */
+	ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes, NULL);
+	list_free(recheckIndexes);
+
+	return NULL;
+}
+
 /*
  * Emit the right error or warning message for a "DROP" command issued on a
  * non-existent relation
@@ -1166,6 +1379,15 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 			LockRelationOid(state->heapOid, heap_lockmode);
 	}
 
+	if (is_partition && relOid != oldRelOid)
+	{
+		if(is_hash_partition(relOid))
+			ereport(ERROR,
+				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+				 errmsg("hash partition \"%s\" can not be dropped",
+						get_rel_name(relOid))));
+	}
+
 	/*
 	 * Similarly, if the relation is a partition, we must acquire lock on its
 	 * parent before locking the partition.  That's because queries lock the
@@ -1745,6 +1967,7 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
 		 * exclusive lock on the parent because its partition descriptor will
 		 * be changed by addition of the new partition.
 		 */
+
 		if (!is_partition)
 			relation = heap_openrv(parent, ShareUpdateExclusiveLock);
 		else
@@ -13146,18 +13369,29 @@ transformPartitionSpec(Relation rel, PartitionSpec *partspec, char *strategy)
 	ParseState *pstate;
 	RangeTblEntry *rte;
 	ListCell   *l;
-
+	
 	newspec = makeNode(PartitionSpec);
 
 	newspec->strategy = partspec->strategy;
 	newspec->partParams = NIL;
 	newspec->location = partspec->location;
-
+	
 	/* Parse partitioning strategy name */
 	if (pg_strcasecmp(partspec->strategy, "list") == 0)
 		*strategy = PARTITION_STRATEGY_LIST;
 	else if (pg_strcasecmp(partspec->strategy, "range") == 0)
 		*strategy = PARTITION_STRATEGY_RANGE;
+	else if (pg_strcasecmp(partspec->strategy, "hash") == 0)
+	{
+		if (list_length(partspec->partParams) == 1)
+		{
+			*strategy = PARTITION_STRATEGY_HASH;
+		}
+		else
+		{
+			elog(ERROR, "The hash partition does not support multiple key values.");
+		}
+	}
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -13434,6 +13668,13 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd)
 	ObjectAddress address;
 	const char *trigger_name;
 
+	/*hash partition do not support attach operation*/
+	if (rel->rd_partkey)
+		if (rel->rd_partkey->strategy == 'h')
+			ereport(ERROR,
+				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+				 errmsg("hash partition do not support attach operation")));
+
 	attachRel = heap_openrv(cmd->name, AccessExclusiveLock);
 
 	/*
@@ -13771,6 +14012,13 @@ ATExecDetachPartition(Relation rel, RangeVar *name)
 				new_repl[Natts_pg_class];
 	ObjectAddress address;
 
+	/*hash partition do not support detach operation*/
+	if (rel->rd_partkey)
+		if (rel->rd_partkey->strategy == 'h')
+			ereport(ERROR,
+				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+				 errmsg("hash partition do not support detach operation")));
+
 	partRel = heap_openrv(name, AccessShareLock);
 
 	/* All inheritance related checks are performed within the function */
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 0f08283..efd66bc 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -3228,6 +3228,7 @@ ExecSetupPartitionTupleRouting(Relation rel,
 	/* Get the tuple-routing information and lock partitions */
 	*pd = RelationGetPartitionDispatchInfo(rel, RowExclusiveLock, num_parted,
 										   &leaf_parts);
+
 	*num_partitions = list_length(leaf_parts);
 	*partitions = (ResultRelInfo *) palloc(*num_partitions *
 										   sizeof(ResultRelInfo));
@@ -3317,6 +3318,7 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, PartitionDispatch *pd,
 
 	result = get_partition_for_tuple(pd, slot, estate,
 									 &failed_at, &failed_slot);
+
 	if (result < 0)
 	{
 		Relation	failed_rel;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 8d17425..271b0f6 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1919,6 +1919,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 									   &partition_tupconv_maps,
 									   &partition_tuple_slot,
 									   &num_parted, &num_partitions);
+
 		mtstate->mt_partition_dispatch_info = partition_dispatch_info;
 		mtstate->mt_num_dispatch = num_parted;
 		mtstate->mt_partitions = partitions;
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 67ac814..b86bfe6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -24,6 +24,7 @@
 
 #include "miscadmin.h"
 #include "nodes/extensible.h"
+#include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
 #include "nodes/relation.h"
 #include "utils/datum.h"
@@ -4448,6 +4449,7 @@ _copyPartitionBoundSpec(const PartitionBoundSpec *from)
 	COPY_NODE_FIELD(listdatums);
 	COPY_NODE_FIELD(lowerdatums);
 	COPY_NODE_FIELD(upperdatums);
+	COPY_SCALAR_FIELD(hashnumber);
 	COPY_LOCATION_FIELD(location);
 
 	return newnode;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b0abe9e..73d09cc 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3565,6 +3565,7 @@ _outPartitionBoundSpec(StringInfo str, const PartitionBoundSpec *node)
 	WRITE_NODE_FIELD(listdatums);
 	WRITE_NODE_FIELD(lowerdatums);
 	WRITE_NODE_FIELD(upperdatums);
+	WRITE_INT_FIELD(hashnumber);
 	WRITE_LOCATION_FIELD(location);
 }
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1380703..74391a7 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -2377,6 +2377,7 @@ _readPartitionBoundSpec(void)
 	READ_NODE_FIELD(listdatums);
 	READ_NODE_FIELD(lowerdatums);
 	READ_NODE_FIELD(upperdatums);
+	READ_INT_FIELD(hashnumber);
 	READ_LOCATION_FIELD(location);
 
 	READ_DONE();
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index dc0b0b0..6333e43 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,7 +28,8 @@
 #include "catalog/dependency.h"
 #include "catalog/heap.h"
 #include "catalog/partition.h"
-#include "catalog/pg_am.h"
+#include "catalog/pg_am.h" 
+#include "catalog/pg_partitioned_table.h"
 #include "catalog/pg_statistic_ext.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
@@ -48,7 +49,7 @@
 #include "utils/syscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
-
+#include "utils/typcache.h"
 
 /* GUC parameter */
 int			constraint_exclusion = CONSTRAINT_EXCLUSION_PARTITION;
@@ -69,6 +70,10 @@ static List *build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
 				  Relation heapRelation);
 static List *get_relation_statistics(RelOptInfo *rel, Relation relation);
 
+static int get_hash_part_attrs(Oid parentid);
+static char get_hash_part_strategy(Oid parentid);
+static int get_hash_part_number(Oid parentid);
+
 /*
  * get_relation_info -
  *	  Retrieves catalog information for a given relation.
@@ -1441,9 +1446,376 @@ relation_excluded_by_constraints(PlannerInfo *root,
 	if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false))
 		return true;
 
+	/* hash partition constraint exclusion. */
+	if (NIL != root->append_rel_list)
+	{
+		Node		*parent = NULL;
+		parent = (Node*)linitial(root->append_rel_list);
+
+		if ((nodeTag(parent) == T_AppendRelInfo) 
+			&& get_hash_part_strategy(((AppendRelInfo*)parent)->parent_reloid) == PARTITION_STRATEGY_HASH 
+			&& (root->parse->jointree->quals != NULL))
+		{
+			int	cur_index = -1, in_index = -1, bool_index = -1;
+			Oid		poid;
+			Relation	relation;
+			PartitionKey	key;
+			RelabelType	*rt;
+
+			poid = ((AppendRelInfo*)parent)->parent_reloid;
+			relation = RelationIdGetRelation(poid);
+			key = RelationGetPartitionKey(relation);
+
+			heap_close(relation, NoLock);
+	
+			BoolExpr *boolexpr;
+			OpExpr 		*subopexpr;
+			Node		*leftSubOp, *rightSubOp;
+			Node		*constExprInSub, *varExprInSub;
+			Const 		*subc;
+			Var 		*subv;
+			int		subpartattr;
+			int			substrat;
+			TypeCacheEntry		*subtce;
+
+			OpExpr		*predicateExpr;
+			Node 		*predicate;
+			List		*list;
+			Node		*leftPredicateOp, *rightPredicateOp;
+			Node		*constExprInPredicate, *varExprInPredicate;
+			Const 		*c;
+			Var 		*v;
+			int		partattr;
+			int			strat;
+			TypeCacheEntry		*tce;
+
+			ScalarArrayOpExpr 	*arrayexpr;
+			Node			*arraynode;
+
+			predicate = rel->baserestrictinfo;
+
+			list = (List *)predicate;
+			
+			if (list_length(list) != 1)
+				return false;
+
+			predicate = linitial(list);
+			if (IsA(predicate, RestrictInfo))
+			{
+				RestrictInfo *info = (RestrictInfo*)predicate;
+				predicate = (Node*)info->clause;
+			}
+
+			switch (nodeTag(predicate))
+			{
+				/* AND, OR, NOT expressions */
+				case T_BoolExpr:
+					
+					boolexpr = (BoolExpr *) predicate;
+					bool 		subflag = true;
+
+					foreach (lc, boolexpr->args)
+					{
+						Expr	*arg = (Expr *) lfirst(lc);
+						OpExpr	*oparg = (OpExpr*)lfirst(lc);
+
+						leftSubOp = get_leftop((Expr*)arg);
+						rightSubOp = get_rightop((Expr*)arg);
+						int		*subidx;
+
+						/* check if one operand is a constant */
+						if (IsA(rightSubOp, Const))
+						{
+							varExprInSub = leftSubOp;
+							constExprInSub = rightSubOp;
+						}
+						else if (IsA(leftSubOp, Const))
+						{
+							constExprInSub = leftSubOp;
+							varExprInSub = rightSubOp;
+						}
+						else
+						{
+							return false;
+						}
+
+						subtce = lookup_type_cache(key->parttypid[0], TYPECACHE_BTREE_OPFAMILY);
+						substrat = get_op_opfamily_strategy(oparg->opno, subtce->btree_opf);
+
+						subpartattr = get_hash_part_attrs(poid);
+						subv = (Var*)varExprInSub;
+
+						switch (boolexpr->boolop)
+						{
+							case OR_EXPR:
+								if (subv->varattno == subpartattr && substrat == BTEqualStrategyNumber)
+								{
+									subc = (Const*)constExprInSub;
+									bool_index = DatumGetUInt32(OidFunctionCall1(
+											lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, 
+											subc->constvalue)) 
+											% list_length(root->append_rel_list);
+									
+									subidx = &bool_index;
+									if (subidx != NULL)
+									{
+										if (get_hash_part_number(rte->relid) == *subidx)
+										{
+											subflag = false;
+										}
+									
+									}
+								}
+								else
+								{
+									/* constraint exclusion in hash partition could not support <, > etc. */
+									return false;
+								}
+
+								break;	
+
+							case AND_EXPR:
+
+								break;
+
+							default:
+
+								break;
+						}
+					}
+
+					if (subflag)
+						return true;
+					else
+						return false;
+
+					break;
+
+				/* =, !=, <, > etc. */
+				case T_OpExpr:
+					predicateExpr = (OpExpr *)predicate;
+					leftPredicateOp = get_leftop((Expr*)predicate);
+					rightPredicateOp = get_rightop((Expr*)predicate);
+					
+					/* check if one operand is a constant */
+					if (IsA(rightPredicateOp, Const))
+					{
+						varExprInPredicate = leftPredicateOp;
+						constExprInPredicate = rightPredicateOp;
+					}
+					else if (IsA(leftPredicateOp, Const))
+					{
+						constExprInPredicate = leftPredicateOp;
+						varExprInPredicate = rightPredicateOp;
+					}
+					else
+					{
+						return false;
+					}
+
+					tce = lookup_type_cache(key->parttypid[0], TYPECACHE_BTREE_OPFAMILY);
+					strat = get_op_opfamily_strategy(predicateExpr->opno, tce->btree_opf);
+
+					partattr = get_hash_part_attrs(poid);
+					v = (Var*)varExprInPredicate;
+
+					/* If strat is "=", select one partiton */
+					if (v->varattno == partattr && strat == BTEqualStrategyNumber)
+					{
+						c = (Const*)constExprInPredicate;
+						cur_index = DatumGetUInt32(OidFunctionCall1(
+								lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, 
+								c->constvalue)) 
+								% list_length(root->append_rel_list);
+					}
+					else
+					{
+						/* constraint exclusion in hash partition could not support <, > etc. */
+						return false;
+					}
+			
+					if (get_hash_part_number(rte->relid) != cur_index && cur_index != -1) 
+					{
+						return true;
+					}
+
+					break;
+
+				/* IN expression */
+				case T_ScalarArrayOpExpr:
+					arrayexpr = (ScalarArrayOpExpr *)predicate;
+					arraynode = (Node *) lsecond(arrayexpr->args);
+					bool 		flag = true;
+
+					if (arraynode && IsA(arraynode, Const) &&
+						!((Const *) arraynode)->constisnull)
+					{
+						ArrayType  *arrayval;
+						int16		elemlen;
+						bool		elembyval;
+						char		elemalign;
+						int			num_elems;
+						Datum	   *elem_values;
+						bool	   *elem_nulls;
+						int			strategy = BTEqualStrategyNumber;
+						int		i;
+						int		*idx;
+						
+
+						/* Extract values from array */
+						arrayval = DatumGetArrayTypeP(((Const *) arraynode)->constvalue);
+						get_typlenbyvalalign(ARR_ELEMTYPE(arrayval),
+											 &elemlen, &elembyval, &elemalign);
+						deconstruct_array(arrayval,
+										  ARR_ELEMTYPE(arrayval),
+										  elemlen, elembyval, elemalign,
+										  &elem_values, &elem_nulls, &num_elems);
+
+						/* Construct OIDs list */
+						for (i = 0; i < num_elems; i++)
+						{
+							if (!elem_nulls[i])
+							{
+								/* Invoke base hash function for value type */
+								in_index = DatumGetUInt32(OidFunctionCall1(
+									lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, 
+									elem_values[i])) 
+									% list_length(root->append_rel_list);
+								idx = &in_index;
+	
+								if (idx != NULL)
+								{
+									if (get_hash_part_number(rte->relid) == *idx)
+									{
+										flag = false;
+									}
+									
+								}
+							}
+						}
+
+						/* Free resources */
+						pfree(elem_values);
+						pfree(elem_nulls);
+					}
+
+					if (flag)
+						return true;
+					else
+						return false;
+
+					break;
+
+				default:
+					elog(ERROR, "Unknown clause type.");
+			}
+
+		}
+
+		return false;
+	}
+
 	return false;
 }
 
+/*
+ * get_hash_part_attrs
+ *
+ */
+int
+get_hash_part_attrs(Oid parentid)
+{
+	Form_pg_partitioned_table	form;
+	HeapTuple		tuple;
+	int			result;
+
+	tuple = SearchSysCache1(PARTRELID, ObjectIdGetDatum(parentid));
+	
+	form = (Form_pg_partitioned_table)GETSTRUCT(tuple);
+	result = form->partattrs.values[0];
+
+	ReleaseSysCache(tuple);
+
+	return result;
+}
+
+/*
+ * get_hash_part_number
+ *
+ */
+int
+get_hash_part_number(Oid parentid)
+{
+	HeapTuple		tuple;
+	int		result;
+
+	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(parentid));
+
+	Datum		boundDatum;
+	bool		isnull;
+	PartitionBoundSpec 	   *bound;
+
+	boundDatum = SysCacheGetAttr(RELOID, tuple,
+		Anum_pg_class_relpartbound,
+		&isnull);
+
+	ReleaseSysCache(tuple);
+
+	if (boundDatum)
+	{
+		bound = (PartitionBoundSpec *)(stringToNode(TextDatumGetCString(boundDatum)));
+		return	bound->hashnumber;
+	}
+
+	return result;
+}
+
+/*
+ * get_hash_part_strategy
+ *
+ */
+char
+get_hash_part_strategy(Oid parentid)
+{
+	Form_pg_partitioned_table	form;
+	HeapTuple		tuple;
+	char			result;
+
+	tuple = SearchSysCache1(PARTRELID, ObjectIdGetDatum(parentid));
+	if (!HeapTupleIsValid(tuple))
+	{
+		tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(parentid));
+
+		if (!HeapTupleIsValid(tuple))
+		{
+			return '0';
+		}
+		else
+		{
+			Datum		boundDatum;
+			bool		isnull;
+			PartitionBoundSpec 	   *bound;
+
+			boundDatum = SysCacheGetAttr(RELOID, tuple,
+				Anum_pg_class_relpartbound,
+				&isnull);
+
+			ReleaseSysCache(tuple);
+
+			if (boundDatum)
+			{
+				bound = (PartitionBoundSpec *)(stringToNode(TextDatumGetCString(boundDatum)));
+				return	bound->strategy;
+			}
+			return '0';
+		}
+	}
+	form = (Form_pg_partitioned_table)GETSTRUCT(tuple);
+	result = form->partstrat;
+	ReleaseSysCache(tuple);
+
+	return result;
+}
 
 /*
  * build_physical_tlist
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0f3998f..1575cbe 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -2675,6 +2675,17 @@ ForValues:
 
 					$$ = n;
 				}
+			/* a HASH partition */
+			|  /*EMPTY*/
+				{
+					PartitionBoundSpec *n = makeNode(PartitionBoundSpec);
+
+					n->strategy = PARTITION_STRATEGY_HASH;
+					//n->hashnumber = 1;
+					//n->location = @3;
+
+					$$ = n;
+				}
 		;
 
 partbound_datum:
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee5f3a3..1742fb3 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -33,6 +33,7 @@
 #include "catalog/heap.h"
 #include "catalog/index.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
@@ -3299,6 +3300,8 @@ transformPartitionBound(ParseState *pstate, Relation parent,
 	char		strategy = get_partition_strategy(key);
 	int			partnatts = get_partition_natts(key);
 	List	   *partexprs = get_partition_exprs(key);
+	PartitionDesc	pdesc;
+	pdesc = RelationGetPartitionDesc(parent);
 
 	/* Avoid scribbling on input */
 	result_spec = copyObject(spec);
@@ -3359,6 +3362,28 @@ transformPartitionBound(ParseState *pstate, Relation parent,
 											  value);
 		}
 	}
+	else if (strategy == PARTITION_STRATEGY_HASH)
+	{
+		char	   *colname;
+
+		if (spec->strategy != PARTITION_STRATEGY_HASH)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+					 errmsg("invalid bound specification for a hash partition"),
+					 parser_errposition(pstate, exprLocation((Node *) spec))));
+
+		/* Get the only column's name in case we need to output an error */
+		if (key->partattrs[0] != 0)
+			colname = get_relid_attribute_name(RelationGetRelid(parent),
+											   key->partattrs[0]);
+		else
+			colname = deparse_expression((Node *) linitial(partexprs),
+										 deparse_context_for(RelationGetRelationName(parent),
+															 RelationGetRelid(parent)),
+									 false, false);
+
+		result_spec->hashnumber = pdesc->nparts;
+	}
 	else if (strategy == PARTITION_STRATEGY_RANGE)
 	{
 		ListCell   *cell1,
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 18d9e27..19d216d 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1660,6 +1660,10 @@ pg_get_partkeydef_worker(Oid relid, int prettyFlags,
 			if (!attrsOnly)
 				appendStringInfo(&buf, "RANGE");
 			break;
+		case PARTITION_STRATEGY_HASH:
+			if (!attrsOnly)
+				appendStringInfo(&buf, "HASH");
+			break;
 		default:
 			elog(ERROR, "unexpected partition strategy: %d",
 				 (int) form->partstrat);
@@ -8666,6 +8670,13 @@ get_rule_expr(Node *node, deparse_context *context,
 						appendStringInfoString(buf, ")");
 						break;
 
+					case PARTITION_STRATEGY_HASH:
+						Assert(spec->hashnumber != -1);
+
+						appendStringInfoString(buf, "SERIAL NUMBER");
+						appendStringInfo(buf, " %d", spec->hashnumber);
+						break;
+
 					case PARTITION_STRATEGY_RANGE:
 						Assert(spec->lowerdatums != NIL &&
 							   spec->upperdatums != NIL &&
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 502d5eb..4d71c21 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15340,12 +15340,33 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
 				appendPQExpBufferStr(q, " (\n)");
 			}
 
-			if (tbinfo->ispartition && !dopt->binary_upgrade)
+			if(tbinfo->ispartition && !dopt->binary_upgrade)
 			{
-				appendPQExpBufferStr(q, "\n");
-				appendPQExpBufferStr(q, tbinfo->partbound);
+				PGresult   *res;
+				PQExpBuffer query = createPQExpBuffer();
+				char	*strategy;
+				char	*s = "h";
+
+				appendPQExpBuffer(query, " select partstrat"
+							" from pg_partitioned_table p"
+							" join pg_inherits i on i.inhparent = p.partrelid"
+							" where i.inhrelid = %d", tbinfo->dobj.catId.oid);
+
+				res = ExecuteSqlQueryForSingleRow(fout, query->data);
+				strategy = pg_strdup(PQgetvalue(res, 0, 0));
+
+				if(!(strcmp(strategy, s) == 0))
+				{
+					appendPQExpBufferStr(q, "\n");
+					appendPQExpBufferStr(q, tbinfo->partbound);
+				}
+
+				PQclear(res);
+				destroyPQExpBuffer(query);
 			}
 
+			
+
 			/* Emit the INHERITS clause, except if this is a partition. */
 			if (numParents > 0 &&
 				!tbinfo->ispartition &&
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index f10879a..ecdd757 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -70,6 +70,10 @@ typedef struct PartitionDispatchData
 
 typedef struct PartitionDispatchData *PartitionDispatch;
 
+extern bool is_hash_partitioned(Oid relid);
+extern bool is_hash_partition(Oid relid);
+extern Oid HashSerialGetPartition(Oid parentOid, int index);
+
 extern void RelationBuildPartitionDesc(Relation relation);
 extern bool partition_bounds_equal(PartitionKey key,
 					   PartitionBoundInfo p1, PartitionBoundInfo p2);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1d96169..73b8281 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -785,6 +785,7 @@ typedef struct PartitionSpec
 /* Internal codes for partitioning strategies */
 #define PARTITION_STRATEGY_LIST		'l'
 #define PARTITION_STRATEGY_RANGE	'r'
+#define PARTITION_STRATEGY_HASH		'h'
 
 /*
  * PartitionBoundSpec - a partition bound specification
@@ -805,6 +806,9 @@ typedef struct PartitionBoundSpec
 	List	   *lowerdatums;	/* List of PartitionRangeDatums */
 	List	   *upperdatums;	/* List of PartitionRangeDatums */
 
+	/* Partitioning info for HASH strategy: */
+	int			hashnumber;		/* serial number */
+
 	int			location;		/* token location, or -1 if unknown */
 } PartitionBoundSpec;
 
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index b6f794e..e434ae2 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -340,11 +340,6 @@ CREATE TABLE partitioned (
 ) PARTITION BY RANGE (const_func());
 ERROR:  cannot use constant expression as partition key
 DROP FUNCTION const_func();
--- only accept "list" and "range" as partitioning strategy
-CREATE TABLE partitioned (
-	a int
-) PARTITION BY HASH (a);
-ERROR:  unrecognized partitioning strategy "hash"
 -- specified column must be present in the table
 CREATE TABLE partitioned (
 	a int
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index cb7aa5b..95b0e23 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -350,11 +350,6 @@ CREATE TABLE partitioned (
 ) PARTITION BY RANGE (const_func());
 DROP FUNCTION const_func();
 
--- only accept "list" and "range" as partitioning strategy
-CREATE TABLE partitioned (
-	a int
-) PARTITION BY HASH (a);
-
 -- specified column must be present in the table
 CREATE TABLE partitioned (
 	a int

Rushabh Lathia

rushabh.lathia@gmail.com

over 8 years ago

In reply to: yangjie@highgo.com (#1)

Re: hash partitioning based on v10Beta2

On Sat, Aug 26, 2017 at 10:10 AM, yangjie@highgo.com <yangjie@highgo.com>
wrote:

Hi all,

Now we have had the range / list partition, but hash partitioning is not
implemented yet.
Attached is a POC patch based on the v10Beta2 to add the
hash partitioning feature.
Although we will need more discussions about the syntax
and other specifications before going ahead the project,
but I think this runnable code might help to discuss
what and how we implement this.

FYI, there is already an existing commitfest entry for this project.

https://commitfest.postgresql.org/14/1059/

Description

The hash partition's implement is on the basis of
the original range / list partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think
the partition key can support multiple values,
which may be difficult to implement when querying, but
it is not impossible.

A partition table can be create as bellow:

CREATE TABLE h1 PARTITION OF h;
CREATE TABLE h2 PARTITION OF h;
CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound
is calclulated automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0],
TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts
/* Number of partitions */
;
In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_
cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 3
h1 | 5
h1 | 17
h1 | 19
h2 | 2
h2 | 6
h2 | 7
h2 | 11
h2 | 12
h2 | 14
h2 | 15
h2 | 18
h2 | 20
h3 | 1
h3 | 4
h3 | 8
h3 | 9
h3 | 10
h3 | 13
h3 | 16
(20 rows)

The number of partitions here can be dynamically added, and
if a new partition is created, the number of partitions
changes, the calculated target partitions will change,
and the same data is not reasonable in different
partitions,So you need to re-calculate the existing data
and insert the target partition when you create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 5
h1 | 17
h1 | 19
h1 | 6
h1 | 12
h1 | 8
h1 | 13
h2 | 11
h2 | 14
h3 | 1
h3 | 9
h3 | 2
h3 | 15
h4 | 3
h4 | 7
h4 | 18
h4 | 20
h4 | 4
h4 | 10
h4 | 16
(20 rows)

When querying the data, the hash partition uses the same
algorithm as the insertion, and filters out the table
that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
QUERY PLAN

------------------------------------------------------------
----------------------------------------
Append (cost=0.00..41.88 rows=13 width=4) (actual time=
0.020..0.023 rows=1 loops=1)
-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (
actual time=0.013..0.016 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 3
Planning time: 0.346 ms
Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
QUERY PLAN

------------------------------------------------------------
----------------------------------------
Append (cost=0.00..83.75 rows=52 width=4) (actual time=
0.016..0.028 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (
actual time=0.015..0.018 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (
actual time=0.005..0.007 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 3
Planning time: 0.720 ms
Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
QUERY PLAN

------------------------------------------------------------
----------------------------------------
Append (cost=0.00..96.50 rows=50 width=4) (actual time=
0.017..0.078 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (
actual time=0.015..0.019 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (
actual time=0.005..0.010 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 3
Planning time: 0.396 ms
Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young

------------------------------

https://yonj1e.github.io/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Rushabh Lathia

yangjie@highgo.com

over 8 years ago

In reply to: yangjie@highgo.com (#1)

1 attachment(s)

Re: hash partitioning based on v10Beta2

On Sat, Aug 26, 2017 at 10:10 AM, yangjie@highgo.com <yangjie@highgo.com>
wrote:

Hi all,

Now we have had the range / list partition, but hash partitioning is not
implemented yet.
Attached is a POC patch based on the v10Beta2 to add the
hash partitioning feature.
Although we will need more discussions about the syntax
and other specifications before going ahead the project,
but I think this runnable code might help to discuss
what and how we implement this.

FYI, there is already an existing commitfest entry for this project.

https://commitfest.postgresql.org/14/1059/

Description

The hash partition's implement is on the basis of
the original range / list partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think
the partition key can support multiple values,
which may be difficult to implement when querying, but
it is not impossible.

A partition table can be create as bellow:

CREATE TABLE h1 PARTITION OF h;
CREATE TABLE h2 PARTITION OF h;
CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound
is calclulated automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0],
TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts
/* Number of partitions */
;
In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_
cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 3
h1 | 5
h1 | 17
h1 | 19
h2 | 2
h2 | 6
h2 | 7
h2 | 11
h2 | 12
h2 | 14
h2 | 15
h2 | 18
h2 | 20
h3 | 1
h3 | 4
h3 | 8
h3 | 9
h3 | 10
h3 | 13
h3 | 16
(20 rows)

The number of partitions here can be dynamically added, and
if a new partition is created, the number of partitions
changes, the calculated target partitions will change,
and the same data is not reasonable in different
partitions,So you need to re-calculate the existing data
and insert the target partition when you create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 5
h1 | 17
h1 | 19
h1 | 6
h1 | 12
h1 | 8
h1 | 13
h2 | 11
h2 | 14
h3 | 1
h3 | 9
h3 | 2
h3 | 15
h4 | 3
h4 | 7
h4 | 18
h4 | 20
h4 | 4
h4 | 10
h4 | 16
(20 rows)

When querying the data, the hash partition uses the same
algorithm as the insertion, and filters out the table
that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
QUERY PLAN

------------------------------------------------------------
----------------------------------------
Append (cost=0.00..41.88 rows=13 width=4) (actual time=
0.020..0.023 rows=1 loops=1)
-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (
actual time=0.013..0.016 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 3
Planning time: 0.346 ms
Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
QUERY PLAN

------------------------------------------------------------
----------------------------------------
Append (cost=0.00..83.75 rows=52 width=4) (actual time=
0.016..0.028 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (
actual time=0.015..0.018 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (
actual time=0.005..0.007 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 3
Planning time: 0.720 ms
Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
QUERY PLAN

------------------------------------------------------------
----------------------------------------
Append (cost=0.00..96.50 rows=50 width=4) (actual time=
0.017..0.078 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (
actual time=0.015..0.019 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (
actual time=0.005..0.010 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 3
Planning time: 0.396 ms
Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young

------------------------------

https://yonj1e.github.io/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Rushabh Lathia

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: yangjie@highgo.com (#1)

Re: hash partitioning based on v10Beta2

On Sat, Aug 26, 2017 at 12:40 AM, yangjie@highgo.com <yangjie@highgo.com> wrote:

A partition table can be create as bellow:

CREATE TABLE h1 PARTITION OF h;
CREATE TABLE h2 PARTITION OF h;
CREATE TABLE h3 PARTITION OF h;

This syntax is very problematic for reasons that have been discussed
on the existing hash partitioning thread. Fortunately, a solution has
already been implemented... you're the third person to try to write a
patch for this feature.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

yangjie

yangjie@highgo.com

over 8 years ago

In reply to: Robert Haas (#4)

Re: hash partitioning based on v10Beta2

<html>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'>
</head>
<body>
<style>
font{
line-height: 1.7;
}
</style>
<div style = 'font-family:"微软雅黑"; font-size: 14px; color:#000000; line-height:1.7;'>
<div>
<div><div class="input-wrap">
<div class="textarea-warp">Hi,</div><div class="textarea-warp"><br></div>
</div>
<div class="output-wrap">

<div class="text-output" id="text-output" style="top: 15px;"><span class="text-dst sentence-0">When the number of partitions and the data are more, adding new partitions, there will be some efficiency problems.</span></div></div></div>
<div><div class="output-wrap">

<div class="text-output" id="text-output" style="top: 15px;"><span class="text-dst sentence-0">I don't know how the solution you're talking about is how to implement a hash partition?</span></div></div></div>
<div><span><br></span></div>
<div id="ntes-pcmail-signature" style="font-family:'微软雅黑'">
<style type="text/css">
a#ntes-pcmail-signature-default:hover {
text-decoration: underline;
color: #199cff;
cursor: pointer;
}
a#ntes-pcmail-signature-default:active {
text-decoration: underline;
color: #246fce;
cursor: pointer;
}
</style>

<div style="font-size:14px; padding: 0; margin:0;">
<div style="font-family:"微软雅黑"; font-size: 14px; color:#000000">
<style>
font{
line-height: 1.7;
}
</style>
<div style="font-family: 微软雅黑; font-size: 14px;">
<style>
font{
line-height: 1.7;
}
</style>
<div style="font-family: 微软雅黑; font-size: 14px;">
<style>
font{
line-height: 1.7;
}
</style>
<div style="font-size: 14px;">
<style>
font{
line-height: 1.7;
}
</style>
<div style="font-size: 14px;">
<style>
font{
line-height: 1.7;
}
</style>
<div style="font-size: 14px;">
<style>
font{
line-height: 1.7;
}
</style>
<div style="font-size: 14px;">
<style>
font{
line-height: 1.7;
}
</style>
<div id="ntes-pcmail-signature-default" style="font-size: 14px; text-decoration: none;"><p class="MsoNormal" style="margin: 0cm 0cm 0.0001pt; text-align: justify; line-height: 23.8px;">---</p><p class="MsoNormal" style="margin: 0cm 0cm 0.0001pt; text-align: justify; line-height: 23.8px;">young</p><p class="MsoNormal" style="margin: 0cm 0cm 0.0001pt; text-align: justify; line-height: 23.8px;">HighGo Database: <a href="http://www.highgo.com">http://www.highgo.com</a></p></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>

</div><br>
</div><div class="J-reply" style="background-color:#f2f2f2;color:black;padding-top:6px;padding-bottom:6px;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;margin-top:45px;margin-bottom:20px;font-family:'微软雅黑';">
<div style="font-size:14px;line-height:1.5;word-break:break-all;margin-left:10px;margin-right:10px">On <span class="mail-date">8/28/2017 22:25</span>，<a class="mail-to" style="text-decoration:none;color:#2a83f2;" href="mailto:robertmhaas@gmail.com">Robert Haas<robertmhaas@gmail.com></a> wrote： </div>
</div>
<blockquote id="ntes-pcmail-quote" style="margin: 0; padding: 0; font-size: 14px; font-family: '微软雅黑';">
On Sat, Aug 26, 2017 at 12:40 AM, yangjie@highgo.com <yangjie@highgo.com> wrote:
<br>> A partition table can be create as bellow:
<br>>
<br>>  CREATE TABLE h1 PARTITION OF h;
<br>>  CREATE TABLE h2 PARTITION OF h;
<br>>  CREATE TABLE h3 PARTITION OF h;
<br>
<br>This syntax is very problematic for reasons that have been discussed
<br>on the existing hash partitioning thread.  Fortunately, a solution has
<br>already been implemented... you're the third person to try to write a
<br>patch for this feature.
<br>
<br>-- 
<br>Robert Haas
<br>EnterpriseDB: http://www.enterprisedb.com
<br>The Enterprise PostgreSQL Company
<br></blockquote>
</div>
</body>
</html>

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: yangjie (#5)

Re: hash partitioning based on v10Beta2

On Mon, Aug 28, 2017 at 10:44 PM, yangjie <yangjie@highgo.com> wrote:

When the number of partitions and the data are more, adding new partitions,
there will be some efficiency problems.
I don't know how the solution you're talking about is how to implement a
hash partition?

I am having difficulty understanding this. There was discussion on
the other thread of how splitting partitions could be done reasonably
efficiently with the proposed design; of course, it's never going to
be super-cheap.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers