Patch for adding DATACUBE operator

Started by sumitover 22 years ago7 messages

sumit@gdit.iiit.net

over 22 years ago

4 attachment(s)

Hi!

We have added the CUBE operator for PostgreSQL. Please find the
attached patch.

Another thing to note is that the file datacube.c should
be placed in src/backend/tcop/ and datacube.h should be in src/include.

The syntax of the query is

SELECT <field list><aggregate list>
INTO <destination table>
FROM <table expression>
WHERE <search condition>
GROUP BY <aggregate list>
HAVING <search condition>
WITH CUBE;

An example along with the output is provided in the
README.datacube file. Kindly have a look. Let us know your response.

Srikanth M
Sumit Kumar Mukherjee

Attachments:

datacube_patchtext/plain; charset=US-ASCII; name=datacube_patchDownload

*** ./src/backend/executor/spi.c.orig	2003-06-30 00:48:37.000000000 +0530
--- ./src/backend/executor/spi.c	2003-06-30 00:51:06.000000000 +0530
***************
*** 1046,1051 ****
--- 1046,1052 ----
  			case T_UpdateStmt:
  				origCmdType = CMD_UPDATE;
  				break;
+ 			case T_CubeSelectStmt:		/*** DATACUBE ***/
  			case T_SelectStmt:
  				origCmdType = CMD_SELECT;
  				break;
*** ./src/backend/nodes/copyfuncs.c.orig	2003-06-30 00:48:48.000000000 +0530
--- ./src/backend/nodes/copyfuncs.c	2003-06-30 00:51:17.000000000 +0530
***************
*** 2944,2949 ****
--- 2944,2950 ----
  		case T_UpdateStmt:
  			retval = _copyUpdateStmt(from);
  			break;
+ 		case T_CubeSelectStmt:			/*** DATACUBE ***/
  		case T_SelectStmt:
  			retval = _copySelectStmt(from);
  			break;
*** ./src/backend/nodes/equalfuncs.c.orig	2003-06-30 00:48:51.000000000 +0530
--- ./src/backend/nodes/equalfuncs.c	2003-06-30 00:51:25.000000000 +0530
***************
*** 2149,2154 ****
--- 2149,2155 ----
  		case T_UpdateStmt:
  			retval = _equalUpdateStmt(a, b);
  			break;
+ 		case T_CubeSelectStmt:			/*** DATACUBE ***/
  		case T_SelectStmt:
  			retval = _equalSelectStmt(a, b);
  			break;
*** ./src/backend/nodes/outfuncs.c.orig	2003-06-30 00:48:55.000000000 +0530
--- ./src/backend/nodes/outfuncs.c	2003-06-30 00:51:31.000000000 +0530
***************
*** 1555,1560 ****
--- 1555,1561 ----
  			case T_NotifyStmt:
  				_outNotifyStmt(str, obj);
  				break;
+ 			case T_CubeSelectStmt:		/*** DATACUBE ***/
  			case T_SelectStmt:
  				_outSelectStmt(str, obj);
  				break;
*** ./src/backend/parser/analyze.c.orig	2003-06-30 00:49:02.000000000 +0530
--- ./src/backend/parser/analyze.c	2003-06-30 01:12:44.000000000 +0530
***************
*** 304,309 ****
--- 304,310 ----
  			break;
  
  		case T_SelectStmt:
+ 		case T_CubeSelectStmt:			/*** DATACUBE ***/
  			if (((SelectStmt *) parseTree)->op == SETOP_NONE)
  				result = transformSelectStmt(pstate,
  											 (SelectStmt *) parseTree);
*** ./src/backend/parser/gram.y.orig	2003-06-30 00:49:06.000000000 +0530
--- ./src/backend/parser/gram.y	2003-06-30 15:31:14.000000000 +0530
***************
*** 212,218 ****
  				any_operator expr_list dotted_name attrs
  				target_list update_target_list insert_column_list
  				insert_target_list def_list opt_indirection
! 				group_clause TriggerFuncArgs select_limit
  				opt_select_limit opclass_item_list trans_options
  				TableFuncElementList
  				prep_type_clause prep_type_list
--- 212,218 ----
  				any_operator expr_list dotted_name attrs
  				target_list update_target_list insert_column_list
  				insert_target_list def_list opt_indirection
! 				group_clause TriggerFuncArgs select_limit cube_group_clause	/*** DATACUBE ***/
  				opt_select_limit opclass_item_list trans_options
  				TableFuncElementList
  				prep_type_clause prep_type_list
***************
*** 331,337 ****
  	CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
  	COMMITTED CONSTRAINT CONSTRAINTS CONVERSION_P CONVERT COPY CREATE CREATEDB
  	CREATEUSER CROSS CURRENT_DATE CURRENT_TIME
! 	CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
  
  	DATABASE DAY_P DEALLOCATE DEC DECIMAL DECLARE DEFAULT
  	DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS
--- 331,337 ----
  	CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
  	COMMITTED CONSTRAINT CONSTRAINTS CONVERSION_P CONVERT COPY CREATE CREATEDB
  	CREATEUSER CROSS CURRENT_DATE CURRENT_TIME
! 	CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE CUBE 		/*** DATACUBE ***/
  
  	DATABASE DAY_P DEALLOCATE DEC DECIMAL DECLARE DEFAULT
  	DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS
***************
*** 4234,4239 ****
--- 4234,4254 ----
  			into_clause from_clause where_clause
  			group_clause having_clause
  				{
+ 					SelectStmt *n = makeNode(SelectStmt); 
+ 					n->distinctClause = $2;
+ 					n->targetList = $3;
+ 					n->into = $4;
+ 					n->intoColNames = NIL;
+ 					n->fromClause = $5;
+ 					n->whereClause = $6;
+ 					n->groupClause = $7;
+ 					n->havingClause = $8;
+ 					$$ = (Node *)n;
+ 				}
+ 			| SELECT opt_distinct target_list
+ 			  into_clause from_clause where_clause
+ 			  cube_group_clause having_clause WITH CUBE		/*** DATACUBE ***/
+ 				{
  					SelectStmt *n = makeNode(SelectStmt);
  					n->distinctClause = $2;
  					n->targetList = $3;
***************
*** 4243,4248 ****
--- 4258,4264 ----
  					n->whereClause = $6;
  					n->groupClause = $7;
  					n->havingClause = $8;
+ 					n->type = T_CubeSelectStmt; /*** DATACUBE ***/
  					$$ = (Node *)n;
  				}
  			| select_clause UNION opt_all select_clause
***************
*** 4471,4476 ****
--- 4487,4496 ----
  			| /*EMPTY*/								{ $$ = NIL; }
  		;
  
+ cube_group_clause:										/*** DATACUBE ***/
+ 			GROUP_P BY expr_list					{ $$ = $3; }	/*** DATACUBE ***/
+ 		;										/*** DATACUBE ***/
+ 
  having_clause:
  			HAVING a_expr							{ $$ = $2; }
  			| /*EMPTY*/								{ $$ = NULL; }
***************
*** 6995,7000 ****
--- 7015,7021 ----
  			| COPY
  			| CREATEDB
  			| CREATEUSER
+ 			| CUBE		/*** DATACUBE ***/
  			| CURSOR
  			| CYCLE
  			| DATABASE
*** ./src/backend/parser/keywords.c.orig	2003-06-30 00:49:12.000000000 +0530
--- ./src/backend/parser/keywords.c	2003-06-30 00:51:47.000000000 +0530
***************
*** 88,93 ****
--- 88,94 ----
  	{"createdb", CREATEDB},
  	{"createuser", CREATEUSER},
  	{"cross", CROSS},
+ 	{"cube", CUBE},			/*** DATACUBE ***/
  	{"current_date", CURRENT_DATE},
  	{"current_time", CURRENT_TIME},
  	{"current_timestamp", CURRENT_TIMESTAMP},
*** ./src/backend/tcop/Makefile.orig	2003-06-30 00:50:07.000000000 +0530
--- ./src/backend/tcop/Makefile	2003-06-30 00:52:28.000000000 +0530
***************
*** 12,18 ****
  top_builddir = ../../..
  include $(top_builddir)/src/Makefile.global
  
! OBJS= dest.o fastpath.o postgres.o pquery.o utility.o
  
  all: SUBSYS.o
  
--- 12,18 ----
  top_builddir = ../../..
  include $(top_builddir)/src/Makefile.global
  
! OBJS= dest.o fastpath.o postgres.o pquery.o utility.o datacube.o
  
  all: SUBSYS.o
  
*** ./src/backend/tcop/postgres.c.orig	2003-06-30 00:49:21.000000000 +0530
--- ./src/backend/tcop/postgres.c	2003-06-30 14:50:46.000000000 +0530
***************
*** 18,23 ****
--- 18,24 ----
   */
  
  #include "postgres.h"
+ #include "datacube.h"		/*** DATACUBE ***/
  
  #include <unistd.h>
  #include <signal.h>
***************
*** 69,74 ****
--- 70,82 ----
  extern int	optind;
  extern char *optarg;
  
+ /* ----------------
+  *		Variables for DATACUBE
+  * ----------------
+  */
+ int dropTable = 0;
+ char refTableName[50];
+ 
  char	   *debug_query_string; /* for pgmonitor and
  								 * log_min_error_statement */
  
***************
*** 637,642 ****
--- 645,651 ----
  			case T_UpdateStmt:
  				origCmdType = CMD_UPDATE;
  				break;
+ 			case T_CubeSelectStmt:		/*** DATACUBE ***/
  			case T_SelectStmt:
  				origCmdType = CMD_SELECT;
  				break;
***************
*** 803,808 ****
--- 812,852 ----
  				 */
  				Plan	   *plan;
  
+ 				/* -------------------------------------------------
+ 				 * DATACUBE:
+ 				 * 	Depending on whether the original query is a 
+ 				 * 	CUBE statement, the function constructCube is
+ 				 * 	called. 
+ 				 *
+ 				 * 	cubeTable is the temporary table generated using
+ 				 * 	the gettimeofday() function and is used for
+ 				 * 	displaying the final result.
+ 				 *
+ 				 * 	dropTable is a flag which decides whether or not
+ 				 * 	the temporary table is to be dropped.
+ 				 *
+ 				 * -------------------------------------------------
+ 				 */
+ 				
+ 				if(nodeTag(parsetree) == T_CubeSelectStmt && querytree->hasAggs == false)
+ 					elog(ERROR, "CUBE operator requires atleast one measure attribute [SUM, MAX, MIN, COUNT, AVG]");
+ 				else if (nodeTag(parsetree) == T_CubeSelectStmt && querytree->hasAggs == true)
+ 				{
+ 					char *cubeTable;
+ 					
+ 					cubeTable = palloc(sizeof(char)*30);
+ 					cubeTable = getTableName();
+ 				   strcpy(refTableName, cubeTable);
+ 					dropTable = 1;
+ 
+ 					constructCube (querytree, TopMemoryContext, cubeTable);
+ #ifdef MEMORY_CONTEXT_CHECKING
+ 	  		      MemoryContextCheck(TopMemoryContext);
+ #endif
+ 
+ 					return;
+ 				}
+ 
  				/*
  				 * Initialize snapshot state for query.  This has to
  				 * be done before running the planner, because it might
***************
*** 1924,1932 ****
--- 1968,1994 ----
  		QueryCancelPending = false;
  		CHECK_FOR_INTERRUPTS();
  
+ 		/* --------------------------------------------------
+ 		 * DATACUBE:
+ 		 * 	The following if condition checks whether the
+ 		 * 	flag dropTable is set and if it is, drops the
+ 		 * 	temporary table created for the CUBE.
+ 		 *
+ 		 * --------------------------------------------------
+ 		 */
+ 		if (dropTable == 1)
+ 		{
+ 				  dropCubeTable (refTableName, TopMemoryContext);
+ 				  dropTable = 0;
+ #ifdef MEMORY_CONTEXT_CHECKING
+ 			     MemoryContextCheck(TopMemoryContext);
+ #endif
+ 		}
+ 
  		/*
  		 * (3) read a command (loop blocks here)
  		 */
+ 
  		firstchar = ReadCommand(parser_input);
  
  		/*
***************
*** 2204,2209 ****
--- 2266,2272 ----
  			tag = "UPDATE";
  			break;
  
+ 		case T_CubeSelectStmt:		/*** DATACUBE ***/
  		case T_SelectStmt:
  			tag = "SELECT";
  			break;
*** ./src/backend/utils/adt/ruleutils.c.orig	2003-06-30 00:52:36.000000000 +0530
--- ./src/backend/utils/adt/ruleutils.c	2003-06-30 15:37:38.000000000 +0530
***************
*** 127,133 ****
  							 StringInfo buf);
  static void make_ruledef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc);
  static void make_viewdef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc);
! static void get_query_def(Query *query, StringInfo buf, List *parentnamespace,
  			  TupleDesc resultDesc);
  static void get_select_query_def(Query *query, deparse_context *context,
  					 TupleDesc resultDesc);
--- 127,133 ----
  							 StringInfo buf);
  static void make_ruledef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc);
  static void make_viewdef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc);
! void get_query_def(Query *query, StringInfo buf, List *parentnamespace,
  			  TupleDesc resultDesc);
  static void get_select_query_def(Query *query, deparse_context *context,
  					 TupleDesc resultDesc);
***************
*** 165,171 ****
  				 StringInfo buf);
  static bool tleIsArrayAssign(TargetEntry *tle);
  static char *generate_relation_name(Oid relid);
! static char *generate_function_name(Oid funcid, int nargs, Oid *argtypes);
  static char *generate_operator_name(Oid operid, Oid arg1, Oid arg2);
  static char *get_relid_attribute_name(Oid relid, AttrNumber attnum);
  
--- 165,171 ----
  				 StringInfo buf);
  static bool tleIsArrayAssign(TargetEntry *tle);
  static char *generate_relation_name(Oid relid);
! char *generate_function_name(Oid funcid, int nargs, Oid *argtypes);
  static char *generate_operator_name(Oid operid, Oid arg1, Oid arg2);
  static char *get_relid_attribute_name(Oid relid, AttrNumber attnum);
  
***************
*** 1262,1268 ****
   * the view represented by a SELECT query.
   * ----------
   */
! static void
  get_query_def(Query *query, StringInfo buf, List *parentnamespace,
  			  TupleDesc resultDesc)
  {
--- 1262,1268 ----
   * the view represented by a SELECT query.
   * ----------
   */
! void
  get_query_def(Query *query, StringInfo buf, List *parentnamespace,
  			  TupleDesc resultDesc)
  {
***************
*** 3146,3152 ****
   *
   * The result includes all necessary quoting and schema-prefixing.
   */
! static char *
  generate_function_name(Oid funcid, int nargs, Oid *argtypes)
  {
  	HeapTuple	proctup;
--- 3146,3152 ----
   *
   * The result includes all necessary quoting and schema-prefixing.
   */
! char *
  generate_function_name(Oid funcid, int nargs, Oid *argtypes)
  {
  	HeapTuple	proctup;
*** ./src/include/nodes/nodes.h.orig	2003-06-30 00:49:40.000000000 +0530
--- ./src/include/nodes/nodes.h	2003-06-30 00:50:40.000000000 +0530
***************
*** 203,208 ****
--- 203,209 ----
  	T_PrepareStmt,
  	T_ExecuteStmt,
  	T_DeallocateStmt,
+ 	T_CubeSelectStmt,		/*** DATACUBE ***/
  
  	T_A_Expr = 700,
  	T_ColumnRef,

datacube.ctext/plain; charset=US-ASCII; name=datacube.cDownload

/*********************************************************
 * datacube.c 
 * 		Utility to build a datacube in PostgreSQL. 
 *
 * 		Currently does not support nested CUBE queries.
 *
 ********************************************************/

#include <sys/time.h>
#include "postgres.h"
#include "datacube.h"
#include "utils/memutils.h"
#include "tcop/tcopprot.h"
#include "parser/parse_expr.h"

/* ----------------
 * Global variables
 * ----------------
 */
int m, Ind = 0, numAtts = 0, **Combinations;
char **c, **projectedAttrs, cubeTableName[30];

/* -----------------------------------------------
 * nCr
 * 
 * 	Function to calculate the number of ways of
 * 	selecting r objects from n objects. 
 *
 * ----------------------------------------------
 */
int nCr (int n, int r)
{
	int num = 1, den = 1, i;
	for (i = n; i >= n-r+1; i--)
		num *= i;
	for (i = r; i >= 1; i--)
		den *= i;
	
	return num/den;
}

/* ----------------------------------------------
 * getMaxCombinations
 *
 * 		This calls the function nCr for values
 * 		of r ranging from 0 to n. The total value
 * 		of their sum is the maximum number of
 * 		combinations possible for given values of
 * 		n and r.
 *
 * This value is required to decide the size of the
 * array required to store the different combinations.
 * 
 * ----------------------------------------------
 */
int getMaxCombinations(int n)
{
	int r = 0;
	int NumCombinations = 0;
	for (r = 0; r <= n; r++)
		NumCombinations += nCr(n,r);

	return NumCombinations;
}

/* ----------------------------------------------
 * Combination
 *
 * 	The actual function where the combinations
 * 	are built and stored into the global array
 * 	called Combinations. This array is later on
 * 	used to generate the actual query for the
 * 	DATACUBE.
 * 	
 * ----------------------------------------------
 */
void Combination(char **arr ,int n, int x, int r)
{
	int i, j;
	static int y = -1;
	y++;
	for(i = x; i < n; ++i)
	{
		c[y] = (char *) malloc( sizeof (arr[i]) );
		strcpy (c[y], arr[i]);
		
		if(n == m)
		{
			for (j = 0; j < r; j++)
				Combinations[Ind][getIndex(c[j])] = 1;
			Ind++;
		}
		else
			Combination (arr+1, n-1, i, r);
	}
	y--;
}

/* ----------------------------------------------
 * getIndex
 *
 * 	A small function to find the index of the
 * 	location of the generated combination in the
 * 	actual array.
 *
 * Never mind if you didnt understand what is
 * written above.
 *
 * ----------------------------------------------
 */
int getIndex(char *str)
{
	int i = 0;
	for( i = 0; i <= numAtts; i++)
		if ( strcmp(projectedAttrs[i],str) == 0 )
			return i;
	return -1;
}

/* ----------------------------------------------
 * generateCombinations
 *
 * 	A simple function which initialises the global
 * 	array Combinations
 * 	
 * ----------------------------------------------
 */
int generateCombinations (void)
{
	int length, i, j;
	
	int NumCombinations = getMaxCombinations(numAtts);
	
	Combinations = (int **)malloc(sizeof(int*) * NumCombinations);

	for(i = 0; i < NumCombinations; i++)
	{
		Combinations[i] = (int *) malloc(sizeof(int)*numAtts);
		for(j = 0; j < numAtts; j++)
			Combinations[i][j] = 0;
	}
	
	for(length = 0; length <= numAtts; length++)
	{
		m = numAtts-length+1;
		c = (char **) malloc(length*sizeof(char*));
		Combination(projectedAttrs,numAtts,0,length);
	}

	return NumCombinations;
}

/* ----------------------------------------------
 * generateQuery
 *
 * 	This is the function where the actual query
 * 	generation takes place. Depending on the
 * 	current combination in the Combinations array,
 * 	the query is generated taking the attribute
 * 	name if the value for that particular attribute
 * 	is 1 or using the same attribute as NULL if 0.
 *
 * ex: Suppose the attributes are type, store. Let
 * 	the current combination be 01. Then the targetlist
 * 	is NULL AS type, store. Similarly, for 10 it is
 * 	type, NULL AS store
 *
 * 	Returns the generated query to the main loop.
 *
 * ----------------------------------------------
 */
StringInfo generateQuery(int numSelect, char *aggregateAttrsList)
{
	StringInfo cubeQueryString = makeStringInfo();
	int i, j, k, flag = 0;

	for (i = 0; i < numSelect; i++)
	{
		flag = 0;
		appendStringInfo(cubeQueryString, "SELECT ");

		for (j = 0; j < numAtts; j++)
		{
			if (Combinations[i][j] == 1)
				appendStringInfo(cubeQueryString, "%s, ", projectedAttrs[j]);
			else if (Combinations[i][j] == 0)
				appendStringInfo(cubeQueryString, "NULL AS %s, ", projectedAttrs[j]);

		}
		appendStringInfo(cubeQueryString, "%s FROM %s", aggregateAttrsList, cubeTableName);

		for (j = 0; j < numAtts; j++)
			if (Combinations[i][j] == 1)
				flag = 1;
		if (flag != 0)
		{
			appendStringInfo(cubeQueryString, " GROUP BY ");
			for (j = 0; j < numAtts; j++)
			{
				flag = 0;
				if (Combinations[i][j] == 1)
				{
					appendStringInfo(cubeQueryString, projectedAttrs[j]);

					for (k = j+1; k < numAtts; k++)
						if (Combinations[i][k] == 1)
							flag = 1;
					if (flag == 1)
						appendStringInfo(cubeQueryString, ", ");
				}
			}
		}
		if (i != numSelect-1)
			appendStringInfo(cubeQueryString, " UNION ");
	}
	
	return cubeQueryString;
}

/* ----------------------------------------------
 * executeQuery
 * 
 * 	Routine to handle the execution of the query
 * 	whose query string is passed along with the
 * 	TopMemoryContext.
 *
 * 	This invokes a call to pg_exec_query_string.
 *
 * ----------------------------------------------
 */
void executeQuery(StringInfo buf, MemoryContext context)
{
	MemoryContext CubeContext;
	
	CubeContext = AllocSetContextCreate(context, "CubeContext", ALLOCSET_DEFAULT_MINSIZE, 
						ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE);

	pg_exec_query_string(buf, 2, CubeContext);
}

/* ----------------------------------------------
 * getTableName
 *
 * 	Our function needs to create a temporary table
 * 	to generate the final result. To avoid any
 * 	clashes/errors, a table name is generated using
 * 	the function gettimeofday().
 *
 * 	The name of the table so generated is returned.
 *
 * ----------------------------------------------
 */
char* getTableName(void)
{
	struct timeval tval;
	struct timezone tz;
	char *timestr;

	timestr = palloc(sizeof(char)*30);
	
	tz.tz_minuteswest = 0;
	tz.tz_dsttime = 0;

	if(gettimeofday(&tval, &tz) != 0)
			  elog(ERROR, "Could not get time of day");
	else
			  sprintf(timestr, "cube_%ld_%ld",tval.tv_sec,tval.tv_usec);

	return timestr;
}

/* ------------------------------------------------------------------
 * constructCube
 *
 * 	The main routine which handles the generation of the DATACUBE. 
 * 	This function is invoked from	postgres.c if it is found that the 
 * 	current query has a CubeSelectStmt (defined by us) tag.
 *
 * 	The main processes which take place are
 *
 * 		a) Retrieve the actual query string from the querytree using
 * 			the function get_query_def defined in ruleutils.c
 * 		b) Create a table with the generated name with the result of
 * 			the actual query WITHOUT the CUBE part.
 * 		c) Find out the attributes in the targetlist and extract the
 * 			aggregatelist into aggregateAttrsList.
 * 		d)	Generate the possible combinations of attributes possible
 * 			using the function generateCombinations().
 * 		e) Using these generate combinations, construct the actual 
 * 			query to be run on the table generated in (b).
 * 		f) Execute this query. If the original query had an 'into',
 * 			save the result of the CUBE query to that particular table.
 * 			Else display the result to the user.
 *
 * ------------------------------------------------------------------
 */
void constructCube(Query *querytree, MemoryContext context, char *cubeTable)
{
	StringInfo buf = makeStringInfo();
	StringInfo createTableQuery = makeStringInfo();
	StringInfo cubeQuery = makeStringInfo();
	StringInfo intoCubeQuery = makeStringInfo();

	List *l;

	char *aggregateAttrsList, intoRelation[30];
	int i = 0, count = 0, numSelect = 0;
	numAtts = 0, Ind = 0;

	//Retrieve query string from querytree
	get_query_def(querytree, buf, NIL, NULL);	

	strcpy (cubeTableName,cubeTable);
	
	appendStringInfo(createTableQuery, "CREATE TABLE %s AS %s", cubeTableName, buf->data);
//	printf("CREATE TABLE QUERY IS %s\n\n",createTableQuery->data);
	executeQuery(createTableQuery, context);
	
	foreach(l, querytree->targetList)
	{
		TargetEntry *tle = (TargetEntry *) lfirst(l);
		if(tle->resdom->ressortgroupref > 0)
			numAtts++;
	}

	//Allocate Memory for projectedAttrs
	projectedAttrs = palloc( sizeof(char *) * numAtts );
	aggregateAttrsList = palloc (sizeof (char) * 1000);

	foreach(l, querytree->targetList)
	{
		char *funcname;
		TargetEntry *tle = (TargetEntry *) lfirst(l);
		if(tle->resdom->ressortgroupref > 0)
		{
			projectedAttrs[i] = (char *)malloc(sizeof(char) * sizeof(tle->resdom->resname));
			projectedAttrs[i] = tle->resdom->resname;
			i++;
		}
		
		if(nodeTag(tle->expr) == T_Aggref ) 
		{
			Aggref *aggref = (Aggref *) tle->expr;
			Oid argtype = exprType(aggref->target);
			
			funcname = generate_function_name(aggref->aggfnoid, 1, &argtype);
			
			if(count == 0)
				sprintf(aggregateAttrsList,"%s(%s)", funcname,tle->resdom->resname);
			else
				sprintf(aggregateAttrsList,"%s, %s(%s)", aggregateAttrsList, funcname,tle->resdom->resname);
			count++;
		}
	}

	numSelect = generateCombinations();
	cubeQuery = generateQuery(numSelect, aggregateAttrsList);

	if (querytree->into != NULL)
	{
			  strcpy (intoRelation, querytree->into->relname);
			  intoCubeQuery->data = palloc(sizeof(char));
			  appendStringInfo(intoCubeQuery, "CREATE TABLE %s AS %s", intoRelation, cubeQuery->data);
//			  printf("CUBE QUERY IS %s\n",intoCubeQuery->data);
			  executeQuery(intoCubeQuery, context);
	}
	else
//	{
//			  printf("CUBE QUERY IS %s\n",cubeQuery->data);
			  executeQuery(cubeQuery, context);
//	}
			
}

/* ----------------------------------------------
 * dropCubeTable
 *
 * 	This routine is used to drop the temporary table
 * 	created using getTableName().
 *
 * ----------------------------------------------
 */
void dropCubeTable (char *tableName, MemoryContext context)
{
		  StringInfo dropTableQuery = makeStringInfo();

		  appendStringInfo(dropTableQuery, "DROP TABLE %s", tableName);
//		  printf("dropTableQuery is %s\n\n",dropTableQuery->data);
		  executeQuery(dropTableQuery, context);
}

datacube.htext/plain; charset=US-ASCII; name=datacube.hDownload

README.datacubetext/plain; charset=ISO-8859-1; name=README.datacubeDownload

Andrew Dunstan

andrew@dunslane.net

over 22 years ago

In reply to: sumit (#1)

Re: Patch for adding DATACUBE operator

I'm curious to know why you implement this as a union of queries, since,
unless my understanding is badly awry, you have all the information
necessary for the ALL rows by running the base (ie. without cube) query, Why
not just run that query and then add the ALL rows from examining the
results? ISTM that would be more efficient, since the summary table is in
most real world situations likely to be far, far smaller than the base
table.

andrew

----- Original Message -----
From: "sumit" <sumit@gdit.iiit.net>
To: <pgsql-patches@postgresql.org>
Sent: Monday, June 30, 2003 6:04 AM
Subject: [PATCHES] Patch for adding DATACUBE operator

Hi!

We have added the CUBE operator for PostgreSQL. Please find the
attached patch.

Another thing to note is that the file datacube.c should
be placed in src/backend/tcop/ and datacube.h should be in src/include.

The syntax of the query is

SELECT <field list><aggregate list>
INTO <destination table>
FROM <table expression>
WHERE <search condition>
GROUP BY <aggregate list>
HAVING <search condition>
WITH CUBE;

An example along with the output is provided in the
README.datacube file. Kindly have a look. Let us know your response.

Srikanth M
Sumit Kumar Mukherjee

----------------------------------------------------------------------------
----

Show quoted text

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

sumit

sumit@gdit.iiit.net

over 22 years ago

In reply to: Andrew Dunstan (#2)

Re: Patch for adding DATACUBE operator

Well, it is true that the information for the ALL rows is present
in the base query. But then adding the ALL rows directly after examining
the result, we think, is not as efficient as it sounds.

Consider the situation you are talking about. Suppose we decide to
add the ALL rows directly to the base result. A major drawback would be
that we would have to execute the whole WHERE clause and/or joins over the
original relations, which, in case of large databases, would cause a lot of
overhead. Whereas, executing the UNION of SELECT queries on the base table
obtained from the base query (WITHOUT CUBE) does not have to handle these
situations (WHERE clause, JOIN cases). More importantly, this is faster
(due to the relatively smaller size).

It is true that a summary table is much smaller than the base
table. But a DATACUBE (as per definition) takes into account even the base
query (the one WITHOUT CUBE). We have referred some sites and articles
before planning the code. Since we are building a DATACUBE all the
information is required.

Hope we have made ourselves clear.

Sumit
Srikanth

On Mon, 30 Jun 2003, Andrew Dunstan wrote:

Show quoted text

I'm curious to know why you implement this as a union of queries, since,
unless my understanding is badly awry, you have all the information
necessary for the ALL rows by running the base (ie. without cube) query, Why
not just run that query and then add the ALL rows from examining the
results? ISTM that would be more efficient, since the summary table is in
most real world situations likely to be far, far smaller than the base
table.

andrew

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: sumit (#1)

Re: Patch for adding DATACUBE operator

sumit <sumit@gdit.iiit.net> writes:

In keeping with this, we first define the syntax of a CUBE statement as follows:

SELECT <field list><aggregate list>
INTO <destination table>
FROM <table expression>
WHERE <search condition>
GROUP BY <aggregate list>
HAVING <search condition>
WITH CUBE;

SQL99 defines a GROUP BY CUBE(...) operation, which seems to do the same
thing as this. Why are you using a nonstandard syntax?

regards, tom lane

sumit

sumit@gdit.iiit.net

over 22 years ago

In reply to: Tom Lane (#4)

Re: Patch for adding DATACUBE operator

Thanks for letting know. Could you also let me know the exact
syntax, I mean, we are not sure whether GROUP BY CUBE(...) is followed by
a HAVING clause. Kindly inform us soon so that we can make the changes and
send you the updated patch and files.

Sumit

On Mon, 30 Jun 2003, Tom Lane wrote:

Show quoted text

SQL99 defines a GROUP BY CUBE(...) operation, which seems to do the same
thing as this. Why are you using a nonstandard syntax?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: sumit (#5)

Re: Patch for adding DATACUBE operator

Moving to pgsql-hackers@

On Mon, 2003-06-30 at 12:03, sumit wrote:

Thanks for letting know. Could you also let me know the exact
syntax, I mean, we are not sure whether GROUP BY CUBE(...) is followed by
a HAVING clause. Kindly inform us soon so that we can make the changes and
send you the updated patch and files.

Sources of the SQL Spec
http://developer.postgresql.org/readtext.php?src/FAQ/FAQ_DEV.html+Developers-FAQ#1.12

The relevent syntax appears to be in section 7.9, and is a part of the
GROUP BY clause, so is potentially followed by a HAVING clause.

7.9 <group by clause>

Function

Specify a grouped table derived by the application of the
<group by
clause> to the result of the previously specified clause.

Format

<group by clause> ::=
GROUP BY <grouping specification>

<rollup list> ::=
ROLLUP <left paren> <grouping column reference list>
<right paren>

<grouping sets list> ::=
GROUPING SETS <left paren> <grouping set list> <right
paren>

<grouping set list> ::=
<grouping set> [ { <comma> <grouping set> }... ]

<grouping column reference list> ::=
<grouping column reference>
[ { <comma> <grouping column reference> }... ]

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: sumit (#5)

Re: Patch for adding DATACUBE operator

sumit <sumit@gdit.iiit.net> writes:

Thanks for letting know. Could you also let me know the exact
syntax, I mean, we are not sure whether GROUP BY CUBE(...) is followed by
a HAVING clause.

Better look at the SQL99 spec for yourself. There's a link to a draft
version in our developer's FAQ, IIRC.

regards, tom lane