add more frame types in window functions (ROWS)

Started by Hitoshi Haradaabout 16 years ago21 messages
#1Hitoshi Harada
umi.tanuki@gmail.com
1 attachment(s)

Attached is the patch against HEAD to support more frame types in
window functions, including these frame types:

- ROWS BETWEEN s PRECEDING AND e PRECEDING
- ROWS BETWEEN s PRECEDING AND CURRENT ROW
- ROWS BETWEEN s PRECEDING AND e FOLLOWING
- ROWS BETWEEN s PRECEDING AND UNBOUNDED FOLLOWING
- ROWS BETWEEN CURRENT ROW AND e FOLLOWING
- ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- ROWS BETWEEN s FOLLOWING AND e FOLLOWING
- ROWS BETWEEN s FOLLOWING AND UNBOUNDED FOLLOWING
- RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- ROWS s PRECEDING

which means that for ROWS types we now support almost all types but
for RANGE types we don't have "value PRECEDING" / "value FOLLOWING".
I'm planning to implement them until CommitFest:2010-01 so this is
"Request for Review" phase though I've arranged the patch to be
committable.

Aggregate cache mechanism is sometimes cleared as discussed the other
day. But it should be kept that the original cache mechanism for basic
(i.e. already implemented) frame types.

Some points to be reviewed are:

- Added WindowFrameDef in parsenode.h
- Is A_Const member for startOffset / endOffset is appropriate?
- Are those data types (including gram.y) well designed?
- For basic frame types, are aggregates still optimized as before?
- Added regression tests but enough?
- All error case covered? For example: ROWS s FOLLOWING AND e PRECEDING
- Added members to WindowAggState to track more information of frame
types, but isn't there more efficient way?
- Not modified docs yet

Regards,

--
Hitoshi Harada

Attachments:

rows_frame_types.20091114.patchapplication/octet-stream; name=rows_frame_types.20091114.patchDownload
*** a/src/backend/executor/nodeWindowAgg.c
--- b/src/backend/executor/nodeWindowAgg.c
***************
*** 164,170 **** static void spool_tuples(WindowAggState *winstate, int64 pos);
  static void release_partition(WindowAggState *winstate);
  
  static bool row_is_in_frame(WindowAggState *winstate, int64 pos,
! 				TupleTableSlot *slot);
  static void update_frametailpos(WindowObject winobj, TupleTableSlot *slot);
  
  static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate,
--- 164,171 ----
  static void release_partition(WindowAggState *winstate);
  
  static bool row_is_in_frame(WindowAggState *winstate, int64 pos,
! 							TupleTableSlot *slot);
! static void update_frameheadpos(WindowObject winobj, TupleTableSlot *slot);
  static void update_frametailpos(WindowObject winobj, TupleTableSlot *slot);
  
  static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate,
***************
*** 391,396 **** eval_windowaggregates(WindowAggState *winstate)
--- 392,398 ----
  	MemoryContext oldContext;
  	ExprContext *econtext;
  	TupleTableSlot *agg_row_slot;
+ 	WindowObject agg_winobj;
  
  	numaggs = winstate->numaggs;
  	if (numaggs == 0)
***************
*** 398,405 **** eval_windowaggregates(WindowAggState *winstate)
--- 400,410 ----
  
  	/* final output execution is in ps_ExprContext */
  	econtext = winstate->ss.ps.ps_ExprContext;
+ 	agg_winobj = winstate->agg_winobj;
+ 	agg_row_slot = winstate->agg_row_slot;
  
  	/*
+ 	 * XXX:
  	 * Currently, we support only a subset of the SQL-standard window framing
  	 * rules.  In all the supported cases, the window frame always consists of
  	 * a contiguous group of rows extending forward from the start of the
***************
*** 434,466 **** eval_windowaggregates(WindowAggState *winstate)
  	 */
  
  	/*
! 	 * If we've already aggregated up through current row, reuse the saved
! 	 * result values.  NOTE: this test works for the currently supported
! 	 * framing rules, but will need fixing when more are added.
  	 */
! 	if (winstate->aggregatedupto > winstate->currentpos)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			econtext->ecxt_aggvalues[wfuncno] = peraggstate->resultValue;
! 			econtext->ecxt_aggnulls[wfuncno] = peraggstate->resultValueIsNull;
  		}
- 		return;
  	}
  
! 	/* Initialize aggregates on first call for partition */
! 	if (winstate->currentpos == 0)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			initialize_windowaggregate(winstate,
! 									   &winstate->perfunc[wfuncno],
! 									   peraggstate);
  		}
  	}
  
  	/*
--- 439,508 ----
  	 */
  
  	/*
! 	 * First, update frame positions. Aggregate framing detection is similar
! 	 * to one in window function APIs, but it optimizes it's own flow to
! 	 * detect frame head and tail, which finds the tail during aggregation
! 	 * rather than before starting. This results in tuplestore buffering
! 	 * be minimized in such simple frame like RANGE BETWEEN UNBOUNDED PRECEDING
! 	 * AND CURRENT ROW (ie default when specifying only ORDER BY clause.)
! 	 */
! 	update_frameheadpos(agg_winobj, winstate->agg_row_slot);
! 
! 	/*
! 	 * Initialize aggregates on first call for partition or when some rows
! 	 * were off from frame since the last call.
  	 */
! 	if (winstate->currentpos == 0 ||
! 		winstate->frameheadpos > winstate->aggregatedbase)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			initialize_windowaggregate(winstate,
! 									   &winstate->perfunc[wfuncno],
! 									   peraggstate);
! 		}
! 
! 		winstate->aggregatedbase = winstate->frameheadpos;
! 
! 		/*
! 		 * When frame head is advancing, aggregate mark position should be
! 		 * updated so that tuplestore can discard unnecessary rows.
! 		 * When the condition is false (i.e. FRAMEOPTION_START_UNBOUNDED_PRECEDING),
! 		 * mark_ptr is invalid and we don't care about mark position.
! 		 */
! 		if (agg_winobj->markptr > 0)
! 		{
! 			if (winstate->frameheadpos < winstate->currentpos)
! 				WinSetMarkPosition(agg_winobj, winstate->frameheadpos - 1);
! 
! 			winstate->aggregatedupto = winstate->aggregatedbase;
! 			/*
! 			 * Discard current state, so that subsequent loop starts with null slot
! 			 */
! 			if (!TupIsNull(agg_row_slot))
! 				ExecClearTuple(agg_row_slot);
  		}
  	}
  
! 	/*
! 	 * If we've already aggregated up through current row and frame covers
! 	 * current row, reuse the saved result values. NOTE: this test should get
! 	 * more efficient in some framing modes.
! 	 */
! 	if (winstate->frametail_stable &&
! 		winstate->aggregatedbase <= winstate->currentpos &&
! 		winstate->aggregatedupto > winstate->currentpos)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			econtext->ecxt_aggvalues[wfuncno] = peraggstate->resultValue;
! 			econtext->ecxt_aggnulls[wfuncno] = peraggstate->resultValueIsNull;
  		}
+ 		return;
  	}
  
  	/*
***************
*** 470,486 **** eval_windowaggregates(WindowAggState *winstate)
  	 * at position aggregatedupto.	The agg_ptr read pointer must always point
  	 * to the next row to read into agg_row_slot.
  	 */
- 	agg_row_slot = winstate->agg_row_slot;
  	for (;;)
  	{
  		/* Fetch next row if we didn't already */
  		if (TupIsNull(agg_row_slot))
  		{
! 			spool_tuples(winstate, winstate->aggregatedupto);
! 			tuplestore_select_read_pointer(winstate->buffer,
! 										   winstate->agg_ptr);
! 			if (!tuplestore_gettupleslot(winstate->buffer, true, true,
! 										 agg_row_slot))
  				break;			/* must be end of partition */
  		}
  
--- 512,530 ----
  	 * at position aggregatedupto.	The agg_ptr read pointer must always point
  	 * to the next row to read into agg_row_slot.
  	 */
  	for (;;)
  	{
  		/* Fetch next row if we didn't already */
  		if (TupIsNull(agg_row_slot))
  		{
! 			/*
! 			 * NOTE:
! 			 * Here window_gettupleslot() tends to fetch current row
! 			 * (but not always). We should optimize window_gettupleslot()
! 			 * on this case, where reads ahead and back currently.
! 			 */
! 			if(!window_gettupleslot(agg_winobj, winstate->aggregatedupto,
! 									agg_row_slot))
  				break;			/* must be end of partition */
  		}
  
***************
*** 627,634 **** begin_partition(WindowAggState *winstate)
  	winstate->frametail_valid = false;
  	winstate->spooled_rows = 0;
  	winstate->currentpos = 0;
  	winstate->frametailpos = -1;
! 	winstate->aggregatedupto = 0;
  	ExecClearTuple(winstate->agg_row_slot);
  
  	/*
--- 671,679 ----
  	winstate->frametail_valid = false;
  	winstate->spooled_rows = 0;
  	winstate->currentpos = 0;
+ 	winstate->frameheadpos = -1;
  	winstate->frametailpos = -1;
! 
  	ExecClearTuple(winstate->agg_row_slot);
  
  	/*
***************
*** 665,671 **** begin_partition(WindowAggState *winstate)
  
  	/* create a read pointer for aggregates, if needed */
  	if (winstate->numaggs > 0)
! 		winstate->agg_ptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
  
  	/* create mark and read pointers for each real window function */
  	for (i = 0; i < numfuncs; i++)
--- 710,738 ----
  
  	/* create a read pointer for aggregates, if needed */
  	if (winstate->numaggs > 0)
! 	{
! 		WindowObject agg_winobj = winstate->agg_winobj;
! 		int			readptr_flag = 0;
! 		WindowAgg  *node = (WindowAgg *) winstate->ss.ps.plan;
! 		int			frameOptions = node->frameOptions;
! 
! 		/* create a read pointer for aggregate marking if the frame head may move */
! 		if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
! 		{
! 			agg_winobj->markptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
! 			readptr_flag = EXEC_FLAG_BACKWARD;
! //			winstate->agg_base_ptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
! 		}
! 
! 		agg_winobj->readptr = tuplestore_alloc_read_pointer(winstate->buffer,
! 															readptr_flag);
! //		winstate->agg_ptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
! 
! 		agg_winobj->markpos = -1;
! 		agg_winobj->seekpos = -1;
! 		winstate->aggregatedbase = 0;
! 		winstate->aggregatedupto = 0;
! 	}
  
  	/* create mark and read pointers for each real window function */
  	for (i = 0; i < numfuncs; i++)
***************
*** 814,842 **** row_is_in_frame(WindowAggState *winstate, int64 pos, TupleTableSlot *slot)
  
  	Assert(pos >= 0);			/* else caller error */
  
! 	/* We only support frame start mode UNBOUNDED PRECEDING for now */
! 	Assert(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING);
  
  	/* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */
  	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  		return true;
  
! 	/* Else frame tail mode must be CURRENT ROW */
! 	Assert(frameOptions & FRAMEOPTION_END_CURRENT_ROW);
  
! 	/* if row is current row or a predecessor, it must be in frame */
! 	if (pos <= winstate->currentpos)
! 		return true;
  
! 	/* In ROWS mode, *only* such rows are in frame */
! 	if (frameOptions & FRAMEOPTION_ROWS)
! 		return false;
  
! 	/* Else must be RANGE mode */
! 	Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 	/* In frame iff it's a peer of current row */
! 	return are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot);
  }
  
  /*
--- 881,1073 ----
  
  	Assert(pos >= 0);			/* else caller error */
  
! 	if (frameOptions & FRAMEOPTION_START_CURRENT_ROW)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			if (pos < winstate->currentpos)
! 				return false;
! 		}
! 		else
! 		{
! 			Assert(frameOptions & FRAMEOPTION_RANGE);
! 
! 			/* preceding row that is not peer is out of frame */
! 			if (pos < winstate->currentpos &&
! 				!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot))
! 				return false;
! 		}
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_START_VALUE_PRECEDING)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->startOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			if (pos < winstate->currentpos - intVal(&(offset->val)))
! 				return false;
! 		}
! 		else
! 			elog(ERROR, "RANGE BETWEEN n PRECEDING AND ... is not supported so far");
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->startOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			if (pos < winstate->currentpos + intVal(&(offset->val)))
! 				return false;
! 		}
! 		else
! 			elog(ERROR, "RANGE BETWEEN n FOLLOWING AND ... is not supported so far");
! 	}
! 
! 	/* seems ok for starting bound */
  
  	/* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */
  	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  		return true;
  
! 	if (frameOptions & FRAMEOPTION_END_CURRENT_ROW)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			return (pos <= winstate->currentpos);
! 		}
! 		else
! 		{
! 			Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 			/* preceding row or peer is in frame */
! 			return pos <= winstate->currentpos ||
! 				are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot);
! 		}
! 	}
  
! 	if (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->endOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			return (pos <= winstate->currentpos - intVal(&(offset->val)));
! 		}
! 		else
! 			elog(ERROR, "RANGE BETWEEN ... AND n PRECEDING is not supported so far");
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_END_VALUE_FOLLOWING)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->endOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			return (pos <= winstate->currentpos + intVal(&(offset->val)));
! 		}
! 		else
! 			elog(ERROR, "RANGE BETWEEN ... AND n FOLLOWING is not supported so far");
! 	}
! 
! 	/* It is not supposed to reach here */
! 	elog(ERROR, "invalid frame type");
! 
! 	return false; /* keep compiler quiet */
! }
! 
! static void
! update_frameheadpos(WindowObject winobj, TupleTableSlot *slot)
! {
! 	WindowAggState *winstate = winobj->winstate;
! 	WindowAgg  *node = (WindowAgg *) winstate->ss.ps.plan;
! 	int			frameOptions = node->frameOptions;
! 
! 	if (winstate->framehead_valid)
! 		return;
! 
! 	if (frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
! 	{
! 		winstate->frameheadpos = 0;
! 		winstate->framehead_valid = true;
! 		return;
! 	}
  
! 	if (frameOptions & FRAMEOPTION_START_CURRENT_ROW)
! 	{
! 		int64		pos = winstate->currentpos;
  
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			winstate->frameheadpos = pos;
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		Assert(frameOptions & FRAMEOPTION_RANGE);
! 
! 		if (node->ordNumCols == 0)
! 		{
! 			winstate->frameheadpos = 0;
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		/*
! 		 * In RANGE START_CURRENT mode, frame head is
! 		 * the head row of all peers.
! 		 */
! 		for(;;)
! 		{
! 			if (!window_gettupleslot(winobj, pos, slot))
! 				break;
! 			if (!are_peers(winstate, slot,
! 						   winstate->ss.ss_ScanTupleSlot))
! 				break;
! 			pos--;
! 		}
! 		winstate->frameheadpos = pos + 1;
! 		winstate->framehead_valid = true;
! 		return;
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_START_VALUE_PRECEDING)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->startOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			winstate->frameheadpos = winstate->currentpos - intVal(&(offset->val));
! 			if (winstate->frameheadpos < 0)
! 				winstate->frameheadpos = 0;
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		elog(ERROR, "RANGE BETWEEN n PRECEDING AND ... is not supported so far");
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->startOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			winstate->frameheadpos = winstate->currentpos + intVal(&(offset->val));
! 
! 			/* we have to know end of the partition to cut off overflow */
! 			spool_tuples(winstate, -1);
! 			if (winstate->frameheadpos >= winstate->spooled_rows)
! 				winstate->frameheadpos = winstate->spooled_rows - 1;
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		elog(ERROR, "RANGE BETWEEN n FOLLOWING AND ... is not supported so far");
! 	}
  }
  
  /*
***************
*** 858,866 **** update_frametailpos(WindowObject winobj, TupleTableSlot *slot)
  	if (winstate->frametail_valid)
  		return;					/* already known for current row */
  
- 	/* We only support frame start mode UNBOUNDED PRECEDING for now */
- 	Assert(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING);
- 
  	/* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */
  	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  	{
--- 1089,1094 ----
***************
*** 870,916 **** update_frametailpos(WindowObject winobj, TupleTableSlot *slot)
  		return;
  	}
  
! 	/* Else frame tail mode must be CURRENT ROW */
! 	Assert(frameOptions & FRAMEOPTION_END_CURRENT_ROW);
! 
! 	/* In ROWS mode, exactly the rows up to current are in frame */
! 	if (frameOptions & FRAMEOPTION_ROWS)
  	{
! 		winstate->frametailpos = winstate->currentpos;
! 		winstate->frametail_valid = true;
! 		return;
! 	}
  
! 	/* Else must be RANGE mode */
! 	Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 	/* If no ORDER BY, all rows are peers with each other */
! 	if (node->ordNumCols == 0)
  	{
! 		spool_tuples(winstate, -1);
! 		winstate->frametailpos = winstate->spooled_rows - 1;
! 		winstate->frametail_valid = true;
! 		return;
  	}
  
! 	/*
! 	 * Else we have to search for the first non-peer of the current row. We
! 	 * assume the current value of frametailpos is a lower bound on the
! 	 * possible frame tail location, ie, frame tail never goes backward, and
! 	 * that currentpos is also a lower bound, ie, current row is always in
! 	 * frame.
! 	 */
! 	ftnext = Max(winstate->frametailpos, winstate->currentpos) + 1;
! 	for (;;)
  	{
! 		if (!window_gettupleslot(winobj, ftnext, slot))
! 			break;				/* end of partition */
! 		if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot))
! 			break;				/* not peer of current row */
! 		ftnext++;
  	}
- 	winstate->frametailpos = ftnext - 1;
- 	winstate->frametail_valid = true;
  }
  
  
--- 1098,1185 ----
  		return;
  	}
  
! 	if (frameOptions & FRAMEOPTION_END_CURRENT_ROW)
  	{
! 		/* In ROWS mode, exactly the rows up to current are in frame */
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			winstate->frametailpos = winstate->currentpos;
! 			winstate->frametail_valid = true;
! 			return;
! 		}
! 		else
! 		{
! 			/* Else must be RANGE mode */
! 			Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 			/* If no ORDER BY, all rows are peers with each other */
! 			if (node->ordNumCols == 0)
! 			{
! 				spool_tuples(winstate, -1);
! 				winstate->frametailpos = winstate->spooled_rows - 1;
! 				winstate->frametail_valid = true;
! 				return;
! 			}
  
! 			/*
! 			 * Else we have to search for the first non-peer of the current row. We
! 			 * assume the current value of frametailpos is a lower bound on the
! 			 * possible frame tail location, ie, frame tail never goes backward, and
! 			 * that currentpos is also a lower bound, ie, current row is always in
! 			 * frame.
! 			 */
! 			ftnext = Max(winstate->frametailpos, winstate->currentpos) + 1;
! 			for (;;)
! 			{
! 				if (!window_gettupleslot(winobj, ftnext, slot))
! 					break;				/* end of partition */
! 				if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot))
! 					break;				/* not peer of current row */
! 				ftnext++;
! 			}
! 			winstate->frametailpos = ftnext - 1;
! 			winstate->frametail_valid = true;
! 			return;
! 		}
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING)
  	{
! 		/* In ROWS mode, frame boundary phisically n before current */
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->endOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			winstate->frametailpos = winstate->currentpos - intVal(&(offset->val));
! 			if(winstate->frametailpos < 0)
! 				winstate->frametailpos = 0;
! 			winstate->frametail_valid = true;
! 			return;
! 		}
! 		else
! 			elog(ERROR, "RANGE BETWEEN ... AND n PRECEDING is not supported for now");
  	}
  
! 	if (frameOptions & FRAMEOPTION_END_VALUE_FOLLOWING)
  	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			A_Const	   *offset = (A_Const *) node->endOffset;
! 
! 			Assert(offset != NULL && IsA(&(offset->val), Integer));
! 			winstate->frametailpos = winstate->currentpos + intVal(&(offset->val));
! 
! 			/* we have to know end of the partition to cut off overflow */
! 			spool_tuples(winstate, -1);
! 			if (winstate->frametailpos >= winstate->spooled_rows)
! 				winstate->frametailpos = winstate->spooled_rows - 1;
! 			winstate->frametail_valid = true;
! 			return;
! 		}
! 		else
! 			elog(ERROR, "RANGE BETWEEN ... AND n FOLLOWING is not supported for now");
  	}
  }
  
  
***************
*** 964,969 **** restart:
--- 1233,1239 ----
  	{
  		/* Advance current row within partition */
  		winstate->currentpos++;
+ 		winstate->framehead_valid = false;
  		/* This might mean that the frame tail moves, too */
  		winstate->frametail_valid = false;
  	}
***************
*** 1264,1272 **** ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
--- 1534,1558 ----
  	winstate->numfuncs = wfuncno + 1;
  	winstate->numaggs = aggno + 1;
  
+ 	if (winstate->numaggs > 0)
+ 	{
+ 		WindowObject agg_winobj = makeNode(WindowObjectData);
+ 
+ 		agg_winobj->winstate = winstate;
+ 		agg_winobj->argstates = NULL;
+ 		agg_winobj->localmem = NULL;
+ 		/* make sure markptr = -1 to invalidate. it may not be used */
+ 		agg_winobj->markptr = -1;
+ 		agg_winobj->readptr = -1;
+ 		winstate->agg_winobj = agg_winobj;
+ 	}
+ 
  	winstate->partition_spooled = false;
  	winstate->more_partitions = false;
  
+ 	winstate->frametail_stable = !(node->frameOptions &
+ 								  (FRAMEOPTION_END_VALUE_PRECEDING |
+ 								   FRAMEOPTION_END_VALUE_FOLLOWING));
  	return winstate;
  }
  
***************
*** 1749,1754 **** WinGetFuncArgInPartition(WindowObject winobj, int argno,
--- 2035,2042 ----
  						 bool *isnull, bool *isout)
  {
  	WindowAggState *winstate;
+ 	WindowAgg  *node;
+ 	int			frameOptions;
  	ExprContext *econtext;
  	TupleTableSlot *slot;
  	bool		gottuple;
***************
*** 1756,1761 **** WinGetFuncArgInPartition(WindowObject winobj, int argno,
--- 2044,2051 ----
  
  	Assert(WindowObjectIsValid(winobj));
  	winstate = winobj->winstate;
+ 	node = (WindowAgg *) winstate->ss.ps.plan;
+ 	frameOptions = node->frameOptions;
  	econtext = winstate->ss.ps.ps_ExprContext;
  	slot = winstate->temp_slot_1;
  
***************
*** 1791,1797 **** WinGetFuncArgInPartition(WindowObject winobj, int argno,
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 			WinSetMarkPosition(winobj, abs_pos);
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
--- 2081,2103 ----
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 		{
! 			int64	mark_pos = abs_pos;
! 
! 			/*
! 			 * In moving RANGE frame mode, we must keep mark at
! 			 * frameheadpos - 1, since that row is to be checked
! 			 * to find frame starting boundary.
! 			 */
! 			if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) &&
! 				frameOptions & FRAMEOPTION_RANGE)
! 			{
! 				update_frameheadpos(winobj, winstate->temp_slot_2);
! 				if (mark_pos >= winstate->frameheadpos)
! 					mark_pos = winstate->frameheadpos - 1;
! 			}
! 			WinSetMarkPosition(winobj, mark_pos);
! 		}
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
***************
*** 1822,1827 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
--- 2128,2135 ----
  					 bool *isnull, bool *isout)
  {
  	WindowAggState *winstate;
+ 	WindowAgg  *node;
+ 	int			frameOptions;
  	ExprContext *econtext;
  	TupleTableSlot *slot;
  	bool		gottuple;
***************
*** 1829,1834 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
--- 2137,2144 ----
  
  	Assert(WindowObjectIsValid(winobj));
  	winstate = winobj->winstate;
+ 	node = (WindowAgg *) winstate->ss.ps.plan;
+ 	frameOptions = node->frameOptions;
  	econtext = winstate->ss.ps.ps_ExprContext;
  	slot = winstate->temp_slot_1;
  
***************
*** 1838,1844 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
  			abs_pos = winstate->currentpos + relpos;
  			break;
  		case WINDOW_SEEK_HEAD:
! 			abs_pos = relpos;
  			break;
  		case WINDOW_SEEK_TAIL:
  			update_frametailpos(winobj, slot);
--- 2148,2155 ----
  			abs_pos = winstate->currentpos + relpos;
  			break;
  		case WINDOW_SEEK_HEAD:
! 			update_frameheadpos(winobj, slot);
! 			abs_pos = winstate->frameheadpos + relpos;
  			break;
  		case WINDOW_SEEK_TAIL:
  			update_frametailpos(winobj, slot);
***************
*** 1865,1872 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
  	{
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 			WinSetMarkPosition(winobj, abs_pos);
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
--- 2176,2200 ----
  	{
  		if (isout)
  			*isout = false;
+ 
  		if (set_mark)
! 		{
! 			int64		mark_pos = abs_pos;
! 
! 			/*
! 			 * In moving RANGE frame mode, we must keep mark at
! 			 * frameheadpos - 1, since that row is to be checked
! 			 * to find frame starting boundary.
! 			 */
! 			if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) &&
! 				frameOptions & FRAMEOPTION_RANGE)
! 			{
! 				update_frameheadpos(winobj, winstate->temp_slot_2);
! 				if (mark_pos >= winstate->frameheadpos)
! 					mark_pos = winstate->frameheadpos - 1;
! 			}
! 			WinSetMarkPosition(winobj, mark_pos);
! 		}
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 718,723 **** _copyWindowAgg(WindowAgg *from)
--- 718,725 ----
  		COPY_POINTER_FIELD(ordOperators, from->ordNumCols * sizeof(Oid));
  	}
  	COPY_SCALAR_FIELD(frameOptions);
+ 	COPY_NODE_FIELD(startOffset);
+ 	COPY_NODE_FIELD(endOffset);
  
  	return newnode;
  }
***************
*** 1844,1850 **** _copyWindowClause(WindowClause *from)
  	COPY_STRING_FIELD(refname);
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
! 	COPY_SCALAR_FIELD(frameOptions);
  	COPY_SCALAR_FIELD(winref);
  	COPY_SCALAR_FIELD(copiedOrder);
  
--- 1846,1852 ----
  	COPY_STRING_FIELD(refname);
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
! 	COPY_NODE_FIELD(frame);
  	COPY_SCALAR_FIELD(winref);
  	COPY_SCALAR_FIELD(copiedOrder);
  
***************
*** 2062,2067 **** _copySortBy(SortBy *from)
--- 2064,2082 ----
  	return newnode;
  }
  
+ static WindowFrameDef *
+ _copyWindowFrameDef(WindowFrameDef *from)
+ {
+ 	WindowFrameDef *newnode = makeNode(WindowFrameDef);
+ 
+ 	COPY_SCALAR_FIELD(options);
+ 	COPY_NODE_FIELD(startOffset);
+ 	COPY_NODE_FIELD(endOffset);
+ 	COPY_LOCATION_FIELD(location);
+ 
+ 	return newnode;
+ }
+ 
  static WindowDef *
  _copyWindowDef(WindowDef *from)
  {
***************
*** 2071,2077 **** _copyWindowDef(WindowDef *from)
  	COPY_STRING_FIELD(refname);
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
! 	COPY_SCALAR_FIELD(frameOptions);
  	COPY_LOCATION_FIELD(location);
  
  	return newnode;
--- 2086,2092 ----
  	COPY_STRING_FIELD(refname);
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
! 	COPY_NODE_FIELD(frame);
  	COPY_LOCATION_FIELD(location);
  
  	return newnode;
***************
*** 4154,4159 **** copyObject(void *from)
--- 4169,4177 ----
  		case T_SortBy:
  			retval = _copySortBy(from);
  			break;
+ 		case T_WindowFrameDef:
+ 			retval = _copyWindowFrameDef(from);
+ 			break;
  		case T_WindowDef:
  			retval = _copyWindowDef(from);
  			break;
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 2035,2047 **** _equalSortBy(SortBy *a, SortBy *b)
  }
  
  static bool
  _equalWindowDef(WindowDef *a, WindowDef *b)
  {
  	COMPARE_STRING_FIELD(name);
  	COMPARE_STRING_FIELD(refname);
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
! 	COMPARE_SCALAR_FIELD(frameOptions);
  	COMPARE_LOCATION_FIELD(location);
  
  	return true;
--- 2035,2057 ----
  }
  
  static bool
+ _equalWindowFrameDef(WindowFrameDef *a, WindowFrameDef *b)
+ {
+ 	COMPARE_SCALAR_FIELD(options);
+ 	COMPARE_NODE_FIELD(startOffset);
+ 	COMPARE_NODE_FIELD(endOffset);
+ 
+ 	return true;
+ }
+ 
+ static bool
  _equalWindowDef(WindowDef *a, WindowDef *b)
  {
  	COMPARE_STRING_FIELD(name);
  	COMPARE_STRING_FIELD(refname);
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
! 	COMPARE_NODE_FIELD(frame);
  	COMPARE_LOCATION_FIELD(location);
  
  	return true;
***************
*** 2186,2192 **** _equalWindowClause(WindowClause *a, WindowClause *b)
  	COMPARE_STRING_FIELD(refname);
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
! 	COMPARE_SCALAR_FIELD(frameOptions);
  	COMPARE_SCALAR_FIELD(winref);
  	COMPARE_SCALAR_FIELD(copiedOrder);
  
--- 2196,2202 ----
  	COMPARE_STRING_FIELD(refname);
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
! 	COMPARE_NODE_FIELD(frame);
  	COMPARE_SCALAR_FIELD(winref);
  	COMPARE_SCALAR_FIELD(copiedOrder);
  
***************
*** 2847,2852 **** equal(void *a, void *b)
--- 2857,2865 ----
  		case T_SortBy:
  			retval = _equalSortBy(a, b);
  			break;
+ 		case T_WindowFrameDef:
+ 			retval = _equalWindowFrameDef(a, b);
+ 			break;
  		case T_WindowDef:
  			retval = _equalWindowDef(a, b);
  			break;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 610,615 **** _outWindowAgg(StringInfo str, WindowAgg *node)
--- 610,617 ----
  		appendStringInfo(str, " %u", node->ordOperators[i]);
  
  	WRITE_INT_FIELD(frameOptions);
+ 	WRITE_NODE_FIELD(startOffset);
+ 	WRITE_NODE_FIELD(endOffset);
  }
  
  static void
***************
*** 2024,2030 **** _outWindowClause(StringInfo str, WindowClause *node)
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_INT_FIELD(frameOptions);
  	WRITE_UINT_FIELD(winref);
  	WRITE_BOOL_FIELD(copiedOrder);
  }
--- 2026,2032 ----
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_NODE_FIELD(frame);
  	WRITE_UINT_FIELD(winref);
  	WRITE_BOOL_FIELD(copiedOrder);
  }
***************
*** 2307,2312 **** _outSortBy(StringInfo str, SortBy *node)
--- 2309,2324 ----
  }
  
  static void
+ _outWindowFrameDef(StringInfo str, WindowFrameDef *node)
+ {
+ 	WRITE_NODE_TYPE("WINDOWFRAMEDEF");
+ 
+ 	WRITE_INT_FIELD(options);
+ 	WRITE_NODE_FIELD(startOffset);
+ 	WRITE_NODE_FIELD(endOffset);
+ }
+ 
+ static void
  _outWindowDef(StringInfo str, WindowDef *node)
  {
  	WRITE_NODE_TYPE("WINDOWDEF");
***************
*** 2315,2321 **** _outWindowDef(StringInfo str, WindowDef *node)
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_INT_FIELD(frameOptions);
  	WRITE_LOCATION_FIELD(location);
  }
  
--- 2327,2333 ----
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_NODE_FIELD(frame);
  	WRITE_LOCATION_FIELD(location);
  }
  
***************
*** 2848,2853 **** _outNode(StringInfo str, void *obj)
--- 2860,2868 ----
  			case T_SortBy:
  				_outSortBy(str, obj);
  				break;
+ 			case T_WindowFrameDef:
+ 				_outWindowFrameDef(str, obj);
+ 				break;
  			case T_WindowDef:
  				_outWindowDef(str, obj);
  				break;
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 278,284 **** _readWindowClause(void)
  	READ_STRING_FIELD(refname);
  	READ_NODE_FIELD(partitionClause);
  	READ_NODE_FIELD(orderClause);
! 	READ_INT_FIELD(frameOptions);
  	READ_UINT_FIELD(winref);
  	READ_BOOL_FIELD(copiedOrder);
  
--- 278,284 ----
  	READ_STRING_FIELD(refname);
  	READ_NODE_FIELD(partitionClause);
  	READ_NODE_FIELD(orderClause);
! 	READ_NODE_FIELD(frame);
  	READ_UINT_FIELD(winref);
  	READ_BOOL_FIELD(copiedOrder);
  
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 3360,3366 **** make_windowagg(PlannerInfo *root, List *tlist,
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Plan *lefttree)
  {
  	WindowAgg  *node = makeNode(WindowAgg);
  	Plan	   *plan = &node->plan;
--- 3360,3367 ----
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Node *startOffset, Node *endOffset,
! 			   Plan *lefttree)
  {
  	WindowAgg  *node = makeNode(WindowAgg);
  	Plan	   *plan = &node->plan;
***************
*** 3375,3380 **** make_windowagg(PlannerInfo *root, List *tlist,
--- 3376,3383 ----
  	node->ordColIdx = ordColIdx;
  	node->ordOperators = ordOperators;
  	node->frameOptions = frameOptions;
+ 	node->startOffset = startOffset;
+ 	node->endOffset = endOffset;
  
  	copy_plan_costsize(plan, lefttree); /* only care about copying size */
  	cost_windowagg(&windowagg_path, root,
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 1484,1490 **** grouping_planner(PlannerInfo *root, double tuple_fraction)
  				result_plan = (Plan *)
  					make_windowagg(root,
  								   (List *) copyObject(window_tlist),
! 							   list_length(wflists->windowFuncs[wc->winref]),
  								   wc->winref,
  								   partNumCols,
  								   partColIdx,
--- 1484,1490 ----
  				result_plan = (Plan *)
  					make_windowagg(root,
  								   (List *) copyObject(window_tlist),
! 								   list_length(wflists->windowFuncs[wc->winref]),
  								   wc->winref,
  								   partNumCols,
  								   partColIdx,
***************
*** 1492,1498 **** grouping_planner(PlannerInfo *root, double tuple_fraction)
  								   ordNumCols,
  								   ordColIdx,
  								   ordOperators,
! 								   wc->frameOptions,
  								   result_plan);
  			}
  		}
--- 1492,1500 ----
  								   ordNumCols,
  								   ordColIdx,
  								   ordOperators,
! 								   wc->frame->options,
! 								   wc->frame->startOffset,
! 								   wc->frame->endOffset,
  								   result_plan);
  			}
  		}
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 428,434 **** static TypeName *TableFuncTypeName(List *columns);
  %type <list>	window_clause window_definition_list opt_partition_clause
  %type <windef>	window_definition over_clause window_specification
  %type <str>		opt_existing_window_name
! %type <ival>	opt_frame_clause frame_extent frame_bound
  
  
  /*
--- 428,434 ----
  %type <list>	window_clause window_definition_list opt_partition_clause
  %type <windef>	window_definition over_clause window_specification
  %type <str>		opt_existing_window_name
! %type <node>	opt_frame_clause frame_extent frame_bound
  
  
  /*
***************
*** 9687,9697 **** over_clause: OVER window_specification
  			| OVER ColId
  				{
  					WindowDef *n = makeNode(WindowDef);
  					n->name = $2;
  					n->refname = NULL;
  					n->partitionClause = NIL;
  					n->orderClause = NIL;
! 					n->frameOptions = FRAMEOPTION_DEFAULTS;
  					n->location = @2;
  					$$ = n;
  				}
--- 9687,9701 ----
  			| OVER ColId
  				{
  					WindowDef *n = makeNode(WindowDef);
+ 					WindowFrameDef *frame = makeNode(WindowFrameDef);
+ 					frame->options = FRAMEOPTION_DEFAULTS;
+ 					frame->startOffset = NULL;
+ 					frame->endOffset = NULL;
  					n->name = $2;
  					n->refname = NULL;
  					n->partitionClause = NIL;
  					n->orderClause = NIL;
! 					n->frame = frame;
  					n->location = @2;
  					$$ = n;
  				}
***************
*** 9707,9713 **** window_specification: '(' opt_existing_window_name opt_partition_clause
  					n->refname = $2;
  					n->partitionClause = $3;
  					n->orderClause = $4;
! 					n->frameOptions = $5;
  					n->location = @1;
  					$$ = n;
  				}
--- 9711,9717 ----
  					n->refname = $2;
  					n->partitionClause = $3;
  					n->orderClause = $4;
! 					n->frame = (WindowFrameDef *) $5;
  					n->location = @1;
  					$$ = n;
  				}
***************
*** 9739,9789 **** opt_partition_clause: PARTITION BY expr_list		{ $$ = $3; }
  opt_frame_clause:
  			RANGE frame_extent
  				{
! 					$$ = FRAMEOPTION_NONDEFAULT | FRAMEOPTION_RANGE | $2;
  				}
  			| ROWS frame_extent
  				{
! 					$$ = FRAMEOPTION_NONDEFAULT | FRAMEOPTION_ROWS | $2;
  				}
  			| /*EMPTY*/
! 				{ $$ = FRAMEOPTION_DEFAULTS; }
  		;
  
  frame_extent: frame_bound
  				{
  					/* reject invalid cases */
! 					if ($1 & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@1)));
! 					if ($1 & FRAMEOPTION_START_CURRENT_ROW)
! 						ereport(ERROR,
! 								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 								 errmsg("frame start at CURRENT ROW is not implemented"),
! 								 parser_errposition(@1)));
! 					$$ = $1 | FRAMEOPTION_END_CURRENT_ROW;
  				}
  			| BETWEEN frame_bound AND frame_bound
  				{
  					/* reject invalid cases */
! 					if ($2 & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@2)));
! 					if ($2 & FRAMEOPTION_START_CURRENT_ROW)
! 						ereport(ERROR,
! 								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 								 errmsg("frame start at CURRENT ROW is not implemented"),
! 								 parser_errposition(@2)));
! 					if ($4 & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame end cannot be UNBOUNDED PRECEDING"),
  								 parser_errposition(@4)));
  					/* shift converts START_ options to END_ options */
! 					$$ = FRAMEOPTION_BETWEEN | $2 | ($4 << 1);
  				}
  		;
  
--- 9743,9826 ----
  opt_frame_clause:
  			RANGE frame_extent
  				{
! 					WindowFrameDef *frame = (WindowFrameDef *) $2;
! 					frame->options |= FRAMEOPTION_NONDEFAULT | FRAMEOPTION_RANGE;
! 					frame->location = @1;
! 					$$ = (Node *) frame;
  				}
  			| ROWS frame_extent
  				{
! 					WindowFrameDef *frame = (WindowFrameDef *) $2;
! 					frame->options |= FRAMEOPTION_NONDEFAULT | FRAMEOPTION_ROWS;
! 					frame->location = @1;
! 					$$ = (Node *) frame;
  				}
  			| /*EMPTY*/
! 				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_DEFAULTS;
! 					frame->startOffset = NULL;
! 					frame->endOffset = NULL;
! 					frame->location = -1;
! 					$$ = (Node *) frame;
! 				}
  		;
  
  frame_extent: frame_bound
  				{
+ 					WindowFrameDef *frame = (WindowFrameDef *) $1;
  					/* reject invalid cases */
! 					if (frame->options & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@1)));
! 					frame->options |= FRAMEOPTION_END_CURRENT_ROW;
! 					$$ = (Node *) frame;
  				}
  			| BETWEEN frame_bound AND frame_bound
  				{
+ 					WindowFrameDef *frame1 = (WindowFrameDef *) $2;
+ 					WindowFrameDef *frame2 = (WindowFrameDef *) $4;
  					/* reject invalid cases */
! 					if (frame1->options & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@2)));
! 					if (frame2->options & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame end cannot be UNBOUNDED PRECEDING"),
  								 parser_errposition(@4)));
+ 					if (frame1->options & FRAMEOPTION_START_CURRENT_ROW &&
+ 						frame2->options &
+ 							(FRAMEOPTION_START_UNBOUNDED_PRECEDING |
+ 							 FRAMEOPTION_START_VALUE_PRECEDING))
+ 						ereport(ERROR,
+ 								(errcode(ERRCODE_WINDOWING_ERROR),
+ 								 errmsg("frame from current row cannot have preceding rows"),
+ 								 parser_errposition(@4)));
+ 					if (frame1->options & FRAMEOPTION_START_VALUE_FOLLOWING &&
+ 						frame2->options &
+ 							(FRAMEOPTION_START_VALUE_PRECEDING |
+ 							 FRAMEOPTION_START_CURRENT_ROW))
+ 						ereport(ERROR,
+ 								(errcode(ERRCODE_WINDOWING_ERROR),
+ 								 errmsg("frame from following row cannot have preceding rows"),
+ 								 parser_errposition(@4)));
+ 					if (frame1->options & FRAMEOPTION_START_VALUE_FOLLOWING &&
+ 						frame2->options & FRAMEOPTION_END_VALUE_PRECEDING)
+ 						ereport(ERROR,
+ 								(errcode(ERRCODE_WINDOWING_ERROR),
+ 								 errmsg("frame cannot be BETWEEN x FOLLOWING AND y PRECEDING"),
+ 								 parser_errposition(@1)));
  					/* shift converts START_ options to END_ options */
! 					frame1->options |= frame2->options << 1;
! 					frame1->endOffset = frame2->startOffset;
! 					/* frame2 is not necessary anymore */
! 					pfree(frame2);
! 					$$ = (Node *) frame1;
  				}
  		;
  
***************
*** 9795,9809 **** frame_extent: frame_bound
  frame_bound:
  			UNBOUNDED PRECEDING
  				{
! 					$$ = FRAMEOPTION_START_UNBOUNDED_PRECEDING;
  				}
  			| UNBOUNDED FOLLOWING
  				{
! 					$$ = FRAMEOPTION_START_UNBOUNDED_FOLLOWING;
  				}
  			| CURRENT_P ROW
  				{
! 					$$ = FRAMEOPTION_START_CURRENT_ROW;
  				}
  		;
  
--- 9832,9870 ----
  frame_bound:
  			UNBOUNDED PRECEDING
  				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_UNBOUNDED_PRECEDING;
! 					frame->startOffset = NULL;
! 					frame->endOffset = NULL;
! 					$$ = (Node *) frame;
  				}
  			| UNBOUNDED FOLLOWING
  				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_UNBOUNDED_FOLLOWING;
! 					frame->startOffset = NULL;
! 					frame->endOffset = NULL;
! 					$$ = (Node *) frame;
  				}
  			| CURRENT_P ROW
  				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_CURRENT_ROW;
! 					$$ = (Node *) frame;
! 				}
! 			| Iconst PRECEDING
! 				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_VALUE_PRECEDING;
! 					frame->startOffset = makeIntConst($1, @1);
! 					$$ = (Node *) frame;
! 				}
! 			| Iconst FOLLOWING
! 				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_VALUE_FOLLOWING;
! 					frame->startOffset = makeIntConst($1, @1);
! 					$$ = (Node *) frame;
  				}
  		;
  
*** a/src/backend/parser/parse_agg.c
--- b/src/backend/parser/parse_agg.c
***************
*** 136,142 **** transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
  		Assert(windef->refname == NULL &&
  			   windef->partitionClause == NIL &&
  			   windef->orderClause == NIL &&
! 			   windef->frameOptions == FRAMEOPTION_DEFAULTS);
  
  		foreach(lc, pstate->p_windowdefs)
  		{
--- 136,142 ----
  		Assert(windef->refname == NULL &&
  			   windef->partitionClause == NIL &&
  			   windef->orderClause == NIL &&
! 			   windef->frame->options == FRAMEOPTION_DEFAULTS);
  
  		foreach(lc, pstate->p_windowdefs)
  		{
***************
*** 174,180 **** transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
  				continue;
  			if (equal(refwin->partitionClause, windef->partitionClause) &&
  				equal(refwin->orderClause, windef->orderClause) &&
! 				refwin->frameOptions == windef->frameOptions)
  			{
  				/* found a duplicate window specification */
  				wfunc->winref = winref;
--- 174,180 ----
  				continue;
  			if (equal(refwin->partitionClause, windef->partitionClause) &&
  				equal(refwin->orderClause, windef->orderClause) &&
! 				equal(refwin->frame, windef->frame))
  			{
  				/* found a duplicate window specification */
  				wfunc->winref = winref;
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
***************
*** 1657,1669 **** transformWindowDefinitions(ParseState *pstate,
  			wc->orderClause = orderClause;
  			wc->copiedOrder = false;
  		}
! 		if (refwc && refwc->frameOptions != FRAMEOPTION_DEFAULTS)
  			ereport(ERROR,
  					(errcode(ERRCODE_WINDOWING_ERROR),
  					 errmsg("cannot override frame clause of window \"%s\"",
  							windef->refname),
  					 parser_errposition(pstate, windef->location)));
! 		wc->frameOptions = windef->frameOptions;
  		wc->winref = winref;
  
  		result = lappend(result, wc);
--- 1657,1669 ----
  			wc->orderClause = orderClause;
  			wc->copiedOrder = false;
  		}
! 		if (refwc && refwc->frame->options != FRAMEOPTION_DEFAULTS)
  			ereport(ERROR,
  					(errcode(ERRCODE_WINDOWING_ERROR),
  					 errmsg("cannot override frame clause of window \"%s\"",
  							windef->refname),
  					 parser_errposition(pstate, windef->location)));
! 		wc->frame = copyObject(windef->frame);
  		wc->winref = winref;
  
  		result = lappend(result, wc);
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
***************
*** 2985,2990 **** get_rule_windowspec(WindowClause *wc, List *targetList,
--- 2985,2991 ----
  {
  	StringInfo	buf = context->buf;
  	bool		needspace = false;
+ 	WindowFrameDef *frame = wc->frame;
  	const char *sep;
  	ListCell   *l;
  
***************
*** 3022,3051 **** get_rule_windowspec(WindowClause *wc, List *targetList,
  		needspace = true;
  	}
  	/* framing clause is never inherited, so print unless it's default */
! 	if (wc->frameOptions & FRAMEOPTION_NONDEFAULT)
  	{
  		if (needspace)
  			appendStringInfoChar(buf, ' ');
! 		if (wc->frameOptions & FRAMEOPTION_RANGE)
  			appendStringInfoString(buf, "RANGE ");
! 		else if (wc->frameOptions & FRAMEOPTION_ROWS)
  			appendStringInfoString(buf, "ROWS ");
  		else
  			Assert(false);
! 		if (wc->frameOptions & FRAMEOPTION_BETWEEN)
  			appendStringInfoString(buf, "BETWEEN ");
! 		if (wc->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
  			appendStringInfoString(buf, "UNBOUNDED PRECEDING ");
! 		else if (wc->frameOptions & FRAMEOPTION_START_CURRENT_ROW)
  			appendStringInfoString(buf, "CURRENT ROW ");
  		else
  			Assert(false);
! 		if (wc->frameOptions & FRAMEOPTION_BETWEEN)
  		{
  			appendStringInfoString(buf, "AND ");
! 			if (wc->frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  				appendStringInfoString(buf, "UNBOUNDED FOLLOWING ");
! 			else if (wc->frameOptions & FRAMEOPTION_END_CURRENT_ROW)
  				appendStringInfoString(buf, "CURRENT ROW ");
  			else
  				Assert(false);
--- 3023,3052 ----
  		needspace = true;
  	}
  	/* framing clause is never inherited, so print unless it's default */
! 	if (frame->options & FRAMEOPTION_NONDEFAULT)
  	{
  		if (needspace)
  			appendStringInfoChar(buf, ' ');
! 		if (frame->options & FRAMEOPTION_RANGE)
  			appendStringInfoString(buf, "RANGE ");
! 		else if (frame->options & FRAMEOPTION_ROWS)
  			appendStringInfoString(buf, "ROWS ");
  		else
  			Assert(false);
! 		if (frame->options & FRAMEOPTION_BETWEEN)
  			appendStringInfoString(buf, "BETWEEN ");
! 		if (frame->options & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
  			appendStringInfoString(buf, "UNBOUNDED PRECEDING ");
! 		else if (frame->options & FRAMEOPTION_START_CURRENT_ROW)
  			appendStringInfoString(buf, "CURRENT ROW ");
  		else
  			Assert(false);
! 		if (frame->options & FRAMEOPTION_BETWEEN)
  		{
  			appendStringInfoString(buf, "AND ");
! 			if (frame->options & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  				appendStringInfoString(buf, "UNBOUNDED FOLLOWING ");
! 			else if (frame->options & FRAMEOPTION_END_CURRENT_ROW)
  				appendStringInfoString(buf, "CURRENT ROW ");
  			else
  				Assert(false);
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 14,19 ****
--- 14,20 ----
  #ifndef EXECNODES_H
  #define EXECNODES_H
  
+ #include "windowapi.h"
  #include "access/genam.h"
  #include "access/heapam.h"
  #include "access/skey.h"
***************
*** 1585,1594 **** typedef struct WindowAggState
  	FmgrInfo   *ordEqfunctions; /* equality funcs for ordering columns */
  	Tuplestorestate *buffer;	/* stores rows of current partition */
  	int			current_ptr;	/* read pointer # for current */
- 	int			agg_ptr;		/* read pointer # for aggregates */
  	int64		spooled_rows;	/* total # of rows in buffer */
  	int64		currentpos;		/* position of current row in partition */
  	int64		frametailpos;	/* current frame tail position */
  	int64		aggregatedupto; /* rows before this one are aggregated */
  
  	MemoryContext wincontext;	/* context for partition-lifespan data */
--- 1586,1597 ----
  	FmgrInfo   *ordEqfunctions; /* equality funcs for ordering columns */
  	Tuplestorestate *buffer;	/* stores rows of current partition */
  	int			current_ptr;	/* read pointer # for current */
  	int64		spooled_rows;	/* total # of rows in buffer */
  	int64		currentpos;		/* position of current row in partition */
+ 	int64		frameheadpos;	/* current frame head position */
  	int64		frametailpos;	/* current frame tail position */
+ 	WindowObject agg_winobj;
+ 	int64		aggregatedbase; /* row position this aggregate started on */
  	int64		aggregatedupto; /* rows before this one are aggregated */
  
  	MemoryContext wincontext;	/* context for partition-lifespan data */
***************
*** 1600,1607 **** typedef struct WindowAggState
--- 1603,1613 ----
  										 * tuplestore */
  	bool		more_partitions;/* true if there's more partitions after this
  								 * one */
+ 	bool		framehead_valid;/* true if frameheadpos is known up to date
+ 								 * for current row */
  	bool		frametail_valid;/* true if frametailpos is known up to date
  								 * for current row */
+ 	bool		frametail_stable;	/* frame's tail doesn't move */
  
  	TupleTableSlot *first_part_slot;	/* first tuple of current or next
  										 * partition */
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 362,367 **** typedef enum NodeTag
--- 362,368 ----
  	T_ResTarget,
  	T_TypeCast,
  	T_SortBy,
+ 	T_WindowFrameDef,
  	T_WindowDef,
  	T_RangeSubselect,
  	T_RangeFunction,
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 376,381 **** typedef struct SortBy
--- 376,393 ----
  } SortBy;
  
  /*
+  * WindowFrameDef - for FRAME clause description in WINDOW definition
+  */
+ typedef struct WindowFrameDef
+ {
+ 	NodeTag		type;
+ 	int			options;		/* frame_clause options, see below */
+ 	Node	   *startOffset;	/* how far is the starting bound */
+ 	Node	   *endOffset;		/* how far is the ending bound */
+ 	int			location;		/* parse location, or -1 if none/unknown */
+ } WindowFrameDef;
+ 
+ /*
   * WindowDef - raw representation of WINDOW and OVER clauses
   *
   * For entries in a WINDOW list, "name" is the window name being defined.
***************
*** 390,396 **** typedef struct WindowDef
  	char	   *refname;		/* referenced window name, if any */
  	List	   *partitionClause;	/* PARTITION BY expression list */
  	List	   *orderClause;	/* ORDER BY (list of SortBy) */
! 	int			frameOptions;	/* frame_clause options, see below */
  	int			location;		/* parse location, or -1 if none/unknown */
  } WindowDef;
  
--- 402,408 ----
  	char	   *refname;		/* referenced window name, if any */
  	List	   *partitionClause;	/* PARTITION BY expression list */
  	List	   *orderClause;	/* ORDER BY (list of SortBy) */
! 	WindowFrameDef *frame;		/* FRAME clause including boundary values */
  	int			location;		/* parse location, or -1 if none/unknown */
  } WindowDef;
  
***************
*** 412,417 **** typedef struct WindowDef
--- 424,433 ----
  #define FRAMEOPTION_END_UNBOUNDED_FOLLOWING		0x00080 /* end is U. F. */
  #define FRAMEOPTION_START_CURRENT_ROW			0x00100 /* start is C. R. */
  #define FRAMEOPTION_END_CURRENT_ROW				0x00200 /* end is C. R. */
+ #define FRAMEOPTION_START_VALUE_PRECEDING		0x01000 /* start is V. P. */
+ #define FRAMEOPTION_END_VALUE_PRECEDING			0x02000 /* end is V.P. */
+ #define FRAMEOPTION_START_VALUE_FOLLOWING		0x04000 /* start is V. F. */
+ #define FRAMEOPTION_END_VALUE_FOLLOWING			0x08000 /* end is V. F. */
  
  #define FRAMEOPTION_DEFAULTS \
  	(FRAMEOPTION_RANGE | FRAMEOPTION_START_UNBOUNDED_PRECEDING | \
***************
*** 794,800 **** typedef struct WindowClause
  	char	   *refname;		/* referenced window name, if any */
  	List	   *partitionClause;	/* PARTITION BY list */
  	List	   *orderClause;	/* ORDER BY list */
! 	int			frameOptions;	/* frame_clause options, see WindowDef */
  	Index		winref;			/* ID referenced by window functions */
  	bool		copiedOrder;	/* did we copy orderClause from refname? */
  } WindowClause;
--- 810,816 ----
  	char	   *refname;		/* referenced window name, if any */
  	List	   *partitionClause;	/* PARTITION BY list */
  	List	   *orderClause;	/* ORDER BY list */
! 	WindowFrameDef *frame;		/* frame_clause options and boundaries */
  	Index		winref;			/* ID referenced by window functions */
  	bool		copiedOrder;	/* did we copy orderClause from refname? */
  } WindowClause;
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 552,557 **** typedef struct WindowAgg
--- 552,559 ----
  	AttrNumber *ordColIdx;		/* their indexes in the target list */
  	Oid		   *ordOperators;	/* equality operators for ordering columns */
  	int			frameOptions;	/* frame_clause options, see WindowDef */
+ 	Node	   *startOffset;	/* frame start boundary */
+ 	Node	   *endOffset;		/* frame end boundary */
  } WindowAgg;
  
  /* ----------------
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 61,67 **** extern WindowAgg *make_windowagg(PlannerInfo *root, List *tlist,
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Plan *lefttree);
  extern Group *make_group(PlannerInfo *root, List *tlist, List *qual,
  		   int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
  		   double numGroups,
--- 61,68 ----
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Node *startOffset, Node *endOffset,
! 			   Plan *lefttree);
  extern Group *make_group(PlannerInfo *root, List *tlist, List *qual,
  		   int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
  		   double numGroups,
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
***************
*** 728,733 **** FROM (select distinct ten, four from tenk1) ss;
--- 728,871 ----
      3 |   2 |   4 |          2
  (20 rows)
  
+ SELECT sum(unique1) over (order by four range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+   45 |       0 |    0
+   45 |       8 |    0
+   45 |       4 |    0
+   33 |       5 |    1
+   33 |       9 |    1
+   33 |       1 |    1
+   18 |       6 |    2
+   18 |       2 |    2
+   10 |       3 |    3
+   10 |       7 |    3
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+   45 |       4 |    0
+   41 |       2 |    2
+   39 |       1 |    1
+   38 |       6 |    2
+   32 |       9 |    1
+   23 |       8 |    0
+   15 |       5 |    1
+   10 |       3 |    3
+    7 |       7 |    3
+    0 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 2 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+    7 |       4 |    0
+   13 |       2 |    2
+   22 |       1 |    1
+   26 |       6 |    2
+   29 |       9 |    1
+   31 |       8 |    0
+   32 |       5 |    1
+   23 |       3 |    3
+   15 |       7 |    3
+   10 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 1 preceding),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+      |       4 |    0
+    4 |       2 |    2
+    6 |       1 |    1
+    3 |       6 |    2
+    7 |       9 |    1
+   15 |       8 |    0
+   17 |       5 |    1
+   13 |       3 |    3
+    8 |       7 |    3
+   10 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between 1 following and 3 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+    9 |       4 |    0
+   16 |       2 |    2
+   23 |       1 |    1
+   22 |       6 |    2
+   16 |       9 |    1
+   15 |       8 |    0
+   10 |       5 |    1
+    7 |       3 |    3
+    0 |       7 |    3
+    0 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between unbounded preceding and 1 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+    6 |       4 |    0
+    7 |       2 |    2
+   13 |       1 |    1
+   22 |       6 |    2
+   30 |       9 |    1
+   35 |       8 |    0
+   38 |       5 |    1
+   45 |       3 |    3
+   45 |       7 |    3
+   45 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (w range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
+  sum | unique1 | four 
+ -----+---------+------
+   45 |       0 |    0
+   45 |       8 |    0
+   45 |       4 |    0
+   33 |       5 |    1
+   33 |       9 |    1
+   33 |       1 |    1
+   18 |       6 |    2
+   18 |       2 |    2
+   10 |       3 |    3
+   10 |       7 |    3
+ (10 rows)
+ 
+ SELECT first_value(unique1) over w,
+ 	nth_value(unique1, 2) over w AS nth_2,
+ 	last_value(unique1) over w, unique1, four
+ FROM tenk1 WHERE unique1 < 10
+ WINDOW w AS (order by four range between current row and unbounded following);
+  first_value | nth_2 | last_value | unique1 | four 
+ -------------+-------+------------+---------+------
+            0 |     8 |          7 |       0 |    0
+            0 |     8 |          7 |       8 |    0
+            0 |     8 |          7 |       4 |    0
+            5 |     9 |          7 |       5 |    1
+            5 |     9 |          7 |       9 |    1
+            5 |     9 |          7 |       1 |    1
+            6 |     2 |          7 |       6 |    2
+            6 |     2 |          7 |       2 |    2
+            3 |     7 |          7 |       3 |    3
+            3 |     7 |          7 |       7 |    3
+ (10 rows)
+ 
  -- with UNION
  SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;
   count 
*** a/src/test/regress/sql/window.sql
--- b/src/test/regress/sql/window.sql
***************
*** 161,166 **** SELECT four, ten/4 as two,
--- 161,200 ----
  	last_value(ten/4) over (partition by four order by ten/4 rows between unbounded preceding and current row)
  FROM (select distinct ten, four from tenk1) ss;
  
+ SELECT sum(unique1) over (order by four range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 2 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 1 preceding),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between 1 following and 3 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between unbounded preceding and 1 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (w range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
+ 
+ SELECT first_value(unique1) over w,
+ 	nth_value(unique1, 2) over w AS nth_2,
+ 	last_value(unique1) over w, unique1, four
+ FROM tenk1 WHERE unique1 < 10
+ WINDOW w AS (order by four range between current row and unbounded following);
+ 
  -- with UNION
  SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;
  
#2Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#1)
Re: add more frame types in window functions (ROWS)

Hi, I've started reviewing your patch.

I've already found some things that need work:

- missing _readWindowFrameDef function (all nodes that are output
from parse analysis must have both _read and _out functions,
otherwise views can't work)

- the A_Const nodes should probably be transformed to Const nodes in
parse analysis, since A_Const has no _read/_out functions, which
means changing the corresponding code in the executor.

(A question here: the spec allows (by my reading) the use of
parameters in the window frame clause, i.e. BETWEEN $1 PRECEDING AND
$2 FOLLOWING. Wouldn't it therefore make more sense to treat the
values as Exprs, albeit very limited ones, and eval them at startup
rather than assuming we know the node type and digging down into it
all over the place?)

You can see the problem here if you do this:

create view v as
select i,sum(i) over (order by i rows between 1 preceding and 1 following)
from generate_series(1,10) i;
select * from v;

(A regression test for this would probably be good, which reminds me that
I need to add one of those myself in the aggregate order by stuff.)

Also, you're going to need to do some work in ruleutils.c
(get_rule_windowspec) in order to be able to deparse your new frame
definitions.

I'll continue reviewing the stuff that does work, so I'm not marking this
as "waiting for author" yet.

--
Andrew (irc:RhodiumToad)

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#2)
Re: add more frame types in window functions (ROWS)

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

(A question here: the spec allows (by my reading) the use of
parameters in the window frame clause, i.e. BETWEEN $1 PRECEDING AND
$2 FOLLOWING. Wouldn't it therefore make more sense to treat the
values as Exprs, albeit very limited ones, and eval them at startup
rather than assuming we know the node type and digging down into it
all over the place?)

Seems like you might as well allow any expression not containing
local Vars. Compare the handling of LIMIT.

regards, tom lane

#4Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Tom Lane (#3)
Re: add more frame types in window functions (ROWS)

2009/11/15 Tom Lane <tgl@sss.pgh.pa.us>:

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

(A question here: the spec allows (by my reading) the use of
parameters in the window frame clause, i.e. BETWEEN $1 PRECEDING AND
$2 FOLLOWING.  Wouldn't it therefore make more sense to treat the
values as Exprs, albeit very limited ones, and eval them at startup
rather than assuming we know the node type and digging down into it
all over the place?)

Seems like you might as well allow any expression not containing
local Vars.  Compare the handling of LIMIT.

Hmm, I've read it wrong, was assuming a constant for <unsigned value
specification> which actually includes any expression. But it's a
fixed value during execution, right? Otherwise, we cannot predicate
frame boundary.

Regards,

--
Hitoshi Harada

#5Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Andrew Gierth (#2)
Re: add more frame types in window functions (ROWS)

Thanks for your review.

2009/11/15 Andrew Gierth <andrew@tao11.riddles.org.uk>:

Hi, I've started reviewing your patch.

I've already found some things that need work:

 - missing _readWindowFrameDef function (all nodes that are output
  from parse analysis must have both _read and _out functions,
  otherwise views can't work)

I added _outWindowFramedef() but seem to forget _read one. Will add it.

 - the A_Const nodes should probably be transformed to Const nodes in
  parse analysis, since A_Const has no _read/_out functions, which
  means changing the corresponding code in the executor.

Thanks for this comment. I hadn't determined which node should be used
as a value node passed to executor. Including Tom's comment, I must
consider which should be again.

Regards,

--
Hitoshi Harada

#6Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#4)
Re: add more frame types in window functions (ROWS)

"Hitoshi" == Hitoshi Harada <umi.tanuki@gmail.com> writes:

(A question here: the spec allows (by my reading) the use of
parameters in the window frame clause, i.e. BETWEEN $1 PRECEDING
AND $2 FOLLOWING.  Wouldn't it therefore make more sense to treat
the values as Exprs, albeit very limited ones, and eval them at
startup rather than assuming we know the node type and digging
down into it all over the place?)

Seems like you might as well allow any expression not containing
local Vars.  Compare the handling of LIMIT.

Hitoshi> Hmm, I've read it wrong, was assuming a constant for <unsigned value
Hitoshi> specification> which actually includes any expression. But it's a
Hitoshi> fixed value during execution, right? Otherwise, we cannot predicate
Hitoshi> frame boundary.

The spec doesn't allow arbitrary expressions there, only literals and
parameters. Allowing more than that would be going beyond the spec, but
as with LIMIT, could be useful nonetheless.

For it to be a fixed value during execution, the same rules apply as
for LIMIT; it can't contain Vars of the current query level.

My thinking is that the executor definitely shouldn't be relying on it
being a specific node type, but should just ExecEvalExpr it on the
first call and store the result; then you don't have to know whether
it's a Const or Param node or a more complex expression.

--
Andrew.

#7Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Andrew Gierth (#6)
Re: add more frame types in window functions (ROWS)

2009/11/15 Andrew Gierth <andrew@tao11.riddles.org.uk>:

My thinking is that the executor definitely shouldn't be relying on it
being a specific node type, but should just ExecEvalExpr it on the
first call and store the result; then you don't have to know whether
it's a Const or Param node or a more complex expression.

Yeah, so that we allow something like ROWS BETWEEN 1 + $1 PRECEDING
AND ... And to support RANGE BETWEEN n PRECEDING ..., we should make
datum to add or to subtract from current value on initial call anyway.

Regards,

--
Hitoshi Harada

#8Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Gierth (#2)
Re: add more frame types in window functions (ROWS)

Here's the rest of the review, as far as I've taken it given the
problems with the code.

The patch applied cleanly and includes regression tests but not docs.

Small nitpicks: there are some comments not updated (e.g. the
big one at the start of eval_windowaggregates). A couple of lines are
commented-out using C++ comments.

The overall approach seems ok, and the parser stuff seems fine to me.

These are the issues I've found that make it not committable in its
present form (including the ones I mentioned before):

- missing _readWindowFrameDef function (all nodes that are output
from parse analysis must have both _read and _out functions,
otherwise views can't work)

- the A_Const nodes should probably be transformed to Const nodes in
parse analysis, since A_Const has no _read/_out functions, which
means changing the corresponding code in the executor.

- ruleutils.c not updated to deparse the newly added window options

- leaks memory like it's going out of style

The memory leakage is caused by not resetting any memory contexts when
throwing away all the aggregate state when advancing the start of the
window frame. This looks like it will require a rethink of the memory
management being used; it's not enough just to pfree copies of the
transition values (which you don't appear to be doing), you have to
reset the memory context that was exposed to the transition functions
via context->wincontext. So the current setup of a single long-lived
context won't work; you'll need a long-lived one, plus an additional
one that you can reset any time the aggregates need to be
re-initialized. (And if you're not going to break existing aggregate
functions, WindowAggState.wincontext needs to be the one that gets
reset.)

Tests for memory leaks:

-- tests for failure to free by-ref transition values
select count(*)
from (select i,max(repeat(i::text,100)) over (order by i rows between 1 preceding and current row)
from generate_series(1,1000000) i) s;

-- tests for failure to reset memory context on window advance
select count(*)
from (select i,array_agg(i) over (order by i rows between 1 preceding and current row)
from generate_series(1,1000000) i) s;

--
Andrew (irc:RhodiumToad)

#9Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Andrew Gierth (#8)
Re: add more frame types in window functions (ROWS)

2009/11/19 Andrew Gierth <andrew@tao11.riddles.org.uk>:

Here's the rest of the review, as far as I've taken it given the
problems with the code.

The patch applied cleanly and includes regression tests but not docs.

Small nitpicks: there are some comments not updated (e.g. the
big one at the start of eval_windowaggregates). A couple of lines are
commented-out using C++ comments.

OK. It's tough for me to rewrite that big part of comment but I'll try it.

The overall approach seems ok, and the parser stuff seems fine to me.

These are the issues I've found that make it not committable in its
present form (including the ones I mentioned before):

 - missing _readWindowFrameDef function (all nodes that are output
  from parse analysis must have both _read and _out functions,
  otherwise views can't work)

 - the A_Const nodes should probably be transformed to Const nodes in
  parse analysis, since A_Const has no _read/_out functions, which
  means changing the corresponding code in the executor.

A_Const/Const will be replace by Expr, to cover any expression without
local Var.

 - ruleutils.c not updated to deparse the newly added window options

 - leaks memory like it's going out of style

The memory leakage is caused by not resetting any memory contexts when
throwing away all the aggregate state when advancing the start of the
window frame. This looks like it will require a rethink of the memory
management being used; it's not enough just to pfree copies of the
transition values (which you don't appear to be doing), you have to
reset the memory context that was exposed to the transition functions
via context->wincontext. So the current setup of a single long-lived
context won't work; you'll need a long-lived one, plus an additional
one that you can reset any time the aggregates need to be
re-initialized. (And if you're not going to break existing aggregate
functions, WindowAggState.wincontext needs to be the one that gets
reset.)

Hmm, good point. Though I doubt we need two contexts for this because
we have not so far (and we already have tmpcontext for that purpose),
memory leakage probably seems to happen. I'll check it out.

Thanks for your elaborate review anyway. All I was worried about is
now clear. It will be lucky if I can update my patch until next week.
So please keep it "Waiting on Author".

Regards,

--
Hitoshi Harada

#10Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Hitoshi Harada (#9)
1 attachment(s)
Re: add more frame types in window functions (ROWS)

2009/11/19 Hitoshi Harada <umi.tanuki@gmail.com>:

2009/11/19 Andrew Gierth <andrew@tao11.riddles.org.uk>:

Small nitpicks: there are some comments not updated (e.g. the
big one at the start of eval_windowaggregates). A couple of lines are
commented-out using C++ comments.

Fixed. Document patch is included as well.

- missing _readWindowFrameDef function (all nodes that are output
from parse analysis must have both _read and _out functions,
otherwise views can't work)

I changed my mind and WindowFrameDef is alive only in initial parser
stage so that _readWindowFrameDef is now unnecessary. Information of
startOffset and endOffset will be copied to WindowClause members.

- the A_Const nodes should probably be transformed to Const nodes in
parse analysis, since A_Const has no _read/_out functions, which
means changing the corresponding code in the executor.

Fixed.

- ruleutils.c not updated to deparse the newly added window options

Fixed.

- leaks memory like it's going out of style

As earlier mail, I added aggcontext to WindowAggState.

Still it doesn't contain RANGE ... PRECEDING / FOLLOWING. If it's not
acceptable for commit without RANGE value support, I'd agree with
that. I'm planning to work on that until the next CommitFest.

Regards,

--
Hitoshi Harada

Attachments:

rows_frame_types.20091129.patchapplication/octet-stream; name=rows_frame_types.20091129.patchDownload
*** a/doc/src/sgml/ref/select.sgml
--- b/doc/src/sgml/ref/select.sgml
***************
*** 619,631 **** WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
      frame (not all do).  It can be one of
  <synopsis>
  RANGE UNBOUNDED PRECEDING
  RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ROWS UNBOUNDED PRECEDING
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  </synopsis>
!     The first two are equivalent and are also the default: they set the
      frame to be all rows from the partition start up through the current row's
      last peer in the <literal>ORDER BY</> ordering (which means all rows if
      there is no <literal>ORDER BY</>).  The options
--- 619,646 ----
      frame (not all do).  It can be one of
  <synopsis>
  RANGE UNBOUNDED PRECEDING
+ RANGE CURRENT ROW
  RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
+ RANGE BETWEEN CURRENT ROW AND CURRENT ROW
+ RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  ROWS UNBOUNDED PRECEDING
+ ROWS <replaceable>unsigned value</replaceable> PRECEDING
+ ROWS CURRENT ROW
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+ ROWS BETWEEN UNBOUNDED PRECEDING AND <replaceable>unsigned value</replaceable> FOLLOWING
  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND <replaceable>unsigned value</replaceable> PRECEDING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND CURRENT ROW
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND <replaceable>unsigned value</replaceable> FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND UNBOUNDED FOLLOWING
+ ROWS BETWEEN CURRENT ROW AND CURRENT ROW
+ ROWS BETWEEN CURRENT ROW AND <replaceable>unsigned value</replaceable> FOLLOWING
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> FOLLOWING AND <replaceable>unsigned value</replaceable> FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> AND UNBOUNDED FOLLOWING
  </synopsis>
!     The first and the third are equivalent and are also the default: they set the
      frame to be all rows from the partition start up through the current row's
      last peer in the <literal>ORDER BY</> ordering (which means all rows if
      there is no <literal>ORDER BY</>).  The options
***************
*** 637,642 **** ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
--- 652,664 ----
      all rows up through the current row (regardless of duplicates).
      Beware that this option can produce implementation-dependent results
      if the <literal>ORDER BY</> ordering does not order the rows uniquely.
+     The <replaceable>unsigned value</replaceable> in <literal>PRECEDING</>
+     or <literal>FOLLOWING</> means how far the frame bound is off from the
+     current row. In <literal>ROWS</> mode this value means phisical row
+     number from the current row and only integer value is allowed. In
+     <literal>RANGE</> mode this value can also be specified in SQL standard
+     altough PostgreSQL doesn't support it currently. You can specify
+     any expression for this value except for column reference.
     </para>
  
     <para>
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
***************
*** 1638,1648 **** sqrt(2)
--- 1638,1663 ----
      can be one of
  <synopsis>
  RANGE UNBOUNDED PRECEDING
+ RANGE CURRENT ROW
  RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
+ RANGE BETWEEN CURRENT ROW AND CURRENT ROW
+ RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  ROWS UNBOUNDED PRECEDING
+ ROWS <replaceable>unsigned value</replaceable> PRECEDING
+ ROWS CURRENT ROW
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+ ROWS BETWEEN UNBOUNDED PRECEDING AND <replaceable>unsigned value</replaceable> FOLLOWING
  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND <replaceable>unsigned value</replaceable> PRECEDING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND CURRENT ROW
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND <replaceable>unsigned value</replaceable> FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> PRECEDING AND UNBOUNDED FOLLOWING
+ ROWS BETWEEN CURRENT ROW AND CURRENT ROW
+ ROWS BETWEEN CURRENT ROW AND <replaceable>unsigned value</replaceable> FOLLOWING
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> FOLLOWING AND <replaceable>unsigned value</replaceable> FOLLOWING
+ ROWS BETWEEN <replaceable>unsigned value</replaceable> AND UNBOUNDED FOLLOWING
  </synopsis>
  
      Here, <replaceable>expression</replaceable> represents any value
***************
*** 1682,1687 **** ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
--- 1697,1713 ----
      all rows up through the current row (regardless of duplicates).
      Beware that this option can produce implementation-dependent results
      if the <literal>ORDER BY</> ordering does not order the rows uniquely.
+   </para>
+ 
+   <para>
+     The <replaceable>unsigned value</replaceable> in <literal>PRECEDING</>
+     or <literal>FOLLOWING</> tells how far the frame bound is off from
+     the current row. In <literal>ROWS</> mode this value means phisical
+     row number off from the current row and thus must be zero or positive
+     integer. The syntax of <literal>RANGE</> mode offset value is defined
+     in SQL standard but is not supported currently. You can specify any
+     expression except for column reference that changes row by row for
+     this value.
     </para>
  
     <para>
*** a/src/backend/executor/nodeWindowAgg.c
--- b/src/backend/executor/nodeWindowAgg.c
***************
*** 164,170 **** static void spool_tuples(WindowAggState *winstate, int64 pos);
  static void release_partition(WindowAggState *winstate);
  
  static bool row_is_in_frame(WindowAggState *winstate, int64 pos,
! 				TupleTableSlot *slot);
  static void update_frametailpos(WindowObject winobj, TupleTableSlot *slot);
  
  static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate,
--- 164,171 ----
  static void release_partition(WindowAggState *winstate);
  
  static bool row_is_in_frame(WindowAggState *winstate, int64 pos,
! 							TupleTableSlot *slot);
! static void update_frameheadpos(WindowObject winobj, TupleTableSlot *slot);
  static void update_frametailpos(WindowObject winobj, TupleTableSlot *slot);
  
  static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate,
***************
*** 193,199 **** initialize_windowaggregate(WindowAggState *winstate,
  		peraggstate->transValue = peraggstate->initValue;
  	else
  	{
! 		oldContext = MemoryContextSwitchTo(winstate->wincontext);
  		peraggstate->transValue = datumCopy(peraggstate->initValue,
  											peraggstate->transtypeByVal,
  											peraggstate->transtypeLen);
--- 194,200 ----
  		peraggstate->transValue = peraggstate->initValue;
  	else
  	{
! 		oldContext = MemoryContextSwitchTo(winstate->aggcontext);
  		peraggstate->transValue = datumCopy(peraggstate->initValue,
  											peraggstate->transtypeByVal,
  											peraggstate->transtypeLen);
***************
*** 258,267 **** advance_windowaggregate(WindowAggState *winstate,
  			 * already checked that the agg's input type is binary-compatible
  			 * with its transtype, so straight copy here is OK.)
  			 *
! 			 * We must copy the datum into wincontext if it is pass-by-ref. We
  			 * do not need to pfree the old transValue, since it's NULL.
  			 */
! 			MemoryContextSwitchTo(winstate->wincontext);
  			peraggstate->transValue = datumCopy(fcinfo->arg[1],
  												peraggstate->transtypeByVal,
  												peraggstate->transtypeLen);
--- 259,268 ----
  			 * already checked that the agg's input type is binary-compatible
  			 * with its transtype, so straight copy here is OK.)
  			 *
! 			 * We must copy the datum into aggcontext if it is pass-by-ref. We
  			 * do not need to pfree the old transValue, since it's NULL.
  			 */
! 			MemoryContextSwitchTo(winstate->aggcontext);
  			peraggstate->transValue = datumCopy(fcinfo->arg[1],
  												peraggstate->transtypeByVal,
  												peraggstate->transtypeLen);
***************
*** 294,300 **** advance_windowaggregate(WindowAggState *winstate,
  	newVal = FunctionCallInvoke(fcinfo);
  
  	/*
! 	 * If pass-by-ref datatype, must copy the new value into wincontext and
  	 * pfree the prior transValue.	But if transfn returned a pointer to its
  	 * first input, we don't need to do anything.
  	 */
--- 295,301 ----
  	newVal = FunctionCallInvoke(fcinfo);
  
  	/*
! 	 * If pass-by-ref datatype, must copy the new value into aggcontext and
  	 * pfree the prior transValue.	But if transfn returned a pointer to its
  	 * first input, we don't need to do anything.
  	 */
***************
*** 303,309 **** advance_windowaggregate(WindowAggState *winstate,
  	{
  		if (!fcinfo->isnull)
  		{
! 			MemoryContextSwitchTo(winstate->wincontext);
  			newVal = datumCopy(newVal,
  							   peraggstate->transtypeByVal,
  							   peraggstate->transtypeLen);
--- 304,310 ----
  	{
  		if (!fcinfo->isnull)
  		{
! 			MemoryContextSwitchTo(winstate->aggcontext);
  			newVal = datumCopy(newVal,
  							   peraggstate->transtypeByVal,
  							   peraggstate->transtypeLen);
***************
*** 391,396 **** eval_windowaggregates(WindowAggState *winstate)
--- 392,398 ----
  	MemoryContext oldContext;
  	ExprContext *econtext;
  	TupleTableSlot *agg_row_slot;
+ 	WindowObject agg_winobj;
  
  	numaggs = winstate->numaggs;
  	if (numaggs == 0)
***************
*** 398,417 **** eval_windowaggregates(WindowAggState *winstate)
  
  	/* final output execution is in ps_ExprContext */
  	econtext = winstate->ss.ps.ps_ExprContext;
  
  	/*
  	 * Currently, we support only a subset of the SQL-standard window framing
! 	 * rules.  In all the supported cases, the window frame always consists of
! 	 * a contiguous group of rows extending forward from the start of the
  	 * partition, and rows only enter the frame, never exit it, as the current
  	 * row advances forward.  This makes it possible to use an incremental
! 	 * strategy for evaluating aggregates: we run the transition function for
! 	 * each row added to the frame, and run the final function whenever we
! 	 * need the current aggregate value.  This is considerably more efficient
! 	 * than the naive approach of re-running the entire aggregate calculation
! 	 * for each current row.  It does assume that the final function doesn't
! 	 * damage the running transition value, but we have the same assumption
! 	 * in nodeAgg.c too (when it rescans an existing hash table).
  	 *
  	 * In many common cases, multiple rows share the same frame and hence the
  	 * same aggregate value. (In particular, if there's no ORDER BY in a RANGE
--- 400,426 ----
  
  	/* final output execution is in ps_ExprContext */
  	econtext = winstate->ss.ps.ps_ExprContext;
+ 	agg_winobj = winstate->agg_winobj;
+ 	agg_row_slot = winstate->agg_row_slot;
  
  	/*
  	 * Currently, we support only a subset of the SQL-standard window framing
! 	 * rules.  In some cases, the window frame always consists of a
! 	 * contiguous group of rows extending forward from the start of the
  	 * partition, and rows only enter the frame, never exit it, as the current
  	 * row advances forward.  This makes it possible to use an incremental
! 	 * strategy for evaluating aggregates rather than re-running the entire
! 	 * aggregate calculation for each row, which is more efficient.  To keep
! 	 * these cases efficient, we cannot know the frame tail bound before
! 	 * hitting it.
! 	 *
! 	 * The other cases are re-initializing aggregate on head moving.  Aside
! 	 * from the first cases, if frame head is moving and rows exit the frame
! 	 * as advancing current row, the intermediate aggregate results should
! 	 * be discarded and re-run from the head again.  'aggregatedbase' points
! 	 * to the first row where current aggregate calculation starts.  If the
! 	 * frame head doesn't move, intermediate result will be used similarly to
! 	 * the first cases.
  	 *
  	 * In many common cases, multiple rows share the same frame and hence the
  	 * same aggregate value. (In particular, if there's no ORDER BY in a RANGE
***************
*** 424,466 **** eval_windowaggregates(WindowAggState *winstate)
  	 * accumulated into the aggregate transition values.  Whenever we start a
  	 * new peer group, we accumulate forward to the end of the peer group.
  	 *
! 	 * TODO: In the future, we should implement the full SQL-standard set of
! 	 * framing rules.  We could implement the other cases by recalculating the
! 	 * aggregates whenever a row exits the frame.  That would be pretty slow,
! 	 * though.	For aggregates like SUM and COUNT we could implement a
! 	 * "negative transition function" that would be called for each row as it
! 	 * exits the frame.  We'd have to think about avoiding recalculation of
! 	 * volatile arguments of aggregate functions, too.
  	 */
  
  	/*
! 	 * If we've already aggregated up through current row, reuse the saved
! 	 * result values.  NOTE: this test works for the currently supported
! 	 * framing rules, but will need fixing when more are added.
  	 */
! 	if (winstate->aggregatedupto > winstate->currentpos)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			econtext->ecxt_aggvalues[wfuncno] = peraggstate->resultValue;
! 			econtext->ecxt_aggnulls[wfuncno] = peraggstate->resultValueIsNull;
  		}
! 		return;
  	}
  
! 	/* Initialize aggregates on first call for partition */
! 	if (winstate->currentpos == 0)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			initialize_windowaggregate(winstate,
! 									   &winstate->perfunc[wfuncno],
! 									   peraggstate);
  		}
  	}
  
  	/*
--- 433,517 ----
  	 * accumulated into the aggregate transition values.  Whenever we start a
  	 * new peer group, we accumulate forward to the end of the peer group.
  	 *
! 	 * TODO: In the future, we should support "negative transition function"
! 	 * to optimize the cases frame head is moving.  For aggregates like SUM
! 	 * and COUNT we could implement another function that would be called
! 	 * for each row as it exits the frame.  We'd have to think about avoiding
! 	 * recalculation of volatile arguments of aggregate functions, too.
  	 */
  
  	/*
! 	 * First, update frame positions. Aggregate framing detection is similar
! 	 * to one in window function APIs, but it optimizes it's own flow to
! 	 * detect frame head and tail, which finds the tail during aggregation
! 	 * rather than before starting. This results in tuplestore buffering
! 	 * be minimized in such simple frame like RANGE BETWEEN UNBOUNDED PRECEDING
! 	 * AND CURRENT ROW (ie default when specifying only ORDER BY clause.)
  	 */
! 	update_frameheadpos(agg_winobj, winstate->agg_row_slot);
! 
! 	/*
! 	 * Initialize aggregates on first call for partition or when some rows
! 	 * were off from frame since the last call.
! 	 */
! 	if (winstate->currentpos == 0 ||
! 		winstate->frameheadpos > winstate->aggregatedbase)
  	{
+ 		/*
+ 		 * When frame head is advancing, aggregate mark position should be
+ 		 * updated so that tuplestore can discard unnecessary rows.
+ 		 * When the condition is false (i.e. start is UNBOUNDED PRECEDING),
+ 		 * mark_ptr is invalid and we don't care about mark position.
+ 		 */
+ 		if (agg_winobj->markptr > 0)
+ 		{
+ 			if (winstate->frameheadpos < winstate->currentpos)
+ 				WinSetMarkPosition(agg_winobj, winstate->frameheadpos - 1);
+ 
+ 			/*
+ 			 * Discard current state, so that subsequent loop starts with null slot
+ 			 */
+ 			if (!TupIsNull(agg_row_slot))
+ 				ExecClearTuple(agg_row_slot);
+ 			
+ 			/*
+ 			 * reset all memory that contains aggregate transient
+ 			 * values with following pointers allocated in individual
+ 			 * functions.
+ 			 */
+ 			MemoryContextResetAndDeleteChildren(winstate->aggcontext);
+ 		}
+ 
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			initialize_windowaggregate(winstate,
! 									   &winstate->perfunc[wfuncno],
! 									   peraggstate);
  		}
! 
! 		winstate->aggregatedbase = winstate->frameheadpos;
! 		winstate->aggregatedupto = winstate->frameheadpos;
  	}
  
! 	/*
! 	 * If we've already aggregated up through current row and frame covers
! 	 * current row, reuse the saved result values. NOTE: this test should get
! 	 * more efficient in some framing modes.
! 	 */
! 	if (winstate->frametail_stable &&
! 		winstate->aggregatedbase <= winstate->currentpos &&
! 		winstate->aggregatedupto > winstate->currentpos)
  	{
  		for (i = 0; i < numaggs; i++)
  		{
  			peraggstate = &winstate->peragg[i];
  			wfuncno = peraggstate->wfuncno;
! 			econtext->ecxt_aggvalues[wfuncno] = peraggstate->resultValue;
! 			econtext->ecxt_aggnulls[wfuncno] = peraggstate->resultValueIsNull;
  		}
+ 		return;
  	}
  
  	/*
***************
*** 470,486 **** eval_windowaggregates(WindowAggState *winstate)
  	 * at position aggregatedupto.	The agg_ptr read pointer must always point
  	 * to the next row to read into agg_row_slot.
  	 */
- 	agg_row_slot = winstate->agg_row_slot;
  	for (;;)
  	{
  		/* Fetch next row if we didn't already */
  		if (TupIsNull(agg_row_slot))
  		{
! 			spool_tuples(winstate, winstate->aggregatedupto);
! 			tuplestore_select_read_pointer(winstate->buffer,
! 										   winstate->agg_ptr);
! 			if (!tuplestore_gettupleslot(winstate->buffer, true, true,
! 										 agg_row_slot))
  				break;			/* must be end of partition */
  		}
  
--- 521,539 ----
  	 * at position aggregatedupto.	The agg_ptr read pointer must always point
  	 * to the next row to read into agg_row_slot.
  	 */
  	for (;;)
  	{
  		/* Fetch next row if we didn't already */
  		if (TupIsNull(agg_row_slot))
  		{
! 			/*
! 			 * NOTE:
! 			 * Here window_gettupleslot() tends to fetch current row
! 			 * (but not always). We should optimize window_gettupleslot()
! 			 * on this case, where reads ahead and back currently.
! 			 */
! 			if(!window_gettupleslot(agg_winobj, winstate->aggregatedupto,
! 									agg_row_slot))
  				break;			/* must be end of partition */
  		}
  
***************
*** 544,554 **** eval_windowaggregates(WindowAggState *winstate)
  				pfree(DatumGetPointer(peraggstate->resultValue));
  
  			/*
! 			 * If pass-by-ref, copy it into our global context.
  			 */
  			if (!*isnull)
  			{
! 				oldContext = MemoryContextSwitchTo(winstate->wincontext);
  				peraggstate->resultValue =
  					datumCopy(*result,
  							  peraggstate->resulttypeByVal,
--- 597,607 ----
  				pfree(DatumGetPointer(peraggstate->resultValue));
  
  			/*
! 			 * If pass-by-ref, copy it into our aggregate context.
  			 */
  			if (!*isnull)
  			{
! 				oldContext = MemoryContextSwitchTo(winstate->aggcontext);
  				peraggstate->resultValue =
  					datumCopy(*result,
  							  peraggstate->resulttypeByVal,
***************
*** 627,634 **** begin_partition(WindowAggState *winstate)
  	winstate->frametail_valid = false;
  	winstate->spooled_rows = 0;
  	winstate->currentpos = 0;
  	winstate->frametailpos = -1;
- 	winstate->aggregatedupto = 0;
  	ExecClearTuple(winstate->agg_row_slot);
  
  	/*
--- 680,687 ----
  	winstate->frametail_valid = false;
  	winstate->spooled_rows = 0;
  	winstate->currentpos = 0;
+ 	winstate->frameheadpos = -1;
  	winstate->frametailpos = -1;
  	ExecClearTuple(winstate->agg_row_slot);
  
  	/*
***************
*** 665,671 **** begin_partition(WindowAggState *winstate)
  
  	/* create a read pointer for aggregates, if needed */
  	if (winstate->numaggs > 0)
! 		winstate->agg_ptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
  
  	/* create mark and read pointers for each real window function */
  	for (i = 0; i < numfuncs; i++)
--- 718,747 ----
  
  	/* create a read pointer for aggregates, if needed */
  	if (winstate->numaggs > 0)
! 	{
! 		WindowObject agg_winobj = winstate->agg_winobj;
! 		int			readptr_flag = 0;
! 		WindowAgg  *node = (WindowAgg *) winstate->ss.ps.plan;
! 		int			frameOptions = node->frameOptions;
! 
! 		/*
! 		 * create a read pointer for aggregate marking
! 		 * if the frame head may move
! 		 */
! 		if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
! 		{
! 			agg_winobj->markptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
! 			readptr_flag = EXEC_FLAG_BACKWARD;
! 		}
! 
! 		agg_winobj->readptr = tuplestore_alloc_read_pointer(winstate->buffer,
! 															readptr_flag);
! 
! 		agg_winobj->markpos = -1;
! 		agg_winobj->seekpos = -1;
! 		winstate->aggregatedbase = 0;
! 		winstate->aggregatedupto = 0;
! 	}
  
  	/* create mark and read pointers for each real window function */
  	for (i = 0; i < numfuncs; i++)
***************
*** 790,795 **** release_partition(WindowAggState *winstate)
--- 866,872 ----
  	 * aggregates might have allocated data we don't have direct pointers to.
  	 */
  	MemoryContextResetAndDeleteChildren(winstate->wincontext);
+ 	MemoryContextResetAndDeleteChildren(winstate->aggcontext);
  
  	if (winstate->buffer)
  		tuplestore_end(winstate->buffer);
***************
*** 814,842 **** row_is_in_frame(WindowAggState *winstate, int64 pos, TupleTableSlot *slot)
  
  	Assert(pos >= 0);			/* else caller error */
  
! 	/* We only support frame start mode UNBOUNDED PRECEDING for now */
! 	Assert(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING);
  
  	/* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */
  	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  		return true;
  
! 	/* Else frame tail mode must be CURRENT ROW */
! 	Assert(frameOptions & FRAMEOPTION_END_CURRENT_ROW);
  
! 	/* if row is current row or a predecessor, it must be in frame */
! 	if (pos <= winstate->currentpos)
! 		return true;
  
! 	/* In ROWS mode, *only* such rows are in frame */
! 	if (frameOptions & FRAMEOPTION_ROWS)
! 		return false;
  
! 	/* Else must be RANGE mode */
! 	Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 	/* In frame iff it's a peer of current row */
! 	return are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot);
  }
  
  /*
--- 891,1063 ----
  
  	Assert(pos >= 0);			/* else caller error */
  
! 	if (frameOptions & FRAMEOPTION_START_CURRENT_ROW)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			if (pos < winstate->currentpos)
! 				return false;
! 		}
! 		else
! 		{
! 			Assert(frameOptions & FRAMEOPTION_RANGE);
! 
! 			/* preceding row that is not peer is out of frame */
! 			if (pos < winstate->currentpos &&
! 				!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot))
! 				return false;
! 		}
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_START_VALUE)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			int64	offset = winstate->startRowOffset;
! 
! 			/* In ROWS mode, row within phisical offsets is in frame */
! 			if(frameOptions & FRAMEOPTION_START_VALUE_PRECEDING)
! 				offset = -offset;
! 			else
! 				Assert(frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING);
! 
! 			if (pos < winstate->currentpos + offset)
! 				return false;
! 		}
! 		else
! 			elog(ERROR, "RANGE ... value PRECEDING / FOLLOWING ... is not supported");
! 	}
! 
! 	/* seems ok for starting bound up to now */
  
  	/* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */
  	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  		return true;
  
! 	if (frameOptions & FRAMEOPTION_END_CURRENT_ROW)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			return (pos <= winstate->currentpos);
! 		}
! 		else
! 		{
! 			Assert(frameOptions & FRAMEOPTION_RANGE);
! 
! 			/* preceding row or peer is in frame */
! 			return pos <= winstate->currentpos ||
! 				are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot);
! 		}
! 	}
  
! 	if (frameOptions & FRAMEOPTION_END_VALUE)
! 	{
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			int64	offset = winstate->endRowOffset;
  
! 			/* In ROWS mode, row within phisical offsets is in frame */
! 			if (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING)
! 				offset = -offset;
! 			else
! 				Assert(frameOptions & FRAMEOPTION_END_VALUE_FOLLOWING);
! 
! 			return (pos <= winstate->currentpos + offset);
! 		}
! 		else
! 			elog(ERROR, "RANGE ... value PRECEDING / FOLLOWING is not supported");
! 	}
! 
! 	/* It is not supposed to reach here */
! 	elog(ERROR, "invalid frame option: %d", frameOptions);
! 
! 	return false; /* keep compiler quiet */
! }
! 
! static void
! update_frameheadpos(WindowObject winobj, TupleTableSlot *slot)
! {
! 	WindowAggState *winstate = winobj->winstate;
! 	WindowAgg  *node = (WindowAgg *) winstate->ss.ps.plan;
! 	int			frameOptions = node->frameOptions;
! 
! 	if (winstate->framehead_valid)
! 		return;
! 
! 	if (frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
! 	{
! 		winstate->frameheadpos = 0;
! 		winstate->framehead_valid = true;
! 		return;
! 	}
! 
! 	if (frameOptions & FRAMEOPTION_START_CURRENT_ROW)
! 	{
! 		int64		pos = winstate->currentpos;
! 
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			winstate->frameheadpos = pos;
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		Assert(frameOptions & FRAMEOPTION_RANGE);
! 
! 		if (node->ordNumCols == 0)
! 		{
! 			winstate->frameheadpos = 0;
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		/*
! 		 * In RANGE START_CURRENT mode, frame head is
! 		 * the head row of all peers.
! 		 */
! 		for(;;)
! 		{
! 			if (!window_gettupleslot(winobj, pos, slot))
! 				break;
! 			if (!are_peers(winstate, slot,
! 						   winstate->ss.ss_ScanTupleSlot))
! 				break;
! 			pos--;
! 		}
! 		winstate->frameheadpos = pos + 1;
! 		winstate->framehead_valid = true;
! 		return;
! 	}
  
! 	if (frameOptions & FRAMEOPTION_START_VALUE)
! 	{
! 		/* In ROWS mode, frame bound is phisically n before/after current */
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			int64	offset = winstate->startRowOffset;
! 
! 			if (frameOptions & FRAMEOPTION_START_VALUE_PRECEDING)
! 				offset = -offset;
! 			else
! 				Assert(frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING);
  
! 			winstate->frameheadpos = winstate->currentpos + offset;
! 			if (offset < 0)
! 			{
! 				if (winstate->frameheadpos < 0)
! 					winstate->frameheadpos = 0;
! 			}
! 			else if (offset > 0)
! 			{
! 				/* we have to know end of the partition to cut off oveflow */
! 				spool_tuples(winstate, -1);
! 				if (winstate->frameheadpos >= winstate->spooled_rows)
! 					winstate->frameheadpos = winstate->spooled_rows - 1;
! 			}
! 			winstate->framehead_valid = true;
! 			return;
! 		}
! 		else
! 			elog(ERROR, "RAGE ... value PRECEDING / FOLLOWING is not supported");
! 	}
  }
  
  /*
***************
*** 858,866 **** update_frametailpos(WindowObject winobj, TupleTableSlot *slot)
  	if (winstate->frametail_valid)
  		return;					/* already known for current row */
  
- 	/* We only support frame start mode UNBOUNDED PRECEDING for now */
- 	Assert(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING);
- 
  	/* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */
  	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
  	{
--- 1079,1084 ----
***************
*** 870,916 **** update_frametailpos(WindowObject winobj, TupleTableSlot *slot)
  		return;
  	}
  
! 	/* Else frame tail mode must be CURRENT ROW */
! 	Assert(frameOptions & FRAMEOPTION_END_CURRENT_ROW);
! 
! 	/* In ROWS mode, exactly the rows up to current are in frame */
! 	if (frameOptions & FRAMEOPTION_ROWS)
  	{
! 		winstate->frametailpos = winstate->currentpos;
! 		winstate->frametail_valid = true;
! 		return;
! 	}
  
! 	/* Else must be RANGE mode */
! 	Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 	/* If no ORDER BY, all rows are peers with each other */
! 	if (node->ordNumCols == 0)
! 	{
! 		spool_tuples(winstate, -1);
! 		winstate->frametailpos = winstate->spooled_rows - 1;
! 		winstate->frametail_valid = true;
! 		return;
  	}
  
! 	/*
! 	 * Else we have to search for the first non-peer of the current row. We
! 	 * assume the current value of frametailpos is a lower bound on the
! 	 * possible frame tail location, ie, frame tail never goes backward, and
! 	 * that currentpos is also a lower bound, ie, current row is always in
! 	 * frame.
! 	 */
! 	ftnext = Max(winstate->frametailpos, winstate->currentpos) + 1;
! 	for (;;)
  	{
! 		if (!window_gettupleslot(winobj, ftnext, slot))
! 			break;				/* end of partition */
! 		if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot))
! 			break;				/* not peer of current row */
! 		ftnext++;
  	}
- 	winstate->frametailpos = ftnext - 1;
- 	winstate->frametail_valid = true;
  }
  
  
--- 1088,1169 ----
  		return;
  	}
  
! 	if (frameOptions & FRAMEOPTION_END_CURRENT_ROW)
  	{
! 		/* In ROWS mode, exactly the rows up to current are in frame */
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			winstate->frametailpos = winstate->currentpos;
! 			winstate->frametail_valid = true;
! 			return;
! 		}
! 		else
! 		{
! 			/* Else must be RANGE mode */
! 			Assert(frameOptions & FRAMEOPTION_RANGE);
  
! 			/* If no ORDER BY, all rows are peers with each other */
! 			if (node->ordNumCols == 0)
! 			{
! 				spool_tuples(winstate, -1);
! 				winstate->frametailpos = winstate->spooled_rows - 1;
! 				winstate->frametail_valid = true;
! 				return;
! 			}
  
! 			/*
! 			 * Else we have to search for the first non-peer of the current row. We
! 			 * assume the current value of frametailpos is a lower bound on the
! 			 * possible frame tail location, ie, frame tail never goes backward, and
! 			 * that currentpos is also a lower bound, ie, current row is always in
! 			 * frame.
! 			 */
! 			ftnext = Max(winstate->frametailpos, winstate->currentpos) + 1;
! 			for (;;)
! 			{
! 				if (!window_gettupleslot(winobj, ftnext, slot))
! 					break;				/* end of partition */
! 				if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot))
! 					break;				/* not peer of current row */
! 				ftnext++;
! 			}
! 			winstate->frametailpos = ftnext - 1;
! 			winstate->frametail_valid = true;
! 			return;
! 		}
  	}
  
! 	if (frameOptions & FRAMEOPTION_END_VALUE)
  	{
! 		/* In ROWS mode, frame bound is phisically n before/after current */
! 		if (frameOptions & FRAMEOPTION_ROWS)
! 		{
! 			int64	offset = winstate->endRowOffset;
! 
! 			if (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING)
! 				offset = -offset;
! 			else
! 				Assert(frameOptions & FRAMEOPTION_END_VALUE_FOLLOWING);
! 			winstate->frametailpos = winstate->currentpos + offset;
! 
! 			if (offset < 0)
! 			{
! 				if (winstate->frametailpos < 0)
! 					winstate->frametailpos = 0;
! 			}
! 			else if (offset > 0)
! 			{
! 				/* we have to know end of the partition to cut off overflow */
! 				spool_tuples(winstate, -1);
! 				if (winstate->frametailpos >= winstate->spooled_rows)
! 					winstate->frametailpos = winstate->spooled_rows - 1;
! 			}
! 			winstate->frametail_valid = true;
! 			return;
! 		}
! 		else
! 			elog(ERROR, "RANGE ... value PRECEDING / FOLLOWING is not supported");
  	}
  }
  
  
***************
*** 953,958 **** ExecWindowAgg(WindowAggState *winstate)
--- 1206,1262 ----
  		winstate->ss.ps.ps_TupFromTlist = false;
  	}
  
+ 	if (winstate->all_first)
+ 	{
+ 		WindowAgg	   *node = (WindowAgg *) winstate->ss.ps.plan;
+ 		int				frameOptions = node->frameOptions;
+ 		ExprContext	   *econtext = winstate->ss.ps.ps_ExprContext;
+ 		Datum			value;
+ 		bool			isnull;
+ 
+ 		if (frameOptions & FRAMEOPTION_START_VALUE)
+ 		{
+ 			Assert(winstate->startOffset != NULL);
+ 			value = ExecEvalExprSwitchContext(winstate->startOffset,
+ 											  econtext,
+ 											  &isnull,
+ 											  NULL);
+ 			if (isnull)
+ 				elog(ERROR, "frame starting value must not be NULL");
+ 			if (frameOptions & FRAMEOPTION_ROWS)
+ 			{
+ 				int64	offset = DatumGetInt64(value);
+ 
+ 				if (offset < 0)
+ 					elog(ERROR, "frame starting offset must not be negative");
+ 				winstate->startRowOffset = offset;
+ 			}
+ 			else
+ 				elog(ERROR, "frame starting bound cannot be determined");
+ 		}
+ 		if (frameOptions & FRAMEOPTION_END_VALUE)
+ 		{
+ 			Assert(winstate->endOffset != NULL);
+ 			value = ExecEvalExprSwitchContext(winstate->endOffset,
+ 											  econtext,
+ 											  &isnull,
+ 											  NULL);
+ 			if (isnull)
+ 				elog(ERROR, "frame ending value must not be NULL");
+ 			if (frameOptions & FRAMEOPTION_ROWS)
+ 			{
+ 				int64	offset = DatumGetInt64(value);
+ 
+ 				if (offset < 0)
+ 					elog(ERROR, "frame ending value must not be negative");
+ 				winstate->endRowOffset = offset;
+ 			}
+ 			else
+ 				elog(ERROR, "frame ending bound cannot be determined");
+ 		}
+ 		winstate->all_first = false;
+ 	}
+ 
  restart:
  	if (winstate->buffer == NULL)
  	{
***************
*** 964,969 **** restart:
--- 1268,1274 ----
  	{
  		/* Advance current row within partition */
  		winstate->currentpos++;
+ 		winstate->framehead_valid = false;
  		/* This might mean that the frame tail moves, too */
  		winstate->frametail_valid = false;
  	}
***************
*** 1099,1112 **** ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
  	winstate->tmpcontext = tmpcontext;
  	ExecAssignExprContext(estate, &winstate->ss.ps);
  
! 	/* Create long-lived context for storage of aggregate transvalues etc */
  	winstate->wincontext =
  		AllocSetContextCreate(CurrentMemoryContext,
! 							  "WindowAggContext",
  							  ALLOCSET_DEFAULT_MINSIZE,
  							  ALLOCSET_DEFAULT_INITSIZE,
  							  ALLOCSET_DEFAULT_MAXSIZE);
  
  	/*
  	 * tuple table initialization
  	 */
--- 1404,1424 ----
  	winstate->tmpcontext = tmpcontext;
  	ExecAssignExprContext(estate, &winstate->ss.ps);
  
! 	/* Create long-lived context for storage of partition-local memory etc */
  	winstate->wincontext =
  		AllocSetContextCreate(CurrentMemoryContext,
! 							  "WindowAggContextWin",
  							  ALLOCSET_DEFAULT_MINSIZE,
  							  ALLOCSET_DEFAULT_INITSIZE,
  							  ALLOCSET_DEFAULT_MAXSIZE);
  
+ 	/* Create mid-lived context for aggregate trans values etc */
+ 	winstate->aggcontext =
+ 		AllocSetContextCreate(CurrentMemoryContext,
+ 							  "WindowAggContextAgg",
+ 							  ALLOCSET_DEFAULT_MINSIZE,
+ 							  ALLOCSET_DEFAULT_INITSIZE,
+ 							  ALLOCSET_DEFAULT_MAXSIZE);
  	/*
  	 * tuple table initialization
  	 */
***************
*** 1229,1235 **** ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
  		perfuncstate->numArguments = list_length(wfuncstate->args);
  
  		fmgr_info_cxt(wfunc->winfnoid, &perfuncstate->flinfo,
! 					  tmpcontext->ecxt_per_query_memory);
  		perfuncstate->flinfo.fn_expr = (Node *) wfunc;
  		get_typlenbyval(wfunc->wintype,
  						&perfuncstate->resulttypeLen,
--- 1541,1547 ----
  		perfuncstate->numArguments = list_length(wfuncstate->args);
  
  		fmgr_info_cxt(wfunc->winfnoid, &perfuncstate->flinfo,
! 					  econtext->ecxt_per_query_memory);
  		perfuncstate->flinfo.fn_expr = (Node *) wfunc;
  		get_typlenbyval(wfunc->wintype,
  						&perfuncstate->resulttypeLen,
***************
*** 1264,1272 **** ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
--- 1576,1612 ----
  	winstate->numfuncs = wfuncno + 1;
  	winstate->numaggs = aggno + 1;
  
+ 	if (winstate->numaggs > 0)
+ 	{
+ 		WindowObject agg_winobj = makeNode(WindowObjectData);
+ 
+ 		agg_winobj->winstate = winstate;
+ 		agg_winobj->argstates = NULL;
+ 		agg_winobj->localmem = NULL;
+ 		/* make sure markptr = -1 to invalidate. it may not be used */
+ 		agg_winobj->markptr = -1;
+ 		agg_winobj->readptr = -1;
+ 		winstate->agg_winobj = agg_winobj;
+ 	}
+ 
+ 	winstate->all_first = true;
  	winstate->partition_spooled = false;
  	winstate->more_partitions = false;
  
+ 	/*
+ 	 * frame stability means whether frame moves and rows are exiting.
+ 	 * this helps prediction of optimization in such like aggregates
+ 	 */
+ 	winstate->frametail_stable = !(node->frameOptions &
+ 								  (FRAMEOPTION_END_VALUE_PRECEDING |
+ 								   FRAMEOPTION_END_VALUE_FOLLOWING));
+ 
+ 	/* initialize frame bound offsets */
+ 	winstate->startOffset = ExecInitExpr((Expr *) node->startOffset,
+ 										 (PlanState *) winstate);
+ 	winstate->endOffset = ExecInitExpr((Expr *) node->endOffset,
+ 									   (PlanState *) winstate);
+ 
  	return winstate;
  }
  
***************
*** 1298,1303 **** ExecEndWindowAgg(WindowAggState *node)
--- 1638,1644 ----
  	ExecFreeExprContext(&node->ss.ps);
  
  	MemoryContextDelete(node->wincontext);
+ 	MemoryContextDelete(node->aggcontext);
  
  	outerPlan = outerPlanState(node);
  	ExecEndNode(outerPlan);
***************
*** 1315,1320 **** ExecReScanWindowAgg(WindowAggState *node, ExprContext *exprCtxt)
--- 1656,1662 ----
  	node->all_done = false;
  
  	node->ss.ps.ps_TupFromTlist = false;
+ 	node->all_first = true;
  
  	/* release tuplestore et al */
  	release_partition(node);
***************
*** 1598,1604 **** window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
   * API exposed to window functions
   ***********************************************************************/
  
- 
  /*
   * WinGetPartitionLocalMemory
   *		Get working memory that lives till end of partition processing
--- 1940,1945 ----
***************
*** 1749,1754 **** WinGetFuncArgInPartition(WindowObject winobj, int argno,
--- 2090,2097 ----
  						 bool *isnull, bool *isout)
  {
  	WindowAggState *winstate;
+ 	WindowAgg  *node;
+ 	int			frameOptions;
  	ExprContext *econtext;
  	TupleTableSlot *slot;
  	bool		gottuple;
***************
*** 1756,1761 **** WinGetFuncArgInPartition(WindowObject winobj, int argno,
--- 2099,2106 ----
  
  	Assert(WindowObjectIsValid(winobj));
  	winstate = winobj->winstate;
+ 	node = (WindowAgg *) winstate->ss.ps.plan;
+ 	frameOptions = node->frameOptions;
  	econtext = winstate->ss.ps.ps_ExprContext;
  	slot = winstate->temp_slot_1;
  
***************
*** 1791,1797 **** WinGetFuncArgInPartition(WindowObject winobj, int argno,
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 			WinSetMarkPosition(winobj, abs_pos);
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
--- 2136,2158 ----
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 		{
! 			int64	mark_pos = abs_pos;
! 
! 			/*
! 			 * In moving RANGE frame mode, we must keep mark at
! 			 * frameheadpos - 1, since that row is to be checked
! 			 * to find frame starting boundary.
! 			 */
! 			if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) &&
! 				frameOptions & FRAMEOPTION_RANGE)
! 			{
! 				update_frameheadpos(winobj, winstate->temp_slot_2);
! 				if (mark_pos >= winstate->frameheadpos)
! 					mark_pos = winstate->frameheadpos - 1;
! 			}
! 			WinSetMarkPosition(winobj, mark_pos);
! 		}
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
***************
*** 1822,1827 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
--- 2183,2190 ----
  					 bool *isnull, bool *isout)
  {
  	WindowAggState *winstate;
+ 	WindowAgg  *node;
+ 	int			frameOptions;
  	ExprContext *econtext;
  	TupleTableSlot *slot;
  	bool		gottuple;
***************
*** 1829,1834 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
--- 2192,2199 ----
  
  	Assert(WindowObjectIsValid(winobj));
  	winstate = winobj->winstate;
+ 	node = (WindowAgg *) winstate->ss.ps.plan;
+ 	frameOptions = node->frameOptions;
  	econtext = winstate->ss.ps.ps_ExprContext;
  	slot = winstate->temp_slot_1;
  
***************
*** 1838,1844 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
  			abs_pos = winstate->currentpos + relpos;
  			break;
  		case WINDOW_SEEK_HEAD:
! 			abs_pos = relpos;
  			break;
  		case WINDOW_SEEK_TAIL:
  			update_frametailpos(winobj, slot);
--- 2203,2210 ----
  			abs_pos = winstate->currentpos + relpos;
  			break;
  		case WINDOW_SEEK_HEAD:
! 			update_frameheadpos(winobj, slot);
! 			abs_pos = winstate->frameheadpos + relpos;
  			break;
  		case WINDOW_SEEK_TAIL:
  			update_frametailpos(winobj, slot);
***************
*** 1866,1872 **** WinGetFuncArgInFrame(WindowObject winobj, int argno,
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 			WinSetMarkPosition(winobj, abs_pos);
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
--- 2232,2254 ----
  		if (isout)
  			*isout = false;
  		if (set_mark)
! 		{
! 			int64		mark_pos = abs_pos;
! 
! 			/*
! 			 * In moving RANGE frame mode, we must keep mark at
! 			 * frameheadpos - 1, since that row is to be checked
! 			 * to find frame starting boundary.
! 			 */
! 			if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) &&
! 				frameOptions & FRAMEOPTION_RANGE)
! 			{
! 				update_frameheadpos(winobj, winstate->temp_slot_2);
! 				if (mark_pos >= winstate->frameheadpos)
! 					mark_pos = winstate->frameheadpos - 1;
! 			}
! 			WinSetMarkPosition(winobj, mark_pos);
! 		}
  		econtext->ecxt_outertuple = slot;
  		return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
  							econtext, isnull, NULL);
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 718,723 **** _copyWindowAgg(WindowAgg *from)
--- 718,725 ----
  		COPY_POINTER_FIELD(ordOperators, from->ordNumCols * sizeof(Oid));
  	}
  	COPY_SCALAR_FIELD(frameOptions);
+ 	COPY_NODE_FIELD(startOffset);
+ 	COPY_NODE_FIELD(endOffset);
  
  	return newnode;
  }
***************
*** 1845,1850 **** _copyWindowClause(WindowClause *from)
--- 1847,1854 ----
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
  	COPY_SCALAR_FIELD(frameOptions);
+ 	COPY_NODE_FIELD(startOffset);
+ 	COPY_NODE_FIELD(endOffset);
  	COPY_SCALAR_FIELD(winref);
  	COPY_SCALAR_FIELD(copiedOrder);
  
***************
*** 2062,2067 **** _copySortBy(SortBy *from)
--- 2066,2084 ----
  	return newnode;
  }
  
+ static WindowFrameDef *
+ _copyWindowFrameDef(WindowFrameDef *from)
+ {
+ 	WindowFrameDef *newnode = makeNode(WindowFrameDef);
+ 
+ 	COPY_SCALAR_FIELD(options);
+ 	COPY_NODE_FIELD(startOffset);
+ 	COPY_NODE_FIELD(endOffset);
+ 	COPY_LOCATION_FIELD(location);
+ 
+ 	return newnode;
+ }
+ 
  static WindowDef *
  _copyWindowDef(WindowDef *from)
  {
***************
*** 2071,2077 **** _copyWindowDef(WindowDef *from)
  	COPY_STRING_FIELD(refname);
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
! 	COPY_SCALAR_FIELD(frameOptions);
  	COPY_LOCATION_FIELD(location);
  
  	return newnode;
--- 2088,2094 ----
  	COPY_STRING_FIELD(refname);
  	COPY_NODE_FIELD(partitionClause);
  	COPY_NODE_FIELD(orderClause);
! 	COPY_NODE_FIELD(frame);
  	COPY_LOCATION_FIELD(location);
  
  	return newnode;
***************
*** 4152,4157 **** copyObject(void *from)
--- 4169,4177 ----
  		case T_SortBy:
  			retval = _copySortBy(from);
  			break;
+ 		case T_WindowFrameDef:
+ 			retval = _copyWindowFrameDef(from);
+ 			break;
  		case T_WindowDef:
  			retval = _copyWindowDef(from);
  			break;
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 2033,2045 **** _equalSortBy(SortBy *a, SortBy *b)
  }
  
  static bool
  _equalWindowDef(WindowDef *a, WindowDef *b)
  {
  	COMPARE_STRING_FIELD(name);
  	COMPARE_STRING_FIELD(refname);
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
! 	COMPARE_SCALAR_FIELD(frameOptions);
  	COMPARE_LOCATION_FIELD(location);
  
  	return true;
--- 2033,2055 ----
  }
  
  static bool
+ _equalWindowFrameDef(WindowFrameDef *a, WindowFrameDef *b)
+ {
+ 	COMPARE_SCALAR_FIELD(options);
+ 	COMPARE_NODE_FIELD(startOffset);
+ 	COMPARE_NODE_FIELD(endOffset);
+ 
+ 	return true;
+ }
+ 
+ static bool
  _equalWindowDef(WindowDef *a, WindowDef *b)
  {
  	COMPARE_STRING_FIELD(name);
  	COMPARE_STRING_FIELD(refname);
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
! 	COMPARE_NODE_FIELD(frame);
  	COMPARE_LOCATION_FIELD(location);
  
  	return true;
***************
*** 2185,2190 **** _equalWindowClause(WindowClause *a, WindowClause *b)
--- 2195,2202 ----
  	COMPARE_NODE_FIELD(partitionClause);
  	COMPARE_NODE_FIELD(orderClause);
  	COMPARE_SCALAR_FIELD(frameOptions);
+ 	COMPARE_NODE_FIELD(startOffset);
+ 	COMPARE_NODE_FIELD(endOffset);
  	COMPARE_SCALAR_FIELD(winref);
  	COMPARE_SCALAR_FIELD(copiedOrder);
  
***************
*** 2845,2850 **** equal(void *a, void *b)
--- 2857,2865 ----
  		case T_SortBy:
  			retval = _equalSortBy(a, b);
  			break;
+ 		case T_WindowFrameDef:
+ 			retval = _equalWindowFrameDef(a, b);
+ 			break;
  		case T_WindowDef:
  			retval = _equalWindowDef(a, b);
  			break;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 610,615 **** _outWindowAgg(StringInfo str, WindowAgg *node)
--- 610,617 ----
  		appendStringInfo(str, " %u", node->ordOperators[i]);
  
  	WRITE_INT_FIELD(frameOptions);
+ 	WRITE_NODE_FIELD(startOffset);
+ 	WRITE_NODE_FIELD(endOffset);
  }
  
  static void
***************
*** 2026,2032 **** _outWindowClause(StringInfo str, WindowClause *node)
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_INT_FIELD(frameOptions);
  	WRITE_UINT_FIELD(winref);
  	WRITE_BOOL_FIELD(copiedOrder);
  }
--- 2028,2036 ----
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_UINT_FIELD(frameOptions);
! 	WRITE_NODE_FIELD(startOffset);
! 	WRITE_NODE_FIELD(endOffset);
  	WRITE_UINT_FIELD(winref);
  	WRITE_BOOL_FIELD(copiedOrder);
  }
***************
*** 2309,2314 **** _outSortBy(StringInfo str, SortBy *node)
--- 2313,2328 ----
  }
  
  static void
+ _outWindowFrameDef(StringInfo str, WindowFrameDef *node)
+ {
+ 	WRITE_NODE_TYPE("WINDOWFRAMEDEF");
+ 
+ 	WRITE_INT_FIELD(options);
+ 	WRITE_NODE_FIELD(startOffset);
+ 	WRITE_NODE_FIELD(endOffset);
+ }
+ 
+ static void
  _outWindowDef(StringInfo str, WindowDef *node)
  {
  	WRITE_NODE_TYPE("WINDOWDEF");
***************
*** 2317,2323 **** _outWindowDef(StringInfo str, WindowDef *node)
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_INT_FIELD(frameOptions);
  	WRITE_LOCATION_FIELD(location);
  }
  
--- 2331,2337 ----
  	WRITE_STRING_FIELD(refname);
  	WRITE_NODE_FIELD(partitionClause);
  	WRITE_NODE_FIELD(orderClause);
! 	WRITE_NODE_FIELD(frame);
  	WRITE_LOCATION_FIELD(location);
  }
  
***************
*** 2850,2855 **** _outNode(StringInfo str, void *obj)
--- 2864,2872 ----
  			case T_SortBy:
  				_outSortBy(str, obj);
  				break;
+ 			case T_WindowFrameDef:
+ 				_outWindowFrameDef(str, obj);
+ 				break;
  			case T_WindowDef:
  				_outWindowDef(str, obj);
  				break;
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 278,284 **** _readWindowClause(void)
  	READ_STRING_FIELD(refname);
  	READ_NODE_FIELD(partitionClause);
  	READ_NODE_FIELD(orderClause);
! 	READ_INT_FIELD(frameOptions);
  	READ_UINT_FIELD(winref);
  	READ_BOOL_FIELD(copiedOrder);
  
--- 278,286 ----
  	READ_STRING_FIELD(refname);
  	READ_NODE_FIELD(partitionClause);
  	READ_NODE_FIELD(orderClause);
! 	READ_UINT_FIELD(frameOptions);
! 	READ_NODE_FIELD(startOffset);
! 	READ_NODE_FIELD(endOffset);
  	READ_UINT_FIELD(winref);
  	READ_BOOL_FIELD(copiedOrder);
  
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 3354,3360 **** make_windowagg(PlannerInfo *root, List *tlist,
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Plan *lefttree)
  {
  	WindowAgg  *node = makeNode(WindowAgg);
  	Plan	   *plan = &node->plan;
--- 3354,3361 ----
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Node *startOffset, Node *endOffset,
! 			   Plan *lefttree)
  {
  	WindowAgg  *node = makeNode(WindowAgg);
  	Plan	   *plan = &node->plan;
***************
*** 3369,3374 **** make_windowagg(PlannerInfo *root, List *tlist,
--- 3370,3377 ----
  	node->ordColIdx = ordColIdx;
  	node->ordOperators = ordOperators;
  	node->frameOptions = frameOptions;
+ 	node->startOffset = startOffset;
+ 	node->endOffset = endOffset;
  
  	copy_plan_costsize(plan, lefttree); /* only care about copying size */
  	cost_windowagg(&windowagg_path, root,
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 1480,1490 **** grouping_planner(PlannerInfo *root, double tuple_fraction)
  					window_tlist = tlist;
  				}
  
  				/* ... and make the WindowAgg plan node */
  				result_plan = (Plan *)
  					make_windowagg(root,
  								   (List *) copyObject(window_tlist),
! 							   list_length(wflists->windowFuncs[wc->winref]),
  								   wc->winref,
  								   partNumCols,
  								   partColIdx,
--- 1480,1494 ----
  					window_tlist = tlist;
  				}
  
+ 				wc->startOffset = preprocess_expression(root, wc->startOffset,
+ 														EXPRKIND_LIMIT);
+ 				wc->endOffset = preprocess_expression(root, wc->endOffset,
+ 													  EXPRKIND_LIMIT);
  				/* ... and make the WindowAgg plan node */
  				result_plan = (Plan *)
  					make_windowagg(root,
  								   (List *) copyObject(window_tlist),
! 								   list_length(wflists->windowFuncs[wc->winref]),
  								   wc->winref,
  								   partNumCols,
  								   partColIdx,
***************
*** 1493,1498 **** grouping_planner(PlannerInfo *root, double tuple_fraction)
--- 1497,1504 ----
  								   ordColIdx,
  								   ordOperators,
  								   wc->frameOptions,
+ 								   wc->startOffset,
+ 								   wc->endOffset,
  								   result_plan);
  			}
  		}
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 465,472 **** set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset)
  					fix_scan_expr(glob, splan->limitCount, rtoffset);
  			}
  			break;
- 		case T_Agg:
  		case T_WindowAgg:
  		case T_Group:
  			set_upper_references(glob, plan, rtoffset);
  			break;
--- 465,488 ----
  					fix_scan_expr(glob, splan->limitCount, rtoffset);
  			}
  			break;
  		case T_WindowAgg:
+ 			{
+ 				WindowAgg	   *splan = (WindowAgg *) plan;
+ 
+ 				/*
+ 				 * Like Limit node limit/offset expressions, WindowAgg has
+ 				 * frame offset expressions which cannot contain subplan
+ 				 * variable refs, so fix_scan_expr works for them.
+ 				 */
+ 				splan->startOffset =
+ 					fix_scan_expr(glob, splan->startOffset, rtoffset);
+ 				splan->endOffset =
+ 					fix_scan_expr(glob, splan->endOffset, rtoffset);
+ 
+ 				set_upper_references(glob, plan, rtoffset);
+ 			}
+ 			break;
+ 		case T_Agg:
  		case T_Group:
  			set_upper_references(glob, plan, rtoffset);
  			break;
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 2087,2095 **** finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
  										 locally_added_param);
  			break;
  
  		case T_Hash:
  		case T_Agg:
- 		case T_WindowAgg:
  		case T_Material:
  		case T_Sort:
  		case T_Unique:
--- 2087,2101 ----
  										 locally_added_param);
  			break;
  
+ 		case T_WindowAgg:
+ 			finalize_primnode(((WindowAgg *) plan)->startOffset,
+ 							  &context);
+ 			finalize_primnode(((WindowAgg *) plan)->endOffset,
+ 							  &context);
+ 			break;
+ 
  		case T_Hash:
  		case T_Agg:
  		case T_Material:
  		case T_Sort:
  		case T_Unique:
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 430,436 **** static TypeName *TableFuncTypeName(List *columns);
  %type <list>	window_clause window_definition_list opt_partition_clause
  %type <windef>	window_definition over_clause window_specification
  %type <str>		opt_existing_window_name
! %type <ival>	opt_frame_clause frame_extent frame_bound
  
  
  /*
--- 430,436 ----
  %type <list>	window_clause window_definition_list opt_partition_clause
  %type <windef>	window_definition over_clause window_specification
  %type <str>		opt_existing_window_name
! %type <node>	opt_frame_clause frame_extent frame_bound
  
  
  /*
***************
*** 564,569 **** static TypeName *TableFuncTypeName(List *columns);
--- 564,570 ----
  %nonassoc	BETWEEN
  %nonassoc	IN_P
  %left		POSTFIXOP		/* dummy for postfix Op rules */
+ %nonassoc	UNBOUNDED		/* needed for frame_extend, frame_bound */
  /*
   * To support target_el without AS, we must give IDENT an explicit priority
   * between POSTFIXOP and Op.  We can safely assign the same priority to
***************
*** 574,580 **** static TypeName *TableFuncTypeName(List *columns);
   * so that they can follow a_expr without creating
   * postfix-operator problems.
   */
! %nonassoc	IDENT PARTITION RANGE ROWS
  %left		Op OPERATOR		/* multi-character ops and user-defined operators */
  %nonassoc	NOTNULL
  %nonassoc	ISNULL
--- 575,581 ----
   * so that they can follow a_expr without creating
   * postfix-operator problems.
   */
! %nonassoc	IDENT PARTITION RANGE ROWS PRECEDING FOLLOWING
  %left		Op OPERATOR		/* multi-character ops and user-defined operators */
  %nonassoc	NOTNULL
  %nonassoc	ISNULL
***************
*** 9735,9745 **** over_clause: OVER window_specification
  			| OVER ColId
  				{
  					WindowDef *n = makeNode(WindowDef);
  					n->name = $2;
  					n->refname = NULL;
  					n->partitionClause = NIL;
  					n->orderClause = NIL;
! 					n->frameOptions = FRAMEOPTION_DEFAULTS;
  					n->location = @2;
  					$$ = n;
  				}
--- 9736,9750 ----
  			| OVER ColId
  				{
  					WindowDef *n = makeNode(WindowDef);
+ 					WindowFrameDef *frame = makeNode(WindowFrameDef);
+ 					frame->options = FRAMEOPTION_DEFAULTS;
+ 					frame->startOffset = NULL;
+ 					frame->endOffset = NULL;
  					n->name = $2;
  					n->refname = NULL;
  					n->partitionClause = NIL;
  					n->orderClause = NIL;
! 					n->frame = frame;
  					n->location = @2;
  					$$ = n;
  				}
***************
*** 9755,9761 **** window_specification: '(' opt_existing_window_name opt_partition_clause
  					n->refname = $2;
  					n->partitionClause = $3;
  					n->orderClause = $4;
! 					n->frameOptions = $5;
  					n->location = @1;
  					$$ = n;
  				}
--- 9760,9766 ----
  					n->refname = $2;
  					n->partitionClause = $3;
  					n->orderClause = $4;
! 					n->frame = (WindowFrameDef *) $5;
  					n->location = @1;
  					$$ = n;
  				}
***************
*** 9787,9837 **** opt_partition_clause: PARTITION BY expr_list		{ $$ = $3; }
  opt_frame_clause:
  			RANGE frame_extent
  				{
! 					$$ = FRAMEOPTION_NONDEFAULT | FRAMEOPTION_RANGE | $2;
  				}
  			| ROWS frame_extent
  				{
! 					$$ = FRAMEOPTION_NONDEFAULT | FRAMEOPTION_ROWS | $2;
  				}
  			| /*EMPTY*/
! 				{ $$ = FRAMEOPTION_DEFAULTS; }
  		;
  
  frame_extent: frame_bound
  				{
  					/* reject invalid cases */
! 					if ($1 & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@1)));
! 					if ($1 & FRAMEOPTION_START_CURRENT_ROW)
  						ereport(ERROR,
! 								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 								 errmsg("frame start at CURRENT ROW is not implemented"),
  								 parser_errposition(@1)));
! 					$$ = $1 | FRAMEOPTION_END_CURRENT_ROW;
  				}
  			| BETWEEN frame_bound AND frame_bound
  				{
  					/* reject invalid cases */
! 					if ($2 & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@2)));
! 					if ($2 & FRAMEOPTION_START_CURRENT_ROW)
! 						ereport(ERROR,
! 								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 								 errmsg("frame start at CURRENT ROW is not implemented"),
! 								 parser_errposition(@2)));
! 					if ($4 & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame end cannot be UNBOUNDED PRECEDING"),
  								 parser_errposition(@4)));
  					/* shift converts START_ options to END_ options */
! 					$$ = FRAMEOPTION_BETWEEN | $2 | ($4 << 1);
  				}
  		;
  
--- 9792,9881 ----
  opt_frame_clause:
  			RANGE frame_extent
  				{
! 					WindowFrameDef *frame = (WindowFrameDef *) $2;
! 					frame->options |= FRAMEOPTION_NONDEFAULT | FRAMEOPTION_RANGE;
! 					frame->location = @1;
! 					$$ = (Node *) frame;
  				}
  			| ROWS frame_extent
  				{
! 					WindowFrameDef *frame = (WindowFrameDef *) $2;
! 					frame->options |= FRAMEOPTION_NONDEFAULT | FRAMEOPTION_ROWS;
! 					frame->location = @1;
! 					$$ = (Node *) frame;
  				}
  			| /*EMPTY*/
! 				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_DEFAULTS;
! 					frame->startOffset = NULL;
! 					frame->endOffset = NULL;
! 					frame->location = -1;
! 					$$ = (Node *) frame;
! 				}
  		;
  
  frame_extent: frame_bound
  				{
+ 					WindowFrameDef *frame = (WindowFrameDef *) $1;
  					/* reject invalid cases */
! 					if (frame->options & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@1)));
! 					if (frame->options & FRAMEOPTION_START_VALUE_FOLLOWING)
  						ereport(ERROR,
! 								(errcode(ERRCODE_WINDOWING_ERROR),
! 								 errmsg("frame from following row cannot end with current row"),
  								 parser_errposition(@1)));
! 					frame->options |= FRAMEOPTION_END_CURRENT_ROW;
! 					$$ = (Node *) frame;
  				}
  			| BETWEEN frame_bound AND frame_bound
  				{
+ 					WindowFrameDef *frame1 = (WindowFrameDef *) $2;
+ 					WindowFrameDef *frame2 = (WindowFrameDef *) $4;
  					/* reject invalid cases */
! 					if (frame1->options & FRAMEOPTION_START_UNBOUNDED_FOLLOWING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame start cannot be UNBOUNDED FOLLOWING"),
  								 parser_errposition(@2)));
! 					if (frame2->options & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
  						ereport(ERROR,
  								(errcode(ERRCODE_WINDOWING_ERROR),
  								 errmsg("frame end cannot be UNBOUNDED PRECEDING"),
  								 parser_errposition(@4)));
+ 					if (frame1->options & FRAMEOPTION_START_CURRENT_ROW &&
+ 						frame2->options &
+ 							(FRAMEOPTION_START_UNBOUNDED_PRECEDING |
+ 							 FRAMEOPTION_START_VALUE_PRECEDING))
+ 						ereport(ERROR,
+ 								(errcode(ERRCODE_WINDOWING_ERROR),
+ 								 errmsg("frame from current row cannot have preceding rows"),
+ 								 parser_errposition(@4)));
+ 					if (frame1->options & FRAMEOPTION_START_VALUE_FOLLOWING &&
+ 						frame2->options &
+ 							(FRAMEOPTION_START_VALUE_PRECEDING |
+ 							 FRAMEOPTION_START_CURRENT_ROW))
+ 						ereport(ERROR,
+ 								(errcode(ERRCODE_WINDOWING_ERROR),
+ 								 errmsg("frame from following row cannot have preceding rows"),
+ 								 parser_errposition(@4)));
+ 					if (frame1->options & FRAMEOPTION_START_VALUE_FOLLOWING &&
+ 						frame2->options & FRAMEOPTION_END_VALUE_PRECEDING)
+ 						ereport(ERROR,
+ 								(errcode(ERRCODE_WINDOWING_ERROR),
+ 								 errmsg("frame cannot be BETWEEN x FOLLOWING AND y PRECEDING"),
+ 								 parser_errposition(@1)));
  					/* shift converts START_ options to END_ options */
! 					frame1->options |= frame2->options << 1;
! 					frame1->options |= FRAMEOPTION_BETWEEN;
! 					frame1->endOffset = frame2->startOffset;
! 					/* frame2 is not necessary anymore */
! 					pfree(frame2);
! 					$$ = (Node *) frame1;
  				}
  		;
  
***************
*** 9839,9857 **** frame_extent: frame_bound
   * This is used for both frame start and frame end, with output set up on
   * the assumption it's frame start; the frame_extent productions must reject
   * invalid cases.
   */
  frame_bound:
  			UNBOUNDED PRECEDING
  				{
! 					$$ = FRAMEOPTION_START_UNBOUNDED_PRECEDING;
  				}
  			| UNBOUNDED FOLLOWING
  				{
! 					$$ = FRAMEOPTION_START_UNBOUNDED_FOLLOWING;
  				}
  			| CURRENT_P ROW
  				{
! 					$$ = FRAMEOPTION_START_CURRENT_ROW;
  				}
  		;
  
--- 9883,9931 ----
   * This is used for both frame start and frame end, with output set up on
   * the assumption it's frame start; the frame_extent productions must reject
   * invalid cases.
+  *
+  * Since UNBOUNDED is an unreserved keyword, it could possibly be an a_expr.
+  * We fix that by making UNBOUNDED have slightly lower precedence than the
+  * lookahead tokens PRECEDING and FOLLOWING.  This solution might cause
+  * strange behavior if UNBOUNDED is ever used anyplace else in the grammar,
+  * however.
   */
  frame_bound:
  			UNBOUNDED PRECEDING
  				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_UNBOUNDED_PRECEDING;
! 					frame->startOffset = NULL;
! 					frame->endOffset = NULL;
! 					$$ = (Node *) frame;
  				}
  			| UNBOUNDED FOLLOWING
  				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_UNBOUNDED_FOLLOWING;
! 					frame->startOffset = NULL;
! 					frame->endOffset = NULL;
! 					$$ = (Node *) frame;
  				}
  			| CURRENT_P ROW
  				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_CURRENT_ROW;
! 					$$ = (Node *) frame;
! 				}
! 			| a_expr PRECEDING
! 				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_VALUE_PRECEDING;
! 					frame->startOffset = $1;
! 					$$ = (Node *) frame;
! 				}
! 			| a_expr FOLLOWING
! 				{
! 					WindowFrameDef *frame = makeNode(WindowFrameDef);
! 					frame->options = FRAMEOPTION_START_VALUE_FOLLOWING;
! 					frame->startOffset = $1;
! 					$$ = (Node *) frame;
  				}
  		;
  
***************
*** 10809,10815 **** unreserved_keyword:
   * looks too much like a function call for an LR(1) parser.
   */
  col_name_keyword:
! 			  BIGINT
  			| BIT
  			| BOOLEAN_P
  			| CHAR_P
--- 10883,10890 ----
   * looks too much like a function call for an LR(1) parser.
   */
  col_name_keyword:
! 			  BETWEEN
! 			| BIGINT
  			| BIT
  			| BOOLEAN_P
  			| CHAR_P
***************
*** 10868,10874 **** col_name_keyword:
   */
  type_func_name_keyword:
  			  AUTHORIZATION
- 			| BETWEEN
  			| BINARY
  			| CROSS
  			| CURRENT_SCHEMA
--- 10943,10948 ----
*** a/src/backend/parser/parse_agg.c
--- b/src/backend/parser/parse_agg.c
***************
*** 136,142 **** transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
  		Assert(windef->refname == NULL &&
  			   windef->partitionClause == NIL &&
  			   windef->orderClause == NIL &&
! 			   windef->frameOptions == FRAMEOPTION_DEFAULTS);
  
  		foreach(lc, pstate->p_windowdefs)
  		{
--- 136,142 ----
  		Assert(windef->refname == NULL &&
  			   windef->partitionClause == NIL &&
  			   windef->orderClause == NIL &&
! 			   windef->frame->options == FRAMEOPTION_DEFAULTS);
  
  		foreach(lc, pstate->p_windowdefs)
  		{
***************
*** 174,180 **** transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
  				continue;
  			if (equal(refwin->partitionClause, windef->partitionClause) &&
  				equal(refwin->orderClause, windef->orderClause) &&
! 				refwin->frameOptions == windef->frameOptions)
  			{
  				/* found a duplicate window specification */
  				wfunc->winref = winref;
--- 174,180 ----
  				continue;
  			if (equal(refwin->partitionClause, windef->partitionClause) &&
  				equal(refwin->orderClause, windef->orderClause) &&
! 				equal(refwin->frame, windef->frame))
  			{
  				/* found a duplicate window specification */
  				wfunc->winref = winref;
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
***************
*** 85,91 **** static List *addTargetToGroupList(ParseState *pstate, TargetEntry *tle,
  					 List *grouplist, List *targetlist, int location,
  					 bool resolveUnknown);
  static WindowClause *findWindowClause(List *wclist, const char *name);
! 
  
  /*
   * transformFromClause -
--- 85,94 ----
  					 List *grouplist, List *targetlist, int location,
  					 bool resolveUnknown);
  static WindowClause *findWindowClause(List *wclist, const char *name);
! static Node *transformFrameOffset(ParseState *pstate, int frameOptions,
! 								  Node *clause, const char *constructName);
! static void checkExprIsLevelStable(ParseState *pstate, Node *n,
! 								   const char *constructName);
  
  /*
   * transformFromClause -
***************
*** 1177,1217 **** transformLimitClause(ParseState *pstate, Node *clause,
  
  	qual = coerce_to_specific_type(pstate, qual, INT8OID, constructName);
  
! 	/*
! 	 * LIMIT can't refer to any vars or aggregates of the current query
! 	 */
! 	if (contain_vars_of_level(qual, 0))
! 	{
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
! 		/* translator: %s is name of a SQL construct, eg LIMIT */
! 				 errmsg("argument of %s must not contain variables",
! 						constructName),
! 				 parser_errposition(pstate,
! 									locate_var_of_level(qual, 0))));
! 	}
! 	if (pstate->p_hasAggs &&
! 		checkExprHasAggs(qual))
! 	{
! 		ereport(ERROR,
! 				(errcode(ERRCODE_GROUPING_ERROR),
! 		/* translator: %s is name of a SQL construct, eg LIMIT */
! 				 errmsg("argument of %s must not contain aggregate functions",
! 						constructName),
! 				 parser_errposition(pstate,
! 									locate_agg_of_level(qual, 0))));
! 	}
! 	if (pstate->p_hasWindowFuncs &&
! 		checkExprHasWindowFuncs(qual))
! 	{
! 		ereport(ERROR,
! 				(errcode(ERRCODE_WINDOWING_ERROR),
! 		/* translator: %s is name of a SQL construct, eg LIMIT */
! 				 errmsg("argument of %s must not contain window functions",
! 						constructName),
! 				 parser_errposition(pstate,
! 									locate_windowfunc(qual))));
! 	}
  
  	return qual;
  }
--- 1180,1186 ----
  
  	qual = coerce_to_specific_type(pstate, qual, INT8OID, constructName);
  
! 	checkExprIsLevelStable(pstate, qual, constructName);
  
  	return qual;
  }
***************
*** 1663,1669 **** transformWindowDefinitions(ParseState *pstate,
  					 errmsg("cannot override frame clause of window \"%s\"",
  							windef->refname),
  					 parser_errposition(pstate, windef->location)));
! 		wc->frameOptions = windef->frameOptions;
  		wc->winref = winref;
  
  		result = lappend(result, wc);
--- 1632,1644 ----
  					 errmsg("cannot override frame clause of window \"%s\"",
  							windef->refname),
  					 parser_errposition(pstate, windef->location)));
! 		wc->frameOptions = windef->frame->options;
! 		wc->startOffset = transformFrameOffset(pstate, windef->frame->options,
! 											   windef->frame->startOffset,
! 											   "Frame Start Offset");
! 		wc->endOffset = transformFrameOffset(pstate, windef->frame->options,
! 											 windef->frame->endOffset,
! 											 "Frame End Offset");
  		wc->winref = winref;
  
  		result = lappend(result, wc);
***************
*** 2160,2162 **** findWindowClause(List *wclist, const char *name)
--- 2135,2206 ----
  
  	return NULL;
  }
+ 
+ /*
+  *	transformFrameOffset
+  */
+ static Node *
+ transformFrameOffset(ParseState *pstate, int frameOptions, Node *clause,
+ 					 const char *constructName)
+ {
+ 	Node	   *node;
+ 
+ 	if (clause == NULL)
+ 		return NULL;
+ 
+ 	node = transformExpr(pstate, clause);
+ 	if (frameOptions & FRAMEOPTION_ROWS)
+ 		node = coerce_to_specific_type(pstate, node, INT8OID, constructName);
+ 	else
+ 	{
+ 		Assert(frameOptions & FRAMEOPTION_RANGE);
+ 		elog(ERROR, "ERROR on parsing frame clause");
+ 	}
+ 	checkExprIsLevelStable(pstate, node, constructName);
+ 
+ 	return node;
+ }
+ 
+ /*
+  * checkExprIsLevelStable
+  *		Check if given expr is stable value in the current query level.
+  *		This check prevent use of any local vars, aggregates or window
+  *		functions in the expression.
+  */
+ static void
+ checkExprIsLevelStable(ParseState *pstate, Node *n,
+ 					   const char *constructName)
+ {
+ 	if (contain_vars_of_level(n, 0))
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ 		/* translator: %s is name of a SQL construct, eg LIMIT */
+ 				 errmsg("argument of %s must not contain variables",
+ 						constructName),
+ 				 parser_errposition(pstate,
+ 									locate_var_of_level(n, 0))));
+ 	}
+ 	if (pstate->p_hasAggs &&
+ 		checkExprHasAggs(n))
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_GROUPING_ERROR),
+ 		/* translator: %s is name of a SQL construct, eg LIMIT */
+ 				 errmsg("argument of %s must not contain aggregate functions",
+ 						constructName),
+ 				 parser_errposition(pstate,
+ 									locate_agg_of_level(n, 0))));
+ 	}
+ 	if (pstate->p_hasWindowFuncs &&
+ 		checkExprHasWindowFuncs(n))
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_WINDOWING_ERROR),
+ 		/* translator: %s is name of a SQL construct, eg LIMIT */
+ 				 errmsg("argument of %s must not contain window functions",
+ 						constructName),
+ 				 parser_errposition(pstate,
+ 									locate_windowfunc(n))));
+ 	}
+ }
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***************
*** 487,493 **** array_agg_transfn(PG_FUNCTION_ARGS)
  	if (fcinfo->context && IsA(fcinfo->context, AggState))
  		aggcontext = ((AggState *) fcinfo->context)->aggcontext;
  	else if (fcinfo->context && IsA(fcinfo->context, WindowAggState))
! 		aggcontext = ((WindowAggState *) fcinfo->context)->wincontext;
  	else
  	{
  		/* cannot be called directly because of internal-type argument */
--- 487,493 ----
  	if (fcinfo->context && IsA(fcinfo->context, AggState))
  		aggcontext = ((AggState *) fcinfo->context)->aggcontext;
  	else if (fcinfo->context && IsA(fcinfo->context, WindowAggState))
! 		aggcontext = ((WindowAggState *) fcinfo->context)->aggcontext;
  	else
  	{
  		/* cannot be called directly because of internal-type argument */
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
***************
*** 3089,3094 **** get_rule_windowspec(WindowClause *wc, List *targetList,
--- 3089,3104 ----
  			appendStringInfoString(buf, "UNBOUNDED PRECEDING ");
  		else if (wc->frameOptions & FRAMEOPTION_START_CURRENT_ROW)
  			appendStringInfoString(buf, "CURRENT ROW ");
+ 		else if (wc->frameOptions & FRAMEOPTION_START_VALUE)
+ 		{
+ 			get_rule_expr(wc->startOffset, context, true);
+ 			if (wc->frameOptions & FRAMEOPTION_START_VALUE_PRECEDING)
+ 				appendStringInfoString(buf, " PRECEDING ");
+ 			else if (wc->frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING)
+ 				appendStringInfoString(buf, " FOLLOWING ");
+ 			else
+ 				Assert(false);
+ 		}
  		else
  			Assert(false);
  		if (wc->frameOptions & FRAMEOPTION_BETWEEN)
***************
*** 3098,3103 **** get_rule_windowspec(WindowClause *wc, List *targetList,
--- 3108,3123 ----
  				appendStringInfoString(buf, "UNBOUNDED FOLLOWING ");
  			else if (wc->frameOptions & FRAMEOPTION_END_CURRENT_ROW)
  				appendStringInfoString(buf, "CURRENT ROW ");
+ 			else if(wc->frameOptions & FRAMEOPTION_END_VALUE)
+ 			{
+ 				get_rule_expr(wc->endOffset, context, true);
+ 				if (wc->frameOptions & FRAMEOPTION_END_VALUE_PRECEDING)
+ 					appendStringInfoString(buf, " PRECEDING ");
+ 				else if (wc->frameOptions & FRAMEOPTION_END_VALUE_FOLLOWING)
+ 					appendStringInfoString(buf, " FOLLOWING ");
+ 				else
+ 					Assert(false);
+ 			}
  			else
  				Assert(false);
  		}
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 14,19 ****
--- 14,20 ----
  #ifndef EXECNODES_H
  #define EXECNODES_H
  
+ #include "windowapi.h"
  #include "access/genam.h"
  #include "access/heapam.h"
  #include "access/skey.h"
***************
*** 1588,1610 **** typedef struct WindowAggState
  	FmgrInfo   *ordEqfunctions; /* equality funcs for ordering columns */
  	Tuplestorestate *buffer;	/* stores rows of current partition */
  	int			current_ptr;	/* read pointer # for current */
- 	int			agg_ptr;		/* read pointer # for aggregates */
  	int64		spooled_rows;	/* total # of rows in buffer */
  	int64		currentpos;		/* position of current row in partition */
  	int64		frametailpos;	/* current frame tail position */
  	int64		aggregatedupto; /* rows before this one are aggregated */
  
  	MemoryContext wincontext;	/* context for partition-lifespan data */
  	ExprContext *tmpcontext;	/* short-term evaluation context */
  
  	bool		all_done;		/* true if the scan is finished */
  	bool		partition_spooled;		/* true if all tuples in current
  										 * partition have been spooled into
  										 * tuplestore */
  	bool		more_partitions;/* true if there's more partitions after this
  								 * one */
  	bool		frametail_valid;/* true if frametailpos is known up to date
  								 * for current row */
  
  	TupleTableSlot *first_part_slot;	/* first tuple of current or next
  										 * partition */
--- 1589,1623 ----
  	FmgrInfo   *ordEqfunctions; /* equality funcs for ordering columns */
  	Tuplestorestate *buffer;	/* stores rows of current partition */
  	int			current_ptr;	/* read pointer # for current */
  	int64		spooled_rows;	/* total # of rows in buffer */
  	int64		currentpos;		/* position of current row in partition */
+ 	int64		frameheadpos;	/* current frame head position */
  	int64		frametailpos;	/* current frame tail position */
+ 	WindowObject agg_winobj;	/* window to track moving frame in aggregates */
+ 	int64		aggregatedbase; /* row position this aggregate started on */
  	int64		aggregatedupto; /* rows before this one are aggregated */
  
+ 	ExprState  *startOffset;
+ 	ExprState  *endOffset;
+ 	int64		startRowOffset;	/* In ROWS mode, upper offset from current */
+ 	int64		endRowOffset;	/* In ROWS mode, lower offset from current */
+ 
  	MemoryContext wincontext;	/* context for partition-lifespan data */
+ 	MemoryContext aggcontext;	/* context for each aggregate data */
  	ExprContext *tmpcontext;	/* short-term evaluation context */
  
+ 	bool		all_first;		/* true if the scan is starting */
  	bool		all_done;		/* true if the scan is finished */
  	bool		partition_spooled;		/* true if all tuples in current
  										 * partition have been spooled into
  										 * tuplestore */
  	bool		more_partitions;/* true if there's more partitions after this
  								 * one */
+ 	bool		framehead_valid;/* true if frameheadpos is known up to date
+ 								 * for current row */
  	bool		frametail_valid;/* true if frametailpos is known up to date
  								 * for current row */
+ 	bool		frametail_stable;	/* frame's tail doesn't move */
  
  	TupleTableSlot *first_part_slot;	/* first tuple of current or next
  										 * partition */
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
***************
*** 362,367 **** typedef enum NodeTag
--- 362,368 ----
  	T_ResTarget,
  	T_TypeCast,
  	T_SortBy,
+ 	T_WindowFrameDef,
  	T_WindowDef,
  	T_RangeSubselect,
  	T_RangeFunction,
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 376,381 **** typedef struct SortBy
--- 376,393 ----
  } SortBy;
  
  /*
+  * WindowFrameDef - for FRAME clause description in WINDOW definition
+  */
+ typedef struct WindowFrameDef
+ {
+ 	NodeTag		type;
+ 	int			options;		/* frame_clause options, see below */
+ 	Node	   *startOffset;	/* how far is the starting bound */
+ 	Node	   *endOffset;		/* how far is the ending bound */
+ 	int			location;		/* parse location, or -1 if none/unknown */
+ } WindowFrameDef;
+ 
+ /*
   * WindowDef - raw representation of WINDOW and OVER clauses
   *
   * For entries in a WINDOW list, "name" is the window name being defined.
***************
*** 390,396 **** typedef struct WindowDef
  	char	   *refname;		/* referenced window name, if any */
  	List	   *partitionClause;	/* PARTITION BY expression list */
  	List	   *orderClause;	/* ORDER BY (list of SortBy) */
! 	int			frameOptions;	/* frame_clause options, see below */
  	int			location;		/* parse location, or -1 if none/unknown */
  } WindowDef;
  
--- 402,408 ----
  	char	   *refname;		/* referenced window name, if any */
  	List	   *partitionClause;	/* PARTITION BY expression list */
  	List	   *orderClause;	/* ORDER BY (list of SortBy) */
! 	WindowFrameDef *frame;		/* FRAME clause including boundary values */
  	int			location;		/* parse location, or -1 if none/unknown */
  } WindowDef;
  
***************
*** 412,422 **** typedef struct WindowDef
--- 424,443 ----
  #define FRAMEOPTION_END_UNBOUNDED_FOLLOWING		0x00080 /* end is U. F. */
  #define FRAMEOPTION_START_CURRENT_ROW			0x00100 /* start is C. R. */
  #define FRAMEOPTION_END_CURRENT_ROW				0x00200 /* end is C. R. */
+ #define FRAMEOPTION_START_VALUE_PRECEDING		0x01000 /* start is V. P. */
+ #define FRAMEOPTION_END_VALUE_PRECEDING			0x02000 /* end is V.P. */
+ #define FRAMEOPTION_START_VALUE_FOLLOWING		0x04000 /* start is V. F. */
+ #define FRAMEOPTION_END_VALUE_FOLLOWING			0x08000 /* end is V. F. */
  
  #define FRAMEOPTION_DEFAULTS \
  	(FRAMEOPTION_RANGE | FRAMEOPTION_START_UNBOUNDED_PRECEDING | \
  	 FRAMEOPTION_END_CURRENT_ROW)
  
+ #define FRAMEOPTION_START_VALUE \
+ 	(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
+ #define FRAMEOPTION_END_VALUE \
+ 	(FRAMEOPTION_END_VALUE_PRECEDING | FRAMEOPTION_END_VALUE_FOLLOWING)
+ 
  /*
   * RangeSubselect - subquery appearing in a FROM clause
   */
***************
*** 795,800 **** typedef struct WindowClause
--- 816,823 ----
  	List	   *partitionClause;	/* PARTITION BY list */
  	List	   *orderClause;	/* ORDER BY list */
  	int			frameOptions;	/* frame_clause options, see WindowDef */
+ 	Node	   *startOffset;	/* offset value to start frame */
+ 	Node	   *endOffset;		/* offset value to end frame */
  	Index		winref;			/* ID referenced by window functions */
  	bool		copiedOrder;	/* did we copy orderClause from refname? */
  } WindowClause;
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 552,557 **** typedef struct WindowAgg
--- 552,559 ----
  	AttrNumber *ordColIdx;		/* their indexes in the target list */
  	Oid		   *ordOperators;	/* equality operators for ordering columns */
  	int			frameOptions;	/* frame_clause options, see WindowDef */
+ 	Node	   *startOffset;	/* frame start boundary */
+ 	Node	   *endOffset;		/* frame end boundary */
  } WindowAgg;
  
  /* ----------------
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 61,67 **** extern WindowAgg *make_windowagg(PlannerInfo *root, List *tlist,
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Plan *lefttree);
  extern Group *make_group(PlannerInfo *root, List *tlist, List *qual,
  		   int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
  		   double numGroups,
--- 61,68 ----
  			   int numWindowFuncs, Index winref,
  			   int partNumCols, AttrNumber *partColIdx, Oid *partOperators,
  			   int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators,
! 			   int frameOptions, Node *startOffset, Node *endOffset,
! 			   Plan *lefttree);
  extern Group *make_group(PlannerInfo *root, List *tlist, List *qual,
  		   int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
  		   double numGroups,
*** a/src/include/parser/kwlist.h
--- b/src/include/parser/kwlist.h
***************
*** 53,59 **** PG_KEYWORD("authorization", AUTHORIZATION, TYPE_FUNC_NAME_KEYWORD)
  PG_KEYWORD("backward", BACKWARD, UNRESERVED_KEYWORD)
  PG_KEYWORD("before", BEFORE, UNRESERVED_KEYWORD)
  PG_KEYWORD("begin", BEGIN_P, UNRESERVED_KEYWORD)
! PG_KEYWORD("between", BETWEEN, TYPE_FUNC_NAME_KEYWORD)
  PG_KEYWORD("bigint", BIGINT, COL_NAME_KEYWORD)
  PG_KEYWORD("binary", BINARY, TYPE_FUNC_NAME_KEYWORD)
  PG_KEYWORD("bit", BIT, COL_NAME_KEYWORD)
--- 53,59 ----
  PG_KEYWORD("backward", BACKWARD, UNRESERVED_KEYWORD)
  PG_KEYWORD("before", BEFORE, UNRESERVED_KEYWORD)
  PG_KEYWORD("begin", BEGIN_P, UNRESERVED_KEYWORD)
! PG_KEYWORD("between", BETWEEN, COL_NAME_KEYWORD)
  PG_KEYWORD("bigint", BIGINT, COL_NAME_KEYWORD)
  PG_KEYWORD("binary", BINARY, TYPE_FUNC_NAME_KEYWORD)
  PG_KEYWORD("bit", BIT, COL_NAME_KEYWORD)
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
***************
*** 728,733 **** FROM (select distinct ten, four from tenk1) ss;
--- 728,908 ----
      3 |   2 |   4 |          2
  (20 rows)
  
+ SELECT sum(unique1) over (order by four range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+   45 |       0 |    0
+   45 |       8 |    0
+   45 |       4 |    0
+   33 |       5 |    1
+   33 |       9 |    1
+   33 |       1 |    1
+   18 |       6 |    2
+   18 |       2 |    2
+   10 |       3 |    3
+   10 |       7 |    3
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+   45 |       4 |    0
+   41 |       2 |    2
+   39 |       1 |    1
+   38 |       6 |    2
+   32 |       9 |    1
+   23 |       8 |    0
+   15 |       5 |    1
+   10 |       3 |    3
+    7 |       7 |    3
+    0 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 2 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+    7 |       4 |    0
+   13 |       2 |    2
+   22 |       1 |    1
+   26 |       6 |    2
+   29 |       9 |    1
+   31 |       8 |    0
+   32 |       5 |    1
+   23 |       3 |    3
+   15 |       7 |    3
+   10 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 1 preceding),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+      |       4 |    0
+    4 |       2 |    2
+    6 |       1 |    1
+    3 |       6 |    2
+    7 |       9 |    1
+   15 |       8 |    0
+   17 |       5 |    1
+   13 |       3 |    3
+    8 |       7 |    3
+   10 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between 1 following and 3 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+    9 |       4 |    0
+   16 |       2 |    2
+   23 |       1 |    1
+   22 |       6 |    2
+   16 |       9 |    1
+   15 |       8 |    0
+   10 |       5 |    1
+    7 |       3 |    3
+    0 |       7 |    3
+    0 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (rows between unbounded preceding and 1 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+  sum | unique1 | four 
+ -----+---------+------
+    6 |       4 |    0
+    7 |       2 |    2
+   13 |       1 |    1
+   22 |       6 |    2
+   30 |       9 |    1
+   35 |       8 |    0
+   38 |       5 |    1
+   45 |       3 |    3
+   45 |       7 |    3
+   45 |       0 |    0
+ (10 rows)
+ 
+ SELECT sum(unique1) over (w range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
+  sum | unique1 | four 
+ -----+---------+------
+   45 |       0 |    0
+   45 |       8 |    0
+   45 |       4 |    0
+   33 |       5 |    1
+   33 |       9 |    1
+   33 |       1 |    1
+   18 |       6 |    2
+   18 |       2 |    2
+   10 |       3 |    3
+   10 |       7 |    3
+ (10 rows)
+ 
+ SELECT first_value(unique1) over w,
+ 	nth_value(unique1, 2) over w AS nth_2,
+ 	last_value(unique1) over w, unique1, four
+ FROM tenk1 WHERE unique1 < 10
+ WINDOW w AS (order by four range between current row and unbounded following);
+  first_value | nth_2 | last_value | unique1 | four 
+ -------------+-------+------------+---------+------
+            0 |     8 |          7 |       0 |    0
+            0 |     8 |          7 |       8 |    0
+            0 |     8 |          7 |       4 |    0
+            5 |     9 |          7 |       5 |    1
+            5 |     9 |          7 |       9 |    1
+            5 |     9 |          7 |       1 |    1
+            6 |     2 |          7 |       6 |    2
+            6 |     2 |          7 |       2 |    2
+            3 |     7 |          7 |       3 |    3
+            3 |     7 |          7 |       7 |    3
+ (10 rows)
+ 
+ SELECT sum(unique1) over
+ 	(rows (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING),
+ 	unique1
+ FROM tenk1 where unique1 < 10;
+  sum | unique1 
+ -----+---------
+    4 |       4
+    6 |       2
+    3 |       1
+    7 |       6
+   15 |       9
+   17 |       8
+   13 |       5
+    8 |       3
+   10 |       7
+    7 |       0
+ (10 rows)
+ 
+ CREATE VIEW v AS
+ 	SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following)
+ 	FROM generate_series(1, 10) i;
+ SELECT * FROM v;
+  i  | sum 
+ ----+-----
+   1 |   3
+   2 |   6
+   3 |   9
+   4 |  12
+   5 |  15
+   6 |  18
+   7 |  21
+   8 |  24
+   9 |  27
+  10 |  19
+ (10 rows)
+ 
+ DROP VIEW v;
  -- with UNION
  SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;
   count 
*** a/src/test/regress/sql/window.sql
--- b/src/test/regress/sql/window.sql
***************
*** 161,166 **** SELECT four, ten/4 as two,
--- 161,211 ----
  	last_value(ten/4) over (partition by four order by ten/4 rows between unbounded preceding and current row)
  FROM (select distinct ten, four from tenk1) ss;
  
+ SELECT sum(unique1) over (order by four range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 2 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between 2 preceding and 1 preceding),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between 1 following and 3 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (rows between unbounded preceding and 1 following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10;
+ 
+ SELECT sum(unique1) over (w range between current row and unbounded following),
+ 	unique1, four
+ FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
+ 
+ SELECT first_value(unique1) over w,
+ 	nth_value(unique1, 2) over w AS nth_2,
+ 	last_value(unique1) over w, unique1, four
+ FROM tenk1 WHERE unique1 < 10
+ WINDOW w AS (order by four range between current row and unbounded following);
+ 
+ SELECT sum(unique1) over
+ 	(rows (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING),
+ 	unique1
+ FROM tenk1 where unique1 < 10;
+ 
+ CREATE VIEW v AS
+ 	SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following)
+ 	FROM generate_series(1, 10) i;
+ SELECT * FROM v;
+ DROP VIEW v;
+ 
  -- with UNION
  SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;
  
#11Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#10)
Re: add more frame types in window functions (ROWS)

"Hitoshi" == Hitoshi Harada <umi.tanuki@gmail.com> writes:

Hitoshi> As earlier mail, I added aggcontext to WindowAggState.

This issue (as detailed in this post):
http://archives.postgresql.org/pgsql-hackers/2009-11/msg01871.php

is currently the only significant outstanding issue in my review of this
patch. I think we need to see more feedback on whether it is acceptable
to change the aggregate function API again (and if so, what to do with it)
before I can post a final review on this and mark it ready for committer
(or not).

--
Andrew.

#12Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#10)
Re: add more frame types in window functions (ROWS)

Functionally this patch looks excellent; correct format, applies
cleanly, passes regression, and I've been unable to find any issues
with the code itself. But I still have a concern over the interface
change, so I'm setting this back to "waiting on author" for now even
though it's really a matter for general discussion on -hackers.

To take the outstanding issues in descending order of importance:

1) Memory context handling for aggregate calls

Having thought about this carefully, I think the solution used in this
patch has to be rejected for one specific reason: if you compile
existing code (i.e. aggregates that use WindowAggState.wincontext)
written for 8.4 against the patched server, the code compiles
successfully and appears to run, but leaks memory at runtime. (And I've
confirmed that there do exist external modules that would be affected.)

If we're going to change the interface in this way, there should, IMO,
be enough of a change that old code fails to compile; e.g. by renaming
wincontext to partition_context or some equivalent change.

But in my opinion we should go further than this: since there's obviously
a need for aggregates to be able to get at a suitable memory context, I
think we should formalize this and actually define some interface functions
for them to use, so that something like

if (fcinfo->context && IsA(fcinfo->context, AggState))
aggcontext = ((AggState *) fcinfo->context)->aggcontext;
else if (fcinfo->context && IsA(fcinfo->context, WindowAggState))
aggcontext = ((WindowAggState *) fcinfo->context)->aggcontext;
else
ereport(...);

becomes something like

aggcontext = AggGetMemoryContext(fcinfo->context);
if (!aggcontext)
ereport(...);

For completeness, there should be two other functions: one to simply
return whether we are in fact being called as an aggregate, and another
one to return whether it's safe to scribble on the first argument
(while it's currently the case that these two are equivalent, it would
be good not to assume that).

Comments? This is the most significant issue as I see it.

(Also, a function in contrib/tsearch2 that accesses wincontext wasn't
updated by the patch.)

2) Keywords

I didn't find any discussion of this previously; did I miss it? The
patch changes BETWEEN from TYPE_FUNC_NAME_KEYWORD to COL_NAME_KEYWORD,
so it's now allowed as a column name (which it previously wasn't),
but now not allowed as a function. I get why it can't be a function now,
but if it didn't work as a column name before, why does it now?

(UNBOUNDED remains an unreserved word, and seems to behave in a
reasonable fashion even if used as an upper-level var in the query;
the interpretation of a bare UNBOUNDED has the standard behaviour
as far as I can see.)

3) Regression tests

Testing that views work is OK as far as it goes, but I think that view
definition should be left in place rather than dropped (possibly with
even more variants) so that the deparse code gets properly tested too.
(see the "rules" test)

4) Deparse output

The code is forcing explicit casting on the offset expressions, i.e.
the deparsed code looks like

... ROWS BETWEEN 1::bigint PRECEDING AND 1::bigint FOLLOWING ...

This looks a bit ugly; is it avoidable? At least for ROWS it should
be, I think, since the type is known; even for RANGE, the type would
be determined by the sort column.

5) Documentation issues

The entry for BETWEEN in the keywords appendix isn't updated.
(Wouldn't it make more sense for this to be generated from the keyword
list somehow?)

Spelling:
-     current row. In <literal>ROWS</> mode this value means phisical row
+     current row. In <literal>ROWS</> mode this value means physical row
(this error appears twice)

The doc could probably do with some wordsmithing but this probably
shouldn't block the patch on its own; that's something that could be
handled separately I guess.

--
Andrew.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#12)
Re: add more frame types in window functions (ROWS)

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

If we're going to change the interface in this way, there should, IMO,
be enough of a change that old code fails to compile; e.g. by renaming
wincontext to partition_context or some equivalent change.

Agreed --- if we have to break things, break them obviously not
silently. I don't have time right now to think about this issue in
detail, but if those are the alternatives I think the choice is clear.
Quietly adding a memory leak to code that used to work well is not
acceptable.

regards, tom lane

#14Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Andrew Gierth (#12)
Re: add more frame types in window functions (ROWS)

2009/12/5 Andrew Gierth <andrew@tao11.riddles.org.uk>:

1) Memory context handling for aggregate calls

   aggcontext = AggGetMemoryContext(fcinfo->context);
   if (!aggcontext)
       ereport(...);

For completeness, there should be two other functions: one to simply
return whether we are in fact being called as an aggregate, and another
one to return whether it's safe to scribble on the first argument
(while it's currently the case that these two are equivalent, it would
be good not to assume that).

Comments? This is the most significant issue as I see it.

Yep, I agree. The most essential point on this is that external
functions refer to the struct members directly; we should provide
kinds of API.

(Also, a function in contrib/tsearch2 that accesses wincontext wasn't
updated by the patch.)

Thanks for noticing. I didn't know it.

2) Keywords

The discussion is:

http://archives.postgresql.org/message-id/e08cc0400911241703u4bf53ek1c3910605a3d8778@mail.gmail.com

and ideas are extracted from Tom's mail in the last year just before
committing the first window function patch, except for changing to
COL_NAME_KEYWORD rather than RESERVED_KEYWORD as suggested. The reason
I picked it up was only that it works. I cannot tell the side effect
of COL_NAME_KEYWORD but I guess we tend to avoid RESERVED_KEYWORD as
far as possible.

And the reason BETWEEN cannot work as function name any more is based
on bison's output shift/reduce conflict when SCONST follows BETWEEN in
frame_extent. I don't have clear example that makes this happen,
though.

3) Regression tests

Testing that views work is OK as far as it goes, but I think that view
definition should be left in place rather than dropped (possibly with
even more variants) so that the deparse code gets properly tested too.
(see the "rules" test)

OK,

4) Deparse output

The code is forcing explicit casting on the offset expressions, i.e.
the deparsed code looks like

 ... ROWS BETWEEN 1::bigint PRECEDING AND 1::bigint FOLLOWING ...

This looks a bit ugly; is it avoidable? At least for ROWS it should
be, I think, since the type is known; even for RANGE, the type would
be determined by the sort column.

Hmm, I'll change it as LIMIT clause does. Pass false as showimplicit
to get_rule_expr() maybe?

5) Documentation issues

The entry for BETWEEN in the keywords appendix isn't updated.
(Wouldn't it make more sense for this to be generated from the keyword
list somehow?)

I heard that you don't need to send keywords appendix in the patch
because it is auto-generated, if I remember correctly.

Spelling:
-     current row. In <literal>ROWS</> mode this value means phisical row
+     current row. In <literal>ROWS</> mode this value means physical row
(this error appears twice)

Oops, thanks.

I'm now reworking as reviewed. The last issue is whether we accept
extension of frame types without RANGE support.

Regards,

--
Hitoshi Harada

#15Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Hitoshi Harada (#14)
1 attachment(s)
Re: add more frame types in window functions (ROWS)

2009/12/5 Hitoshi Harada <umi.tanuki@gmail.com>:

I'm now reworking as reviewed. The last issue is whether we accept
extension of frame types without RANGE support.

Attached is updated version. I added AggGetMemoryContext() in
executor/nodeAgg.h (though I'm not sure where to go...) and its second
argument "iswindowagg" is output parameter to know whether the call
context is Agg or WindowAgg. Your proposal of APIs to know whether the
function is called as Aggregate or not is also a candidate to be, but
it seems out of this patch scope, so it doesn't touch anything.

Fix tsearch function is also included, as well as typo phisical ->
physical. Pass false to get_rule_expr() of value in
PRECEDING/FOLLOWING.

One thing for rule test, I checked existing regression test cases and
concluded DROP VIEW is necessary, or even VIEW test for a specific
feature is not needed. I remember your aggregate ORDER BY patch
contains "rules" test changes. However, since processing order of
regression tests is not predictable and may change AFAIK, I guess it
shouldn't add those changes in rules.out.

Regards.

--
Hitoshi Harada

Attachments:

rows_frame_types.20091205.patch.gzapplication/x-gzip; name=rows_frame_types.20091205.patch.gzDownload
#16Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#15)
Re: add more frame types in window functions (ROWS)

"Hitoshi" == Hitoshi Harada <umi.tanuki@gmail.com> writes:

Hitoshi> One thing for rule test, I checked existing regression test
Hitoshi> cases and concluded DROP VIEW is necessary, or even VIEW
Hitoshi> test for a specific feature is not needed. I remember your
Hitoshi> aggregate ORDER BY patch contains "rules" test
Hitoshi> changes. However, since processing order of regression tests
Hitoshi> is not predictable and may change AFAIK, I guess it
Hitoshi> shouldn't add those changes in rules.out.

Actually, looking more closely, the way you have it currently works only
by chance - "rules" and "window" are running in parallel, therefore the
view creation in "window" can break the output of "rules".

The order of regression tests is set in parallel_schedule and
serial_schedule; it's unpredictable only for tests within the same
parallel group.

I think a modification of the schedule is needed here; the only other
option would be to move the view creation into a different test.

--
Andrew.

#17Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#15)
Re: add more frame types in window functions (ROWS)

"Hitoshi" == Hitoshi Harada <umi.tanuki@gmail.com> writes:

Hitoshi> Attached is updated version. I added AggGetMemoryContext()
Hitoshi> in executor/nodeAgg.h (though I'm not sure where to go...)
Hitoshi> and its second argument "iswindowagg" is output parameter to
Hitoshi> know whether the call context is Agg or WindowAgg. Your
Hitoshi> proposal of APIs to know whether the function is called as
Hitoshi> Aggregate or not is also a candidate to be, but it seems out
Hitoshi> of this patch scope, so it doesn't touch anything.

I don't really like the extra argument; aggregate functions should
almost never have to care about whether they're being called as window
functions rather than aggregate functions. And if it does care, I
don't see why this is the appropriate function for it. At the very
least the function should accept NULL for the "iswindowagg" pointer to
avoid useless variables in the caller.

So for this and the regression test problem mentioned in the other mail,
I'm setting this back to "waiting on author".

--
Andrew.

#18Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Andrew Gierth (#17)
Re: add more frame types in window functions (ROWS)

2009/12/7 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Hitoshi" == Hitoshi Harada <umi.tanuki@gmail.com> writes:

 Hitoshi> Attached is updated version. I added AggGetMemoryContext()
 Hitoshi> in executor/nodeAgg.h (though I'm not sure where to go...)
 Hitoshi> and its second argument "iswindowagg" is output parameter to
 Hitoshi> know whether the call context is Agg or WindowAgg. Your
 Hitoshi> proposal of APIs to know whether the function is called as
 Hitoshi> Aggregate or not is also a candidate to be, but it seems out
 Hitoshi> of this patch scope, so it doesn't touch anything.

I don't really like the extra argument; aggregate functions should
almost never have to care about whether they're being called as window
functions rather than aggregate functions. And if it does care, I
don't see why this is the appropriate function for it. At the very
least the function should accept NULL for the "iswindowagg" pointer to
avoid useless variables in the caller.

I've just remembered such situation before, around here:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/array_userfuncs.c.diff?r1=1.30;r2=1.31

Of course it is fixed now. Still thinking about error reporting or
something, there should be a way to know which context you are called.
OK, I agree iswindowagg can be nullable, though most "isnull"
parameter cannot be NULL and forcing caller to put boolean stack value
don't seem a hard work.

So for this and the regression test problem mentioned in the other mail,
I'm setting this back to "waiting on author".

In my humble opinion, view regression test is not necessary in both my
patch and yours. At least window test has not been testing views so
far since it was introduced. Am I missing?

Regards,

--
Hitoshi Harada

#19Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Hitoshi Harada (#18)
Re: add more frame types in window functions (ROWS)

"Hitoshi" == Hitoshi Harada <umi.tanuki@gmail.com> writes:

So for this and the regression test problem mentioned in the other
mail, I'm setting this back to "waiting on author".

Hitoshi> In my humble opinion, view regression test is not necessary
Hitoshi> in both my patch and yours. At least window test has not
Hitoshi> been testing views so far since it was introduced. Am I
Hitoshi> missing?

If you don't want to include views in the regression test, then take
them out; I will object to that, but I'll leave it up to the committer
to decide whether it is appropriate, provided the other concerns are
addressed.

But what you have in the regression tests _now_ is, as far as I can
tell, only working by chance; with "rules" and "window" being in the
same parallel group, and window.sql creating a view, you have an
obvious race condition, unless I'm missing something.

I included view tests in my own patch for the very simple reason that
I found actual bugs that way during development.

--
Andrew (irc:RhodiumToad)

#20Greg Smith
greg@2ndquadrant.com
In reply to: Andrew Gierth (#19)
Re: add more frame types in window functions (ROWS)

Andrew Gierth wrote:

But what you have in the regression tests _now_ is, as far as I can
tell, only working by chance; with "rules" and "window" being in the
same parallel group, and window.sql creating a view, you have an
obvious race condition, unless I'm missing something.

You're right. rules.sql does this, among other things that might get
impacted:

SELECT viewname, definition FROM pg_views WHERE schemaname <>
'information_schema' ORDER BY viewname;

Both rules and window are part of the same parallel group:

test: select_views portals_p2 rules foreign_key cluster dependency guc
bitmapops combocid tsearch tsdicts foreign_data window xmlmap

Which means that views created in the window test could absolutely cause
the rules test to fail given a bad race condition. Either rules or
window needs to be moved to another section of the test schedule. (I
guess you could cut down the scope of "rules" to avoid this particular
problem, but hacking other people's regression tests to work around
issues caused by yours is never good practice). I also agree with
Andrew's sentiment that including a view on top of the new window
implementations is good practice, just for general robustness.

Also, while not a strict requirement, I personally hate seeing things
with simple names used in the regression tests, like how "v" is used in
this patch. The better written regression tests use a preface for all
their names to decrease the chance of a conflict here; for example, the
rules test names all of its views with names like rtest_v1 so they're
very clearly not going to conflict with views created by other tests.
No cleverness there can eliminate the conflict with rules though, when
it's looking at every view in the system.

It looks like a lot of progress has been made on this patch through its
review. But there's enough open issues still that I think it could use
a bit more time to mature before we try to get it committed--the fact
that it's been getting bounced around for weeks now and the regression
tests aren't even completely settled down yet is telling. The feature
seems complete, useful, and functionally solid, but still in need of
some general cleanup and re-testing afterwards. I'm going to mark this
one "Returned with Feedback" for now. Please continue to work on
knocking all these issues out, this should be a lot easier to get
committed in our next CF.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com

#21Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Greg Smith (#20)
Re: add more frame types in window functions (ROWS)

2009/12/7 Greg Smith <greg@2ndquadrant.com>:

Which means that views created in the window test could absolutely cause the
rules test to fail given a bad race condition.  Either rules or window needs
to be moved to another section of the test schedule.  (I guess you could cut
down the scope of "rules" to avoid this particular problem, but hacking
other people's regression tests to work around issues caused by yours is
never good practice).  I also agree with Andrew's sentiment that including a
view on top of the new window implementations is good practice, just for
general robustness.

I've agreed window and rules cannot be in the same group if window has
view test.

It looks like a lot of progress has been made on this patch through its
review.  But there's enough open issues still that I think it could use a
bit more time to mature before we try to get it committed--the fact that
it's been getting bounced around for weeks now and the regression tests
aren't even completely settled down yet is telling.  The feature seems
complete, useful, and functionally solid, but still in need of some general
cleanup and re-testing afterwards.  I'm going to mark this one "Returned
with Feedback" for now.  Please continue to work on knocking all these
issues out, this should be a lot easier to get committed in our next CF.

OK, Andrew, thank you for collaborating on this. This fest made my
patch to progress greatly. More comments from others on the memory
leakage issue are welcome yet.

Regards,

--
Hitoshi Harada