AS OF queries

Started by Konstantin Knizhnikabout 8 years ago45 messages
#1Konstantin Knizhnik
k.knizhnik@postgrespro.ru
1 attachment(s)

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)
2. Enable commit timestamp (track_commit_timestamp = on)
3. Add asofTimestamp to snapshot and patch XidInMVCCSnapshot to compare
commit timestamps when it is specified in snapshot.

Attached please find my prototype implementation of it.
Most of the efforts are needed to support asof timestamp in grammar and
add it to query plan.
I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...

Please notice that now ASOF timestamp is used only for data snapshot,
not for catalog snapshot.
I am not sure that it is possible (and useful) to travel through
database schema history...

Below is an example of how it works:

postgres=# create table foo(pk serial primary key, ts timestamp default
now(), val text);
CREATE TABLE
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# select * from foo;
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
  2 | 2017-12-20 14:59:22.933753 | insert
  3 | 2017-12-20 14:59:27.87712  | insert
(3 rows)

postgres=# select * from foo asof timestamp '2017-12-20 14:59:25';
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
  2 | 2017-12-20 14:59:22.933753 | insert
(2 rows)

postgres=# select * from foo asof timestamp '2017-12-20 14:59:20';
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
(1 row)

postgres=# update foo set val='upd',ts=now() where pk=1;
UPDATE 1
postgres=# select * from foo asof timestamp '2017-12-20 14:59:20';
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
(1 row)

postgres=# select * from foo;
 pk |             ts             |  val
----+----------------------------+--------
  2 | 2017-12-20 14:59:22.933753 | insert
  3 | 2017-12-20 14:59:27.87712  | insert
  1 | 2017-12-20 15:09:17.046047 | upd
(3 rows)

postgres=# update foo set val='upd2',ts=now() where pk=1;
UPDATE 1
postgres=# select * from foo asof timestamp '2017-12-20 15:10';
 pk |             ts             |  val
----+----------------------------+--------
  2 | 2017-12-20 14:59:22.933753 | insert
  3 | 2017-12-20 14:59:27.87712  | insert
  1 | 2017-12-20 15:09:17.046047 | upd
(3 rows)

Comments and feedback are welcome:)

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

asof.patchtext/x-patch; name=asof.patchDownload
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3de8333..2126847 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2353,6 +2353,7 @@ JumbleQuery(pgssJumbleState *jstate, Query *query)
 	JumbleExpr(jstate, (Node *) query->sortClause);
 	JumbleExpr(jstate, query->limitOffset);
 	JumbleExpr(jstate, query->limitCount);
+	JumbleExpr(jstate, query->asofTimestamp);
 	/* we ignore rowMarks */
 	JumbleExpr(jstate, query->setOperations);
 }
diff --git a/src/backend/executor/Makefile b/src/backend/executor/Makefile
index cc09895..d2e0799 100644
--- a/src/backend/executor/Makefile
+++ b/src/backend/executor/Makefile
@@ -29,6 +29,6 @@ OBJS = execAmi.o execCurrent.o execExpr.o execExprInterp.o \
        nodeCtescan.o nodeNamedtuplestorescan.o nodeWorktablescan.o \
        nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
        nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
-       nodeTableFuncscan.o
+       nodeTableFuncscan.o nodeAsof.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index f1636a5..38c79b8 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -285,6 +285,10 @@ ExecReScan(PlanState *node)
 			ExecReScanLimit((LimitState *) node);
 			break;
 
+		case T_AsofState:
+			ExecReScanAsof((AsofState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
diff --git a/src/backend/executor/execCurrent.c b/src/backend/executor/execCurrent.c
index a3e962e..1912ae4 100644
--- a/src/backend/executor/execCurrent.c
+++ b/src/backend/executor/execCurrent.c
@@ -329,6 +329,7 @@ search_plan_tree(PlanState *node, Oid table_oid)
 			 */
 		case T_ResultState:
 		case T_LimitState:
+		case T_AsofState:
 			return search_plan_tree(node->lefttree, table_oid);
 
 			/*
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index fcb8b56..586b5b3 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -75,6 +75,7 @@
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
+#include "executor/nodeAsof.h"
 #include "executor/nodeBitmapAnd.h"
 #include "executor/nodeBitmapHeapscan.h"
 #include "executor/nodeBitmapIndexscan.h"
@@ -364,6 +365,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
 												 estate, eflags);
 			break;
 
+		case T_Asof:
+			result = (PlanState *) ExecInitAsof((Asof *) node,
+												estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -727,6 +733,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_AsofState:
+			ExecEndAsof((AsofState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
diff --git a/src/backend/executor/nodeAsof.c b/src/backend/executor/nodeAsof.c
new file mode 100644
index 0000000..8957a91
--- /dev/null
+++ b/src/backend/executor/nodeAsof.c
@@ -0,0 +1,157 @@
+/*-------------------------------------------------------------------------
+ *
+ * nodeAsof.c
+ *	  Routines to handle asofing of query results where appropriate
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/executor/nodeAsof.c
+ *
+ *-------------------------------------------------------------------------
+ */
+/*
+ * INTERFACE ROUTINES
+ *		ExecAsof		- extract a asofed range of tuples
+ *		ExecInitAsof	- initialize node and subnodes..
+ *		ExecEndAsof	- shutdown node and subnodes
+ */
+
+#include "postgres.h"
+
+#include "executor/executor.h"
+#include "executor/nodeAsof.h"
+#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
+
+/* ----------------------------------------------------------------
+ *		ExecAsof
+ *
+ *		This is a very simple node which just performs ASOF/OFFSET
+ *		filtering on the stream of tuples returned by a subplan.
+ * ----------------------------------------------------------------
+ */
+static TupleTableSlot *			/* return: a tuple or NULL */
+ExecAsof(PlanState *pstate)
+{
+	AsofState      *node = castNode(AsofState, pstate);
+	PlanState      *outerPlan = outerPlanState(node);
+	TimestampTz     outerAsofTimestamp;
+	TupleTableSlot *slot;
+
+	if (!node->timestampCalculated)
+	{
+		Datum		val;
+		bool		isNull;
+
+		val = ExecEvalExprSwitchContext(node->asofExpr,
+										pstate->ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		if (isNull)
+			node->asofTimestamp = 0;
+		else
+		{
+			node->asofTimestamp = DatumGetInt64(val);
+		}
+		node->timestampCalculated = true;
+	}
+	outerAsofTimestamp = pstate->state->es_snapshot->asofTimestamp;
+	pstate->state->es_snapshot->asofTimestamp = node->asofTimestamp;
+	slot = ExecProcNode(outerPlan);
+	pstate->state->es_snapshot->asofTimestamp = outerAsofTimestamp;
+	return slot;
+}
+
+
+/* ----------------------------------------------------------------
+ *		ExecInitAsof
+ *
+ *		This initializes the asof node state structures and
+ *		the node's subplan.
+ * ----------------------------------------------------------------
+ */
+AsofState *
+ExecInitAsof(Asof *node, EState *estate, int eflags)
+{
+	AsofState *asofstate;
+	Plan	   *outerPlan;
+
+	/* check for unsupported flags */
+	Assert(!(eflags & EXEC_FLAG_MARK));
+
+	/*
+	 * create state structure
+	 */
+	asofstate = makeNode(AsofState);
+	asofstate->ps.plan = (Plan *) node;
+	asofstate->ps.state = estate;
+	asofstate->ps.ExecProcNode = ExecAsof;
+	asofstate->timestampCalculated = false;
+
+	/*
+	 * Miscellaneous initialization
+	 *
+	 * Asof nodes never call ExecQual or ExecProject, but they need an
+	 * exprcontext anyway to evaluate the asof/offset parameters in.
+	 */
+	ExecAssignExprContext(estate, &asofstate->ps);
+
+	/*
+	 * initialize child expressions
+	 */
+	asofstate->asofExpr = ExecInitExpr((Expr *) node->asofTimestamp,
+									   (PlanState *) asofstate);
+	/*
+	 * Tuple table initialization (XXX not actually used...)
+	 */
+	ExecInitResultTupleSlot(estate, &asofstate->ps);
+
+	/*
+	 * then initialize outer plan
+	 */
+	outerPlan = outerPlan(node);
+	outerPlanState(asofstate) = ExecInitNode(outerPlan, estate, eflags);
+
+	/*
+	 * asof nodes do no projections, so initialize projection info for this
+	 * node appropriately
+	 */
+	ExecAssignResultTypeFromTL(&asofstate->ps);
+	asofstate->ps.ps_ProjInfo = NULL;
+
+	return asofstate;
+}
+
+/* ----------------------------------------------------------------
+ *		ExecEndAsof
+ *
+ *		This shuts down the subplan and frees resources allocated
+ *		to this node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecEndAsof(AsofState *node)
+{
+	ExecFreeExprContext(&node->ps);
+	ExecEndNode(outerPlanState(node));
+}
+
+
+void
+ExecReScanAsof(AsofState *node)
+{
+	/*
+	 * Recompute AS OF in case parameters changed, and reset the current snapshot
+	 */
+	node->timestampCalculated = false;
+
+	/*
+	 * if chgParam of subnode is not null then plan will be re-scanned by
+	 * first ExecProcNode.
+	 */
+	if (node->ps.lefttree->chgParam == NULL)
+		ExecReScan(node->ps.lefttree);
+}
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 883f46c..ebe0362 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -23,6 +23,7 @@
 
 #include "executor/executor.h"
 #include "executor/nodeLimit.h"
+#include "executor/nodeAsof.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index b1515dd..e142bbc 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1134,6 +1134,27 @@ _copyLimit(const Limit *from)
 }
 
 /*
+ * _copyAsof
+ */
+static Asof *
+_copyAsof(const Asof *from)
+{
+	Asof   *newnode = makeNode(Asof);
+
+	/*
+	 * copy node superclass fields
+	 */
+	CopyPlanFields((const Plan *) from, (Plan *) newnode);
+
+	/*
+	 * copy remainder of node
+	 */
+	COPY_NODE_FIELD(asofTimestamp);
+
+	return newnode;
+}
+
+/*
  * _copyNestLoopParam
  */
 static NestLoopParam *
@@ -2958,6 +2979,7 @@ _copyQuery(const Query *from)
 	COPY_NODE_FIELD(sortClause);
 	COPY_NODE_FIELD(limitOffset);
 	COPY_NODE_FIELD(limitCount);
+	COPY_NODE_FIELD(asofTimestamp);
 	COPY_NODE_FIELD(rowMarks);
 	COPY_NODE_FIELD(setOperations);
 	COPY_NODE_FIELD(constraintDeps);
@@ -3048,6 +3070,7 @@ _copySelectStmt(const SelectStmt *from)
 	COPY_SCALAR_FIELD(all);
 	COPY_NODE_FIELD(larg);
 	COPY_NODE_FIELD(rarg);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
@@ -4840,6 +4863,9 @@ copyObjectImpl(const void *from)
 		case T_Limit:
 			retval = _copyLimit(from);
 			break;
+		case T_Asof:
+			retval = _copyAsof(from);
+			break;
 		case T_NestLoopParam:
 			retval = _copyNestLoopParam(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 2e869a9..6bbbc1c 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -982,6 +982,7 @@ _equalQuery(const Query *a, const Query *b)
 	COMPARE_NODE_FIELD(sortClause);
 	COMPARE_NODE_FIELD(limitOffset);
 	COMPARE_NODE_FIELD(limitCount);
+	COMPARE_NODE_FIELD(asofTimestamp);
 	COMPARE_NODE_FIELD(rowMarks);
 	COMPARE_NODE_FIELD(setOperations);
 	COMPARE_NODE_FIELD(constraintDeps);
@@ -1062,6 +1063,7 @@ _equalSelectStmt(const SelectStmt *a, const SelectStmt *b)
 	COMPARE_SCALAR_FIELD(all);
 	COMPARE_NODE_FIELD(larg);
 	COMPARE_NODE_FIELD(rarg);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index c2a93b2..d674ec2 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2267,6 +2267,8 @@ query_tree_walker(Query *query,
 		return true;
 	if (walker(query->limitCount, context))
 		return true;
+	if (walker(query->asofTimestamp, context))
+		return true;
 	if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
 	{
 		if (walker((Node *) query->cteList, context))
@@ -3089,6 +3091,7 @@ query_tree_mutator(Query *query,
 	MUTATE(query->havingQual, query->havingQual, Node *);
 	MUTATE(query->limitOffset, query->limitOffset, Node *);
 	MUTATE(query->limitCount, query->limitCount, Node *);
+	MUTATE(query->asofTimestamp, query->asofTimestamp, Node *);
 	if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
 		MUTATE(query->cteList, query->cteList, List *);
 	else						/* else copy CTE list as-is */
@@ -3442,6 +3445,8 @@ raw_expression_tree_walker(Node *node,
 					return true;
 				if (walker(stmt->rarg, context))
 					return true;
+				if (walker(stmt->asofTimestamp, context))
+					return true;
 			}
 			break;
 		case T_A_Expr:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b59a521..e59c60d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -978,6 +978,16 @@ _outLimit(StringInfo str, const Limit *node)
 }
 
 static void
+_outAsof(StringInfo str, const Asof *node)
+{
+	WRITE_NODE_TYPE("ASOF");
+
+	_outPlanInfo(str, (const Plan *) node);
+
+	WRITE_NODE_FIELD(asofTimestamp);
+}
+
+static void
 _outNestLoopParam(StringInfo str, const NestLoopParam *node)
 {
 	WRITE_NODE_TYPE("NESTLOOPPARAM");
@@ -2127,6 +2137,17 @@ _outLimitPath(StringInfo str, const LimitPath *node)
 }
 
 static void
+_outAsofPath(StringInfo str, const AsofPath *node)
+{
+	WRITE_NODE_TYPE("ASOFPATH");
+
+	_outPathInfo(str, (const Path *) node);
+
+	WRITE_NODE_FIELD(subpath);
+	WRITE_NODE_FIELD(asofTimestamp);
+}
+
+static void
 _outGatherMergePath(StringInfo str, const GatherMergePath *node)
 {
 	WRITE_NODE_TYPE("GATHERMERGEPATH");
@@ -2722,6 +2743,7 @@ _outSelectStmt(StringInfo str, const SelectStmt *node)
 	WRITE_BOOL_FIELD(all);
 	WRITE_NODE_FIELD(larg);
 	WRITE_NODE_FIELD(rarg);
+	WRITE_NODE_FIELD(asofTimestamp);
 }
 
 static void
@@ -2925,6 +2947,7 @@ _outQuery(StringInfo str, const Query *node)
 	WRITE_NODE_FIELD(sortClause);
 	WRITE_NODE_FIELD(limitOffset);
 	WRITE_NODE_FIELD(limitCount);
+	WRITE_NODE_FIELD(asofTimestamp);
 	WRITE_NODE_FIELD(rowMarks);
 	WRITE_NODE_FIELD(setOperations);
 	WRITE_NODE_FIELD(constraintDeps);
@@ -3753,6 +3776,9 @@ outNode(StringInfo str, const void *obj)
 			case T_Limit:
 				_outLimit(str, obj);
 				break;
+			case T_Asof:
+				_outAsof(str, obj);
+				break;
 			case T_NestLoopParam:
 				_outNestLoopParam(str, obj);
 				break;
@@ -4002,6 +4028,9 @@ outNode(StringInfo str, const void *obj)
 			case T_LimitPath:
 				_outLimitPath(str, obj);
 				break;
+			case T_AsofPath:
+				_outAsofPath(str, obj);
+				break;
 			case T_GatherMergePath:
 				_outGatherMergePath(str, obj);
 				break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 0d17ae8..f805ea3 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -266,6 +266,7 @@ _readQuery(void)
 	READ_NODE_FIELD(sortClause);
 	READ_NODE_FIELD(limitOffset);
 	READ_NODE_FIELD(limitCount);
+	READ_NODE_FIELD(asofTimestamp);
 	READ_NODE_FIELD(rowMarks);
 	READ_NODE_FIELD(setOperations);
 	READ_NODE_FIELD(constraintDeps);
@@ -2272,6 +2273,21 @@ _readLimit(void)
 }
 
 /*
+ * _readAsof
+ */
+static Asof *
+_readAsof(void)
+{
+	READ_LOCALS(Asof);
+
+	ReadCommonPlan(&local_node->plan);
+
+	READ_NODE_FIELD(asofTimestamp);
+
+	READ_DONE();
+}
+
+/*
  * _readNestLoopParam
  */
 static NestLoopParam *
@@ -2655,6 +2671,8 @@ parseNodeString(void)
 		return_value = _readLockRows();
 	else if (MATCH("LIMIT", 5))
 		return_value = _readLimit();
+	else if (MATCH("ASOF", 4))
+		return_value = _readAsof();
 	else if (MATCH("NESTLOOPPARAM", 13))
 		return_value = _readNestLoopParam();
 	else if (MATCH("PLANROWMARK", 11))
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0e8463e..9c97018 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2756,7 +2756,7 @@ subquery_is_pushdown_safe(Query *subquery, Query *topquery,
 	SetOperationStmt *topop;
 
 	/* Check point 1 */
-	if (subquery->limitOffset != NULL || subquery->limitCount != NULL)
+	if (subquery->limitOffset != NULL || subquery->limitCount != NULL || subquery->asofTimestamp != NULL)
 		return false;
 
 	/* Check points 3, 4, and 5 */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index f6c83d0..413a7a7 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -114,6 +114,8 @@ static LockRows *create_lockrows_plan(PlannerInfo *root, LockRowsPath *best_path
 static ModifyTable *create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path);
 static Limit *create_limit_plan(PlannerInfo *root, LimitPath *best_path,
 				  int flags);
+static Asof *create_asof_plan(PlannerInfo *root, AsofPath *best_path,
+				  int flags);
 static SeqScan *create_seqscan_plan(PlannerInfo *root, Path *best_path,
 					List *tlist, List *scan_clauses);
 static SampleScan *create_samplescan_plan(PlannerInfo *root, Path *best_path,
@@ -483,6 +485,11 @@ create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
 											  (LimitPath *) best_path,
 											  flags);
 			break;
+		case T_Asof:
+			plan = (Plan *) create_asof_plan(root,
+											 (AsofPath *) best_path,
+											 flags);
+			break;
 		case T_GatherMerge:
 			plan = (Plan *) create_gather_merge_plan(root,
 													 (GatherMergePath *) best_path);
@@ -2410,6 +2417,29 @@ create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags)
 	return plan;
 }
 
+/*
+ * create_asof_plan
+ *
+ *	  Create a Limit plan for 'best_path' and (recursively) plans
+ *	  for its subpaths.
+ */
+static Asof *
+create_asof_plan(PlannerInfo *root, AsofPath *best_path, int flags)
+{
+	Asof	   *plan;
+	Plan	   *subplan;
+
+	/* Limit doesn't project, so tlist requirements pass through */
+	subplan = create_plan_recurse(root, best_path->subpath, flags);
+
+	plan = make_asof(subplan,
+					 best_path->asofTimestamp);
+
+	copy_generic_path_info(&plan->plan, (Path *) best_path);
+
+	return plan;
+}
+
 
 /*****************************************************************************
  *
@@ -6385,6 +6415,26 @@ make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount)
 }
 
 /*
+ * make_asof
+ *	  Build a Asof plan node
+ */
+Asof *
+make_asof(Plan *lefttree, Node *asofTimestamp)
+{
+	Asof	   *node = makeNode(Asof);
+	Plan	   *plan = &node->plan;
+
+	plan->targetlist = lefttree->targetlist;
+	plan->qual = NIL;
+	plan->lefttree = lefttree;
+	plan->righttree = NULL;
+
+	node->asofTimestamp = asofTimestamp;
+
+	return node;
+}
+
+/*
  * make_result
  *	  Build a Result plan node
  */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index e8bc15c..e5c867b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -84,6 +84,7 @@ create_upper_paths_hook_type create_upper_paths_hook = NULL;
 #define EXPRKIND_ARBITER_ELEM		10
 #define EXPRKIND_TABLEFUNC			11
 #define EXPRKIND_TABLEFUNC_LATERAL	12
+#define EXPRKIND_ASOF				13
 
 /* Passthrough data for standard_qp_callback */
 typedef struct
@@ -696,6 +697,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	parse->limitCount = preprocess_expression(root, parse->limitCount,
 											  EXPRKIND_LIMIT);
 
+	parse->asofTimestamp = preprocess_expression(root, parse->asofTimestamp,
+												 EXPRKIND_ASOF);
+
 	if (parse->onConflict)
 	{
 		parse->onConflict->arbiterElems = (List *)
@@ -2032,12 +2036,13 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 
 	/*
 	 * If the input rel is marked consider_parallel and there's nothing that's
-	 * not parallel-safe in the LIMIT clause, then the final_rel can be marked
+	 * not parallel-safe in the LIMIT and ASOF clauses, then the final_rel can be marked
 	 * consider_parallel as well.  Note that if the query has rowMarks or is
 	 * not a SELECT, consider_parallel will be false for every relation in the
 	 * query.
 	 */
 	if (current_rel->consider_parallel &&
+		is_parallel_safe(root, parse->asofTimestamp) &&
 		is_parallel_safe(root, parse->limitOffset) &&
 		is_parallel_safe(root, parse->limitCount))
 		final_rel->consider_parallel = true;
@@ -2084,6 +2089,15 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		}
 
 		/*
+		 * If there is a AS OF clause, add the ASOF node.
+		 */
+		if (parse->asofTimestamp)
+		{
+			path = (Path *) create_asof_path(root, final_rel, path,
+											 parse->asofTimestamp);
+		}
+
+		/*
 		 * If this is an INSERT/UPDATE/DELETE, and we're not being called from
 		 * inheritance_planner, add the ModifyTable node.
 		 */
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b5c4124..bc79055 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -700,6 +700,23 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 					fix_scan_expr(root, splan->limitCount, rtoffset);
 			}
 			break;
+		case T_Asof:
+			{
+				Asof	   *splan = (Asof *) plan;
+
+				/*
+				 * Like the plan types above, Asof doesn't evaluate its tlist
+				 * or quals.  It does have live expression for asof,
+				 * however; and those cannot contain subplan variable refs, so
+				 * fix_scan_expr works for them.
+				 */
+				set_dummy_tlist_references(plan, rtoffset);
+				Assert(splan->plan.qual == NIL);
+
+				splan->asofTimestamp =
+					fix_scan_expr(root, splan->asofTimestamp, rtoffset);
+			}
+			break;
 		case T_Agg:
 			{
 				Agg		   *agg = (Agg *) plan;
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 2e3abee..c215d3b 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1602,6 +1602,7 @@ simplify_EXISTS_query(PlannerInfo *root, Query *query)
 		query->hasModifyingCTE ||
 		query->havingQual ||
 		query->limitOffset ||
+		query->asofTimestamp ||
 		query->rowMarks)
 		return false;
 
@@ -2691,6 +2692,11 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 							  &context);
 			break;
 
+	    case T_Asof:
+			finalize_primnode(((Asof *) plan)->asofTimestamp,
+							  &context);
+			break;
+
 		case T_RecursiveUnion:
 			/* child nodes are allowed to reference wtParam */
 			locally_added_param = ((RecursiveUnion *) plan)->wtParam;
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 1d7e499..a06806e 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -1443,6 +1443,7 @@ is_simple_subquery(Query *subquery, RangeTblEntry *rte,
 		subquery->distinctClause ||
 		subquery->limitOffset ||
 		subquery->limitCount ||
+		subquery->asofTimestamp ||
 		subquery->hasForUpdate ||
 		subquery->cteList)
 		return false;
@@ -1758,6 +1759,7 @@ is_simple_union_all(Query *subquery)
 	if (subquery->sortClause ||
 		subquery->limitOffset ||
 		subquery->limitCount ||
+		subquery->asofTimestamp ||
 		subquery->rowMarks ||
 		subquery->cteList)
 		return false;
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 6a2d5ad..1372fe5 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4514,6 +4514,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 		querytree->sortClause ||
 		querytree->limitOffset ||
 		querytree->limitCount ||
+		querytree->asofTimestamp ||
 		querytree->setOperations ||
 		list_length(querytree->targetList) != 1)
 		goto fail;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 54126fb..8a6f057 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -3358,6 +3358,37 @@ create_modifytable_path(PlannerInfo *root, RelOptInfo *rel,
 }
 
 /*
+ * create_asof_path
+ *	  Creates a pathnode that represents performing AS OF clause
+ */
+AsofPath *
+create_asof_path(PlannerInfo *root, RelOptInfo *rel,
+				 Path *subpath,
+				 Node *asofTimestamp)
+{
+	AsofPath  *pathnode = makeNode(AsofPath);
+	pathnode->path.pathtype = T_Asof;
+	pathnode->path.parent = rel;
+	/* Limit doesn't project, so use source path's pathtarget */
+	pathnode->path.pathtarget = subpath->pathtarget;
+	/* For now, assume we are above any joins, so no parameterization */
+	pathnode->path.param_info = NULL;
+	pathnode->path.parallel_aware = false;
+	pathnode->path.parallel_safe = rel->consider_parallel &&
+		subpath->parallel_safe;
+	pathnode->path.parallel_workers = subpath->parallel_workers;
+	pathnode->path.rows = subpath->rows;
+	pathnode->path.startup_cost = subpath->startup_cost;
+	pathnode->path.total_cost = subpath->total_cost;
+	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->subpath = subpath;
+	pathnode->asofTimestamp = asofTimestamp;
+
+	return pathnode;
+}
+
+
+/*
  * create_limit_path
  *	  Creates a pathnode that represents performing LIMIT/OFFSET
  *
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index d680d22..aa37f74 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -505,6 +505,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
 									  selectStmt->sortClause != NIL ||
 									  selectStmt->limitOffset != NULL ||
 									  selectStmt->limitCount != NULL ||
+									  selectStmt->asofTimestamp != NULL ||
 									  selectStmt->lockingClause != NIL ||
 									  selectStmt->withClause != NULL));
 
@@ -1266,6 +1267,7 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
 											EXPR_KIND_OFFSET, "OFFSET");
 	qry->limitCount = transformLimitClause(pstate, stmt->limitCount,
 										   EXPR_KIND_LIMIT, "LIMIT");
+	qry->asofTimestamp = transformAsofClause(pstate, stmt->asofTimestamp);
 
 	/* transform window clauses after we have seen all window functions */
 	qry->windowClause = transformWindowDefinitions(pstate,
@@ -1512,6 +1514,7 @@ transformValuesClause(ParseState *pstate, SelectStmt *stmt)
 											EXPR_KIND_OFFSET, "OFFSET");
 	qry->limitCount = transformLimitClause(pstate, stmt->limitCount,
 										   EXPR_KIND_LIMIT, "LIMIT");
+	qry->asofTimestamp = transformAsofClause(pstate, stmt->asofTimestamp);
 
 	if (stmt->lockingClause)
 		ereport(ERROR,
@@ -1553,6 +1556,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 	List	   *sortClause;
 	Node	   *limitOffset;
 	Node	   *limitCount;
+	Node	   *asofTimestamp;
 	List	   *lockingClause;
 	WithClause *withClause;
 	Node	   *node;
@@ -1598,12 +1602,14 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 	sortClause = stmt->sortClause;
 	limitOffset = stmt->limitOffset;
 	limitCount = stmt->limitCount;
+	asofTimestamp = stmt->asofTimestamp;
 	lockingClause = stmt->lockingClause;
 	withClause = stmt->withClause;
 
 	stmt->sortClause = NIL;
 	stmt->limitOffset = NULL;
 	stmt->limitCount = NULL;
+	stmt->asofTimestamp = NULL;
 	stmt->lockingClause = NIL;
 	stmt->withClause = NULL;
 
@@ -1747,6 +1753,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 											EXPR_KIND_OFFSET, "OFFSET");
 	qry->limitCount = transformLimitClause(pstate, limitCount,
 										   EXPR_KIND_LIMIT, "LIMIT");
+	qry->asofTimestamp = transformAsofClause(pstate, asofTimestamp);
 
 	qry->rtable = pstate->p_rtable;
 	qry->jointree = makeFromExpr(pstate->p_joinlist, NULL);
@@ -1829,7 +1836,7 @@ transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
 	{
 		Assert(stmt->larg != NULL && stmt->rarg != NULL);
 		if (stmt->sortClause || stmt->limitOffset || stmt->limitCount ||
-			stmt->lockingClause || stmt->withClause)
+			stmt->lockingClause || stmt->withClause || stmt->asofTimestamp)
 			isLeaf = true;
 		else
 			isLeaf = false;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ebfc94f..d28755b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -424,6 +424,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <ival>	for_locking_strength
 %type <node>	for_locking_item
 %type <list>	for_locking_clause opt_for_locking_clause for_locking_items
+%type <node>    asof_clause
 %type <list>	locked_rels_list
 %type <boolean>	all_or_distinct
 
@@ -605,7 +606,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 /* ordinary key words in alphabetical order */
 %token <keyword> ABORT_P ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER
-	AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC
+	AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC ASOF
 	ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
 
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
@@ -11129,8 +11130,8 @@ SelectStmt: select_no_parens			%prec UMINUS
 		;
 
 select_with_parens:
-			'(' select_no_parens ')'				{ $$ = $2; }
-			| '(' select_with_parens ')'			{ $$ = $2; }
+			'(' select_no_parens ')'			{ $$ = $2; }
+			| '(' select_with_parens ')'		{ $$ = $2; }
 		;
 
 /*
@@ -11234,7 +11235,7 @@ select_clause:
 simple_select:
 			SELECT opt_all_clause opt_target_list
 			into_clause from_clause where_clause
-			group_clause having_clause window_clause
+			group_clause having_clause window_clause asof_clause
 				{
 					SelectStmt *n = makeNode(SelectStmt);
 					n->targetList = $3;
@@ -11244,11 +11245,12 @@ simple_select:
 					n->groupClause = $7;
 					n->havingClause = $8;
 					n->windowClause = $9;
+					n->asofTimestamp = $10;
 					$$ = (Node *)n;
 				}
 			| SELECT distinct_clause target_list
 			into_clause from_clause where_clause
-			group_clause having_clause window_clause
+			group_clause having_clause window_clause asof_clause
 				{
 					SelectStmt *n = makeNode(SelectStmt);
 					n->distinctClause = $2;
@@ -11259,6 +11261,7 @@ simple_select:
 					n->groupClause = $7;
 					n->havingClause = $8;
 					n->windowClause = $9;
+					n->asofTimestamp = $10;
 					$$ = (Node *)n;
 				}
 			| values_clause							{ $$ = $1; }
@@ -11494,6 +11497,10 @@ opt_select_limit:
 			| /* EMPTY */						{ $$ = list_make2(NULL,NULL); }
 		;
 
+asof_clause: ASOF a_expr						{ $$ = $2; }
+			| /* EMPTY */						{ $$ = NULL; }
+		;
+
 limit_clause:
 			LIMIT select_limit_value
 				{ $$ = $2; }
@@ -15311,6 +15318,7 @@ reserved_keyword:
 			| ARRAY
 			| AS
 			| ASC
+			| ASOF
 			| ASYMMETRIC
 			| BOTH
 			| CASE
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4c4f4cd..6c3e506 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -439,6 +439,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
@@ -856,6 +857,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 2828bbf..6cdf1af 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -1729,6 +1729,30 @@ transformWhereClause(ParseState *pstate, Node *clause,
 
 
 /*
+ * transformAsofClause -
+ *	  Transform the expression and make sure it is of type bigint.
+ *	  Used for ASOF clause.
+ *
+ */
+Node *
+transformAsofClause(ParseState *pstate, Node *clause)
+{
+	Node	   *qual;
+
+	if (clause == NULL)
+		return NULL;
+
+	qual = transformExpr(pstate, clause, EXPR_KIND_ASOF);
+
+	qual = coerce_to_specific_type(pstate, qual, TIMESTAMPTZOID, "ASOF");
+
+	/* LIMIT can't refer to any variables of the current query */
+	checkExprIsVarFree(pstate, qual, "ASOF");
+
+	return qual;
+}
+
+/*
  * transformLimitClause -
  *	  Transform the expression and make sure it is of type bigint.
  *	  Used for LIMIT and allied clauses.
diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c
index 5160fdb..d326937 100644
--- a/src/backend/parser/parse_cte.c
+++ b/src/backend/parser/parse_cte.c
@@ -942,6 +942,8 @@ checkWellFormedSelectStmt(SelectStmt *stmt, CteState *cstate)
 											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->limitCount,
 											   cstate);
+				checkWellFormedRecursionWalker((Node *) stmt->asofTimestamp,
+											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->lockingClause,
 											   cstate);
 				/* stmt->withClause is intentionally ignored here */
@@ -961,6 +963,8 @@ checkWellFormedSelectStmt(SelectStmt *stmt, CteState *cstate)
 											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->limitCount,
 											   cstate);
+				checkWellFormedRecursionWalker((Node *) stmt->asofTimestamp,
+											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->lockingClause,
 											   cstate);
 				/* stmt->withClause is intentionally ignored here */
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da7..94b6b52 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1814,6 +1814,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
 		case EXPR_KIND_DISTINCT_ON:
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 		case EXPR_KIND_RETURNING:
 		case EXPR_KIND_VALUES:
 		case EXPR_KIND_VALUES_SINGLE:
@@ -3445,6 +3446,8 @@ ParseExprKindName(ParseExprKind exprKind)
 			return "LIMIT";
 		case EXPR_KIND_OFFSET:
 			return "OFFSET";
+		case EXPR_KIND_ASOF:
+			return "ASOF";
 		case EXPR_KIND_RETURNING:
 			return "RETURNING";
 		case EXPR_KIND_VALUES:
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e6b0856..a6bcfc7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2250,6 +2250,7 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c
index b032651..fec5ac5 100644
--- a/src/backend/parser/parse_type.c
+++ b/src/backend/parser/parse_type.c
@@ -767,6 +767,7 @@ typeStringToTypeName(const char *str)
 		stmt->sortClause != NIL ||
 		stmt->limitOffset != NULL ||
 		stmt->limitCount != NULL ||
+		stmt->asofTimestamp != NULL ||
 		stmt->lockingClause != NIL ||
 		stmt->withClause != NULL ||
 		stmt->op != SETOP_NONE)
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index e93552a..1bb37b2 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -2301,8 +2301,8 @@ view_query_is_auto_updatable(Query *viewquery, bool check_cols)
 	if (viewquery->cteList != NIL)
 		return gettext_noop("Views containing WITH are not automatically updatable.");
 
-	if (viewquery->limitOffset != NULL || viewquery->limitCount != NULL)
-		return gettext_noop("Views containing LIMIT or OFFSET are not automatically updatable.");
+	if (viewquery->limitOffset != NULL || viewquery->limitCount != NULL || viewquery->asofTimestamp != NULL)
+		return gettext_noop("Views containing AS OF, LIMIT or OFFSET are not automatically updatable.");
 
 	/*
 	 * We must not allow window functions or set returning functions in the
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 8514c21..330ebfb 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5115,6 +5115,12 @@ get_select_query_def(Query *query, deparse_context *context,
 		else
 			get_rule_expr(query->limitCount, context, false);
 	}
+	if (query->asofTimestamp != NULL)
+	{
+		appendContextKeyword(context, " AS OF ",
+							 -PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
+		get_rule_expr(query->asofTimestamp, context, false);
+	}
 
 	/* Add FOR [KEY] UPDATE/SHARE clauses if present */
 	if (query->hasForUpdate)
@@ -5503,10 +5509,11 @@ get_setop_query(Node *setOp, Query *query, deparse_context *context,
 
 		Assert(subquery != NULL);
 		Assert(subquery->setOperations == NULL);
-		/* Need parens if WITH, ORDER BY, FOR UPDATE, or LIMIT; see gram.y */
+		/* Need parens if WITH, ORDER BY, AS OF, FOR UPDATE, or LIMIT; see gram.y */
 		need_paren = (subquery->cteList ||
 					  subquery->sortClause ||
 					  subquery->rowMarks ||
+					  subquery->asofTimestamp ||
 					  subquery->limitOffset ||
 					  subquery->limitCount);
 		if (need_paren)
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 2b218e0..621c6a3 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -69,6 +69,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -81,7 +82,6 @@
 SnapshotData SnapshotSelfData = {HeapTupleSatisfiesSelf};
 SnapshotData SnapshotAnyData = {HeapTupleSatisfiesAny};
 
-
 /*
  * SetHintBits()
  *
@@ -1476,7 +1476,17 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
 	uint32		i;
 
-	/*
+	if (snapshot->asofTimestamp != 0)
+	{
+		TimestampTz ts;
+		if (TransactionIdGetCommitTsData(xid, &ts, NULL))
+		{
+			return timestamptz_cmp_internal(snapshot->asofTimestamp, ts) < 0;
+		}
+	}
+
+
+    /*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
 	 * its parent below, because a subxact with XID < xmin has surely also got
diff --git a/src/include/executor/nodeAsof.h b/src/include/executor/nodeAsof.h
new file mode 100644
index 0000000..2f8e1a2
--- /dev/null
+++ b/src/include/executor/nodeAsof.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * nodeLimit.h
+ *
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/executor/nodeLimit.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef NODEASOF_H
+#define NODEASOF_H
+
+#include "nodes/execnodes.h"
+
+extern AsofState *ExecInitAsof(Asof *node, EState *estate, int eflags);
+extern void ExecEndAsof(AsofState *node);
+extern void ExecReScanAsof(AsofState *node);
+
+#endif							/* NODEASOF_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 1a35c5c..42cd037 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2109,4 +2109,12 @@ typedef struct LimitState
 	TupleTableSlot *subSlot;	/* tuple last obtained from subplan */
 } LimitState;
 
+typedef struct AsofState
+{
+	PlanState	ps;				/* its first field is NodeTag */
+	ExprState  *asofExpr;	    /* AS OF expression */
+	TimestampTz asofTimestamp;  /* AS OF timestamp or 0 if not set */
+	bool        timestampCalculated; /* whether AS OF timestamp was calculated */
+} AsofState;
+
 #endif							/* EXECNODES_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index c5b5115..e69a189 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -83,6 +83,7 @@ typedef enum NodeTag
 	T_SetOp,
 	T_LockRows,
 	T_Limit,
+	T_Asof,
 	/* these aren't subclasses of Plan: */
 	T_NestLoopParam,
 	T_PlanRowMark,
@@ -135,6 +136,7 @@ typedef enum NodeTag
 	T_SetOpState,
 	T_LockRowsState,
 	T_LimitState,
+	T_AsofState,
 
 	/*
 	 * TAGS FOR PRIMITIVE NODES (primnodes.h)
@@ -251,6 +253,7 @@ typedef enum NodeTag
 	T_LockRowsPath,
 	T_ModifyTablePath,
 	T_LimitPath,
+	T_AsofPath,
 	/* these aren't subclasses of Path: */
 	T_EquivalenceClass,
 	T_EquivalenceMember,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2eaa6b2..e592418 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -158,6 +158,7 @@ typedef struct Query
 	Node	   *limitOffset;	/* # of result tuples to skip (int8 expr) */
 	Node	   *limitCount;		/* # of result tuples to return (int8 expr) */
 
+	Node       *asofTimestamp;  /* ASOF timestamp */
 	List	   *rowMarks;		/* a list of RowMarkClause's */
 
 	Node	   *setOperations;	/* set-operation tree if this is top level of
@@ -1552,6 +1553,7 @@ typedef struct SelectStmt
 	struct SelectStmt *larg;	/* left child */
 	struct SelectStmt *rarg;	/* right child */
 	/* Eventually add fields for CORRESPONDING spec here */
+	Node*       asofTimestamp;
 } SelectStmt;
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 02fb366..60c66b5 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -931,6 +931,19 @@ typedef struct Limit
 } Limit;
 
 
+/* ----------------
+ *		asof node
+ *
+ */
+typedef struct Asof
+{
+	Plan		plan;
+	Node	   *asofTimestamp;
+} Asof;
+
+
+
+
 /*
  * RowMarkType -
  *	  enums for types of row-marking operations
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 1108b6a..d1ca25e9 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1694,6 +1694,16 @@ typedef struct LimitPath
 	Node	   *limitCount;		/* COUNT parameter, or NULL if none */
 } LimitPath;
 
+/*
+ * AsofPath represents applying AS OF timestamp qualifier
+ */
+typedef struct AsofPath
+{
+	Path		path;
+	Path	   *subpath;		/* path representing input source */
+	Node	   *asofTimestamp;  /* AS OF timestamp */
+} AsofPath;
+
 
 /*
  * Restriction clause info.
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 99f65b4..9da39f1 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -250,6 +250,9 @@ extern LimitPath *create_limit_path(PlannerInfo *root, RelOptInfo *rel,
 				  Path *subpath,
 				  Node *limitOffset, Node *limitCount,
 				  int64 offset_est, int64 count_est);
+extern AsofPath *create_asof_path(PlannerInfo *root, RelOptInfo *rel,
+				  Path *subpath,
+				  Node *asofTimeout);
 
 extern Path *reparameterize_path(PlannerInfo *root, Path *path,
 					Relids required_outer,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index d613322..f5e5508 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -65,6 +65,7 @@ extern Agg *make_agg(List *tlist, List *qual,
 		 List *groupingSets, List *chain,
 		 double dNumGroups, Plan *lefttree);
 extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount);
+extern Asof *make_asof(Plan *lefttree, Node *asofTimeout);
 
 /*
  * prototypes for plan/initsplan.c
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index a932400..adaa71d 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -45,6 +45,7 @@ PG_KEYWORD("any", ANY, RESERVED_KEYWORD)
 PG_KEYWORD("array", ARRAY, RESERVED_KEYWORD)
 PG_KEYWORD("as", AS, RESERVED_KEYWORD)
 PG_KEYWORD("asc", ASC, RESERVED_KEYWORD)
+PG_KEYWORD("asof", ASOF, RESERVED_KEYWORD)
 PG_KEYWORD("assertion", ASSERTION, UNRESERVED_KEYWORD)
 PG_KEYWORD("assignment", ASSIGNMENT, UNRESERVED_KEYWORD)
 PG_KEYWORD("asymmetric", ASYMMETRIC, RESERVED_KEYWORD)
diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 1d205c6..d0d3681 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -25,6 +25,7 @@ extern Node *transformWhereClause(ParseState *pstate, Node *clause,
 					 ParseExprKind exprKind, const char *constructName);
 extern Node *transformLimitClause(ParseState *pstate, Node *clause,
 					 ParseExprKind exprKind, const char *constructName);
+extern Node *transformAsofClause(ParseState *pstate, Node *clause);
 extern List *transformGroupClause(ParseState *pstate, List *grouplist,
 					 List **groupingSets,
 					 List **targetlist, List *sortClause,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 565bb3d..46e9c0c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -68,7 +68,8 @@ typedef enum ParseExprKind
 	EXPR_KIND_TRIGGER_WHEN,		/* WHEN condition in CREATE TRIGGER */
 	EXPR_KIND_POLICY,			/* USING or WITH CHECK expr in policy */
 	EXPR_KIND_PARTITION_EXPRESSION,	/* PARTITION BY expression */
-	EXPR_KIND_CALL				/* CALL argument */
+	EXPR_KIND_CALL,				/* CALL argument */
+	EXPR_KIND_ASOF              /* AS OF */ 
 } ParseExprKind;
 
 
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index bf51977..a00f0d9 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -111,6 +111,7 @@ typedef struct SnapshotData
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
 	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
+	TimestampTz asofTimestamp;	/* select AS OF timestamp */
 	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
 } SnapshotData;
 
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index fa4d573..2b65848 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -6513,6 +6513,7 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
 		query->sortClause ||
 		query->limitOffset ||
 		query->limitCount ||
+		query->asofTimestamp ||
 		query->setOperations)
 		return;
 
#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Konstantin Knizhnik (#1)
Re: AS OF queries

Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.

I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...

I think it would be highly desirable to have AS OF, because that's
the way the SQL standard has it.

Yours,
Laurenz Albe

#3Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Laurenz Albe (#2)
Re: AS OF queries

On 20.12.2017 16:12, Laurenz Albe wrote:

Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.

Please notice that it is necessary to configure postgres in proper way
in order to be able to perform time travels.
If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not
possible to locate required timeline.

So DBA should make a decision in advance whether this feature is needed
or not.
It is not a proper instrument for restoring/auditing existed database
which was not configured to keep all versions.

May be it is better to add special configuration parameter for this
feature which should implicitly toggle
autovacuumand track_commit_timestamp parameters).

The obvious drawbacks of keeping all versions are
1. Increased size of database.
2. Decreased query execution speed because them need to traverse a lot
of not visible versions.

So may be in practice it will be useful to limit lifetime of versions.

I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...

I think it would be highly desirable to have AS OF, because that's
the way the SQL standard has it.

Completely agree  with you: I just give up after few hours of attempts
to make bison to resolve this conflicts.

Yours,
Laurenz Albe

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#4Joe Wildish
joe-postgresql.org@elusive.cx
In reply to: Konstantin Knizhnik (#3)
Re: AS OF queries

On 20 Dec 2017, at 13:48, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:

On 20.12.2017 16:12, Laurenz Albe wrote:

Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm <https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm&gt;).
As far as I know something similar is now developed for MariaDB.

I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.

Please notice that it is necessary to configure postgres in proper way in order to be able to perform time travels.
If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not possible to locate required timeline.

So DBA should make a decision in advance whether this feature is needed or not.
It is not a proper instrument for restoring/auditing existed database which was not configured to keep all versions.

May be it is better to add special configuration parameter for this feature which should implicitly toggle
autovacuum and track_commit_timestamp parameters).

I seem to recall that Oracle handles this by requiring tables that want the capability to live within a tablespace that supports flashback. That tablespace is obviously configured to retain redo/undo logs. It would be nice if the vacuuming process could be configured in a similar manner. I have no idea if it would make sense on a tablespace basis or not, though — I’m not entirely sure how analogous they are between Postgres & Oracle as I’ve never used tablespaces in Postgres.

-Joe

#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Konstantin Knizhnik (#3)
Re: AS OF queries

Konstantin Knizhnik wrote:

Please notice that it is necessary to configure postgres in proper way in order to be able to perform time travels.
If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not possible to locate required timeline.

So DBA should make a decision in advance whether this feature is needed or not.
It is not a proper instrument for restoring/auditing existed database which was not configured to keep all versions.

Of course; you'd have to anticipate the need to travel in time,
and you have to pay the price for it.
Anybody who has read science fiction stories know that time travel
does not come free.

May be it is better to add special configuration parameter for this feature which should implicitly toggle
autovacuum and track_commit_timestamp parameters).

The feature would be most useful with some kind of "moving xid
horizon" that guarantees that only dead tuples whose xmax lies
more than a certain time interval in the past can be vacuumed.

Yours,
Laurenz Albe

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#2)
Re: AS OF queries

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Konstantin Knizhnik wrote:

I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...

I think it would be highly desirable to have AS OF, because that's
the way the SQL standard has it.

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:

regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)

so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.

If we absolutely had to do it, the path to a solution would involve some
lexer-level lookahead, cf base_yylex() --- but that's messy and tends to
introduce its own set of corner case misbehaviors. I'd much rather use a
syntax that wasn't chosen with blind disregard for SQL's existing
syntactic constraints.

regards, tom lane

#7David Fetter
david@fetter.org
In reply to: Laurenz Albe (#5)
Re: AS OF queries

On Wed, Dec 20, 2017 at 03:03:50PM +0100, Laurenz Albe wrote:

Konstantin Knizhnik wrote:

Please notice that it is necessary to configure postgres in proper
way in order to be able to perform time travels. If you do not
disable autovacuum, then old versions will be just cleaned-up. If
transaction commit timestamps are not tracked, then it is not
possible to locate required timeline.

So DBA should make a decision in advance whether this feature is
needed or not. It is not a proper instrument for
restoring/auditing existed database which was not configured to
keep all versions.

Of course; you'd have to anticipate the need to travel in time, and
you have to pay the price for it. Anybody who has read science
fiction stories know that time travel does not come free.

A few extra terabytes' worth of storage space is a pretty small price
to pay, at least on the scale of time travel penalties.

May be it is better to add special configuration parameter for
this feature which should implicitly toggle autovacuum and
track_commit_timestamp parameters).

The feature would be most useful with some kind of "moving xid
horizon" that guarantees that only dead tuples whose xmax lies more
than a certain time interval in the past can be vacuumed.

+1 for this horizon. It would be very nice, but maybe not strictly
necessary, for this to be adjustable downward without a restart.

It's not clear that adjusting it upward should work at all, but if it
did, the state of dead tuples would have to be known, and they'd have
to be vacuumed a way that was able to establish a guarantee of
gaplessness at least back to the new horizon. Maybe there could be
some kind of "high water mark" for it. Would that impose overhead or
design constraints on vacuum that we don't want?

Also nice but not strictly necessary, making it tunable per relation,
or at least per table. I'm up in the air as to whether queries with
an AS OF older than the horizon[1]If we allow setting this at granularities coarser than DB instance, this means going as far back as the relationship with the newest "last" tuple among the relations involved in the query. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 should error out or merely throw
warnings.

Best,
David.

[1]: If we allow setting this at granularities coarser than DB instance, this means going as far back as the relationship with the newest "last" tuple among the relations involved in the query. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778
instance, this means going as far back as the relationship with the
newest "last" tuple among the relations involved in the query.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#8Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: AS OF queries

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:

regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)

so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.

The SQL standard syntax appears to be something like

"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]

That's not going to be fun to parse.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#8)
Re: AS OF queries

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.

The SQL standard syntax appears to be something like

"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]

That's not going to be fun to parse.

Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.

Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.

It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.

regards, tom lane

#10Magnus Hagander
magnus@hagander.net
In reply to: Peter Eisentraut (#8)
Re: AS OF queries

On Wed, Dec 20, 2017 at 5:17 PM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:

regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)

so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.

The SQL standard syntax appears to be something like

"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]

That's not going to be fun to parse.

There was a presentation about this given at FOSDEM PGDay a couple of years
back. Slides at
https://wiki.postgresql.org/images/6/64/Fosdem20150130PostgresqlTemporal.pdf
.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#11Pantelis Theodosiou
ypercube@gmail.com
In reply to: Tom Lane (#9)
Re: AS OF queries

On Wed, Dec 20, 2017 at 4:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.

The SQL standard syntax appears to be something like

"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]

That's not going to be fun to parse.

Examples from DB2 documentation (which may be closer to the standard):

SELECT coverage_amt
FROM policy FOR SYSTEM_TIME AS OF '2010-12-01'
WHERE id = 1111;

SELECT count(*)
FROM policy FOR SYSTEM_TIME FROM '2011-11-30'
TO '9999-12-30'
WHERE vin = 'A1111';

So besides AS .. AS , it could also be FROM .. FROM

Show quoted text

Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.

Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.

It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.

regards, tom lane

#12Alvaro Hernandez
aht@ongres.com
In reply to: Konstantin Knizhnik (#3)
Re: AS OF queries

On 20/12/17 14:48, Konstantin Knizhnik wrote:

On 20.12.2017 16:12, Laurenz Albe wrote:

Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.

Please notice that it is necessary to configure postgres in proper way
in order to be able to perform time travels.

    This makes sense. BTW, I believe this feature would be an amazing
addition to PostgreSQL.

If you do not disable autovacuum, then old versions will be just
cleaned-up.
If transaction commit timestamps are not tracked, then it is not
possible to locate required timeline.

So DBA should make a decision in advance whether this feature is
needed or not.
It is not a proper instrument for restoring/auditing existed database
which was not configured to keep all versions.

May be it is better to add special configuration parameter for this
feature which should implicitly toggle
autovacuumand track_commit_timestamp parameters).

    Downthread a "moving xid horizon" is proposed. I believe this is
not too user friendly. I'd rather use a timestamp horizon (e.g. "up to 2
days ago"). Given that the commit timestamp is tracked, I don't think
this is an issue. This is the same as the undo_retention in Oracle,
which is expressed in seconds.

The obvious drawbacks of keeping all versions are
1. Increased size of database.
2. Decreased query execution speed because them need to traverse a lot
of not visible versions.

    In other words, what is nowadays called "bloat". I have seen in the
field a lot of it. Not everybody tunes vacuum to keep up to date. So I
don't expect this feature to be too expensive for many. While at the
same time an awesome addition, not to fire a new separate server and
exercise PITR, and then find the ways to move the old data around.

    Regards,

    Álvaro

--

Alvaro Hernandez

-----------
OnGres

#13Craig Ringer
craig@2ndquadrant.com
In reply to: Peter Eisentraut (#8)
Re: AS OF queries

On 21 December 2017 at 00:17, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:

regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)

so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.

The SQL standard syntax appears to be something like

"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]

That's not going to be fun to parse.

Well, the SQL committe seem to specialise in parser torture.

Window functions, anybody?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#14Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tom Lane (#9)
1 attachment(s)
Re: AS OF queries

On 20.12.2017 19:26, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.

The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.

Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.

Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.

It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.

regards, tom lane

Thank you for suggestion with AS_LA: it really works.
Actually instead of AS_LA I just return ASOF token if next token after
AS is OF.
So now it is possible to write query in this way:

    select * from foo as of timestamp '2017-12-21 14:12:15.1867';

There is still one significant difference of my prototype implementation
with SQL standard: it associates timestamp with select statement, not
with particular table.
It seems to be more difficult to support and I am not sure that joining
tables from different timelines has much sense.
But certainly it also can be fixed.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

asof-2.patchtext/x-patch; name=asof-2.patchDownload
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3de8333..2126847 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2353,6 +2353,7 @@ JumbleQuery(pgssJumbleState *jstate, Query *query)
 	JumbleExpr(jstate, (Node *) query->sortClause);
 	JumbleExpr(jstate, query->limitOffset);
 	JumbleExpr(jstate, query->limitCount);
+	JumbleExpr(jstate, query->asofTimestamp);
 	/* we ignore rowMarks */
 	JumbleExpr(jstate, query->setOperations);
 }
diff --git a/src/backend/executor/Makefile b/src/backend/executor/Makefile
index cc09895..d2e0799 100644
--- a/src/backend/executor/Makefile
+++ b/src/backend/executor/Makefile
@@ -29,6 +29,6 @@ OBJS = execAmi.o execCurrent.o execExpr.o execExprInterp.o \
        nodeCtescan.o nodeNamedtuplestorescan.o nodeWorktablescan.o \
        nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
        nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
-       nodeTableFuncscan.o
+       nodeTableFuncscan.o nodeAsof.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index f1636a5..38c79b8 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -285,6 +285,10 @@ ExecReScan(PlanState *node)
 			ExecReScanLimit((LimitState *) node);
 			break;
 
+		case T_AsofState:
+			ExecReScanAsof((AsofState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
diff --git a/src/backend/executor/execCurrent.c b/src/backend/executor/execCurrent.c
index a3e962e..1912ae4 100644
--- a/src/backend/executor/execCurrent.c
+++ b/src/backend/executor/execCurrent.c
@@ -329,6 +329,7 @@ search_plan_tree(PlanState *node, Oid table_oid)
 			 */
 		case T_ResultState:
 		case T_LimitState:
+		case T_AsofState:
 			return search_plan_tree(node->lefttree, table_oid);
 
 			/*
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index fcb8b56..586b5b3 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -75,6 +75,7 @@
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
+#include "executor/nodeAsof.h"
 #include "executor/nodeBitmapAnd.h"
 #include "executor/nodeBitmapHeapscan.h"
 #include "executor/nodeBitmapIndexscan.h"
@@ -364,6 +365,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
 												 estate, eflags);
 			break;
 
+		case T_Asof:
+			result = (PlanState *) ExecInitAsof((Asof *) node,
+												estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -727,6 +733,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_AsofState:
+			ExecEndAsof((AsofState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
diff --git a/src/backend/executor/nodeAsof.c b/src/backend/executor/nodeAsof.c
new file mode 100644
index 0000000..8957a91
--- /dev/null
+++ b/src/backend/executor/nodeAsof.c
@@ -0,0 +1,157 @@
+/*-------------------------------------------------------------------------
+ *
+ * nodeAsof.c
+ *	  Routines to handle asofing of query results where appropriate
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/executor/nodeAsof.c
+ *
+ *-------------------------------------------------------------------------
+ */
+/*
+ * INTERFACE ROUTINES
+ *		ExecAsof		- extract a asofed range of tuples
+ *		ExecInitAsof	- initialize node and subnodes..
+ *		ExecEndAsof	- shutdown node and subnodes
+ */
+
+#include "postgres.h"
+
+#include "executor/executor.h"
+#include "executor/nodeAsof.h"
+#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
+
+/* ----------------------------------------------------------------
+ *		ExecAsof
+ *
+ *		This is a very simple node which just performs ASOF/OFFSET
+ *		filtering on the stream of tuples returned by a subplan.
+ * ----------------------------------------------------------------
+ */
+static TupleTableSlot *			/* return: a tuple or NULL */
+ExecAsof(PlanState *pstate)
+{
+	AsofState      *node = castNode(AsofState, pstate);
+	PlanState      *outerPlan = outerPlanState(node);
+	TimestampTz     outerAsofTimestamp;
+	TupleTableSlot *slot;
+
+	if (!node->timestampCalculated)
+	{
+		Datum		val;
+		bool		isNull;
+
+		val = ExecEvalExprSwitchContext(node->asofExpr,
+										pstate->ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		if (isNull)
+			node->asofTimestamp = 0;
+		else
+		{
+			node->asofTimestamp = DatumGetInt64(val);
+		}
+		node->timestampCalculated = true;
+	}
+	outerAsofTimestamp = pstate->state->es_snapshot->asofTimestamp;
+	pstate->state->es_snapshot->asofTimestamp = node->asofTimestamp;
+	slot = ExecProcNode(outerPlan);
+	pstate->state->es_snapshot->asofTimestamp = outerAsofTimestamp;
+	return slot;
+}
+
+
+/* ----------------------------------------------------------------
+ *		ExecInitAsof
+ *
+ *		This initializes the asof node state structures and
+ *		the node's subplan.
+ * ----------------------------------------------------------------
+ */
+AsofState *
+ExecInitAsof(Asof *node, EState *estate, int eflags)
+{
+	AsofState *asofstate;
+	Plan	   *outerPlan;
+
+	/* check for unsupported flags */
+	Assert(!(eflags & EXEC_FLAG_MARK));
+
+	/*
+	 * create state structure
+	 */
+	asofstate = makeNode(AsofState);
+	asofstate->ps.plan = (Plan *) node;
+	asofstate->ps.state = estate;
+	asofstate->ps.ExecProcNode = ExecAsof;
+	asofstate->timestampCalculated = false;
+
+	/*
+	 * Miscellaneous initialization
+	 *
+	 * Asof nodes never call ExecQual or ExecProject, but they need an
+	 * exprcontext anyway to evaluate the asof/offset parameters in.
+	 */
+	ExecAssignExprContext(estate, &asofstate->ps);
+
+	/*
+	 * initialize child expressions
+	 */
+	asofstate->asofExpr = ExecInitExpr((Expr *) node->asofTimestamp,
+									   (PlanState *) asofstate);
+	/*
+	 * Tuple table initialization (XXX not actually used...)
+	 */
+	ExecInitResultTupleSlot(estate, &asofstate->ps);
+
+	/*
+	 * then initialize outer plan
+	 */
+	outerPlan = outerPlan(node);
+	outerPlanState(asofstate) = ExecInitNode(outerPlan, estate, eflags);
+
+	/*
+	 * asof nodes do no projections, so initialize projection info for this
+	 * node appropriately
+	 */
+	ExecAssignResultTypeFromTL(&asofstate->ps);
+	asofstate->ps.ps_ProjInfo = NULL;
+
+	return asofstate;
+}
+
+/* ----------------------------------------------------------------
+ *		ExecEndAsof
+ *
+ *		This shuts down the subplan and frees resources allocated
+ *		to this node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecEndAsof(AsofState *node)
+{
+	ExecFreeExprContext(&node->ps);
+	ExecEndNode(outerPlanState(node));
+}
+
+
+void
+ExecReScanAsof(AsofState *node)
+{
+	/*
+	 * Recompute AS OF in case parameters changed, and reset the current snapshot
+	 */
+	node->timestampCalculated = false;
+
+	/*
+	 * if chgParam of subnode is not null then plan will be re-scanned by
+	 * first ExecProcNode.
+	 */
+	if (node->ps.lefttree->chgParam == NULL)
+		ExecReScan(node->ps.lefttree);
+}
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 883f46c..ebe0362 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -23,6 +23,7 @@
 
 #include "executor/executor.h"
 #include "executor/nodeLimit.h"
+#include "executor/nodeAsof.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index b1515dd..e142bbc 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1134,6 +1134,27 @@ _copyLimit(const Limit *from)
 }
 
 /*
+ * _copyAsof
+ */
+static Asof *
+_copyAsof(const Asof *from)
+{
+	Asof   *newnode = makeNode(Asof);
+
+	/*
+	 * copy node superclass fields
+	 */
+	CopyPlanFields((const Plan *) from, (Plan *) newnode);
+
+	/*
+	 * copy remainder of node
+	 */
+	COPY_NODE_FIELD(asofTimestamp);
+
+	return newnode;
+}
+
+/*
  * _copyNestLoopParam
  */
 static NestLoopParam *
@@ -2958,6 +2979,7 @@ _copyQuery(const Query *from)
 	COPY_NODE_FIELD(sortClause);
 	COPY_NODE_FIELD(limitOffset);
 	COPY_NODE_FIELD(limitCount);
+	COPY_NODE_FIELD(asofTimestamp);
 	COPY_NODE_FIELD(rowMarks);
 	COPY_NODE_FIELD(setOperations);
 	COPY_NODE_FIELD(constraintDeps);
@@ -3048,6 +3070,7 @@ _copySelectStmt(const SelectStmt *from)
 	COPY_SCALAR_FIELD(all);
 	COPY_NODE_FIELD(larg);
 	COPY_NODE_FIELD(rarg);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
@@ -4840,6 +4863,9 @@ copyObjectImpl(const void *from)
 		case T_Limit:
 			retval = _copyLimit(from);
 			break;
+		case T_Asof:
+			retval = _copyAsof(from);
+			break;
 		case T_NestLoopParam:
 			retval = _copyNestLoopParam(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 2e869a9..6bbbc1c 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -982,6 +982,7 @@ _equalQuery(const Query *a, const Query *b)
 	COMPARE_NODE_FIELD(sortClause);
 	COMPARE_NODE_FIELD(limitOffset);
 	COMPARE_NODE_FIELD(limitCount);
+	COMPARE_NODE_FIELD(asofTimestamp);
 	COMPARE_NODE_FIELD(rowMarks);
 	COMPARE_NODE_FIELD(setOperations);
 	COMPARE_NODE_FIELD(constraintDeps);
@@ -1062,6 +1063,7 @@ _equalSelectStmt(const SelectStmt *a, const SelectStmt *b)
 	COMPARE_SCALAR_FIELD(all);
 	COMPARE_NODE_FIELD(larg);
 	COMPARE_NODE_FIELD(rarg);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index c2a93b2..d674ec2 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2267,6 +2267,8 @@ query_tree_walker(Query *query,
 		return true;
 	if (walker(query->limitCount, context))
 		return true;
+	if (walker(query->asofTimestamp, context))
+		return true;
 	if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
 	{
 		if (walker((Node *) query->cteList, context))
@@ -3089,6 +3091,7 @@ query_tree_mutator(Query *query,
 	MUTATE(query->havingQual, query->havingQual, Node *);
 	MUTATE(query->limitOffset, query->limitOffset, Node *);
 	MUTATE(query->limitCount, query->limitCount, Node *);
+	MUTATE(query->asofTimestamp, query->asofTimestamp, Node *);
 	if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
 		MUTATE(query->cteList, query->cteList, List *);
 	else						/* else copy CTE list as-is */
@@ -3442,6 +3445,8 @@ raw_expression_tree_walker(Node *node,
 					return true;
 				if (walker(stmt->rarg, context))
 					return true;
+				if (walker(stmt->asofTimestamp, context))
+					return true;
 			}
 			break;
 		case T_A_Expr:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b59a521..e59c60d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -978,6 +978,16 @@ _outLimit(StringInfo str, const Limit *node)
 }
 
 static void
+_outAsof(StringInfo str, const Asof *node)
+{
+	WRITE_NODE_TYPE("ASOF");
+
+	_outPlanInfo(str, (const Plan *) node);
+
+	WRITE_NODE_FIELD(asofTimestamp);
+}
+
+static void
 _outNestLoopParam(StringInfo str, const NestLoopParam *node)
 {
 	WRITE_NODE_TYPE("NESTLOOPPARAM");
@@ -2127,6 +2137,17 @@ _outLimitPath(StringInfo str, const LimitPath *node)
 }
 
 static void
+_outAsofPath(StringInfo str, const AsofPath *node)
+{
+	WRITE_NODE_TYPE("ASOFPATH");
+
+	_outPathInfo(str, (const Path *) node);
+
+	WRITE_NODE_FIELD(subpath);
+	WRITE_NODE_FIELD(asofTimestamp);
+}
+
+static void
 _outGatherMergePath(StringInfo str, const GatherMergePath *node)
 {
 	WRITE_NODE_TYPE("GATHERMERGEPATH");
@@ -2722,6 +2743,7 @@ _outSelectStmt(StringInfo str, const SelectStmt *node)
 	WRITE_BOOL_FIELD(all);
 	WRITE_NODE_FIELD(larg);
 	WRITE_NODE_FIELD(rarg);
+	WRITE_NODE_FIELD(asofTimestamp);
 }
 
 static void
@@ -2925,6 +2947,7 @@ _outQuery(StringInfo str, const Query *node)
 	WRITE_NODE_FIELD(sortClause);
 	WRITE_NODE_FIELD(limitOffset);
 	WRITE_NODE_FIELD(limitCount);
+	WRITE_NODE_FIELD(asofTimestamp);
 	WRITE_NODE_FIELD(rowMarks);
 	WRITE_NODE_FIELD(setOperations);
 	WRITE_NODE_FIELD(constraintDeps);
@@ -3753,6 +3776,9 @@ outNode(StringInfo str, const void *obj)
 			case T_Limit:
 				_outLimit(str, obj);
 				break;
+			case T_Asof:
+				_outAsof(str, obj);
+				break;
 			case T_NestLoopParam:
 				_outNestLoopParam(str, obj);
 				break;
@@ -4002,6 +4028,9 @@ outNode(StringInfo str, const void *obj)
 			case T_LimitPath:
 				_outLimitPath(str, obj);
 				break;
+			case T_AsofPath:
+				_outAsofPath(str, obj);
+				break;
 			case T_GatherMergePath:
 				_outGatherMergePath(str, obj);
 				break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 0d17ae8..f805ea3 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -266,6 +266,7 @@ _readQuery(void)
 	READ_NODE_FIELD(sortClause);
 	READ_NODE_FIELD(limitOffset);
 	READ_NODE_FIELD(limitCount);
+	READ_NODE_FIELD(asofTimestamp);
 	READ_NODE_FIELD(rowMarks);
 	READ_NODE_FIELD(setOperations);
 	READ_NODE_FIELD(constraintDeps);
@@ -2272,6 +2273,21 @@ _readLimit(void)
 }
 
 /*
+ * _readAsof
+ */
+static Asof *
+_readAsof(void)
+{
+	READ_LOCALS(Asof);
+
+	ReadCommonPlan(&local_node->plan);
+
+	READ_NODE_FIELD(asofTimestamp);
+
+	READ_DONE();
+}
+
+/*
  * _readNestLoopParam
  */
 static NestLoopParam *
@@ -2655,6 +2671,8 @@ parseNodeString(void)
 		return_value = _readLockRows();
 	else if (MATCH("LIMIT", 5))
 		return_value = _readLimit();
+	else if (MATCH("ASOF", 4))
+		return_value = _readAsof();
 	else if (MATCH("NESTLOOPPARAM", 13))
 		return_value = _readNestLoopParam();
 	else if (MATCH("PLANROWMARK", 11))
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0e8463e..9c97018 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2756,7 +2756,7 @@ subquery_is_pushdown_safe(Query *subquery, Query *topquery,
 	SetOperationStmt *topop;
 
 	/* Check point 1 */
-	if (subquery->limitOffset != NULL || subquery->limitCount != NULL)
+	if (subquery->limitOffset != NULL || subquery->limitCount != NULL || subquery->asofTimestamp != NULL)
 		return false;
 
 	/* Check points 3, 4, and 5 */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index f6c83d0..413a7a7 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -114,6 +114,8 @@ static LockRows *create_lockrows_plan(PlannerInfo *root, LockRowsPath *best_path
 static ModifyTable *create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path);
 static Limit *create_limit_plan(PlannerInfo *root, LimitPath *best_path,
 				  int flags);
+static Asof *create_asof_plan(PlannerInfo *root, AsofPath *best_path,
+				  int flags);
 static SeqScan *create_seqscan_plan(PlannerInfo *root, Path *best_path,
 					List *tlist, List *scan_clauses);
 static SampleScan *create_samplescan_plan(PlannerInfo *root, Path *best_path,
@@ -483,6 +485,11 @@ create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
 											  (LimitPath *) best_path,
 											  flags);
 			break;
+		case T_Asof:
+			plan = (Plan *) create_asof_plan(root,
+											 (AsofPath *) best_path,
+											 flags);
+			break;
 		case T_GatherMerge:
 			plan = (Plan *) create_gather_merge_plan(root,
 													 (GatherMergePath *) best_path);
@@ -2410,6 +2417,29 @@ create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags)
 	return plan;
 }
 
+/*
+ * create_asof_plan
+ *
+ *	  Create a Limit plan for 'best_path' and (recursively) plans
+ *	  for its subpaths.
+ */
+static Asof *
+create_asof_plan(PlannerInfo *root, AsofPath *best_path, int flags)
+{
+	Asof	   *plan;
+	Plan	   *subplan;
+
+	/* Limit doesn't project, so tlist requirements pass through */
+	subplan = create_plan_recurse(root, best_path->subpath, flags);
+
+	plan = make_asof(subplan,
+					 best_path->asofTimestamp);
+
+	copy_generic_path_info(&plan->plan, (Path *) best_path);
+
+	return plan;
+}
+
 
 /*****************************************************************************
  *
@@ -6385,6 +6415,26 @@ make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount)
 }
 
 /*
+ * make_asof
+ *	  Build a Asof plan node
+ */
+Asof *
+make_asof(Plan *lefttree, Node *asofTimestamp)
+{
+	Asof	   *node = makeNode(Asof);
+	Plan	   *plan = &node->plan;
+
+	plan->targetlist = lefttree->targetlist;
+	plan->qual = NIL;
+	plan->lefttree = lefttree;
+	plan->righttree = NULL;
+
+	node->asofTimestamp = asofTimestamp;
+
+	return node;
+}
+
+/*
  * make_result
  *	  Build a Result plan node
  */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index e8bc15c..e5c867b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -84,6 +84,7 @@ create_upper_paths_hook_type create_upper_paths_hook = NULL;
 #define EXPRKIND_ARBITER_ELEM		10
 #define EXPRKIND_TABLEFUNC			11
 #define EXPRKIND_TABLEFUNC_LATERAL	12
+#define EXPRKIND_ASOF				13
 
 /* Passthrough data for standard_qp_callback */
 typedef struct
@@ -696,6 +697,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	parse->limitCount = preprocess_expression(root, parse->limitCount,
 											  EXPRKIND_LIMIT);
 
+	parse->asofTimestamp = preprocess_expression(root, parse->asofTimestamp,
+												 EXPRKIND_ASOF);
+
 	if (parse->onConflict)
 	{
 		parse->onConflict->arbiterElems = (List *)
@@ -2032,12 +2036,13 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 
 	/*
 	 * If the input rel is marked consider_parallel and there's nothing that's
-	 * not parallel-safe in the LIMIT clause, then the final_rel can be marked
+	 * not parallel-safe in the LIMIT and ASOF clauses, then the final_rel can be marked
 	 * consider_parallel as well.  Note that if the query has rowMarks or is
 	 * not a SELECT, consider_parallel will be false for every relation in the
 	 * query.
 	 */
 	if (current_rel->consider_parallel &&
+		is_parallel_safe(root, parse->asofTimestamp) &&
 		is_parallel_safe(root, parse->limitOffset) &&
 		is_parallel_safe(root, parse->limitCount))
 		final_rel->consider_parallel = true;
@@ -2084,6 +2089,15 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		}
 
 		/*
+		 * If there is a AS OF clause, add the ASOF node.
+		 */
+		if (parse->asofTimestamp)
+		{
+			path = (Path *) create_asof_path(root, final_rel, path,
+											 parse->asofTimestamp);
+		}
+
+		/*
 		 * If this is an INSERT/UPDATE/DELETE, and we're not being called from
 		 * inheritance_planner, add the ModifyTable node.
 		 */
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b5c4124..bc79055 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -700,6 +700,23 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 					fix_scan_expr(root, splan->limitCount, rtoffset);
 			}
 			break;
+		case T_Asof:
+			{
+				Asof	   *splan = (Asof *) plan;
+
+				/*
+				 * Like the plan types above, Asof doesn't evaluate its tlist
+				 * or quals.  It does have live expression for asof,
+				 * however; and those cannot contain subplan variable refs, so
+				 * fix_scan_expr works for them.
+				 */
+				set_dummy_tlist_references(plan, rtoffset);
+				Assert(splan->plan.qual == NIL);
+
+				splan->asofTimestamp =
+					fix_scan_expr(root, splan->asofTimestamp, rtoffset);
+			}
+			break;
 		case T_Agg:
 			{
 				Agg		   *agg = (Agg *) plan;
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 2e3abee..c215d3b 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1602,6 +1602,7 @@ simplify_EXISTS_query(PlannerInfo *root, Query *query)
 		query->hasModifyingCTE ||
 		query->havingQual ||
 		query->limitOffset ||
+		query->asofTimestamp ||
 		query->rowMarks)
 		return false;
 
@@ -2691,6 +2692,11 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 							  &context);
 			break;
 
+	    case T_Asof:
+			finalize_primnode(((Asof *) plan)->asofTimestamp,
+							  &context);
+			break;
+
 		case T_RecursiveUnion:
 			/* child nodes are allowed to reference wtParam */
 			locally_added_param = ((RecursiveUnion *) plan)->wtParam;
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 1d7e499..a06806e 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -1443,6 +1443,7 @@ is_simple_subquery(Query *subquery, RangeTblEntry *rte,
 		subquery->distinctClause ||
 		subquery->limitOffset ||
 		subquery->limitCount ||
+		subquery->asofTimestamp ||
 		subquery->hasForUpdate ||
 		subquery->cteList)
 		return false;
@@ -1758,6 +1759,7 @@ is_simple_union_all(Query *subquery)
 	if (subquery->sortClause ||
 		subquery->limitOffset ||
 		subquery->limitCount ||
+		subquery->asofTimestamp ||
 		subquery->rowMarks ||
 		subquery->cteList)
 		return false;
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 6a2d5ad..1372fe5 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4514,6 +4514,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 		querytree->sortClause ||
 		querytree->limitOffset ||
 		querytree->limitCount ||
+		querytree->asofTimestamp ||
 		querytree->setOperations ||
 		list_length(querytree->targetList) != 1)
 		goto fail;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 54126fb..8a6f057 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -3358,6 +3358,37 @@ create_modifytable_path(PlannerInfo *root, RelOptInfo *rel,
 }
 
 /*
+ * create_asof_path
+ *	  Creates a pathnode that represents performing AS OF clause
+ */
+AsofPath *
+create_asof_path(PlannerInfo *root, RelOptInfo *rel,
+				 Path *subpath,
+				 Node *asofTimestamp)
+{
+	AsofPath  *pathnode = makeNode(AsofPath);
+	pathnode->path.pathtype = T_Asof;
+	pathnode->path.parent = rel;
+	/* Limit doesn't project, so use source path's pathtarget */
+	pathnode->path.pathtarget = subpath->pathtarget;
+	/* For now, assume we are above any joins, so no parameterization */
+	pathnode->path.param_info = NULL;
+	pathnode->path.parallel_aware = false;
+	pathnode->path.parallel_safe = rel->consider_parallel &&
+		subpath->parallel_safe;
+	pathnode->path.parallel_workers = subpath->parallel_workers;
+	pathnode->path.rows = subpath->rows;
+	pathnode->path.startup_cost = subpath->startup_cost;
+	pathnode->path.total_cost = subpath->total_cost;
+	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->subpath = subpath;
+	pathnode->asofTimestamp = asofTimestamp;
+
+	return pathnode;
+}
+
+
+/*
  * create_limit_path
  *	  Creates a pathnode that represents performing LIMIT/OFFSET
  *
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index d680d22..aa37f74 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -505,6 +505,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
 									  selectStmt->sortClause != NIL ||
 									  selectStmt->limitOffset != NULL ||
 									  selectStmt->limitCount != NULL ||
+									  selectStmt->asofTimestamp != NULL ||
 									  selectStmt->lockingClause != NIL ||
 									  selectStmt->withClause != NULL));
 
@@ -1266,6 +1267,7 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
 											EXPR_KIND_OFFSET, "OFFSET");
 	qry->limitCount = transformLimitClause(pstate, stmt->limitCount,
 										   EXPR_KIND_LIMIT, "LIMIT");
+	qry->asofTimestamp = transformAsofClause(pstate, stmt->asofTimestamp);
 
 	/* transform window clauses after we have seen all window functions */
 	qry->windowClause = transformWindowDefinitions(pstate,
@@ -1512,6 +1514,7 @@ transformValuesClause(ParseState *pstate, SelectStmt *stmt)
 											EXPR_KIND_OFFSET, "OFFSET");
 	qry->limitCount = transformLimitClause(pstate, stmt->limitCount,
 										   EXPR_KIND_LIMIT, "LIMIT");
+	qry->asofTimestamp = transformAsofClause(pstate, stmt->asofTimestamp);
 
 	if (stmt->lockingClause)
 		ereport(ERROR,
@@ -1553,6 +1556,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 	List	   *sortClause;
 	Node	   *limitOffset;
 	Node	   *limitCount;
+	Node	   *asofTimestamp;
 	List	   *lockingClause;
 	WithClause *withClause;
 	Node	   *node;
@@ -1598,12 +1602,14 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 	sortClause = stmt->sortClause;
 	limitOffset = stmt->limitOffset;
 	limitCount = stmt->limitCount;
+	asofTimestamp = stmt->asofTimestamp;
 	lockingClause = stmt->lockingClause;
 	withClause = stmt->withClause;
 
 	stmt->sortClause = NIL;
 	stmt->limitOffset = NULL;
 	stmt->limitCount = NULL;
+	stmt->asofTimestamp = NULL;
 	stmt->lockingClause = NIL;
 	stmt->withClause = NULL;
 
@@ -1747,6 +1753,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 											EXPR_KIND_OFFSET, "OFFSET");
 	qry->limitCount = transformLimitClause(pstate, limitCount,
 										   EXPR_KIND_LIMIT, "LIMIT");
+	qry->asofTimestamp = transformAsofClause(pstate, asofTimestamp);
 
 	qry->rtable = pstate->p_rtable;
 	qry->jointree = makeFromExpr(pstate->p_joinlist, NULL);
@@ -1829,7 +1836,7 @@ transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
 	{
 		Assert(stmt->larg != NULL && stmt->rarg != NULL);
 		if (stmt->sortClause || stmt->limitOffset || stmt->limitCount ||
-			stmt->lockingClause || stmt->withClause)
+			stmt->lockingClause || stmt->withClause || stmt->asofTimestamp)
 			isLeaf = true;
 		else
 			isLeaf = false;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ebfc94f..6a9821e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -424,6 +424,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <ival>	for_locking_strength
 %type <node>	for_locking_item
 %type <list>	for_locking_clause opt_for_locking_clause for_locking_items
+%type <node>    asof_clause
 %type <list>	locked_rels_list
 %type <boolean>	all_or_distinct
 
@@ -605,7 +606,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 /* ordinary key words in alphabetical order */
 %token <keyword> ABORT_P ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER
-	AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC
+	AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC ASOF
 	ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
 
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
@@ -11129,8 +11130,8 @@ SelectStmt: select_no_parens			%prec UMINUS
 		;
 
 select_with_parens:
-			'(' select_no_parens ')'				{ $$ = $2; }
-			| '(' select_with_parens ')'			{ $$ = $2; }
+			'(' select_no_parens ')'			{ $$ = $2; }
+			| '(' select_with_parens ')'		{ $$ = $2; }
 		;
 
 /*
@@ -11234,7 +11235,7 @@ select_clause:
 simple_select:
 			SELECT opt_all_clause opt_target_list
 			into_clause from_clause where_clause
-			group_clause having_clause window_clause
+			group_clause having_clause window_clause asof_clause
 				{
 					SelectStmt *n = makeNode(SelectStmt);
 					n->targetList = $3;
@@ -11244,11 +11245,12 @@ simple_select:
 					n->groupClause = $7;
 					n->havingClause = $8;
 					n->windowClause = $9;
+					n->asofTimestamp = $10;
 					$$ = (Node *)n;
 				}
 			| SELECT distinct_clause target_list
 			into_clause from_clause where_clause
-			group_clause having_clause window_clause
+			group_clause having_clause window_clause asof_clause
 				{
 					SelectStmt *n = makeNode(SelectStmt);
 					n->distinctClause = $2;
@@ -11259,6 +11261,7 @@ simple_select:
 					n->groupClause = $7;
 					n->havingClause = $8;
 					n->windowClause = $9;
+					n->asofTimestamp = $10;
 					$$ = (Node *)n;
 				}
 			| values_clause							{ $$ = $1; }
@@ -11494,6 +11497,10 @@ opt_select_limit:
 			| /* EMPTY */						{ $$ = list_make2(NULL,NULL); }
 		;
 
+asof_clause: ASOF a_expr						{ $$ = $2; }
+			| /* EMPTY */						{ $$ = NULL; }
+		;
+
 limit_clause:
 			LIMIT select_limit_value
 				{ $$ = $2; }
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4c4f4cd..6c3e506 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -439,6 +439,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
@@ -856,6 +857,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 2828bbf..6cdf1af 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -1729,6 +1729,30 @@ transformWhereClause(ParseState *pstate, Node *clause,
 
 
 /*
+ * transformAsofClause -
+ *	  Transform the expression and make sure it is of type bigint.
+ *	  Used for ASOF clause.
+ *
+ */
+Node *
+transformAsofClause(ParseState *pstate, Node *clause)
+{
+	Node	   *qual;
+
+	if (clause == NULL)
+		return NULL;
+
+	qual = transformExpr(pstate, clause, EXPR_KIND_ASOF);
+
+	qual = coerce_to_specific_type(pstate, qual, TIMESTAMPTZOID, "ASOF");
+
+	/* LIMIT can't refer to any variables of the current query */
+	checkExprIsVarFree(pstate, qual, "ASOF");
+
+	return qual;
+}
+
+/*
  * transformLimitClause -
  *	  Transform the expression and make sure it is of type bigint.
  *	  Used for LIMIT and allied clauses.
diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c
index 5160fdb..d326937 100644
--- a/src/backend/parser/parse_cte.c
+++ b/src/backend/parser/parse_cte.c
@@ -942,6 +942,8 @@ checkWellFormedSelectStmt(SelectStmt *stmt, CteState *cstate)
 											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->limitCount,
 											   cstate);
+				checkWellFormedRecursionWalker((Node *) stmt->asofTimestamp,
+											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->lockingClause,
 											   cstate);
 				/* stmt->withClause is intentionally ignored here */
@@ -961,6 +963,8 @@ checkWellFormedSelectStmt(SelectStmt *stmt, CteState *cstate)
 											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->limitCount,
 											   cstate);
+				checkWellFormedRecursionWalker((Node *) stmt->asofTimestamp,
+											   cstate);
 				checkWellFormedRecursionWalker((Node *) stmt->lockingClause,
 											   cstate);
 				/* stmt->withClause is intentionally ignored here */
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da7..94b6b52 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1814,6 +1814,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
 		case EXPR_KIND_DISTINCT_ON:
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 		case EXPR_KIND_RETURNING:
 		case EXPR_KIND_VALUES:
 		case EXPR_KIND_VALUES_SINGLE:
@@ -3445,6 +3446,8 @@ ParseExprKindName(ParseExprKind exprKind)
 			return "LIMIT";
 		case EXPR_KIND_OFFSET:
 			return "OFFSET";
+		case EXPR_KIND_ASOF:
+			return "ASOF";
 		case EXPR_KIND_RETURNING:
 			return "RETURNING";
 		case EXPR_KIND_VALUES:
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e6b0856..a6bcfc7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2250,6 +2250,7 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c
index b032651..fec5ac5 100644
--- a/src/backend/parser/parse_type.c
+++ b/src/backend/parser/parse_type.c
@@ -767,6 +767,7 @@ typeStringToTypeName(const char *str)
 		stmt->sortClause != NIL ||
 		stmt->limitOffset != NULL ||
 		stmt->limitCount != NULL ||
+		stmt->asofTimestamp != NULL ||
 		stmt->lockingClause != NIL ||
 		stmt->withClause != NULL ||
 		stmt->op != SETOP_NONE)
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 245b4cd..63017b9 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -108,6 +108,9 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	 */
 	switch (cur_token)
 	{
+		case AS:
+			cur_token_length = 2;
+			break;
 		case NOT:
 			cur_token_length = 3;
 			break;
@@ -155,6 +158,14 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	/* Replace cur_token if needed, based on lookahead */
 	switch (cur_token)
 	{
+		case AS:
+		    if (next_token == OF)
+			{
+				cur_token = ASOF;
+				*(yyextra->lookahead_end) = yyextra->lookahead_hold_char;
+				yyextra->have_lookahead = false;
+			}
+			break;
 		case NOT:
 			/* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
 			switch (next_token)
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index e93552a..1bb37b2 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -2301,8 +2301,8 @@ view_query_is_auto_updatable(Query *viewquery, bool check_cols)
 	if (viewquery->cteList != NIL)
 		return gettext_noop("Views containing WITH are not automatically updatable.");
 
-	if (viewquery->limitOffset != NULL || viewquery->limitCount != NULL)
-		return gettext_noop("Views containing LIMIT or OFFSET are not automatically updatable.");
+	if (viewquery->limitOffset != NULL || viewquery->limitCount != NULL || viewquery->asofTimestamp != NULL)
+		return gettext_noop("Views containing AS OF, LIMIT or OFFSET are not automatically updatable.");
 
 	/*
 	 * We must not allow window functions or set returning functions in the
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 8514c21..330ebfb 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5115,6 +5115,12 @@ get_select_query_def(Query *query, deparse_context *context,
 		else
 			get_rule_expr(query->limitCount, context, false);
 	}
+	if (query->asofTimestamp != NULL)
+	{
+		appendContextKeyword(context, " AS OF ",
+							 -PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
+		get_rule_expr(query->asofTimestamp, context, false);
+	}
 
 	/* Add FOR [KEY] UPDATE/SHARE clauses if present */
 	if (query->hasForUpdate)
@@ -5503,10 +5509,11 @@ get_setop_query(Node *setOp, Query *query, deparse_context *context,
 
 		Assert(subquery != NULL);
 		Assert(subquery->setOperations == NULL);
-		/* Need parens if WITH, ORDER BY, FOR UPDATE, or LIMIT; see gram.y */
+		/* Need parens if WITH, ORDER BY, AS OF, FOR UPDATE, or LIMIT; see gram.y */
 		need_paren = (subquery->cteList ||
 					  subquery->sortClause ||
 					  subquery->rowMarks ||
+					  subquery->asofTimestamp ||
 					  subquery->limitOffset ||
 					  subquery->limitCount);
 		if (need_paren)
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 2b218e0..621c6a3 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -69,6 +69,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -81,7 +82,6 @@
 SnapshotData SnapshotSelfData = {HeapTupleSatisfiesSelf};
 SnapshotData SnapshotAnyData = {HeapTupleSatisfiesAny};
 
-
 /*
  * SetHintBits()
  *
@@ -1476,7 +1476,17 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
 	uint32		i;
 
-	/*
+	if (snapshot->asofTimestamp != 0)
+	{
+		TimestampTz ts;
+		if (TransactionIdGetCommitTsData(xid, &ts, NULL))
+		{
+			return timestamptz_cmp_internal(snapshot->asofTimestamp, ts) < 0;
+		}
+	}
+
+
+    /*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
 	 * its parent below, because a subxact with XID < xmin has surely also got
diff --git a/src/include/executor/nodeAsof.h b/src/include/executor/nodeAsof.h
new file mode 100644
index 0000000..2f8e1a2
--- /dev/null
+++ b/src/include/executor/nodeAsof.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * nodeLimit.h
+ *
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/executor/nodeLimit.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef NODEASOF_H
+#define NODEASOF_H
+
+#include "nodes/execnodes.h"
+
+extern AsofState *ExecInitAsof(Asof *node, EState *estate, int eflags);
+extern void ExecEndAsof(AsofState *node);
+extern void ExecReScanAsof(AsofState *node);
+
+#endif							/* NODEASOF_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 1a35c5c..42cd037 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2109,4 +2109,12 @@ typedef struct LimitState
 	TupleTableSlot *subSlot;	/* tuple last obtained from subplan */
 } LimitState;
 
+typedef struct AsofState
+{
+	PlanState	ps;				/* its first field is NodeTag */
+	ExprState  *asofExpr;	    /* AS OF expression */
+	TimestampTz asofTimestamp;  /* AS OF timestamp or 0 if not set */
+	bool        timestampCalculated; /* whether AS OF timestamp was calculated */
+} AsofState;
+
 #endif							/* EXECNODES_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index c5b5115..e69a189 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -83,6 +83,7 @@ typedef enum NodeTag
 	T_SetOp,
 	T_LockRows,
 	T_Limit,
+	T_Asof,
 	/* these aren't subclasses of Plan: */
 	T_NestLoopParam,
 	T_PlanRowMark,
@@ -135,6 +136,7 @@ typedef enum NodeTag
 	T_SetOpState,
 	T_LockRowsState,
 	T_LimitState,
+	T_AsofState,
 
 	/*
 	 * TAGS FOR PRIMITIVE NODES (primnodes.h)
@@ -251,6 +253,7 @@ typedef enum NodeTag
 	T_LockRowsPath,
 	T_ModifyTablePath,
 	T_LimitPath,
+	T_AsofPath,
 	/* these aren't subclasses of Path: */
 	T_EquivalenceClass,
 	T_EquivalenceMember,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2eaa6b2..e592418 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -158,6 +158,7 @@ typedef struct Query
 	Node	   *limitOffset;	/* # of result tuples to skip (int8 expr) */
 	Node	   *limitCount;		/* # of result tuples to return (int8 expr) */
 
+	Node       *asofTimestamp;  /* ASOF timestamp */
 	List	   *rowMarks;		/* a list of RowMarkClause's */
 
 	Node	   *setOperations;	/* set-operation tree if this is top level of
@@ -1552,6 +1553,7 @@ typedef struct SelectStmt
 	struct SelectStmt *larg;	/* left child */
 	struct SelectStmt *rarg;	/* right child */
 	/* Eventually add fields for CORRESPONDING spec here */
+	Node*       asofTimestamp;
 } SelectStmt;
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 02fb366..60c66b5 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -931,6 +931,19 @@ typedef struct Limit
 } Limit;
 
 
+/* ----------------
+ *		asof node
+ *
+ */
+typedef struct Asof
+{
+	Plan		plan;
+	Node	   *asofTimestamp;
+} Asof;
+
+
+
+
 /*
  * RowMarkType -
  *	  enums for types of row-marking operations
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 1108b6a..d1ca25e9 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1694,6 +1694,16 @@ typedef struct LimitPath
 	Node	   *limitCount;		/* COUNT parameter, or NULL if none */
 } LimitPath;
 
+/*
+ * AsofPath represents applying AS OF timestamp qualifier
+ */
+typedef struct AsofPath
+{
+	Path		path;
+	Path	   *subpath;		/* path representing input source */
+	Node	   *asofTimestamp;  /* AS OF timestamp */
+} AsofPath;
+
 
 /*
  * Restriction clause info.
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 99f65b4..9da39f1 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -250,6 +250,9 @@ extern LimitPath *create_limit_path(PlannerInfo *root, RelOptInfo *rel,
 				  Path *subpath,
 				  Node *limitOffset, Node *limitCount,
 				  int64 offset_est, int64 count_est);
+extern AsofPath *create_asof_path(PlannerInfo *root, RelOptInfo *rel,
+				  Path *subpath,
+				  Node *asofTimeout);
 
 extern Path *reparameterize_path(PlannerInfo *root, Path *path,
 					Relids required_outer,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index d613322..f5e5508 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -65,6 +65,7 @@ extern Agg *make_agg(List *tlist, List *qual,
 		 List *groupingSets, List *chain,
 		 double dNumGroups, Plan *lefttree);
 extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount);
+extern Asof *make_asof(Plan *lefttree, Node *asofTimeout);
 
 /*
  * prototypes for plan/initsplan.c
diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 1d205c6..d0d3681 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -25,6 +25,7 @@ extern Node *transformWhereClause(ParseState *pstate, Node *clause,
 					 ParseExprKind exprKind, const char *constructName);
 extern Node *transformLimitClause(ParseState *pstate, Node *clause,
 					 ParseExprKind exprKind, const char *constructName);
+extern Node *transformAsofClause(ParseState *pstate, Node *clause);
 extern List *transformGroupClause(ParseState *pstate, List *grouplist,
 					 List **groupingSets,
 					 List **targetlist, List *sortClause,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 565bb3d..46e9c0c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -68,7 +68,8 @@ typedef enum ParseExprKind
 	EXPR_KIND_TRIGGER_WHEN,		/* WHEN condition in CREATE TRIGGER */
 	EXPR_KIND_POLICY,			/* USING or WITH CHECK expr in policy */
 	EXPR_KIND_PARTITION_EXPRESSION,	/* PARTITION BY expression */
-	EXPR_KIND_CALL				/* CALL argument */
+	EXPR_KIND_CALL,				/* CALL argument */
+	EXPR_KIND_ASOF              /* AS OF */ 
 } ParseExprKind;
 
 
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index bf51977..a00f0d9 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -111,6 +111,7 @@ typedef struct SnapshotData
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
 	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
+	TimestampTz asofTimestamp;	/* select AS OF timestamp */
 	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
 } SnapshotData;
 
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index fa4d573..2b65848 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -6513,6 +6513,7 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
 		query->sortClause ||
 		query->limitOffset ||
 		query->limitCount ||
+		query->asofTimestamp ||
 		query->setOperations)
 		return;
 
#15David Fetter
david@fetter.org
In reply to: Konstantin Knizhnik (#14)
Re: AS OF queries

On Thu, Dec 21, 2017 at 05:00:35PM +0300, Konstantin Knizhnik wrote:

On 20.12.2017 19:26, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.

The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.

Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.

Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.

It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.

regards, tom lane

Thank you for suggestion with AS_LA: it really works.
Actually instead of AS_LA I just return ASOF token if next token after AS is
OF.
So now it is possible to write query in this way:

��� select * from foo as of timestamp '2017-12-21 14:12:15.1867';

Thanks for your hard work so far on this! It looks really exciting.

There is still one significant difference of my prototype implementation
with SQL standard: it associates timestamp with select statement, not with
particular table.
It seems to be more difficult to support and I am not sure that joining
tables from different timelines has much sense.

I can think of a use case right offhand that I suspect would be very
common: comparing the state of a table at multiple times.

But certainly it also can be fixed.

That would be really fantastic.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#16Greg Stark
stark@mit.edu
In reply to: Konstantin Knizhnik (#1)
Re: AS OF queries

On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)

"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"

I think you'll find it a lot harder to get this to work than just
disabling autovacuum. Notably HOT updates can get cleaned up (and even
non-HOT updates can now leave tombstone dead line pointers iirc) even
if vacuum hasn't run.

We do have the infrastructure to deal with that. c.f.
vacuum_defer_cleanup_age. So in _theory_ you could create a snapshot
with xmin older than recent_global_xmin as long as it's not more than
vacuum_defer_cleanup_age older. But the devil will be in the details.
It does mean that you'll be making recent_global_xmin move backwards
which it has always been promised to *not* do

Then there's another issue that logical replication has had to deal
with -- catalog changes. You can't start looking at tuples that have a
different structure than the current catalog unless you can figure out
how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.

Postgres used to have time travel. I think it's come up more than once
in the pasts as something that can probably never come back due to
other decisions made. If more decisions have made it possible again
that will be fascinating.

--
greg

#17Michael Paquier
michael.paquier@gmail.com
In reply to: Greg Stark (#16)
Re: AS OF queries

On Fri, Dec 22, 2017 at 11:08:02PM +0000, Greg Stark wrote:

On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)

"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"

I would be amazed if you have been able to finish the 14 volumes of the
series. There is a lot of content to take.

Postgres used to have time travel. I think it's come up more than once
in the pasts as something that can probably never come back due to
other decisions made. If more decisions have made it possible again
that will be fascinating.

This subject is showing up a couple of times lately, things would
be interested to see. What I am sure about is that people are not
willing to emulate that with triggers and two extra columns per table.
--
Michael

#18konstantin knizhnik
k.knizhnik@postgrespro.ru
In reply to: Greg Stark (#16)
Re: AS OF queries

On Dec 23, 2017, at 2:08 AM, Greg Stark wrote:

On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)

"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"

I think you'll find it a lot harder to get this to work than just
disabling autovacuum. Notably HOT updates can get cleaned up (and even
non-HOT updates can now leave tombstone dead line pointers iirc) even
if vacuum hasn't run.

Yeh, I suspected that just disabling autovacuum was not enough.
I heard (but do no know too much) about microvacuum and hot updates.
This is why I was a little bit surprised when me test didn't show lost of updated versions.
May be it is because of vacuum_defer_cleanup_age.

We do have the infrastructure to deal with that. c.f.
vacuum_defer_cleanup_age. So in _theory_ you could create a snapshot
with xmin older than recent_global_xmin as long as it's not more than
vacuum_defer_cleanup_age older. But the devil will be in the details.
It does mean that you'll be making recent_global_xmin move backwards
which it has always been promised to *not* do

But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?

Then there's another issue that logical replication has had to deal
with -- catalog changes. You can't start looking at tuples that have a
different structure than the current catalog unless you can figure out
how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.

Yes, catalog changes this is another problem of time travel.
I do not know any suitable way to handle several different catalog snapshots in one query.
But I think that there are a lot of cases where time travels without possibility of database schema change still will be useful.
The question is how we should handle such catalog changes if them are happen. Ideally we should not allow to move back beyond this point.
Unfortunately it is not so easy to implement.

Show quoted text

Postgres used to have time travel. I think it's come up more than once
in the pasts as something that can probably never come back due to
other decisions made. If more decisions have made it possible again
that will be fascinating.

--
greg

#19Alvaro Hernandez
aht@ongres.com
In reply to: Konstantin Knizhnik (#14)
Re: AS OF queries

On 21/12/17 15:00, Konstantin Knizhnik wrote:

On 20.12.2017 19:26, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 12/20/17 10:29, Tom Lane wrote:

Please say that's just an Oracle-ism and not SQL standard, because
it's
formally ambiguous.

The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.

Bleah.  In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.

Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.

It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.

            regards, tom lane

Thank you for suggestion with AS_LA: it really works.
Actually instead of AS_LA I just return ASOF token if next token after
AS is OF.
So now it is possible to write query in this way:

    select * from foo as of timestamp '2017-12-21 14:12:15.1867';

There is still one significant difference of my prototype
implementation with SQL standard: it associates timestamp with select
statement, not with particular table.
It seems to be more difficult to support and I am not sure that
joining tables from different timelines has much sense.
But certainly it also can be fixed.

    If the standard is "AS OF SYSTEM TIME" and we're going to deviate
and go for "AS OF TIMESTAMP", I'd recommend then, if possible, to:

- Make "TIMESTAMP" optional, i.e., "AS OF [TIMESTAMP] <timestamp>"

- Augment the syntax to support also a transaction id, similar to
Oracle's "AS OF SCN <scn>": "AS OF TRANSACTION <txid>".

    Merry Christmas,

    Álvaro

--

Alvaro Hernandez

-----------
OnGres

#20Craig Ringer
craig@2ndquadrant.com
In reply to: konstantin knizhnik (#18)
Re: AS OF queries

On 24 December 2017 at 04:53, konstantin knizhnik <k.knizhnik@postgrespro.ru

wrote:

But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?

That's totally impractical, you'd have unbounded bloat and a nonfunctional
system in no time.

You'd need a mechanism - akin to what we have with replication slots - to
set a threshold for age.

Then there's another issue that logical replication has had to deal

with -- catalog changes. You can't start looking at tuples that have a
different structure than the current catalog unless you can figure out
how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.

Yes, catalog changes this is another problem of time travel.
I do not know any suitable way to handle several different catalog
snapshots in one query.

I doubt it's practical unless you can extract it to subplans that can be
materialized separately. Even then, UDTs, rowtype results, etc...

The question is how we should handle such catalog changes if them are
happen. Ideally we should not allow to move back beyond this point.
Unfortunately it is not so easy to implement.

I think you can learn a lot from studying logical decoding here.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#21Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Craig Ringer (#20)
Re: AS OF queries

On 25.12.2017 06:26, Craig Ringer wrote:

On 24 December 2017 at 04:53, konstantin knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?

That's totally impractical, you'd have unbounded bloat and a
nonfunctional system in no time.

You'd need a mechanism - akin to what we have with replication slots -
to set a threshold for age.

Well, there are systems with "never delete" and "append only" semantic.
For example, I have participated in SciDB project: database for
scientific applications.
One of the key requirements for scientific researches is reproducibility.
From the database point of view it means that we need to store all raw
data and never delete it.
If you performed some measurements and made some conclusions based on
this results, then everybody should be able to repeat it, even if later
you find some errors in input data and made corrections or just add more
data.
So one of the SciDB requirements was to store all versions. Delete
operation should just mark data as been deleted (although later we have
to add true delete:)

But I agree with you: in most cases more flexible policy of managing
versions is needed.
I am not sure that it should be similar with logical replication slot.
Here semantic is quite clear: we preserve segments of WAL until them are
replicated to the subscribers.
With time travel situation is less obscure: we may specify some
threshold for age - keep data for example for one year.
But what if somebody later wants to access  older data? At this moment
them are already lost...

It seems to me that version pinning policy mostly depends on source of
the data.
If  them have "append only" semantic (like as raw scientific data,
trading data, measurements from IoT sensors...)
then it will be desirable to keep all version forever.
If we speak about OLTP tables (like accounts in pgbench), then may be
time travel is not the proper mechanism for such data at all.

I think that in addition to logged/unlogged tables it will be useful to
support historical/non-historical tables. Historical table should
support time travel, while
non-historical (default) acts like normal table. It is already possible
in Postgres to disable autovacuum for particular tables.
But unfortunately trick with snapshot (doesn't matter how we setup
oldest xmin horizon) affect all tables.
There is similar (but not the same) problem with logical replication:
assume that we need to replicate only one small table. But we have to
pin in WAL all updates of other huge table which is not involved in
logical replication at all.

Then there's another issue that logical replication has had to deal
with -- catalog changes. You can't start looking at tuples that

have a

different structure than the current catalog unless you can

figure out

how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.

Yes, catalog changes this is another problem of time travel.
I do not know any suitable way to handle several different catalog
snapshots in one query.

I doubt it's practical unless you can extract it to subplans that can
be materialized separately. Even then, UDTs, rowtype results, etc...

Well, I am really not sure about user's demands to time travel. This is
one of the reasons of initiating this discussion in hackers... May be it
is not the best place for such discussion, because there are mostly
Postgres developers and not users...
At least, from experience of few SciDB customers, I can tell that we
didn't have problems with schema evolution: mostly schema is simple,
static and well defined.
There was problems with incorrect import of data (this is why we have to
add real delete), with splitting data in chunks (partitioning),...

The question is how we should handle such catalog changes if them
are happen. Ideally we should not allow to move back beyond  this
point.
Unfortunately it is not so easy to implement.

I think you can learn a lot from studying logical decoding here.

Working with multimaster and shardman I have to learn a lot about
logical replication.
It is really powerful and flexible mechanism ... with a lot of
limitations and problems: lack of catalog replication, inefficient bulk
insert, various race conditions,...
But I think that time travel and logical replication are really serving
different goals so require different approaches.

--
 Craig Ringer http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#22Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Alvaro Hernandez (#12)
Re: AS OF queries

On Thu, Dec 21, 2017 at 3:57 AM, Alvaro Hernandez <aht@ongres.com> wrote:

On 20/12/17 14:48, Konstantin Knizhnik wrote:

On 20.12.2017 16:12, Laurenz Albe wrote:

Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.

Please notice that it is necessary to configure postgres in proper way in
order to be able to perform time travels.

This makes sense. BTW, I believe this feature would be an amazing
addition to PostgreSQL.

If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not possible to
locate required timeline.

So DBA should make a decision in advance whether this feature is needed or
not.
It is not a proper instrument for restoring/auditing existed database which
was not configured to keep all versions.

May be it is better to add special configuration parameter for this feature
which should implicitly toggle
autovacuum and track_commit_timestamp parameters).

Downthread a "moving xid horizon" is proposed. I believe this is not too
user friendly. I'd rather use a timestamp horizon (e.g. "up to 2 days ago").
Given that the commit timestamp is tracked, I don't think this is an issue.
This is the same as the undo_retention in Oracle, which is expressed in
seconds.

I agree but since we cannot have same xid beyond xid wraparounds we
would have to remove old tuples even if we're still in the time
interval

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#23Hannu Krosing
hkrosing@gmail.com
In reply to: Konstantin Knizhnik (#1)
Re: AS OF queries

On 20.12.2017 14:45, Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)

In the design for original University Postgres ( which was a full
history database geared towards WORM drives )
it was the task of vacuum to move old tuples to "an archive" from where
the AS OF queries would then fetch
them as needed.

This might also be a good place to do Commit LSN to Commit Timestamp
translation

Hannu

2. Enable commit timestamp (track_commit_timestamp = on)
3. Add asofTimestamp to snapshot and patch XidInMVCCSnapshot to
compare commit timestamps when it is specified in snapshot.

Attached please find my prototype implementation of it.
Most of the efforts are needed to support asof timestamp in grammar
and add it to query plan.
I failed to support AS OF clause (as in Oracle) because of
shift-reduce conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can
propose how to solve this conflict without introducing new keyword...

Please notice that now ASOF timestamp is used only for data snapshot,
not for catalog snapshot.
I am not sure that it is possible (and useful) to travel through
database schema history...

Below is an example of how it works:

postgres=# create table foo(pk serial primary key, ts timestamp
default now(), val text);
CREATE TABLE
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# select * from foo;
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
  2 | 2017-12-20 14:59:22.933753 | insert
  3 | 2017-12-20 14:59:27.87712  | insert
(3 rows)

postgres=# select * from foo asof timestamp '2017-12-20 14:59:25';
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
  2 | 2017-12-20 14:59:22.933753 | insert
(2 rows)

postgres=# select * from foo asof timestamp '2017-12-20 14:59:20';
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
(1 row)

postgres=# update foo set val='upd',ts=now() where pk=1;
UPDATE 1
postgres=# select * from foo asof timestamp '2017-12-20 14:59:20';
 pk |             ts             |  val
----+----------------------------+--------
  1 | 2017-12-20 14:59:17.715453 | insert
(1 row)

postgres=# select * from foo;
 pk |             ts             |  val
----+----------------------------+--------
  2 | 2017-12-20 14:59:22.933753 | insert
  3 | 2017-12-20 14:59:27.87712  | insert
  1 | 2017-12-20 15:09:17.046047 | upd
(3 rows)

postgres=# update foo set val='upd2',ts=now() where pk=1;
UPDATE 1
postgres=# select * from foo asof timestamp '2017-12-20 15:10';
 pk |             ts             |  val
----+----------------------------+--------
  2 | 2017-12-20 14:59:22.933753 | insert
  3 | 2017-12-20 14:59:27.87712  | insert
  1 | 2017-12-20 15:09:17.046047 | upd
(3 rows)

Comments and feedback are welcome:)

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
https://2ndquadrant.com/

#24Jeff Janes
jeff.janes@gmail.com
In reply to: Konstantin Knizhnik (#14)
Re: AS OF queries

On Thu, Dec 21, 2017 at 6:00 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

There is still one significant difference of my prototype implementation
with SQL standard: it associates timestamp with select statement, not with
particular table.
It seems to be more difficult to support and I am not sure that joining
tables from different timelines has much sense.
But certainly it also can be fixed.

I think the main use I would find for this feature is something like:

select * from foo except select * from foo as old_foo as of '<some time>';

So I would be grateful if you can make that work. Also, I think conforming
to the standards is pretty important where it is feasible to do that.

Cheers,

Jeff

#25legrand legrand
legrand_legrand@hotmail.com
In reply to: Jeff Janes (#24)
Re: AS OF queries

would actual syntax

WITH old_foo AS
(select * from foo as of '<some time>')
select * from foo except select * from old_foo;

work in replacement for

select * from foo except select * from foo as old_foo as of '<some time>';

?

Regards
PAscal

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#26David Fetter
david@fetter.org
In reply to: legrand legrand (#25)
Re: AS OF queries

On Tue, Dec 26, 2017 at 03:43:36PM -0700, legrand legrand wrote:

would actual syntax

WITH old_foo AS
(select * from foo as of '<some time>')
select * from foo except select * from old_foo;

work in replacement for

select * from foo except select * from foo as old_foo as of '<some time>';

?

If there has to be a WITH, or (roughly) equivalently, a sub-select for
each relation, the queries get very hairy very quickly. It would
nevertheless be better for the people who need the feature to have it
this way than not to have it at all.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#27Craig Ringer
craig@2ndquadrant.com
In reply to: Konstantin Knizhnik (#21)
Re: AS OF queries

On 25 December 2017 at 15:59, Konstantin Knizhnik <k.knizhnik@postgrespro.ru

wrote:

On 25.12.2017 06:26, Craig Ringer wrote:

On 24 December 2017 at 04:53, konstantin knizhnik <
k.knizhnik@postgrespro.ru> wrote:

But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?

That's totally impractical, you'd have unbounded bloat and a nonfunctional
system in no time.

You'd need a mechanism - akin to what we have with replication slots - to
set a threshold for age.

Well, there are systems with "never delete" and "append only" semantic.
For example, I have participated in SciDB project: database for scientific
applications.
One of the key requirements for scientific researches is reproducibility.
From the database point of view it means that we need to store all raw
data and never delete it.

PostgreSQL can't cope with that for more than 2^31 xacts, you have to
"forget" details of which xacts created/updated tuples and the contents of
deleted tuples, or you exceed our xid limit. You'd need 64-bit XIDs, or a
redo-buffer based heap model (like the zheap stuff) with redo buffers
marked with an xid epoch, or something like that.

I am not sure that it should be similar with logical replication slot.

Here semantic is quite clear: we preserve segments of WAL until them are

replicated to the subscribers.

Er, what?

This isn't to do with restart_lsn. That's why I mentioned *logical*
replication slots.

I'm talking about how they interact with GetOldestXmin using their xmin and
catalog_xmin.

You probably won't want to re-use slots, but you'll want something akin to
that, a transaction age threshold. Otherwise your system has a finite end
date where it can no longer function due to xid count, or if you solve
that, it'll slowly choke on table bloat etc. I guess if you're willing to
accept truly horrible performance...

With time travel situation is less obscure: we may specify some threshold
for age - keep data for example for one year.

Sure. You'd likely do that by mapping commit timestamps => xids and using
an xid threshold though.

But unfortunately trick with snapshot (doesn't matter how we setup oldest
xmin horizon) affect all tables.

You'd need to be able to pass more info into HeapTupleSatisfiesMVCC etc. I
expect you'd probably add a new snapshot type (like logical decoding did
with historic snapshots), that has a new Satisfies function. But you'd have
to be able to ensure all snapshot Satisfies callers had the required extra
info - like maybe a Relation - which could be awkward for some call sites.

The user would have to be responsible for ensuring sanity of FK
relationships etc when specifying different snapshots for different
relations.

Per-relation time travel doesn't seem totally impractical so long as you
can guarantee that there is some possible snapshot for which the catalogs
defining all the relations and types are simultaneously valid, i.e. there's
no disjoint set of catalog changes. Avoiding messy performance implications
with normal queries might not even be too bad if you use a separate
snapshot model, so long as you can avoid callsites having to do extra work
in the normal case.

Dealing with dropped columns and rewrites would be a pain though. You'd
have to preserve the dropped column data when you re-projected the rewrite
tuples.

There is similar (but not the same) problem with logical replication:
assume that we need to replicate only one small table. But we have to pin
in WAL all updates of other huge table which is not involved in logical
replication at all.

I don't really see how that's similar. It's concerned with WAL, wheras what
you're looking at is heaps and bloat from old versions. Completely
different, unless you propose to somehow reconstruct data from old WAL to
do historic queries, which would be o_O ...

Well, I am really not sure about user's demands to time travel. This is
one of the reasons of initiating this discussion in hackers... May be it is
not the best place for such discussion, because there are mostly Postgres
developers and not users...
At least, from experience of few SciDB customers, I can tell that we
didn't have problems with schema evolution: mostly schema is simple, static
and well defined.
There was problems with incorrect import of data (this is why we have to
add real delete), with splitting data in chunks (partitioning),...

Every system I've ever worked with that has a "static" schema has landed up
not being so static after all.

I'm sure there are exceptions, but if you can't cope with catalog changes
you've excluded the immense majority of users. Even the ones who promise
they don't ever need to change anything ... land up changing things.

The question is how we should handle such catalog changes if them are

happen. Ideally we should not allow to move back beyond this point.
Unfortunately it is not so easy to implement.

I think you can learn a lot from studying logical decoding here.

Working with multimaster and shardman I have to learn a lot about logical
replication.
It is really powerful and flexible mechanism ... with a lot of limitations
and problems: lack of catalog replication, inefficient bulk insert, various
race conditions,...
But I think that time travel and logical replication are really serving
different goals so require different approaches.

Of course. I'm pointing out that we solve the catalog-change problem using
historic snapshots, and THAT is what you'd be wanting to look at. Also what
it does with the rewrite map.

However, you'd have a nightmare of a time getting the syscache to deliver
you different data depending on which table's catalogs you're looking for.
And what if there's some UDT that appears in >1 table with different AS OF
times, but with different definitions at different times? Yuck.

More importantly you can't construct a historic snapshot at some arbitrary
point in time. It depends on the maintenance of state that's done with
logical decoding and xlogreader. So I don't know how you'd construct a
historic snapshot for "June 24 at 2:01 am".

Ignoring concerns with catalog changes sounds convenient but in practice
it's a total waste of time IMO. If nothing else there's temp tables to deal
with.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#28Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Craig Ringer (#27)
Re: AS OF queries

On 27.12.2017 10:29, Craig Ringer wrote:

On 25 December 2017 at 15:59, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

On 25.12.2017 06:26, Craig Ringer wrote:

On 24 December 2017 at 04:53, konstantin knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?

That's totally impractical, you'd have unbounded bloat and a
nonfunctional system in no time.

You'd need a mechanism - akin to what we have with replication
slots - to set a threshold for age.

Well, there are systems with "never delete" and "append only"
semantic.
For example, I have participated in SciDB project: database for
scientific applications.
One of the key requirements for scientific researches is
reproducibility.
From the database point of view it means that we need to store all
raw data and never delete it.

PostgreSQL can't cope with that for more than 2^31 xacts, you have to
"forget" details of which xacts created/updated tuples and the
contents of deleted tuples, or you exceed our xid limit. You'd need
64-bit XIDs, or a redo-buffer based heap model (like the zheap stuff)
with redo buffers marked with an xid epoch, or something like that.

Yes, but PgPro-EE already has 64-bit xids and we have spent a lot of
time trying to push it to community.

I am not sure that it should be similar with logical replication slot.

Here semantic is quite clear: we preserve segments of WAL until
them are replicated to the subscribers.

Er, what?

This isn't to do with restart_lsn. That's why I mentioned *logical*
replication slots.

I'm talking about how they interact with GetOldestXmin using their
xmin and catalog_xmin.

You probably won't want to re-use slots, but you'll want something
akin to that, a transaction age threshold. Otherwise your system has a
finite end date where it can no longer function due to xid count, or
if you solve that, it'll slowly choke on table bloat etc. I guess if
you're willing to accept truly horrible performance...

Definitely supporting time travel through frequently updated data may
cause database bloat and awful performance.
I still think that this feature will be mostly interesting for
append-only/rarely updated data.

In any case I have set vacuum_defer_cleanup_age = 1000000 and run
pgbench during several limits.
There was no significant performance degradation.

Unfortunately  replication slots, neither  vacuum_defer_cleanup_age
allows to keep versions just for particular table(s).
And it seems to be the major problem I do not know how to solve now.

With time travel situation is less obscure: we may specify some
threshold for age - keep data for example for one year.

Sure. You'd likely do that by mapping commit timestamps => xids and
using an xid threshold though.

But unfortunately trick with snapshot (doesn't matter how we setup
oldest xmin horizon) affect all tables.

You'd need to be able to pass more info into HeapTupleSatisfiesMVCC
etc. I expect you'd probably add a new snapshot type (like logical
decoding did with historic snapshots), that has a new Satisfies
function. But you'd have to be able to ensure all snapshot Satisfies
callers had the required extra info - like maybe a Relation - which
could be awkward for some call sites.

Yes, it seems to be the only possible choice.

The user would have to be responsible for ensuring sanity of FK
relationships etc when specifying different snapshots for different
relations.

Per-relation time travel doesn't seem totally impractical so long as
you can guarantee that there is some possible snapshot for which the
catalogs defining all the relations and types are simultaneously
valid, i.e. there's no disjoint set of catalog changes. Avoiding messy
performance implications with normal queries might not even be too bad
if you use a separate snapshot model, so long as you can avoid
callsites having to do extra work in the normal case.

Dealing with dropped columns and rewrites would be a pain though.
You'd have to preserve the dropped column data when you re-projected
the rewrite tuples.

There is similar (but not the same) problem with logical
replication: assume that we need to replicate only one small
table. But we have to pin in WAL all updates of other huge table
which is not involved in logical replication at all.

I don't really see how that's similar. It's concerned with WAL, wheras
what you're looking at is heaps and bloat from old versions.
Completely different, unless you propose to somehow reconstruct data
from old WAL to do historic queries, which would be o_O ...

Well, I am really not sure about user's demands to time travel.
This is one of the reasons of initiating this discussion in
hackers... May be it is not the best place for such discussion,
because there are mostly Postgres developers and not users...
At least, from experience of few SciDB customers, I can tell that
we didn't have problems with schema evolution: mostly schema is
simple, static and well defined.
There was problems with incorrect import of data (this is why we
have to add real delete), with splitting data in chunks
(partitioning),...

Every system I've ever worked with that has a "static" schema has
landed up not being so static after all.

I'm sure there are exceptions, but if you can't cope with catalog
changes you've excluded the immense majority of users. Even the ones
who promise they don't ever need to change anything ... land up
changing things.

JSON? :)

The question is how we should handle such catalog changes if
them are happen. Ideally we should not allow to move back
beyond  this point.
Unfortunately it is not so easy to implement.

I think you can learn a lot from studying logical decoding here.

Working with multimaster and shardman I have to learn a lot about
logical replication.
It is really powerful and flexible mechanism ... with a lot of
limitations and problems: lack of catalog replication, inefficient
bulk insert, various race conditions,...
But I think that time travel and logical replication are really
serving different goals so require different approaches.

Of course. I'm pointing out that we solve the catalog-change problem
using historic snapshots, and THAT is what you'd be wanting to look
at. Also what it does with the rewrite map.

However, you'd have a nightmare of a time getting the syscache to
deliver you different data depending on which table's catalogs you're
looking for. And what if there's some UDT that appears in >1 table
with different AS OF times, but with different definitions at
different times? Yuck.

More importantly you can't construct a historic snapshot at some
arbitrary point in time. It depends on the maintenance of state that's
done with logical decoding and xlogreader. So I don't know how you'd
construct a historic snapshot for "June 24 at 2:01 am".

Ignoring concerns with catalog changes sounds convenient but in
practice it's a total waste of time IMO. If nothing else there's temp
tables to deal with.

Assume we have query

select * from A as old_a as of timestamp '2016-12-01', A as new_a as of
timestamp '2017-12-01' where old_a.old_id = new_a.new_id;

where schema of A was changed during this year. We have to carefully
specify proper historical snapshots in all places of parse and optimizer
deadling with this tables...
I afraid that it will be too complicated.

--
 Craig Ringer http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In reply to: Konstantin Knizhnik (#1)
Re: AS OF queries

On 12/20/2017 01:45 PM, Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)
2. Enable commit timestamp (track_commit_timestamp = on)
3. Add asofTimestamp to snapshot and patch XidInMVCCSnapshot to
compare commit timestamps when it is specified in snapshot.

that sounds really awesome ... i would love to see that.
my question is: while MVCC is fine when a tuple is still there ...
what are you going to do with TRUNCATE and so on?
it is not uncommon that a table is truncated frequently. in this case
MVCC won't help.
what are your thoughts on this ?

    many thanks,

        hans

--
Hans-Jürgen Schönig
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

#30Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: PostgreSQL - Hans-Jürgen Schönig (#29)
Re: AS OF queries

On 27.12.2017 17:14, PostgreSQL - Hans-Jürgen Schönig wrote:

On 12/20/2017 01:45 PM, Konstantin Knizhnik wrote:

I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)
2. Enable commit timestamp (track_commit_timestamp = on)
3. Add asofTimestamp to snapshot and patch XidInMVCCSnapshot to
compare commit timestamps when it is specified in snapshot.

that sounds really awesome ... i would love to see that.
my question is: while MVCC is fine when a tuple is still there ...
what are you going to do with TRUNCATE and so on?
it is not uncommon that a table is truncated frequently. in this case
MVCC won't help.
what are your thoughts on this ?

You should not use drop/truncate if you want to access old versions:)
Yes, truncate is much more faster than delete but it is because it
operates on file level.
I think that it is quite natural limitation.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#31Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Jeff Janes (#24)
1 attachment(s)
Re: AS OF queries

On 27.12.2017 00:52, Jeff Janes wrote:

On Thu, Dec 21, 2017 at 6:00 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

There is still one significant difference of my prototype
implementation with SQL standard: it associates timestamp with
select statement, not with particular table.
It seems to be more difficult to support and I am not sure that
joining tables from different timelines has much sense.
But certainly it also can be fixed.

I think the main use I would find for this feature is something like:

select * from foo except select * from foo as old_foo as of '<some time>';

So I would be grateful if you can make that work. Also, I think
conforming to the standards is pretty important where it is feasible
to do that.

Cheers,

Jeff

I attach ne version of the patch which supports "standard" syntax, where
AS OF clause is associated with table reference.
So it is possible to write query like:

    select * from SomeTable as t as of timestamp '2017-12-27 14:54:40'
where id=100;

Also I introduced "time_travel" GUC which implicitly assigns some others
GUCs:

        track_commit_timestamp = true;
        vacuum_defer_cleanup_age = 1000000000;
        vacuum_freeze_min_age = 1000000000;
        autovacuum_freeze_max_age = 2000000000;
        autovacuum_multixact_freeze_max_age = 2000000000;
        autovacuum_start_daemon = false;

So it disables autovacuum and microvacuum and enable commit timestamps
tracking.
It provides access in the past up to milliard of transactions.

There is still no way to keep all versions only for particular tables or
truncate too old versions.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

asof-3.patchtext/x-patch; name=asof-3.patchDownload
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index eb5bbb5..7acaf30 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -78,6 +78,7 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	ExprContext *econtext;
 	HeapScanDesc scan;
 	TIDBitmap  *tbm;
+	EState	   *estate;
 	TBMIterator *tbmiterator = NULL;
 	TBMSharedIterator *shared_tbmiterator = NULL;
 	TBMIterateResult *tbmres;
@@ -85,11 +86,13 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	TupleTableSlot *slot;
 	ParallelBitmapHeapState *pstate = node->pstate;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = 0;
 
 	/*
 	 * extract necessary information from index scan node
 	 */
 	econtext = node->ss.ps.ps_ExprContext;
+	estate = node->ss.ps.state;
 	slot = node->ss.ss_ScanTupleSlot;
 	scan = node->ss.ss_currentScanDesc;
 	tbm = node->tbm;
@@ -99,6 +102,25 @@ BitmapHeapNext(BitmapHeapScanState *node)
 		shared_tbmiterator = node->shared_tbmiterator;
 	tbmres = node->tbmres;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			node->ss.asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	/*
 	 * If we haven't yet performed the underlying index scan, do it, and begin
 	 * the iteration over the bitmap.
@@ -364,11 +386,21 @@ BitmapHeapNext(BitmapHeapScanState *node)
 			}
 		}
 
-		/* OK to return this tuple */
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	    /* OK to return this tuple */
 		return slot;
 	}
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * if we get here it means we are at the end of the scan..
 	 */
 	return ExecClearTuple(slot);
@@ -746,6 +778,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
 {
 	PlanState  *outerPlan = outerPlanState(node);
 
+	node->ss.asofTimestampSet = false;
+
 	/* rescan to release any page pin */
 	heap_rescan(node->ss.ss_currentScanDesc, NULL);
 
@@ -902,7 +936,8 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 	 * most cases it's probably not worth working harder than that.
 	 */
 	scanstate->can_skip_fetch = (node->scan.plan.qual == NIL &&
-								 node->scan.plan.targetlist == NIL);
+								 node->scan.plan.targetlist == NIL &&
+								 node->scan.asofTimestamp == NULL);
 
 	/*
 	 * Miscellaneous initialization
@@ -920,6 +955,18 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 		ExecInitQual(node->bitmapqualorig, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+										   &scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -1052,11 +1099,27 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ParallelBitmapHeapState *pstate;
 	EState	   *estate = node->ss.ps.state;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
 
 	/* If there's no DSA, there are no workers; initialize nothing. */
 	if (dsa == NULL)
 		return;
 
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		ExprState* asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		val = ExecEvalExprSwitchContext(asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
+
 	pstate = shm_toc_allocate(pcxt->toc, node->pscan_len);
 
 	pstate->tbmiterator = 0;
@@ -1071,6 +1134,8 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ConditionVariableInit(&pstate->cv);
 	SerializeSnapshot(estate->es_snapshot, pstate->phs_snapshot_data);
 
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pstate);
 	node->pstate = pstate;
 }
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 2ffef23..a0b505c 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -86,7 +86,7 @@ IndexNext(IndexScanState *node)
 	IndexScanDesc scandesc;
 	HeapTuple	tuple;
 	TupleTableSlot *slot;
-
+	TimestampTz outerAsofTimestamp;
 	/*
 	 * extract necessary information from index scan node
 	 */
@@ -104,6 +104,30 @@ IndexNext(IndexScanState *node)
 	econtext = node->ss.ps.ps_ExprContext;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			if (isNull)
+				node->ss.asofTimestamp = 0;
+			else
+			{
+				node->ss.asofTimestamp = DatumGetInt64(val);
+			}
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -160,9 +184,17 @@ IndexNext(IndexScanState *node)
 				continue;
 			}
 		}
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 		return slot;
 	}
+	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 	/*
 	 * if we get here it means the index scan failed so we are at the end of
@@ -578,6 +610,8 @@ ExecIndexScan(PlanState *pstate)
 void
 ExecReScanIndexScan(IndexScanState *node)
 {
+	node->ss.asofTimestampSet = false;
+
 	/*
 	 * If we are doing runtime key calculations (ie, any of the index key
 	 * values weren't simple Consts), compute the new key values.  But first,
@@ -918,6 +952,18 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
 		ExecInitExprList(node->indexorderbyorig, (PlanState *) indexstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		indexstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+											&indexstate->ss.ps);
+		indexstate->ss.asofTimestampSet = false;
+	}
+	else
+		indexstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
@@ -1672,12 +1718,30 @@ ExecIndexScanInitializeDSM(IndexScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelIndexScanDesc piscan;
+	TimestampTz outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	piscan = shm_toc_allocate(pcxt->toc, node->iss_PscanLen);
 	index_parallelscan_initialize(node->ss.ss_currentRelation,
 								  node->iss_RelationDesc,
 								  estate->es_snapshot,
 								  piscan);
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
 	node->iss_ScanDesc =
 		index_beginscan_parallel(node->ss.ss_currentRelation,
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index a5bd60e..d19d210 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -54,6 +54,7 @@ SeqNext(SeqScanState *node)
 	EState	   *estate;
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	TimestampTz     outerAsofTimestamp;
 
 	/*
 	 * get information from the estate and scan state
@@ -63,6 +64,25 @@ SeqNext(SeqScanState *node)
 	direction = estate->es_direction;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			node->ss.asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -81,6 +101,11 @@ SeqNext(SeqScanState *node)
 	tuple = heap_getnext(scandesc, direction);
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * save the tuple and the buffer returned to us by the access methods in
 	 * our scan tuple slot and return the slot.  Note: we pass 'false' because
 	 * tuples returned by heap_getnext() are pointers onto disk pages and were
@@ -196,6 +221,19 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 		ExecInitQual(node->plan.qual, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->asofTimestamp,
+											&scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -273,6 +311,7 @@ ExecReScanSeqScan(SeqScanState *node)
 	HeapScanDesc scan;
 
 	scan = node->ss.ss_currentScanDesc;
+	node->ss.asofTimestampSet = false;
 
 	if (scan != NULL)
 		heap_rescan(scan,		/* scan desc */
@@ -316,11 +355,30 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+										 &node->ss.ps);
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
 	node->ss.ss_currentScanDesc =
 		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
@@ -337,8 +395,24 @@ ExecSeqScanReInitializeDSM(SeqScanState *node,
 						   ParallelContext *pcxt)
 {
 	HeapScanDesc scan = node->ss.ss_currentScanDesc;
+	EState	   *estate = node->ss.ps.state;
+	TimestampTz  outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		Datum		val;
+		bool		isNull;
+
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	heap_parallelscan_reinitialize(scan->rs_parallel);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 84d7171..259d991 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -410,6 +410,7 @@ CopyScanFields(const Scan *from, Scan *newnode)
 	CopyPlanFields((const Plan *) from, (Plan *) newnode);
 
 	COPY_SCALAR_FIELD(scanrelid);
+	COPY_NODE_FIELD(asofTimestamp);
 }
 
 /*
@@ -1216,6 +1217,7 @@ _copyRangeVar(const RangeVar *from)
 	COPY_SCALAR_FIELD(relpersistence);
 	COPY_NODE_FIELD(alias);
 	COPY_LOCATION_FIELD(location);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
@@ -2326,6 +2328,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_BITMAPSET_FIELD(insertedCols);
 	COPY_BITMAPSET_FIELD(updatedCols);
 	COPY_NODE_FIELD(securityQuals);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 2e869a9..8ee4228 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -112,6 +112,7 @@ _equalRangeVar(const RangeVar *a, const RangeVar *b)
 	COMPARE_SCALAR_FIELD(relpersistence);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_LOCATION_FIELD(location);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
@@ -2661,6 +2662,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_BITMAPSET_FIELD(insertedCols);
 	COMPARE_BITMAPSET_FIELD(updatedCols);
 	COMPARE_NODE_FIELD(securityQuals);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index c2a93b2..0ace44d 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2338,6 +2338,10 @@ range_table_walker(List *rtable,
 
 		if (walker(rte->securityQuals, context))
 			return true;
+
+		if (walker(rte->asofTimestamp, context))
+			return true;
+
 	}
 	return false;
 }
@@ -3161,6 +3165,7 @@ range_table_mutator(List *rtable,
 				break;
 		}
 		MUTATE(newrte->securityQuals, rte->securityQuals, List *);
+		MUTATE(newrte->asofTimestamp, rte->asofTimestamp, Node *);
 		newrt = lappend(newrt, newrte);
 	}
 	return newrt;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e468d7c..3ee00f3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3105,6 +3105,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 	WRITE_BITMAPSET_FIELD(insertedCols);
 	WRITE_BITMAPSET_FIELD(updatedCols);
 	WRITE_NODE_FIELD(securityQuals);
+	WRITE_NODE_FIELD(asofTimestamp);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1133c70..cf7c637 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1399,6 +1399,7 @@ _readRangeTblEntry(void)
 	READ_BITMAPSET_FIELD(insertedCols);
 	READ_BITMAPSET_FIELD(updatedCols);
 	READ_NODE_FIELD(securityQuals);
+	READ_NODE_FIELD(asofTimestamp);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 1a9fd82..713f9b3 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -168,10 +168,10 @@ static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
 static SampleScan *make_samplescan(List *qptlist, List *qpqual, Index scanrelid,
 				TableSampleClause *tsc);
 static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
-			   Oid indexid, List *indexqual, List *indexqualorig,
-			   List *indexorderby, List *indexorderbyorig,
-			   List *indexorderbyops,
-			   ScanDirection indexscandir);
+								 Oid indexid, List *indexqual, List *indexqualorig,
+								 List *indexorderby, List *indexorderbyorig,
+								 List *indexorderbyops,
+								 ScanDirection indexscandir);
 static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
 				   Index scanrelid, Oid indexid,
 				   List *indexqual, List *indexorderby,
@@ -509,6 +509,7 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 	List	   *gating_clauses;
 	List	   *tlist;
 	Plan	   *plan;
+	RangeTblEntry *rte;
 
 	/*
 	 * Extract the relevant restriction clauses from the parent relation. The
@@ -709,6 +710,12 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 			break;
 	}
 
+	if (plan != NULL)
+	{
+		rte = planner_rt_fetch(rel->relid, root);
+		((Scan*)plan)->asofTimestamp = rte->asofTimestamp;
+	}
+
 	/*
 	 * If there are any pseudoconstant clauses attached to this node, insert a
 	 * gating Result node that evaluates the pseudoconstants as one-time
@@ -2434,7 +2441,7 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 	Assert(scan_relid > 0);
 	Assert(best_path->parent->rtekind == RTE_RELATION);
 
-	/* Sort clauses into best execution order */
+    /* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 382791f..ceb6542 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -84,6 +84,7 @@ create_upper_paths_hook_type create_upper_paths_hook = NULL;
 #define EXPRKIND_ARBITER_ELEM		10
 #define EXPRKIND_TABLEFUNC			11
 #define EXPRKIND_TABLEFUNC_LATERAL	12
+#define EXPRKIND_ASOF	            13
 
 /* Passthrough data for standard_qp_callback */
 typedef struct
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ebfc94f..a642e28 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -449,7 +449,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type <node>	fetch_args limit_clause select_limit_value
 				offset_clause select_offset_value
-				select_offset_value2 opt_select_fetch_first_value
+				select_offset_value2 opt_select_fetch_first_value opt_asof_clause
 %type <ival>	row_or_rows first_or_next
 
 %type <list>	OptSeqOptList SeqOptList OptParenthesizedSeqOptList
@@ -704,7 +704,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * as NOT, at least with respect to their left-hand subexpression.
  * NULLS_LA and WITH_LA are needed to make the grammar LALR(1).
  */
-%token		NOT_LA NULLS_LA WITH_LA
+%token		NOT_LA NULLS_LA WITH_LA AS_LA
 
 
 /* Precedence: lowest to highest */
@@ -11720,9 +11720,10 @@ from_list:
 /*
  * table_ref is where an alias clause can be attached.
  */
-table_ref:	relation_expr opt_alias_clause
+table_ref:	relation_expr opt_alias_clause opt_asof_clause
 				{
 					$1->alias = $2;
+					$1->asofTimestamp = $3;
 					$$ = (Node *) $1;
 				}
 			| relation_expr opt_alias_clause tablesample_clause
@@ -11948,6 +11949,10 @@ opt_alias_clause: alias_clause						{ $$ = $1; }
 			| /*EMPTY*/								{ $$ = NULL; }
 		;
 
+opt_asof_clause: AS_LA OF a_expr                    { $$ = $3; }
+			| /*EMPTY*/								{ $$ = NULL; }
+		;
+
 /*
  * func_alias_clause can include both an Alias and a coldeflist, so we make it
  * return a 2-element list that gets disassembled by calling production.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4c4f4cd..6c3e506 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -439,6 +439,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
@@ -856,6 +857,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 2828bbf..a23f3d8 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -426,7 +426,11 @@ transformTableEntry(ParseState *pstate, RangeVar *r)
 
 	/* We need only build a range table entry */
 	rte = addRangeTableEntry(pstate, r, r->alias, r->inh, true);
-
+	if (r->asofTimestamp)
+	{
+		Node* asof = transformExpr(pstate, r->asofTimestamp, EXPR_KIND_ASOF);
+		rte->asofTimestamp = coerce_to_specific_type(pstate, asof, TIMESTAMPTZOID, "ASOF");
+	}
 	return rte;
 }
 
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da7..cd83fc3 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1818,6 +1818,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
 		case EXPR_KIND_VALUES:
 		case EXPR_KIND_VALUES_SINGLE:
 		case EXPR_KIND_CALL:
+		case EXPR_KIND_ASOF:
 			/* okay */
 			break;
 		case EXPR_KIND_CHECK_CONSTRAINT:
@@ -3470,6 +3471,8 @@ ParseExprKindName(ParseExprKind exprKind)
 			return "PARTITION BY";
 		case EXPR_KIND_CALL:
 			return "CALL";
+		case EXPR_KIND_ASOF:
+			return "ASOF";
 
 			/*
 			 * There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e6b0856..a6bcfc7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2250,6 +2250,7 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 58bdb23..ddf6af4 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1206,6 +1206,7 @@ addRangeTableEntry(ParseState *pstate,
 
 	rte->rtekind = RTE_RELATION;
 	rte->alias = alias;
+	rte->asofTimestamp = relation->asofTimestamp;
 
 	/*
 	 * Get the rel's OID.  This access also ensures that we have an up-to-date
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 245b4cd..a3845b5 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -108,6 +108,9 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	 */
 	switch (cur_token)
 	{
+		case AS:
+			cur_token_length = 2;
+			break;
 		case NOT:
 			cur_token_length = 3;
 			break;
@@ -155,6 +158,10 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	/* Replace cur_token if needed, based on lookahead */
 	switch (cur_token)
 	{
+		case AS:
+		    if (next_token == OF)
+			    cur_token = AS_LA;
+		    break;
 		case NOT:
 			/* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
 			switch (next_token)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e32901d..9f425b7 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -191,6 +191,7 @@ static void assign_application_name(const char *newval, void *extra);
 static bool check_cluster_name(char **newval, void **extra, GucSource source);
 static const char *show_unix_socket_permissions(void);
 static const char *show_log_file_mode(void);
+static void assign_time_travel_hook(bool newval, void *extra);
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -516,6 +517,7 @@ static int	wal_block_size;
 static bool data_checksums;
 static bool integer_datetimes;
 static bool assert_enabled;
+static bool time_travel;
 
 /* should be static, but commands/variable.c needs to get at this */
 char	   *role_string;
@@ -804,6 +806,15 @@ static const unit_conversion time_unit_conversion_table[] =
 static struct config_bool ConfigureNamesBool[] =
 {
 	{
+		{"time_travel", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Keep all recrod versions to support time travel."),
+			NULL
+		},
+		&time_travel,
+		false,
+		NULL, assign_time_travel_hook, NULL
+	},
+	{
 		{"enable_seqscan", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of sequential-scan plans."),
 			NULL
@@ -10530,4 +10541,17 @@ show_log_file_mode(void)
 	return buf;
 }
 
+static void assign_time_travel_hook(bool newval, void *extra)
+{
+	if (newval)
+	{
+		track_commit_timestamp = true;
+		vacuum_defer_cleanup_age = 1000000000;
+		vacuum_freeze_min_age = 1000000000;
+		autovacuum_freeze_max_age = 2000000000;
+		autovacuum_multixact_freeze_max_age = 2000000000;
+		autovacuum_start_daemon = false;
+	}
+}
+
 #include "guc-file.c"
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 0b03290..de87bf0 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -244,6 +244,7 @@ typedef struct SerializedSnapshotData
 	bool		takenDuringRecovery;
 	CommandId	curcid;
 	TimestampTz whenTaken;
+	TimestampTz asofTimestamp;
 	XLogRecPtr	lsn;
 } SerializedSnapshotData;
 
@@ -2080,6 +2081,7 @@ SerializeSnapshot(Snapshot snapshot, char *start_address)
 	serialized_snapshot.takenDuringRecovery = snapshot->takenDuringRecovery;
 	serialized_snapshot.curcid = snapshot->curcid;
 	serialized_snapshot.whenTaken = snapshot->whenTaken;
+	serialized_snapshot.asofTimestamp = snapshot->asofTimestamp;
 	serialized_snapshot.lsn = snapshot->lsn;
 
 	/*
@@ -2154,6 +2156,7 @@ RestoreSnapshot(char *start_address)
 	snapshot->takenDuringRecovery = serialized_snapshot.takenDuringRecovery;
 	snapshot->curcid = serialized_snapshot.curcid;
 	snapshot->whenTaken = serialized_snapshot.whenTaken;
+	snapshot->asofTimestamp = serialized_snapshot.asofTimestamp;
 	snapshot->lsn = serialized_snapshot.lsn;
 
 	/* Copy XIDs, if present. */
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 2b218e0..09e067f 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -69,6 +69,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -1476,6 +1477,16 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
 	uint32		i;
 
+	if (snapshot->asofTimestamp != 0)
+	{
+		TimestampTz ts;
+		if (TransactionIdGetCommitTsData(xid, &ts, NULL))
+		{
+			return timestamptz_cmp_internal(snapshot->asofTimestamp, ts) < 0;
+		}
+	}
+
+
 	/*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index c9a5279..ed923ab 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1120,6 +1120,9 @@ typedef struct ScanState
 	Relation	ss_currentRelation;
 	HeapScanDesc ss_currentScanDesc;
 	TupleTableSlot *ss_ScanTupleSlot;
+	ExprState  *asofExpr;	      /* AS OF expression */
+	bool        asofTimestampSet; /* As OF timestamp evaluated */
+	TimestampTz asofTimestamp;    /* AS OF timestamp or 0 if not set */
 } ScanState;
 
 /* ----------------
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2eaa6b2..b78c8e2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1062,6 +1062,7 @@ typedef struct RangeTblEntry
 	Bitmapset  *insertedCols;	/* columns needing INSERT permission */
 	Bitmapset  *updatedCols;	/* columns needing UPDATE permission */
 	List	   *securityQuals;	/* security barrier quals to apply, if any */
+	Node       *asofTimestamp;  /* AS OF timestamp */
 } RangeTblEntry;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index d763da6..083dc90 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -327,7 +327,8 @@ typedef struct BitmapOr
 typedef struct Scan
 {
 	Plan		plan;
-	Index		scanrelid;		/* relid is index into the range table */
+	Index		scanrelid;	   /* relid is index into the range table */
+	Node       *asofTimestamp; /* AS OF timestamp */
 } Scan;
 
 /* ----------------
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 074ae0a..11e1a0c 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -70,6 +70,7 @@ typedef struct RangeVar
 								 * on children? */
 	char		relpersistence; /* see RELPERSISTENCE_* in pg_class.h */
 	Alias	   *alias;			/* table alias & optional column aliases */
+	Node       *asofTimestamp;  /* expression with AS OF timestamp */
 	int			location;		/* token location, or -1 if unknown */
 } RangeVar;
 
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 565bb3d..b1efb5c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -68,7 +68,8 @@ typedef enum ParseExprKind
 	EXPR_KIND_TRIGGER_WHEN,		/* WHEN condition in CREATE TRIGGER */
 	EXPR_KIND_POLICY,			/* USING or WITH CHECK expr in policy */
 	EXPR_KIND_PARTITION_EXPRESSION,	/* PARTITION BY expression */
-	EXPR_KIND_CALL				/* CALL argument */
+	EXPR_KIND_CALL,				/* CALL argument */
+	EXPR_KIND_ASOF              /* AS OF */
 } ParseExprKind;
 
 
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index bf51977..a00f0d9 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -111,6 +111,7 @@ typedef struct SnapshotData
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
 	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
+	TimestampTz asofTimestamp;	/* select AS OF timestamp */
 	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
 } SnapshotData;
 
In reply to: Konstantin Knizhnik (#31)
Re: AS OF queries

On Wed, Dec 27, 2017 at 7:37 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

On 27.12.2017 00:52, Jeff Janes wrote:

On Thu, Dec 21, 2017 at 6:00 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

There is still one significant difference of my prototype implementation
with SQL standard: it associates timestamp with select statement, not with
particular table.
It seems to be more difficult to support and I am not sure that joining
tables from different timelines has much sense.
But certainly it also can be fixed.

I think the main use I would find for this feature is something like:

select * from foo except select * from foo as old_foo as of '<some time>';

Just a quick report from the world of ORMs and web applications.

Today the idiomatic approach for an ORM like Ruby on Rails is to support
temporal(ish) queries using three additional TIMESTAMP_TZ columns:
"created_at", "updated_at" and "deleted_at". This idiom is bundled up into
a plugin called "acts_as_paranoid" (See: https://github.com/
rubysherpas/paranoia). We used this extensively at Heroku in our production
code for auditability reasons.

In general, this gets implemented on a per-table basis and usually has no
expiry short of manual cleanup. (It would be interesting to contemplate how
an end-user would clean up a table without losing their entire history in
the event of some kind of bug or bloat.)

I think a quality PostgreSQL-core implementation would be a fantastic
enhancement, though it would obviously introduce a bunch of interesting
decisions around how to handle things like referential integrity.

Personally, I frequently used these columns to query for things like "how
many users were created in each of the last twelve months", and the ability
to index on those dates was often important.

I'm confident that if this feature made it into PostgreSQL there would be
interested people in downstream communities that would take advantage of it.

Hope all that helps,

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut

#33Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#31)
1 attachment(s)
Re: AS OF queries

Attached please find new version of AS OF patch which allows to specify
time travel period.
Older versions outside this period may be reclaimed by autovacuum.
This behavior is controlled by "time_travel_period" parameter.

Zero value of this parameter disables time travel and postgres behaves
in standard way.
Actually you can still use AS AF construction but there is no warranty
that requested versions are not reclaimed and result of query actually
belongs to the specified time slice.

Value -1 means infinite history: versions are never reclaimed and
autovacuum is disabled.

And positive value of this parameter specifies maximal time travel
period in seconds.
As in case of disabled time travel, you can specify AS OF timestamp
older than this period.
But there is no warranty that requested versions still exist.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

asof-4.patchtext/x-patch; name=asof-4.patchDownload
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index eb5bbb5..7acaf30 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -78,6 +78,7 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	ExprContext *econtext;
 	HeapScanDesc scan;
 	TIDBitmap  *tbm;
+	EState	   *estate;
 	TBMIterator *tbmiterator = NULL;
 	TBMSharedIterator *shared_tbmiterator = NULL;
 	TBMIterateResult *tbmres;
@@ -85,11 +86,13 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	TupleTableSlot *slot;
 	ParallelBitmapHeapState *pstate = node->pstate;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = 0;
 
 	/*
 	 * extract necessary information from index scan node
 	 */
 	econtext = node->ss.ps.ps_ExprContext;
+	estate = node->ss.ps.state;
 	slot = node->ss.ss_ScanTupleSlot;
 	scan = node->ss.ss_currentScanDesc;
 	tbm = node->tbm;
@@ -99,6 +102,25 @@ BitmapHeapNext(BitmapHeapScanState *node)
 		shared_tbmiterator = node->shared_tbmiterator;
 	tbmres = node->tbmres;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			node->ss.asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	/*
 	 * If we haven't yet performed the underlying index scan, do it, and begin
 	 * the iteration over the bitmap.
@@ -364,11 +386,21 @@ BitmapHeapNext(BitmapHeapScanState *node)
 			}
 		}
 
-		/* OK to return this tuple */
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	    /* OK to return this tuple */
 		return slot;
 	}
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * if we get here it means we are at the end of the scan..
 	 */
 	return ExecClearTuple(slot);
@@ -746,6 +778,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
 {
 	PlanState  *outerPlan = outerPlanState(node);
 
+	node->ss.asofTimestampSet = false;
+
 	/* rescan to release any page pin */
 	heap_rescan(node->ss.ss_currentScanDesc, NULL);
 
@@ -902,7 +936,8 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 	 * most cases it's probably not worth working harder than that.
 	 */
 	scanstate->can_skip_fetch = (node->scan.plan.qual == NIL &&
-								 node->scan.plan.targetlist == NIL);
+								 node->scan.plan.targetlist == NIL &&
+								 node->scan.asofTimestamp == NULL);
 
 	/*
 	 * Miscellaneous initialization
@@ -920,6 +955,18 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 		ExecInitQual(node->bitmapqualorig, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+										   &scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -1052,11 +1099,27 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ParallelBitmapHeapState *pstate;
 	EState	   *estate = node->ss.ps.state;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
 
 	/* If there's no DSA, there are no workers; initialize nothing. */
 	if (dsa == NULL)
 		return;
 
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		ExprState* asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		val = ExecEvalExprSwitchContext(asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
+
 	pstate = shm_toc_allocate(pcxt->toc, node->pscan_len);
 
 	pstate->tbmiterator = 0;
@@ -1071,6 +1134,8 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ConditionVariableInit(&pstate->cv);
 	SerializeSnapshot(estate->es_snapshot, pstate->phs_snapshot_data);
 
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pstate);
 	node->pstate = pstate;
 }
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 2ffef23..a0b505c 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -86,7 +86,7 @@ IndexNext(IndexScanState *node)
 	IndexScanDesc scandesc;
 	HeapTuple	tuple;
 	TupleTableSlot *slot;
-
+	TimestampTz outerAsofTimestamp;
 	/*
 	 * extract necessary information from index scan node
 	 */
@@ -104,6 +104,30 @@ IndexNext(IndexScanState *node)
 	econtext = node->ss.ps.ps_ExprContext;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			if (isNull)
+				node->ss.asofTimestamp = 0;
+			else
+			{
+				node->ss.asofTimestamp = DatumGetInt64(val);
+			}
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -160,9 +184,17 @@ IndexNext(IndexScanState *node)
 				continue;
 			}
 		}
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 		return slot;
 	}
+	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 	/*
 	 * if we get here it means the index scan failed so we are at the end of
@@ -578,6 +610,8 @@ ExecIndexScan(PlanState *pstate)
 void
 ExecReScanIndexScan(IndexScanState *node)
 {
+	node->ss.asofTimestampSet = false;
+
 	/*
 	 * If we are doing runtime key calculations (ie, any of the index key
 	 * values weren't simple Consts), compute the new key values.  But first,
@@ -918,6 +952,18 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
 		ExecInitExprList(node->indexorderbyorig, (PlanState *) indexstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		indexstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+											&indexstate->ss.ps);
+		indexstate->ss.asofTimestampSet = false;
+	}
+	else
+		indexstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
@@ -1672,12 +1718,30 @@ ExecIndexScanInitializeDSM(IndexScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelIndexScanDesc piscan;
+	TimestampTz outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	piscan = shm_toc_allocate(pcxt->toc, node->iss_PscanLen);
 	index_parallelscan_initialize(node->ss.ss_currentRelation,
 								  node->iss_RelationDesc,
 								  estate->es_snapshot,
 								  piscan);
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
 	node->iss_ScanDesc =
 		index_beginscan_parallel(node->ss.ss_currentRelation,
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index a5bd60e..d19d210 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -54,6 +54,7 @@ SeqNext(SeqScanState *node)
 	EState	   *estate;
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	TimestampTz     outerAsofTimestamp;
 
 	/*
 	 * get information from the estate and scan state
@@ -63,6 +64,25 @@ SeqNext(SeqScanState *node)
 	direction = estate->es_direction;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			node->ss.asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -81,6 +101,11 @@ SeqNext(SeqScanState *node)
 	tuple = heap_getnext(scandesc, direction);
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * save the tuple and the buffer returned to us by the access methods in
 	 * our scan tuple slot and return the slot.  Note: we pass 'false' because
 	 * tuples returned by heap_getnext() are pointers onto disk pages and were
@@ -196,6 +221,19 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 		ExecInitQual(node->plan.qual, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->asofTimestamp,
+											&scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -273,6 +311,7 @@ ExecReScanSeqScan(SeqScanState *node)
 	HeapScanDesc scan;
 
 	scan = node->ss.ss_currentScanDesc;
+	node->ss.asofTimestampSet = false;
 
 	if (scan != NULL)
 		heap_rescan(scan,		/* scan desc */
@@ -316,11 +355,30 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+										 &node->ss.ps);
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
 	node->ss.ss_currentScanDesc =
 		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
@@ -337,8 +395,24 @@ ExecSeqScanReInitializeDSM(SeqScanState *node,
 						   ParallelContext *pcxt)
 {
 	HeapScanDesc scan = node->ss.ss_currentScanDesc;
+	EState	   *estate = node->ss.ps.state;
+	TimestampTz  outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		Datum		val;
+		bool		isNull;
+
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	heap_parallelscan_reinitialize(scan->rs_parallel);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 84d7171..259d991 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -410,6 +410,7 @@ CopyScanFields(const Scan *from, Scan *newnode)
 	CopyPlanFields((const Plan *) from, (Plan *) newnode);
 
 	COPY_SCALAR_FIELD(scanrelid);
+	COPY_NODE_FIELD(asofTimestamp);
 }
 
 /*
@@ -1216,6 +1217,7 @@ _copyRangeVar(const RangeVar *from)
 	COPY_SCALAR_FIELD(relpersistence);
 	COPY_NODE_FIELD(alias);
 	COPY_LOCATION_FIELD(location);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
@@ -2326,6 +2328,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_BITMAPSET_FIELD(insertedCols);
 	COPY_BITMAPSET_FIELD(updatedCols);
 	COPY_NODE_FIELD(securityQuals);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 2e869a9..8ee4228 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -112,6 +112,7 @@ _equalRangeVar(const RangeVar *a, const RangeVar *b)
 	COMPARE_SCALAR_FIELD(relpersistence);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_LOCATION_FIELD(location);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
@@ -2661,6 +2662,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_BITMAPSET_FIELD(insertedCols);
 	COMPARE_BITMAPSET_FIELD(updatedCols);
 	COMPARE_NODE_FIELD(securityQuals);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index c2a93b2..0ace44d 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2338,6 +2338,10 @@ range_table_walker(List *rtable,
 
 		if (walker(rte->securityQuals, context))
 			return true;
+
+		if (walker(rte->asofTimestamp, context))
+			return true;
+
 	}
 	return false;
 }
@@ -3161,6 +3165,7 @@ range_table_mutator(List *rtable,
 				break;
 		}
 		MUTATE(newrte->securityQuals, rte->securityQuals, List *);
+		MUTATE(newrte->asofTimestamp, rte->asofTimestamp, Node *);
 		newrt = lappend(newrt, newrte);
 	}
 	return newrt;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e468d7c..3ee00f3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3105,6 +3105,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 	WRITE_BITMAPSET_FIELD(insertedCols);
 	WRITE_BITMAPSET_FIELD(updatedCols);
 	WRITE_NODE_FIELD(securityQuals);
+	WRITE_NODE_FIELD(asofTimestamp);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1133c70..cf7c637 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1399,6 +1399,7 @@ _readRangeTblEntry(void)
 	READ_BITMAPSET_FIELD(insertedCols);
 	READ_BITMAPSET_FIELD(updatedCols);
 	READ_NODE_FIELD(securityQuals);
+	READ_NODE_FIELD(asofTimestamp);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 1a9fd82..713f9b3 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -168,10 +168,10 @@ static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
 static SampleScan *make_samplescan(List *qptlist, List *qpqual, Index scanrelid,
 				TableSampleClause *tsc);
 static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
-			   Oid indexid, List *indexqual, List *indexqualorig,
-			   List *indexorderby, List *indexorderbyorig,
-			   List *indexorderbyops,
-			   ScanDirection indexscandir);
+								 Oid indexid, List *indexqual, List *indexqualorig,
+								 List *indexorderby, List *indexorderbyorig,
+								 List *indexorderbyops,
+								 ScanDirection indexscandir);
 static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
 				   Index scanrelid, Oid indexid,
 				   List *indexqual, List *indexorderby,
@@ -509,6 +509,7 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 	List	   *gating_clauses;
 	List	   *tlist;
 	Plan	   *plan;
+	RangeTblEntry *rte;
 
 	/*
 	 * Extract the relevant restriction clauses from the parent relation. The
@@ -709,6 +710,12 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 			break;
 	}
 
+	if (plan != NULL)
+	{
+		rte = planner_rt_fetch(rel->relid, root);
+		((Scan*)plan)->asofTimestamp = rte->asofTimestamp;
+	}
+
 	/*
 	 * If there are any pseudoconstant clauses attached to this node, insert a
 	 * gating Result node that evaluates the pseudoconstants as one-time
@@ -2434,7 +2441,7 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 	Assert(scan_relid > 0);
 	Assert(best_path->parent->rtekind == RTE_RELATION);
 
-	/* Sort clauses into best execution order */
+    /* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 382791f..ceb6542 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -84,6 +84,7 @@ create_upper_paths_hook_type create_upper_paths_hook = NULL;
 #define EXPRKIND_ARBITER_ELEM		10
 #define EXPRKIND_TABLEFUNC			11
 #define EXPRKIND_TABLEFUNC_LATERAL	12
+#define EXPRKIND_ASOF	            13
 
 /* Passthrough data for standard_qp_callback */
 typedef struct
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ebfc94f..a642e28 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -449,7 +449,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type <node>	fetch_args limit_clause select_limit_value
 				offset_clause select_offset_value
-				select_offset_value2 opt_select_fetch_first_value
+				select_offset_value2 opt_select_fetch_first_value opt_asof_clause
 %type <ival>	row_or_rows first_or_next
 
 %type <list>	OptSeqOptList SeqOptList OptParenthesizedSeqOptList
@@ -704,7 +704,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * as NOT, at least with respect to their left-hand subexpression.
  * NULLS_LA and WITH_LA are needed to make the grammar LALR(1).
  */
-%token		NOT_LA NULLS_LA WITH_LA
+%token		NOT_LA NULLS_LA WITH_LA AS_LA
 
 
 /* Precedence: lowest to highest */
@@ -11720,9 +11720,10 @@ from_list:
 /*
  * table_ref is where an alias clause can be attached.
  */
-table_ref:	relation_expr opt_alias_clause
+table_ref:	relation_expr opt_alias_clause opt_asof_clause
 				{
 					$1->alias = $2;
+					$1->asofTimestamp = $3;
 					$$ = (Node *) $1;
 				}
 			| relation_expr opt_alias_clause tablesample_clause
@@ -11948,6 +11949,10 @@ opt_alias_clause: alias_clause						{ $$ = $1; }
 			| /*EMPTY*/								{ $$ = NULL; }
 		;
 
+opt_asof_clause: AS_LA OF a_expr                    { $$ = $3; }
+			| /*EMPTY*/								{ $$ = NULL; }
+		;
+
 /*
  * func_alias_clause can include both an Alias and a coldeflist, so we make it
  * return a 2-element list that gets disassembled by calling production.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4c4f4cd..6c3e506 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -439,6 +439,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
@@ -856,6 +857,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 2828bbf..a23f3d8 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -426,7 +426,11 @@ transformTableEntry(ParseState *pstate, RangeVar *r)
 
 	/* We need only build a range table entry */
 	rte = addRangeTableEntry(pstate, r, r->alias, r->inh, true);
-
+	if (r->asofTimestamp)
+	{
+		Node* asof = transformExpr(pstate, r->asofTimestamp, EXPR_KIND_ASOF);
+		rte->asofTimestamp = coerce_to_specific_type(pstate, asof, TIMESTAMPTZOID, "ASOF");
+	}
 	return rte;
 }
 
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da7..cd83fc3 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1818,6 +1818,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
 		case EXPR_KIND_VALUES:
 		case EXPR_KIND_VALUES_SINGLE:
 		case EXPR_KIND_CALL:
+		case EXPR_KIND_ASOF:
 			/* okay */
 			break;
 		case EXPR_KIND_CHECK_CONSTRAINT:
@@ -3470,6 +3471,8 @@ ParseExprKindName(ParseExprKind exprKind)
 			return "PARTITION BY";
 		case EXPR_KIND_CALL:
 			return "CALL";
+		case EXPR_KIND_ASOF:
+			return "ASOF";
 
 			/*
 			 * There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e6b0856..a6bcfc7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2250,6 +2250,7 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 58bdb23..ddf6af4 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1206,6 +1206,7 @@ addRangeTableEntry(ParseState *pstate,
 
 	rte->rtekind = RTE_RELATION;
 	rte->alias = alias;
+	rte->asofTimestamp = relation->asofTimestamp;
 
 	/*
 	 * Get the rel's OID.  This access also ensures that we have an up-to-date
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 245b4cd..a3845b5 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -108,6 +108,9 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	 */
 	switch (cur_token)
 	{
+		case AS:
+			cur_token_length = 2;
+			break;
 		case NOT:
 			cur_token_length = 3;
 			break;
@@ -155,6 +158,10 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	/* Replace cur_token if needed, based on lookahead */
 	switch (cur_token)
 	{
+		case AS:
+		    if (next_token == OF)
+			    cur_token = AS_LA;
+		    break;
 		case NOT:
 			/* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
 			switch (next_token)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index d87799c..c3b3790 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -51,6 +51,7 @@
 #include "access/twophase.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "catalog/catalog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -91,6 +92,9 @@ typedef struct ProcArrayStruct
 	/* oldest catalog xmin of any replication slot */
 	TransactionId replication_slot_catalog_xmin;
 
+	TransactionId time_travel_xmin;
+	TimestampTz   time_travel_horizon;
+
 	/* indexes into allPgXact[], has PROCARRAY_MAXPROCS entries */
 	int			pgprocnos[FLEXIBLE_ARRAY_MEMBER];
 } ProcArrayStruct;
@@ -1256,6 +1260,87 @@ TransactionIdIsActive(TransactionId xid)
 	return result;
 }
 
+/*
+ * Get minimal XID which belonds to time travel period.
+ * This function tries to adjust current time travel horizon.
+ * It is commit_ts SLRU to map xids to timestamps. As far as order of XIDs doesn;t match with order of timestamps,
+ * this function may produce no quite correct results in case of presence of long living transaction.
+ * So time travel period specification is not exact and should consider maximal transaction duration.
+ *
+ * Passed time_travel_xmin&time_travel_horizon are taken from procarray under lock.
+ */
+static TransactionId
+GetTimeTravelXmin(TransactionId oldestXmin, TransactionId time_travel_xmin, TimestampTz time_travel_horizon)
+{
+	if (time_travel_period < 0)
+	{
+		/* Infinite history */
+		oldestXmin -= MaxTimeTravelPeriod;
+	}
+	else
+	{
+		/* Limited history: check time travel horizon */
+		TimestampTz new_horizon = GetCurrentTimestamp()	- (TimestampTz)time_travel_period*USECS_PER_SEC;
+		TransactionId old_xmin = time_travel_xmin;
+
+		if (time_travel_xmin != InvalidTransactionId)
+		{
+			/* We have already determined time travel horizon: check if it needs to be adjusted */
+			TimestampTz old_horizon = time_travel_horizon;
+			TransactionId xid = old_xmin;
+
+			while (timestamptz_cmp_internal(old_horizon, new_horizon) < 0)
+			{
+				/* Move horizon forward */
+				time_travel_xmin  = xid;
+				time_travel_horizon = old_horizon;
+				do {
+					TransactionIdAdvance(xid);
+					/* Stop if we reach oldest xmin */
+					if (TransactionIdFollowsOrEquals(xid, oldestXmin))
+						goto EndScan;
+				} while (!TransactionIdGetCommitTsData(xid, &old_horizon, NULL));
+			}
+		}
+		else
+		{
+			/* Find out time travel horizon */
+			TransactionId xid = oldestXmin;
+
+			do {
+				TransactionIdRetreat(xid);
+				/*
+				 * Lack of information about transaction timestamp in SLRU means that we reach unexisted or untracked transaction,
+				 * so we need to stop traversal in this case
+				 */
+				if (!TransactionIdGetCommitTsData(xid, &time_travel_horizon, NULL))
+					goto EndScan;
+				time_travel_xmin = xid;
+			} while (timestamptz_cmp_internal(time_travel_horizon, new_horizon) > 0);
+		}
+	  EndScan:
+		if (old_xmin != time_travel_xmin)
+		{
+			/* Horizon moved */
+			LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+			/* Recheck under lock that xmin is advanced */
+			if (TransactionIdPrecedes(procArray->time_travel_xmin, time_travel_xmin))
+			{
+				procArray->time_travel_xmin = time_travel_xmin;
+				procArray->time_travel_horizon = time_travel_horizon;
+			}
+			LWLockRelease(ProcArrayLock);
+		}
+		/* Move oldest xmin in the past if it is required for time travel */
+		if (TransactionIdPrecedes(time_travel_xmin, oldestXmin))
+			oldestXmin = time_travel_xmin;
+	}
+
+	if (!TransactionIdIsNormal(oldestXmin))
+		oldestXmin = FirstNormalTransactionId;
+
+	return oldestXmin;
+}
 
 /*
  * GetOldestXmin -- returns oldest transaction that was running
@@ -1321,6 +1406,8 @@ GetOldestXmin(Relation rel, int flags)
 
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	volatile TransactionId time_travel_xmin;
+	TimestampTz time_travel_horizon;
 
 	/*
 	 * If we're not computing a relation specific limit, or if a shared
@@ -1383,6 +1470,9 @@ GetOldestXmin(Relation rel, int flags)
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
 
+	time_travel_xmin = procArray->time_travel_xmin;
+	time_travel_horizon = procArray->time_travel_horizon;
+
 	if (RecoveryInProgress())
 	{
 		/*
@@ -1423,6 +1513,9 @@ GetOldestXmin(Relation rel, int flags)
 			result = FirstNormalTransactionId;
 	}
 
+	if (time_travel_period != 0)
+		result = GetTimeTravelXmin(result, time_travel_xmin, time_travel_horizon);
+
 	/*
 	 * Check whether there are replication slots requiring an older xmin.
 	 */
@@ -1469,6 +1562,7 @@ GetMaxSnapshotSubxidCount(void)
 	return TOTAL_MAX_CACHED_SUBXIDS;
 }
 
+
 /*
  * GetSnapshotData -- returns information about running transactions.
  *
@@ -1518,6 +1612,8 @@ GetSnapshotData(Snapshot snapshot)
 	bool		suboverflowed = false;
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	volatile TransactionId time_travel_xmin;
+	TimestampTz time_travel_horizon;
 
 	Assert(snapshot != NULL);
 
@@ -1707,6 +1803,9 @@ GetSnapshotData(Snapshot snapshot)
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
 
+	time_travel_xmin = procArray->time_travel_xmin;
+	time_travel_horizon = procArray->time_travel_horizon;
+
 	if (!TransactionIdIsValid(MyPgXact->xmin))
 		MyPgXact->xmin = TransactionXmin = xmin;
 
@@ -1730,6 +1829,9 @@ GetSnapshotData(Snapshot snapshot)
 		NormalTransactionIdPrecedes(replication_slot_xmin, RecentGlobalXmin))
 		RecentGlobalXmin = replication_slot_xmin;
 
+	if (time_travel_period != 0)
+		RecentGlobalXmin = GetTimeTravelXmin(RecentGlobalXmin, time_travel_xmin, time_travel_horizon);
+
 	/* Non-catalog tables can be vacuumed if older than this xid */
 	RecentGlobalDataXmin = RecentGlobalXmin;
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e32901d..7d25060 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -191,6 +191,7 @@ static void assign_application_name(const char *newval, void *extra);
 static bool check_cluster_name(char **newval, void **extra, GucSource source);
 static const char *show_unix_socket_permissions(void);
 static const char *show_log_file_mode(void);
+static void assign_time_travel_period_hook(int newval, void *extra);
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -1713,6 +1714,15 @@ static struct config_bool ConfigureNamesBool[] =
 static struct config_int ConfigureNamesInt[] =
 {
 	{
+		{"time_travel_period", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Specifies time travel period in seconds: 0 disables, -1 infinite"),
+			NULL
+		},
+		&time_travel_period,
+		0, -1, MaxTimeTravelPeriod,
+		NULL, assign_time_travel_period_hook, NULL
+	},
+	{
 		{"archive_timeout", PGC_SIGHUP, WAL_ARCHIVING,
 			gettext_noop("Forces a switch to the next WAL file if a "
 						 "new file has not been started within N seconds."),
@@ -10530,4 +10540,14 @@ show_log_file_mode(void)
 	return buf;
 }
 
+static void assign_time_travel_period_hook(int newval, void *extra)
+{
+	if (newval != 0)
+	{
+		track_commit_timestamp = true;
+		if (newval < 0)
+			autovacuum_start_daemon = false;
+	}
+}
+
 #include "guc-file.c"
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 0b03290..d0bacd4 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -73,6 +73,7 @@
  * GUC parameters
  */
 int			old_snapshot_threshold; /* number of minutes, -1 disables */
+int         time_travel_period;     /* number of seconds, 0 disables, -1 infinite */
 
 /*
  * Structure for dealing with old_snapshot_threshold implementation.
@@ -244,6 +245,7 @@ typedef struct SerializedSnapshotData
 	bool		takenDuringRecovery;
 	CommandId	curcid;
 	TimestampTz whenTaken;
+	TimestampTz asofTimestamp;
 	XLogRecPtr	lsn;
 } SerializedSnapshotData;
 
@@ -2080,6 +2082,7 @@ SerializeSnapshot(Snapshot snapshot, char *start_address)
 	serialized_snapshot.takenDuringRecovery = snapshot->takenDuringRecovery;
 	serialized_snapshot.curcid = snapshot->curcid;
 	serialized_snapshot.whenTaken = snapshot->whenTaken;
+	serialized_snapshot.asofTimestamp = snapshot->asofTimestamp;
 	serialized_snapshot.lsn = snapshot->lsn;
 
 	/*
@@ -2154,6 +2157,7 @@ RestoreSnapshot(char *start_address)
 	snapshot->takenDuringRecovery = serialized_snapshot.takenDuringRecovery;
 	snapshot->curcid = serialized_snapshot.curcid;
 	snapshot->whenTaken = serialized_snapshot.whenTaken;
+	snapshot->asofTimestamp = serialized_snapshot.asofTimestamp;
 	snapshot->lsn = serialized_snapshot.lsn;
 
 	/* Copy XIDs, if present. */
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 2b218e0..09e067f 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -69,6 +69,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -1476,6 +1477,16 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
 	uint32		i;
 
+	if (snapshot->asofTimestamp != 0)
+	{
+		TimestampTz ts;
+		if (TransactionIdGetCommitTsData(xid, &ts, NULL))
+		{
+			return timestamptz_cmp_internal(snapshot->asofTimestamp, ts) < 0;
+		}
+	}
+
+
 	/*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 86076de..e46f4c6 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -33,6 +33,7 @@
 #define FrozenTransactionId			((TransactionId) 2)
 #define FirstNormalTransactionId	((TransactionId) 3)
 #define MaxTransactionId			((TransactionId) 0xFFFFFFFF)
+#define MaxTimeTravelPeriod         ((TransactionId) 0x3FFFFFFF)
 
 /* ----------------
  *		transaction ID manipulation macros
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index c9a5279..ed923ab 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1120,6 +1120,9 @@ typedef struct ScanState
 	Relation	ss_currentRelation;
 	HeapScanDesc ss_currentScanDesc;
 	TupleTableSlot *ss_ScanTupleSlot;
+	ExprState  *asofExpr;	      /* AS OF expression */
+	bool        asofTimestampSet; /* As OF timestamp evaluated */
+	TimestampTz asofTimestamp;    /* AS OF timestamp or 0 if not set */
 } ScanState;
 
 /* ----------------
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2eaa6b2..b78c8e2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1062,6 +1062,7 @@ typedef struct RangeTblEntry
 	Bitmapset  *insertedCols;	/* columns needing INSERT permission */
 	Bitmapset  *updatedCols;	/* columns needing UPDATE permission */
 	List	   *securityQuals;	/* security barrier quals to apply, if any */
+	Node       *asofTimestamp;  /* AS OF timestamp */
 } RangeTblEntry;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index d763da6..083dc90 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -327,7 +327,8 @@ typedef struct BitmapOr
 typedef struct Scan
 {
 	Plan		plan;
-	Index		scanrelid;		/* relid is index into the range table */
+	Index		scanrelid;	   /* relid is index into the range table */
+	Node       *asofTimestamp; /* AS OF timestamp */
 } Scan;
 
 /* ----------------
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 074ae0a..11e1a0c 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -70,6 +70,7 @@ typedef struct RangeVar
 								 * on children? */
 	char		relpersistence; /* see RELPERSISTENCE_* in pg_class.h */
 	Alias	   *alias;			/* table alias & optional column aliases */
+	Node       *asofTimestamp;  /* expression with AS OF timestamp */
 	int			location;		/* token location, or -1 if unknown */
 } RangeVar;
 
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 565bb3d..b1efb5c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -68,7 +68,8 @@ typedef enum ParseExprKind
 	EXPR_KIND_TRIGGER_WHEN,		/* WHEN condition in CREATE TRIGGER */
 	EXPR_KIND_POLICY,			/* USING or WITH CHECK expr in policy */
 	EXPR_KIND_PARTITION_EXPRESSION,	/* PARTITION BY expression */
-	EXPR_KIND_CALL				/* CALL argument */
+	EXPR_KIND_CALL,				/* CALL argument */
+	EXPR_KIND_ASOF              /* AS OF */
 } ParseExprKind;
 
 
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 8585194..1e1414a 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -47,7 +47,7 @@
 
 /* GUC variables */
 extern PGDLLIMPORT int old_snapshot_threshold;
-
+extern PGDLLIMPORT int time_travel_period;
 
 extern Size SnapMgrShmemSize(void);
 extern void SnapMgrInit(void);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index bf51977..a00f0d9 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -111,6 +111,7 @@ typedef struct SnapshotData
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
 	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
+	TimestampTz asofTimestamp;	/* select AS OF timestamp */
 	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
 } SnapshotData;
 
#34Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Konstantin Knizhnik (#33)
Re: AS OF queries

On 12/28/17 11:36, Konstantin Knizhnik wrote:

Attached please find new version of AS OF patch which allows to specify
time travel period.
Older versions outside this period may be reclaimed by autovacuum.
This behavior is controlled by "time_travel_period" parameter.

So where are we on using quasi SQL-standard syntax for a nonstandard
interpretation? I think we could very well have a variety of standard
and nonstandard AS OF variants, including by commit timestamp, xid,
explicit range columns, etc. But I'd like to see a discussion on that,
perhaps in a documentation update, which this patch is missing.

I have questions about corner cases. What happens when multiple tables
are queried with different AS OF clauses? Can there be apparent RI
violations? What happens when the time_travel_period is changed during
a session? How can we check how much old data is available, and how can
we check how much space it uses? What happens if no old data for the
selected AS OF is available? How does this interact with catalog
changes, such as changes to row-level security settings? (Do we apply
the current or the past settings?)

This patch should probably include a bunch of tests to cover these and
other scenarios.

(Maybe "period" isn't the best name, because it implies a start and an
end. How about something with "age"?)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#35Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Peter Eisentraut (#34)
1 attachment(s)
Re: AS OF queries

On 28.12.2017 20:28, Peter Eisentraut wrote:

On 12/28/17 11:36, Konstantin Knizhnik wrote:

Attached please find new version of AS OF patch which allows to specify
time travel period.
Older versions outside this period may be reclaimed by autovacuum.
This behavior is controlled by "time_travel_period" parameter.

So where are we on using quasi SQL-standard syntax for a nonstandard
interpretation? I think we could very well have a variety of standard
and nonstandard AS OF variants, including by commit timestamp, xid,
explicit range columns, etc. But I'd like to see a discussion on that,
perhaps in a documentation update, which this patch is missing.

SQL:2011 ||defines rules for creation and querying of temporal tables.
I have not read this standard myself, I just take information about it
from wikipedia:
https://en.wikipedia.org/wiki/SQL:2011
According to this standard time-sliced queries are specified using
|
AS OF SYSTEM TIME| |and| |VERSIONS BETWEEN SYSTEM TIME ... AND ...|clauses.

Looks like it is supported now only by Oracle. IBM DB, MS-SQL, are
providing similar functionality in slightly different way.
I am not sure whether strict support of SQL:2011 standard is critical
and which other functionality we need.

I have questions about corner cases. What happens when multiple tables
are queried with different AS OF clauses?

It is possible.

Can there be apparent RI
violations?

Right now AS OF is used only in selects, not in update statements. So I
do not understand how integrity constraints can be violated.

What happens when the time_travel_period is changed during
a session?

Right now it depends on autovacuum: how fast it will be able to reclaim
old version.
Actually I I do not see much sense in changing time travel period during
session.
In asof-4.patch time_travel_period is postmaster level GUC which can not
be changed in session.
But I have changed policy for it for SIGHUP to make experiments with it
more easier.

How can we check how much old data is available, and how can
we check how much space it uses?

Physical space used by the database/relation can be determined using
standard functions, for example pg_total_relation_size.
I do not know any simple way to get total number of all stored versions.

What happens if no old data for the
selected AS OF is available?

It will just return the version closest to the specified timestamp.

How does this interact with catalog
changes, such as changes to row-level security settings? (Do we apply
the current or the past settings?)

Catalog changes are not currently supported.
And I do not have good understanding how to support it if query involves
two different timeslice with different versions of the table.
Too much places in parser/optimizer have to be change to support such
"historical collisions".

This patch should probably include a bunch of tests to cover these and
other scenarios.

Right now I have added just one test: asof.sql.
It requires "track_commit_timestamp" option to be switched on and it is
postmaster level GUC.
So I have added for it postgresql.asof.conf and asof_schedule.
This test should be launched using the following command:

make check EXTRA_REGRESS_OPTS="--schedule=asof_schedule
--temp-config=postgresql.asof.config"

If there is some better way to include this test in standard regression
tests, please let me know.

(Maybe "period" isn't the best name, because it implies a start and an
end. How about something with "age"?)

Well I am not an English native speaker. So I can not conclude what is
more natural.
"period" is widely used in topics related with temporal tables (just
count occurrences of this word at https://en.wikipedia.org/wiki/SQL:2011)
Age is not used here at all.
From my point of view age is something applicable to person, building,
monument,...
It is not possible to say about "ge of time travel". In science fiction
"time machines" frequently have limitations: you can not got more than N
years in the past.
How we can name this N? Is it "period", "age" or something else?

I attached yet another version of the patch which includes test for AS
OF query.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

asof-5.patchtext/x-patch; name=asof-5.patchDownload
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index eb5bbb5..7acaf30 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -78,6 +78,7 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	ExprContext *econtext;
 	HeapScanDesc scan;
 	TIDBitmap  *tbm;
+	EState	   *estate;
 	TBMIterator *tbmiterator = NULL;
 	TBMSharedIterator *shared_tbmiterator = NULL;
 	TBMIterateResult *tbmres;
@@ -85,11 +86,13 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	TupleTableSlot *slot;
 	ParallelBitmapHeapState *pstate = node->pstate;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = 0;
 
 	/*
 	 * extract necessary information from index scan node
 	 */
 	econtext = node->ss.ps.ps_ExprContext;
+	estate = node->ss.ps.state;
 	slot = node->ss.ss_ScanTupleSlot;
 	scan = node->ss.ss_currentScanDesc;
 	tbm = node->tbm;
@@ -99,6 +102,25 @@ BitmapHeapNext(BitmapHeapScanState *node)
 		shared_tbmiterator = node->shared_tbmiterator;
 	tbmres = node->tbmres;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			node->ss.asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	/*
 	 * If we haven't yet performed the underlying index scan, do it, and begin
 	 * the iteration over the bitmap.
@@ -364,11 +386,21 @@ BitmapHeapNext(BitmapHeapScanState *node)
 			}
 		}
 
-		/* OK to return this tuple */
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	    /* OK to return this tuple */
 		return slot;
 	}
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * if we get here it means we are at the end of the scan..
 	 */
 	return ExecClearTuple(slot);
@@ -746,6 +778,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
 {
 	PlanState  *outerPlan = outerPlanState(node);
 
+	node->ss.asofTimestampSet = false;
+
 	/* rescan to release any page pin */
 	heap_rescan(node->ss.ss_currentScanDesc, NULL);
 
@@ -902,7 +936,8 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 	 * most cases it's probably not worth working harder than that.
 	 */
 	scanstate->can_skip_fetch = (node->scan.plan.qual == NIL &&
-								 node->scan.plan.targetlist == NIL);
+								 node->scan.plan.targetlist == NIL &&
+								 node->scan.asofTimestamp == NULL);
 
 	/*
 	 * Miscellaneous initialization
@@ -920,6 +955,18 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 		ExecInitQual(node->bitmapqualorig, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+										   &scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -1052,11 +1099,27 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ParallelBitmapHeapState *pstate;
 	EState	   *estate = node->ss.ps.state;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
 
 	/* If there's no DSA, there are no workers; initialize nothing. */
 	if (dsa == NULL)
 		return;
 
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		ExprState* asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		val = ExecEvalExprSwitchContext(asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
+
 	pstate = shm_toc_allocate(pcxt->toc, node->pscan_len);
 
 	pstate->tbmiterator = 0;
@@ -1071,6 +1134,8 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ConditionVariableInit(&pstate->cv);
 	SerializeSnapshot(estate->es_snapshot, pstate->phs_snapshot_data);
 
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pstate);
 	node->pstate = pstate;
 }
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 2ffef23..a0b505c 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -86,7 +86,7 @@ IndexNext(IndexScanState *node)
 	IndexScanDesc scandesc;
 	HeapTuple	tuple;
 	TupleTableSlot *slot;
-
+	TimestampTz outerAsofTimestamp;
 	/*
 	 * extract necessary information from index scan node
 	 */
@@ -104,6 +104,30 @@ IndexNext(IndexScanState *node)
 	econtext = node->ss.ps.ps_ExprContext;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			if (isNull)
+				node->ss.asofTimestamp = 0;
+			else
+			{
+				node->ss.asofTimestamp = DatumGetInt64(val);
+			}
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -160,9 +184,17 @@ IndexNext(IndexScanState *node)
 				continue;
 			}
 		}
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 		return slot;
 	}
+	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 	/*
 	 * if we get here it means the index scan failed so we are at the end of
@@ -578,6 +610,8 @@ ExecIndexScan(PlanState *pstate)
 void
 ExecReScanIndexScan(IndexScanState *node)
 {
+	node->ss.asofTimestampSet = false;
+
 	/*
 	 * If we are doing runtime key calculations (ie, any of the index key
 	 * values weren't simple Consts), compute the new key values.  But first,
@@ -918,6 +952,18 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
 		ExecInitExprList(node->indexorderbyorig, (PlanState *) indexstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		indexstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+											&indexstate->ss.ps);
+		indexstate->ss.asofTimestampSet = false;
+	}
+	else
+		indexstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
@@ -1672,12 +1718,30 @@ ExecIndexScanInitializeDSM(IndexScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelIndexScanDesc piscan;
+	TimestampTz outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	piscan = shm_toc_allocate(pcxt->toc, node->iss_PscanLen);
 	index_parallelscan_initialize(node->ss.ss_currentRelation,
 								  node->iss_RelationDesc,
 								  estate->es_snapshot,
 								  piscan);
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
 	node->iss_ScanDesc =
 		index_beginscan_parallel(node->ss.ss_currentRelation,
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index a5bd60e..d19d210 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -54,6 +54,7 @@ SeqNext(SeqScanState *node)
 	EState	   *estate;
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	TimestampTz     outerAsofTimestamp;
 
 	/*
 	 * get information from the estate and scan state
@@ -63,6 +64,25 @@ SeqNext(SeqScanState *node)
 	direction = estate->es_direction;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		if (!node->ss.asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+											node->ss.ps.ps_ExprContext,
+											&isNull);
+			/* Interpret NULL timestamp as no timestamp */
+			node->ss.asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+			node->ss.asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = node->ss.asofTimestamp;
+	}
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -81,6 +101,11 @@ SeqNext(SeqScanState *node)
 	tuple = heap_getnext(scandesc, direction);
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * save the tuple and the buffer returned to us by the access methods in
 	 * our scan tuple slot and return the slot.  Note: we pass 'false' because
 	 * tuples returned by heap_getnext() are pointers onto disk pages and were
@@ -196,6 +221,19 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 		ExecInitQual(node->plan.qual, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->asofTimestamp,
+											&scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -273,6 +311,7 @@ ExecReScanSeqScan(SeqScanState *node)
 	HeapScanDesc scan;
 
 	scan = node->ss.ss_currentScanDesc;
+	node->ss.asofTimestampSet = false;
 
 	if (scan != NULL)
 		heap_rescan(scan,		/* scan desc */
@@ -316,11 +355,30 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		Datum		val;
+		bool		isNull;
+
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+										 &node->ss.ps);
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
 	node->ss.ss_currentScanDesc =
 		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
@@ -337,8 +395,24 @@ ExecSeqScanReInitializeDSM(SeqScanState *node,
 						   ParallelContext *pcxt)
 {
 	HeapScanDesc scan = node->ss.ss_currentScanDesc;
+	EState	   *estate = node->ss.ps.state;
+	TimestampTz  outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+
+	if (node->ss.asofExpr)
+	{
+		Datum		val;
+		bool		isNull;
+
+		val = ExecEvalExprSwitchContext(node->ss.asofExpr,
+										node->ss.ps.ps_ExprContext,
+										&isNull);
+		/* Interpret NULL timestamp as no timestamp */
+		estate->es_snapshot->asofTimestamp = isNull ? 0 : DatumGetInt64(val);
+	}
 
 	heap_parallelscan_reinitialize(scan->rs_parallel);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 84d7171..259d991 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -410,6 +410,7 @@ CopyScanFields(const Scan *from, Scan *newnode)
 	CopyPlanFields((const Plan *) from, (Plan *) newnode);
 
 	COPY_SCALAR_FIELD(scanrelid);
+	COPY_NODE_FIELD(asofTimestamp);
 }
 
 /*
@@ -1216,6 +1217,7 @@ _copyRangeVar(const RangeVar *from)
 	COPY_SCALAR_FIELD(relpersistence);
 	COPY_NODE_FIELD(alias);
 	COPY_LOCATION_FIELD(location);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
@@ -2326,6 +2328,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_BITMAPSET_FIELD(insertedCols);
 	COPY_BITMAPSET_FIELD(updatedCols);
 	COPY_NODE_FIELD(securityQuals);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 2e869a9..8ee4228 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -112,6 +112,7 @@ _equalRangeVar(const RangeVar *a, const RangeVar *b)
 	COMPARE_SCALAR_FIELD(relpersistence);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_LOCATION_FIELD(location);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
@@ -2661,6 +2662,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_BITMAPSET_FIELD(insertedCols);
 	COMPARE_BITMAPSET_FIELD(updatedCols);
 	COMPARE_NODE_FIELD(securityQuals);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index c2a93b2..0ace44d 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2338,6 +2338,10 @@ range_table_walker(List *rtable,
 
 		if (walker(rte->securityQuals, context))
 			return true;
+
+		if (walker(rte->asofTimestamp, context))
+			return true;
+
 	}
 	return false;
 }
@@ -3161,6 +3165,7 @@ range_table_mutator(List *rtable,
 				break;
 		}
 		MUTATE(newrte->securityQuals, rte->securityQuals, List *);
+		MUTATE(newrte->asofTimestamp, rte->asofTimestamp, Node *);
 		newrt = lappend(newrt, newrte);
 	}
 	return newrt;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e468d7c..3ee00f3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3105,6 +3105,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 	WRITE_BITMAPSET_FIELD(insertedCols);
 	WRITE_BITMAPSET_FIELD(updatedCols);
 	WRITE_NODE_FIELD(securityQuals);
+	WRITE_NODE_FIELD(asofTimestamp);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1133c70..cf7c637 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1399,6 +1399,7 @@ _readRangeTblEntry(void)
 	READ_BITMAPSET_FIELD(insertedCols);
 	READ_BITMAPSET_FIELD(updatedCols);
 	READ_NODE_FIELD(securityQuals);
+	READ_NODE_FIELD(asofTimestamp);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 1a9fd82..713f9b3 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -168,10 +168,10 @@ static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
 static SampleScan *make_samplescan(List *qptlist, List *qpqual, Index scanrelid,
 				TableSampleClause *tsc);
 static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
-			   Oid indexid, List *indexqual, List *indexqualorig,
-			   List *indexorderby, List *indexorderbyorig,
-			   List *indexorderbyops,
-			   ScanDirection indexscandir);
+								 Oid indexid, List *indexqual, List *indexqualorig,
+								 List *indexorderby, List *indexorderbyorig,
+								 List *indexorderbyops,
+								 ScanDirection indexscandir);
 static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
 				   Index scanrelid, Oid indexid,
 				   List *indexqual, List *indexorderby,
@@ -509,6 +509,7 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 	List	   *gating_clauses;
 	List	   *tlist;
 	Plan	   *plan;
+	RangeTblEntry *rte;
 
 	/*
 	 * Extract the relevant restriction clauses from the parent relation. The
@@ -709,6 +710,12 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 			break;
 	}
 
+	if (plan != NULL)
+	{
+		rte = planner_rt_fetch(rel->relid, root);
+		((Scan*)plan)->asofTimestamp = rte->asofTimestamp;
+	}
+
 	/*
 	 * If there are any pseudoconstant clauses attached to this node, insert a
 	 * gating Result node that evaluates the pseudoconstants as one-time
@@ -2434,7 +2441,7 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 	Assert(scan_relid > 0);
 	Assert(best_path->parent->rtekind == RTE_RELATION);
 
-	/* Sort clauses into best execution order */
+    /* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 382791f..ceb6542 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -84,6 +84,7 @@ create_upper_paths_hook_type create_upper_paths_hook = NULL;
 #define EXPRKIND_ARBITER_ELEM		10
 #define EXPRKIND_TABLEFUNC			11
 #define EXPRKIND_TABLEFUNC_LATERAL	12
+#define EXPRKIND_ASOF	            13
 
 /* Passthrough data for standard_qp_callback */
 typedef struct
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ebfc94f..a642e28 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -449,7 +449,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type <node>	fetch_args limit_clause select_limit_value
 				offset_clause select_offset_value
-				select_offset_value2 opt_select_fetch_first_value
+				select_offset_value2 opt_select_fetch_first_value opt_asof_clause
 %type <ival>	row_or_rows first_or_next
 
 %type <list>	OptSeqOptList SeqOptList OptParenthesizedSeqOptList
@@ -704,7 +704,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * as NOT, at least with respect to their left-hand subexpression.
  * NULLS_LA and WITH_LA are needed to make the grammar LALR(1).
  */
-%token		NOT_LA NULLS_LA WITH_LA
+%token		NOT_LA NULLS_LA WITH_LA AS_LA
 
 
 /* Precedence: lowest to highest */
@@ -11720,9 +11720,10 @@ from_list:
 /*
  * table_ref is where an alias clause can be attached.
  */
-table_ref:	relation_expr opt_alias_clause
+table_ref:	relation_expr opt_alias_clause opt_asof_clause
 				{
 					$1->alias = $2;
+					$1->asofTimestamp = $3;
 					$$ = (Node *) $1;
 				}
 			| relation_expr opt_alias_clause tablesample_clause
@@ -11948,6 +11949,10 @@ opt_alias_clause: alias_clause						{ $$ = $1; }
 			| /*EMPTY*/								{ $$ = NULL; }
 		;
 
+opt_asof_clause: AS_LA OF a_expr                    { $$ = $3; }
+			| /*EMPTY*/								{ $$ = NULL; }
+		;
+
 /*
  * func_alias_clause can include both an Alias and a coldeflist, so we make it
  * return a 2-element list that gets disassembled by calling production.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4c4f4cd..6c3e506 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -439,6 +439,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
@@ -856,6 +857,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 2828bbf..a23f3d8 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -426,7 +426,11 @@ transformTableEntry(ParseState *pstate, RangeVar *r)
 
 	/* We need only build a range table entry */
 	rte = addRangeTableEntry(pstate, r, r->alias, r->inh, true);
-
+	if (r->asofTimestamp)
+	{
+		Node* asof = transformExpr(pstate, r->asofTimestamp, EXPR_KIND_ASOF);
+		rte->asofTimestamp = coerce_to_specific_type(pstate, asof, TIMESTAMPTZOID, "ASOF");
+	}
 	return rte;
 }
 
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da7..cd83fc3 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1818,6 +1818,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
 		case EXPR_KIND_VALUES:
 		case EXPR_KIND_VALUES_SINGLE:
 		case EXPR_KIND_CALL:
+		case EXPR_KIND_ASOF:
 			/* okay */
 			break;
 		case EXPR_KIND_CHECK_CONSTRAINT:
@@ -3470,6 +3471,8 @@ ParseExprKindName(ParseExprKind exprKind)
 			return "PARTITION BY";
 		case EXPR_KIND_CALL:
 			return "CALL";
+		case EXPR_KIND_ASOF:
+			return "ASOF";
 
 			/*
 			 * There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e6b0856..a6bcfc7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2250,6 +2250,7 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 58bdb23..ddf6af4 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1206,6 +1206,7 @@ addRangeTableEntry(ParseState *pstate,
 
 	rte->rtekind = RTE_RELATION;
 	rte->alias = alias;
+	rte->asofTimestamp = relation->asofTimestamp;
 
 	/*
 	 * Get the rel's OID.  This access also ensures that we have an up-to-date
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 245b4cd..a3845b5 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -108,6 +108,9 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	 */
 	switch (cur_token)
 	{
+		case AS:
+			cur_token_length = 2;
+			break;
 		case NOT:
 			cur_token_length = 3;
 			break;
@@ -155,6 +158,10 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	/* Replace cur_token if needed, based on lookahead */
 	switch (cur_token)
 	{
+		case AS:
+		    if (next_token == OF)
+			    cur_token = AS_LA;
+		    break;
 		case NOT:
 			/* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
 			switch (next_token)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index d87799c..945f782 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -51,6 +51,7 @@
 #include "access/twophase.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "catalog/catalog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -91,6 +92,9 @@ typedef struct ProcArrayStruct
 	/* oldest catalog xmin of any replication slot */
 	TransactionId replication_slot_catalog_xmin;
 
+	TransactionId time_travel_xmin;
+	TimestampTz   time_travel_horizon;
+
 	/* indexes into allPgXact[], has PROCARRAY_MAXPROCS entries */
 	int			pgprocnos[FLEXIBLE_ARRAY_MEMBER];
 } ProcArrayStruct;
@@ -1256,6 +1260,87 @@ TransactionIdIsActive(TransactionId xid)
 	return result;
 }
 
+/*
+ * Get minimal XID which belongs to time travel period.
+ * This function tries to adjust current time travel horizon.
+ * It is commit_ts SLRU to map xids to timestamps. As far as order of XIDs doesn't match with order of timestamps,
+ * this function may produce no quite correct results in case of presence of long living transaction.
+ * So time travel period specification is not exact and should consider maximal transaction duration.
+ *
+ * Passed time_travel_xmin&time_travel_horizon are taken from procarray under lock.
+ */
+static TransactionId
+GetTimeTravelXmin(TransactionId oldestXmin, TransactionId time_travel_xmin, TimestampTz time_travel_horizon)
+{
+	if (time_travel_period < 0)
+	{
+		/* Infinite history */
+		oldestXmin -= MaxTimeTravelPeriod;
+	}
+	else
+	{
+		/* Limited history: check time travel horizon */
+		TimestampTz new_horizon = GetCurrentTimestamp()	- (TimestampTz)time_travel_period*USECS_PER_SEC;
+		TransactionId old_xmin = time_travel_xmin;
+
+		if (time_travel_xmin != InvalidTransactionId)
+		{
+			/* We have already determined time travel horizon: check if it needs to be adjusted */
+			TimestampTz old_horizon = time_travel_horizon;
+			TransactionId xid = old_xmin;
+
+			while (timestamptz_cmp_internal(old_horizon, new_horizon) < 0)
+			{
+				/* Move horizon forward */
+				time_travel_xmin  = xid;
+				time_travel_horizon = old_horizon;
+				do {
+					TransactionIdAdvance(xid);
+					/* Stop if we reach oldest xmin */
+					if (TransactionIdFollowsOrEquals(xid, oldestXmin))
+						goto EndScan;
+				} while (!TransactionIdGetCommitTsData(xid, &old_horizon, NULL));
+			}
+		}
+		else
+		{
+			/* Find out time travel horizon */
+			TransactionId xid = oldestXmin;
+
+			do {
+				TransactionIdRetreat(xid);
+				/*
+				 * Lack of information about transaction timestamp in SLRU means that we reach unexisted or untracked transaction,
+				 * so we need to stop traversal in this case
+				 */
+				if (!TransactionIdGetCommitTsData(xid, &time_travel_horizon, NULL))
+					goto EndScan;
+				time_travel_xmin = xid;
+			} while (timestamptz_cmp_internal(time_travel_horizon, new_horizon) > 0);
+		}
+	  EndScan:
+		if (old_xmin != time_travel_xmin)
+		{
+			/* Horizon moved */
+			LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+			/* Recheck under lock that xmin is advanced */
+			if (TransactionIdPrecedes(procArray->time_travel_xmin, time_travel_xmin))
+			{
+				procArray->time_travel_xmin = time_travel_xmin;
+				procArray->time_travel_horizon = time_travel_horizon;
+			}
+			LWLockRelease(ProcArrayLock);
+		}
+		/* Move oldest xmin in the past if it is required for time travel */
+		if (TransactionIdPrecedes(time_travel_xmin, oldestXmin))
+			oldestXmin = time_travel_xmin;
+	}
+
+	if (!TransactionIdIsNormal(oldestXmin))
+		oldestXmin = FirstNormalTransactionId;
+
+	return oldestXmin;
+}
 
 /*
  * GetOldestXmin -- returns oldest transaction that was running
@@ -1321,6 +1406,8 @@ GetOldestXmin(Relation rel, int flags)
 
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	volatile TransactionId time_travel_xmin;
+	TimestampTz time_travel_horizon;
 
 	/*
 	 * If we're not computing a relation specific limit, or if a shared
@@ -1383,6 +1470,9 @@ GetOldestXmin(Relation rel, int flags)
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
 
+	time_travel_xmin = procArray->time_travel_xmin;
+	time_travel_horizon = procArray->time_travel_horizon;
+
 	if (RecoveryInProgress())
 	{
 		/*
@@ -1423,6 +1513,9 @@ GetOldestXmin(Relation rel, int flags)
 			result = FirstNormalTransactionId;
 	}
 
+	if (time_travel_period != 0)
+		result = GetTimeTravelXmin(result, time_travel_xmin, time_travel_horizon);
+
 	/*
 	 * Check whether there are replication slots requiring an older xmin.
 	 */
@@ -1469,6 +1562,7 @@ GetMaxSnapshotSubxidCount(void)
 	return TOTAL_MAX_CACHED_SUBXIDS;
 }
 
+
 /*
  * GetSnapshotData -- returns information about running transactions.
  *
@@ -1518,6 +1612,8 @@ GetSnapshotData(Snapshot snapshot)
 	bool		suboverflowed = false;
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	volatile TransactionId time_travel_xmin;
+	TimestampTz time_travel_horizon;
 
 	Assert(snapshot != NULL);
 
@@ -1707,6 +1803,9 @@ GetSnapshotData(Snapshot snapshot)
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
 
+	time_travel_xmin = procArray->time_travel_xmin;
+	time_travel_horizon = procArray->time_travel_horizon;
+
 	if (!TransactionIdIsValid(MyPgXact->xmin))
 		MyPgXact->xmin = TransactionXmin = xmin;
 
@@ -1730,6 +1829,9 @@ GetSnapshotData(Snapshot snapshot)
 		NormalTransactionIdPrecedes(replication_slot_xmin, RecentGlobalXmin))
 		RecentGlobalXmin = replication_slot_xmin;
 
+	if (time_travel_period != 0)
+		RecentGlobalXmin = GetTimeTravelXmin(RecentGlobalXmin, time_travel_xmin, time_travel_horizon);
+
 	/* Non-catalog tables can be vacuumed if older than this xid */
 	RecentGlobalDataXmin = RecentGlobalXmin;
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e32901d..286b855 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -191,6 +191,7 @@ static void assign_application_name(const char *newval, void *extra);
 static bool check_cluster_name(char **newval, void **extra, GucSource source);
 static const char *show_unix_socket_permissions(void);
 static const char *show_log_file_mode(void);
+static void assign_time_travel_period_hook(int newval, void *extra);
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -1713,6 +1714,15 @@ static struct config_bool ConfigureNamesBool[] =
 static struct config_int ConfigureNamesInt[] =
 {
 	{
+		{"time_travel_period", PGC_SIGHUP, AUTOVACUUM,
+			gettext_noop("Specifies time travel period in seconds: 0 disables, -1 infinite"),
+			NULL
+		},
+		&time_travel_period,
+		0, -1, MaxTimeTravelPeriod,
+		NULL, assign_time_travel_period_hook, NULL
+	},
+	{
 		{"archive_timeout", PGC_SIGHUP, WAL_ARCHIVING,
 			gettext_noop("Forces a switch to the next WAL file if a "
 						 "new file has not been started within N seconds."),
@@ -10530,4 +10540,21 @@ show_log_file_mode(void)
 	return buf;
 }
 
+static void assign_time_travel_period_hook(int newval, void *extra)
+{
+	if (newval != 0)
+	{
+		track_commit_timestamp = true;
+		if (newval < 0)
+		{
+			autovacuum_start_daemon = false;
+			/* Do we actually need to adjust freeze horizon? 
+			vacuum_freeze_min_age = MaxTimeTravelPeriod;
+			autovacuum_freeze_max_age = MaxTimeTravelPeriod*2;
+			autovacuum_multixact_freeze_max_age = MaxTimeTravelPeriod*2;
+			*/
+		}
+	}
+}
+
 #include "guc-file.c"
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 0b03290..d0bacd4 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -73,6 +73,7 @@
  * GUC parameters
  */
 int			old_snapshot_threshold; /* number of minutes, -1 disables */
+int         time_travel_period;     /* number of seconds, 0 disables, -1 infinite */
 
 /*
  * Structure for dealing with old_snapshot_threshold implementation.
@@ -244,6 +245,7 @@ typedef struct SerializedSnapshotData
 	bool		takenDuringRecovery;
 	CommandId	curcid;
 	TimestampTz whenTaken;
+	TimestampTz asofTimestamp;
 	XLogRecPtr	lsn;
 } SerializedSnapshotData;
 
@@ -2080,6 +2082,7 @@ SerializeSnapshot(Snapshot snapshot, char *start_address)
 	serialized_snapshot.takenDuringRecovery = snapshot->takenDuringRecovery;
 	serialized_snapshot.curcid = snapshot->curcid;
 	serialized_snapshot.whenTaken = snapshot->whenTaken;
+	serialized_snapshot.asofTimestamp = snapshot->asofTimestamp;
 	serialized_snapshot.lsn = snapshot->lsn;
 
 	/*
@@ -2154,6 +2157,7 @@ RestoreSnapshot(char *start_address)
 	snapshot->takenDuringRecovery = serialized_snapshot.takenDuringRecovery;
 	snapshot->curcid = serialized_snapshot.curcid;
 	snapshot->whenTaken = serialized_snapshot.whenTaken;
+	snapshot->asofTimestamp = serialized_snapshot.asofTimestamp;
 	snapshot->lsn = serialized_snapshot.lsn;
 
 	/* Copy XIDs, if present. */
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 2b218e0..09e067f 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -69,6 +69,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -1476,6 +1477,16 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
 	uint32		i;
 
+	if (snapshot->asofTimestamp != 0)
+	{
+		TimestampTz ts;
+		if (TransactionIdGetCommitTsData(xid, &ts, NULL))
+		{
+			return timestamptz_cmp_internal(snapshot->asofTimestamp, ts) < 0;
+		}
+	}
+
+
 	/*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 86076de..e46f4c6 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -33,6 +33,7 @@
 #define FrozenTransactionId			((TransactionId) 2)
 #define FirstNormalTransactionId	((TransactionId) 3)
 #define MaxTransactionId			((TransactionId) 0xFFFFFFFF)
+#define MaxTimeTravelPeriod         ((TransactionId) 0x3FFFFFFF)
 
 /* ----------------
  *		transaction ID manipulation macros
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index c9a5279..ed923ab 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1120,6 +1120,9 @@ typedef struct ScanState
 	Relation	ss_currentRelation;
 	HeapScanDesc ss_currentScanDesc;
 	TupleTableSlot *ss_ScanTupleSlot;
+	ExprState  *asofExpr;	      /* AS OF expression */
+	bool        asofTimestampSet; /* As OF timestamp evaluated */
+	TimestampTz asofTimestamp;    /* AS OF timestamp or 0 if not set */
 } ScanState;
 
 /* ----------------
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2eaa6b2..b78c8e2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1062,6 +1062,7 @@ typedef struct RangeTblEntry
 	Bitmapset  *insertedCols;	/* columns needing INSERT permission */
 	Bitmapset  *updatedCols;	/* columns needing UPDATE permission */
 	List	   *securityQuals;	/* security barrier quals to apply, if any */
+	Node       *asofTimestamp;  /* AS OF timestamp */
 } RangeTblEntry;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index d763da6..083dc90 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -327,7 +327,8 @@ typedef struct BitmapOr
 typedef struct Scan
 {
 	Plan		plan;
-	Index		scanrelid;		/* relid is index into the range table */
+	Index		scanrelid;	   /* relid is index into the range table */
+	Node       *asofTimestamp; /* AS OF timestamp */
 } Scan;
 
 /* ----------------
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 074ae0a..11e1a0c 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -70,6 +70,7 @@ typedef struct RangeVar
 								 * on children? */
 	char		relpersistence; /* see RELPERSISTENCE_* in pg_class.h */
 	Alias	   *alias;			/* table alias & optional column aliases */
+	Node       *asofTimestamp;  /* expression with AS OF timestamp */
 	int			location;		/* token location, or -1 if unknown */
 } RangeVar;
 
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 565bb3d..b1efb5c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -68,7 +68,8 @@ typedef enum ParseExprKind
 	EXPR_KIND_TRIGGER_WHEN,		/* WHEN condition in CREATE TRIGGER */
 	EXPR_KIND_POLICY,			/* USING or WITH CHECK expr in policy */
 	EXPR_KIND_PARTITION_EXPRESSION,	/* PARTITION BY expression */
-	EXPR_KIND_CALL				/* CALL argument */
+	EXPR_KIND_CALL,				/* CALL argument */
+	EXPR_KIND_ASOF              /* AS OF */
 } ParseExprKind;
 
 
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 8585194..1e1414a 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -47,7 +47,7 @@
 
 /* GUC variables */
 extern PGDLLIMPORT int old_snapshot_threshold;
-
+extern PGDLLIMPORT int time_travel_period;
 
 extern Size SnapMgrShmemSize(void);
 extern void SnapMgrInit(void);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index bf51977..a00f0d9 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -111,6 +111,7 @@ typedef struct SnapshotData
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
 	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
+	TimestampTz asofTimestamp;	/* select AS OF timestamp */
 	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
 } SnapshotData;
 
diff --git a/src/test/regress/asof_schedule b/src/test/regress/asof_schedule
new file mode 100644
index 0000000..9e77b91
--- /dev/null
+++ b/src/test/regress/asof_schedule
@@ -0,0 +1,2 @@
+# src/test/regress/asof_schedule
+test: asof
diff --git a/src/test/regress/expected/asof.out b/src/test/regress/expected/asof.out
new file mode 100644
index 0000000..c2c46ac
--- /dev/null
+++ b/src/test/regress/expected/asof.out
@@ -0,0 +1,185 @@
+-- This test requires postgres to be configured with track_commit_timestamp = on
+-- Please run it using make check EXTRA_REGRESS_OPTS="--schedule=asof_schedule --temp-config=postgresql.asof.config"
+alter system set time_travel_period = 10;
+select pg_reload_conf();
+ pg_reload_conf 
+----------------
+ t
+(1 row)
+
+create table foo(pk int primary key, val int);
+insert into foo values (1,10);
+insert into foo values (2,20);
+insert into foo values (3,30);
+select * from foo;
+ pk | val 
+----+-----
+  1 |  10
+  2 |  20
+  3 |  30
+(3 rows)
+
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=3;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=3;
+select * from foo as of now() - interval '1 second';
+ pk | val 
+----+-----
+  3 |  31
+  1 |  12
+  2 |  22
+(3 rows)
+
+select * from foo as of now() - interval '1 second' where pk=3;
+ pk | val 
+----+-----
+  3 |  31
+(1 row)
+
+select new_foo.val - old_foo.val from foo as old_foo as of now() - interval '1 second' join foo as new_foo on old_foo.pk=new_foo.pk where old_foo.pk=3;
+ ?column? 
+----------
+        1
+(1 row)
+
+select * from foo as of now() - interval '2 seconds';
+ pk | val 
+----+-----
+  2 |  21
+  3 |  31
+  1 |  12
+(3 rows)
+
+select * from foo as of now() - interval '2 seconds' where pk=2;
+ pk | val 
+----+-----
+  2 |  21
+(1 row)
+
+select * from foo as of now() - interval '3 seconds';
+ pk | val 
+----+-----
+  1 |  11
+  2 |  21
+  3 |  31
+(3 rows)
+
+select * from foo as of now() - interval '3 seconds' where pk=1;
+ pk | val 
+----+-----
+  1 |  11
+(1 row)
+
+select * from foo as of now() - interval '4 seconds';
+ pk | val 
+----+-----
+  3 |  30
+  1 |  11
+  2 |  21
+(3 rows)
+
+select * from foo as of now() - interval '4 seconds' where pk=3;
+ pk | val 
+----+-----
+  3 |  30
+(1 row)
+
+select * from foo as of now() - interval '5 seconds';
+ pk | val 
+----+-----
+  2 |  20
+  3 |  30
+  1 |  11
+(3 rows)
+
+select * from foo as of now() - interval '5 seconds' where pk=2;
+ pk | val 
+----+-----
+  2 |  20
+(1 row)
+
+select * from foo as of now() - interval '6 seconds';
+ pk | val 
+----+-----
+  1 |  10
+  2 |  20
+  3 |  30
+(3 rows)
+
+select * from foo as of now() - interval '6 seconds' where pk=1;
+ pk | val 
+----+-----
+  1 |  10
+(1 row)
+
+vacuum foo;
+select * from foo as of now() - interval '6 seconds';
+ pk | val 
+----+-----
+  1 |  10
+  2 |  20
+  3 |  30
+(3 rows)
+
+select pg_sleep(10);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+vacuum foo;
+select * from foo as of now() - interval '10 seconds';
+ pk | val 
+----+-----
+  1 |  12
+  2 |  22
+  3 |  32
+(3 rows)
+
+alter system set time_travel_period = 0;
+select pg_reload_conf();
+ pg_reload_conf 
+----------------
+ t
+(1 row)
+
+drop table foo;
diff --git a/src/test/regress/sql/asof.sql b/src/test/regress/sql/asof.sql
new file mode 100644
index 0000000..baa134c
--- /dev/null
+++ b/src/test/regress/sql/asof.sql
@@ -0,0 +1,43 @@
+-- This test requires postgres to be configured with track_commit_timestamp = on
+-- Please run it using make check EXTRA_REGRESS_OPTS="--schedule=asof_schedule --temp-config=postgresql.asof.config"
+alter system set time_travel_period = 10;
+select pg_reload_conf();
+create table foo(pk int primary key, val int);
+insert into foo values (1,10);
+insert into foo values (2,20);
+insert into foo values (3,30);
+select * from foo;
+select pg_sleep(1);
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+update foo set val=val+1 where pk=3;
+select pg_sleep(1);
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+update foo set val=val+1 where pk=3;
+select * from foo as of now() - interval '1 second';
+select * from foo as of now() - interval '1 second' where pk=3;
+select new_foo.val - old_foo.val from foo as old_foo as of now() - interval '1 second' join foo as new_foo on old_foo.pk=new_foo.pk where old_foo.pk=3;
+select * from foo as of now() - interval '2 seconds';
+select * from foo as of now() - interval '2 seconds' where pk=2;
+select * from foo as of now() - interval '3 seconds';
+select * from foo as of now() - interval '3 seconds' where pk=1;
+select * from foo as of now() - interval '4 seconds';
+select * from foo as of now() - interval '4 seconds' where pk=3;
+select * from foo as of now() - interval '5 seconds';
+select * from foo as of now() - interval '5 seconds' where pk=2;
+select * from foo as of now() - interval '6 seconds';
+select * from foo as of now() - interval '6 seconds' where pk=1;
+vacuum foo;
+select * from foo as of now() - interval '6 seconds';
+select pg_sleep(10);
+vacuum foo;
+select * from foo as of now() - interval '10 seconds';
+
+alter system set time_travel_period = 0;
+select pg_reload_conf();
+drop table foo;
#36Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Konstantin Knizhnik (#35)
Re: AS OF queries

On 12/29/17 06:28, Konstantin Knizhnik wrote:

Can there be apparent RI
violations?

Right now AS OF is used only in selects, not in update statements. So I
do not understand how integrity constraints can be violated.

I mean, if you join tables connected by a foreign key, you can expect a
certain shape of result, for example at least one match per PK row. But
if you select from each table "as of" a different timestamp, then that
won't hold. That could also throw off any optimizations we might come
up with in that area, such as cross-table statistics. Not saying it
can't or shouldn't be done, but there might be some questions.

What happens if no old data for the
selected AS OF is available?

It will just return the version closest to the specified timestamp.

That seems strange. Shouldn't that be an error?

How does this interact with catalog
changes, such as changes to row-level security settings? (Do we apply
the current or the past settings?)

Catalog changes are not currently supported.
And I do not have good understanding how to support it if query involves
two different timeslice with different versions of the table.
Too much places in parser/optimizer have to be change to support such
"historical collisions".

Right, it's probably very hard to do. But I think it somehow should be
recognized that catalog changes took place between the selected
timestamp(s) and now and an error or notice should be produced.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#37legrand legrand
legrand_legrand@hotmail.com
In reply to: Peter Eisentraut (#36)
Re: AS OF queries

Maybe that a simple check of the asof_timestamp value like:

asof_timestamp >= now() - time_travel_period
AND
asof_timestamp >= latest_table_ddl

would permit to raise a warning or an error message saying that query result
can not be garanteed with this asof_timestamp value.

latest_table_ddl being found with

SELECT greatest( max(pg_xact_commit_timestamp( rel.xmin )),
max(pg_xact_commit_timestamp( att.xmin ))) as latest_table_ddl
FROM pg_catalog.pg_attribute att
INNER JOIN pg_catalog.pg_class rel
ON att.attrelid = rel.oid WHERE rel.relname = '<asof_tablename>' and
rel.relowner= ...

(tested with add/alter/drop column and drop/create/truncate table)

Regards
PAscal

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#38Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Peter Eisentraut (#36)
Re: AS OF queries

On 02.01.2018 21:12, Peter Eisentraut wrote:

On 12/29/17 06:28, Konstantin Knizhnik wrote:

Can there be apparent RI
violations?

Right now AS OF is used only in selects, not in update statements. So I
do not understand how integrity constraints can be violated.

I mean, if you join tables connected by a foreign key, you can expect a
certain shape of result, for example at least one match per PK row. But
if you select from each table "as of" a different timestamp, then that
won't hold. That could also throw off any optimizations we might come
up with in that area, such as cross-table statistics. Not saying it
can't or shouldn't be done, but there might be some questions.

Now I understand your statement. Yes, combining different timelines in
the same query can violate integrity constraint.
In theory there can be some query plans which will be executed
incorrectly  because of this constraint violation.
I do not know concrete examples of such plans right now, but I can not
prove that such problem can  not happen.

What happens if no old data for the
selected AS OF is available?

It will just return the version closest to the specified timestamp.

That seems strange. Shouldn't that be an error?

I will add an option raising error in this case.
I do not want to always throw error, because Postgres is very
conservative in reclaiming old space. And the fact that version is not
used by any snapshot doesn't mean that it will be immediately deleted.
So there is still chance to peek-up old data although it is out of the
specified time travel period.

How does this interact with catalog
changes, such as changes to row-level security settings? (Do we apply
the current or the past settings?)

Catalog changes are not currently supported.
And I do not have good understanding how to support it if query involves
two different timeslice with different versions of the table.
Too much places in parser/optimizer have to be change to support such
"historical collisions".

Right, it's probably very hard to do. But I think it somehow should be
recognized that catalog changes took place between the selected
timestamp(s) and now and an error or notice should be produced.

There is one challenge: right now AS OF timestamps are not required to
be constants: them can be calculated dynamically during query execution.
So at the time of query compilation it is not possible to check whether
specified timestamps observe catalog changes or not.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#39Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: legrand legrand (#37)
Re: AS OF queries

On 03.01.2018 23:49, legrand legrand wrote:

Maybe that a simple check of the asof_timestamp value like:

asof_timestamp >= now() - time_travel_period
AND
asof_timestamp >= latest_table_ddl

would permit to raise a warning or an error message saying that query result
can not be garanteed with this asof_timestamp value.

latest_table_ddl being found with

SELECT greatest( max(pg_xact_commit_timestamp( rel.xmin )),
max(pg_xact_commit_timestamp( att.xmin ))) as latest_table_ddl
FROM pg_catalog.pg_attribute att
INNER JOIN pg_catalog.pg_class rel
ON att.attrelid = rel.oid WHERE rel.relname = '<asof_tablename>' and
rel.relowner= ...

(tested with add/alter/drop column and drop/create/truncate table)

Well, it can be done.
But performing this query on each access to the table seems to be bad
idea: in case of nested loop join it can cause significant degrade of
performance.
The obvious solution is to calculate this latest_table_ddl timestamp
once and store it it somewhere (in ScanState?)
But I am not sure that this check is actually needed.
If table is changed in some incompatible way, then we will get error in
any case.
If table change is not critical for this query (for example some column
was added or removed which is not used in this query),
then should we really throw error in this case?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#40legrand legrand
legrand_legrand@hotmail.com
In reply to: Konstantin Knizhnik (#39)
Re: AS OF queries

But performing this query on each access to the table seems to be bad
idea: in case of nested loop join it can cause significant degrade of
performance.

this could be a pre-plan / pre-exec check, no more.

But I am not sure that this check is actually needed.
If table is changed in some incompatible way, then we will get error in
any case.

It seems that with path v3, a query with asof_timestamp
set before a truncate or alter table doesn't throw any error,
just gives an empty result (even if there was data).

If table change is not critical for this query (for example some column
was added or removed which is not used in this query),
then should we really throw error in this case?

no error is needed if result is correct.

Regards
PAscal

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#41Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: legrand legrand (#40)
1 attachment(s)
Re: AS OF queries

On 10.01.2018 16:02, legrand legrand wrote:

But performing this query on each access to the table seems to be bad
idea: in case of nested loop join it can cause significant degrade of
performance.

this could be a pre-plan / pre-exec check, no more.

AS-OF timestamp can be taken from outer table, so it is necessary to
repeat this check at each nested loop join iteration.

But I am not sure that this check is actually needed.
If table is changed in some incompatible way, then we will get error in
any case.

It seems that with path v3, a query with asof_timestamp
set before a truncate or alter table doesn't throw any error,
just gives an empty result (even if there was data).

Sorry, truncate is not compatible with AS OF. It is performed at file
level and deletes old old version.
So if you want to use time travel, you should not use truncate.

If table change is not critical for this query (for example some column
was added or removed which is not used in this query),
then should we really throw error in this case?

no error is needed if result is correct.

Does it mean that no explicit check is needed that table metadata was
not checked after specified timeslice?

Attached please find new version of the AS OF patch which throws error
if specified AS OF timestamp is older that time travel horizon and
"check_asof_timestamp" parameter is set to true (by default it is
switched off).

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

asof-6.patchtext/x-patch; name=asof-6.patchDownload
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 837abc0..3ac7868 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -21,7 +21,8 @@
 #include "executor/executor.h"
 #include "miscadmin.h"
 #include "utils/memutils.h"
-
+#include "utils/snapmgr.h"
+#include "utils/timestamp.h"
 
 
 /*
@@ -296,3 +297,43 @@ ExecScanReScan(ScanState *node)
 		}
 	}
 }
+
+/*
+ * Evaluate ASOF timestamp,
+ * check that it belongs to the time travel period (if specified)
+ * and assign it to snapshot.
+ * This function throws error if specified snapshot is out of
+ * time_travel_period and check_asof_timestamp parameter is true
+ */
+void ExecAsofTimestamp(EState* estate, ScanState* ss)
+{
+	if (ss->asofExpr)
+	{
+		if (!ss->asofTimestampSet)
+		{
+			Datum		val;
+			bool		isNull;
+
+			val = ExecEvalExprSwitchContext(ss->asofExpr,
+											ss->ps.ps_ExprContext,
+											&isNull);
+			if (isNull)
+			{
+				/* Interpret NULL timestamp as no timestamp */
+				ss->asofTimestamp = 0;
+			}
+			else
+			{
+				ss->asofTimestamp = DatumGetInt64(val);
+				if (check_asof_timestamp && time_travel_period > 0)
+				{
+					TimestampTz horizon = GetCurrentTimestamp()	- (TimestampTz)time_travel_period*USECS_PER_SEC;
+					if (timestamptz_cmp_internal(horizon, ss->asofTimestamp) > 0)
+						elog(ERROR, "Specified AS OF timestamp is out of time travel horizon");
+				}
+			}
+			ss->asofTimestampSet = true;
+		}
+		estate->es_snapshot->asofTimestamp = ss->asofTimestamp;
+	}
+}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index eb5bbb5..b880c18 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -78,6 +78,7 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	ExprContext *econtext;
 	HeapScanDesc scan;
 	TIDBitmap  *tbm;
+	EState	   *estate;
 	TBMIterator *tbmiterator = NULL;
 	TBMSharedIterator *shared_tbmiterator = NULL;
 	TBMIterateResult *tbmres;
@@ -85,11 +86,13 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	TupleTableSlot *slot;
 	ParallelBitmapHeapState *pstate = node->pstate;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = 0;
 
 	/*
 	 * extract necessary information from index scan node
 	 */
 	econtext = node->ss.ps.ps_ExprContext;
+	estate = node->ss.ps.state;
 	slot = node->ss.ss_ScanTupleSlot;
 	scan = node->ss.ss_currentScanDesc;
 	tbm = node->tbm;
@@ -99,6 +102,9 @@ BitmapHeapNext(BitmapHeapScanState *node)
 		shared_tbmiterator = node->shared_tbmiterator;
 	tbmres = node->tbmres;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	ExecAsofTimestamp(estate, &node->ss);
+
 	/*
 	 * If we haven't yet performed the underlying index scan, do it, and begin
 	 * the iteration over the bitmap.
@@ -364,11 +370,21 @@ BitmapHeapNext(BitmapHeapScanState *node)
 			}
 		}
 
-		/* OK to return this tuple */
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	    /* OK to return this tuple */
 		return slot;
 	}
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * if we get here it means we are at the end of the scan..
 	 */
 	return ExecClearTuple(slot);
@@ -746,6 +762,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
 {
 	PlanState  *outerPlan = outerPlanState(node);
 
+	node->ss.asofTimestampSet = false;
+
 	/* rescan to release any page pin */
 	heap_rescan(node->ss.ss_currentScanDesc, NULL);
 
@@ -902,7 +920,8 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 	 * most cases it's probably not worth working harder than that.
 	 */
 	scanstate->can_skip_fetch = (node->scan.plan.qual == NIL &&
-								 node->scan.plan.targetlist == NIL);
+								 node->scan.plan.targetlist == NIL &&
+								 node->scan.asofTimestamp == NULL);
 
 	/*
 	 * Miscellaneous initialization
@@ -920,6 +939,18 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
 		ExecInitQual(node->bitmapqualorig, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+										   &scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -1052,11 +1083,21 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ParallelBitmapHeapState *pstate;
 	EState	   *estate = node->ss.ps.state;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
 
 	/* If there's no DSA, there are no workers; initialize nothing. */
 	if (dsa == NULL)
 		return;
 
+	if (scan->asofTimestamp)
+	{
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+										 &node->ss.ps);
+		node->ss.asofTimestampSet = false;
+		ExecAsofTimestamp(estate, &node->ss);
+	}
+
 	pstate = shm_toc_allocate(pcxt->toc, node->pscan_len);
 
 	pstate->tbmiterator = 0;
@@ -1071,6 +1112,8 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 	ConditionVariableInit(&pstate->cv);
 	SerializeSnapshot(estate->es_snapshot, pstate->phs_snapshot_data);
 
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pstate);
 	node->pstate = pstate;
 }
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 2ffef23..58d20bf 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -86,7 +86,7 @@ IndexNext(IndexScanState *node)
 	IndexScanDesc scandesc;
 	HeapTuple	tuple;
 	TupleTableSlot *slot;
-
+	TimestampTz outerAsofTimestamp;
 	/*
 	 * extract necessary information from index scan node
 	 */
@@ -104,6 +104,9 @@ IndexNext(IndexScanState *node)
 	econtext = node->ss.ps.ps_ExprContext;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	ExecAsofTimestamp(estate, &node->ss);
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -160,9 +163,17 @@ IndexNext(IndexScanState *node)
 				continue;
 			}
 		}
+		/*
+		 * Restore ASOF timestamp for the current snapshot
+		 */
+		estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 		return slot;
 	}
+	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 
 	/*
 	 * if we get here it means the index scan failed so we are at the end of
@@ -578,6 +589,8 @@ ExecIndexScan(PlanState *pstate)
 void
 ExecReScanIndexScan(IndexScanState *node)
 {
+	node->ss.asofTimestampSet = false;
+
 	/*
 	 * If we are doing runtime key calculations (ie, any of the index key
 	 * values weren't simple Consts), compute the new key values.  But first,
@@ -918,6 +931,18 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
 		ExecInitExprList(node->indexorderbyorig, (PlanState *) indexstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->scan.asofTimestamp)
+	{
+		indexstate->ss.asofExpr = ExecInitExpr((Expr *) node->scan.asofTimestamp,
+											&indexstate->ss.ps);
+		indexstate->ss.asofTimestampSet = false;
+	}
+	else
+		indexstate->ss.asofExpr = NULL;
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
@@ -1672,12 +1697,24 @@ ExecIndexScanInitializeDSM(IndexScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelIndexScanDesc piscan;
+	TimestampTz outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+									  &node->ss.ps);
+		node->ss.asofTimestampSet = false;
+		ExecAsofTimestamp(estate, &node->ss);
+	}
 
 	piscan = shm_toc_allocate(pcxt->toc, node->iss_PscanLen);
 	index_parallelscan_initialize(node->ss.ss_currentRelation,
 								  node->iss_RelationDesc,
 								  estate->es_snapshot,
 								  piscan);
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
 	node->iss_ScanDesc =
 		index_beginscan_parallel(node->ss.ss_currentRelation,
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index a5bd60e..3fbf46d 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -54,6 +54,7 @@ SeqNext(SeqScanState *node)
 	EState	   *estate;
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	TimestampTz     outerAsofTimestamp;
 
 	/*
 	 * get information from the estate and scan state
@@ -63,6 +64,9 @@ SeqNext(SeqScanState *node)
 	direction = estate->es_direction;
 	slot = node->ss.ss_ScanTupleSlot;
 
+	outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	ExecAsofTimestamp(estate, &node->ss);
+
 	if (scandesc == NULL)
 	{
 		/*
@@ -81,6 +85,11 @@ SeqNext(SeqScanState *node)
 	tuple = heap_getnext(scandesc, direction);
 
 	/*
+	 * Restore ASOF timestamp for the current snapshot
+	 */
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
+	/*
 	 * save the tuple and the buffer returned to us by the access methods in
 	 * our scan tuple slot and return the slot.  Note: we pass 'false' because
 	 * tuples returned by heap_getnext() are pointers onto disk pages and were
@@ -196,6 +205,19 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 		ExecInitQual(node->plan.qual, (PlanState *) scanstate);
 
 	/*
+	 * Initlialize AS OF expression of any
+	 */
+	if (node->asofTimestamp)
+	{
+		scanstate->ss.asofExpr = ExecInitExpr((Expr *) node->asofTimestamp,
+											&scanstate->ss.ps);
+		scanstate->ss.asofTimestampSet = false;
+	}
+	else
+		scanstate->ss.asofExpr = NULL;
+
+
+	/*
 	 * tuple table initialization
 	 */
 	ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
@@ -273,6 +295,7 @@ ExecReScanSeqScan(SeqScanState *node)
 	HeapScanDesc scan;
 
 	scan = node->ss.ss_currentScanDesc;
+	node->ss.asofTimestampSet = false;
 
 	if (scan != NULL)
 		heap_rescan(scan,		/* scan desc */
@@ -316,11 +339,24 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelHeapScanDesc pscan;
+	TimestampTz     outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
+	Scan* scan = (Scan*)node->ss.ps.plan;
+
+	if (scan->asofTimestamp)
+	{
+		node->ss.asofExpr = ExecInitExpr((Expr *) scan->asofTimestamp,
+										 &node->ss.ps);
+		node->ss.asofTimestampSet = false;
+		ExecAsofTimestamp(estate, &node->ss);
+	}
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	heap_parallelscan_initialize(pscan,
 								 node->ss.ss_currentRelation,
 								 estate->es_snapshot);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
+
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
 	node->ss.ss_currentScanDesc =
 		heap_beginscan_parallel(node->ss.ss_currentRelation, pscan);
@@ -337,8 +373,13 @@ ExecSeqScanReInitializeDSM(SeqScanState *node,
 						   ParallelContext *pcxt)
 {
 	HeapScanDesc scan = node->ss.ss_currentScanDesc;
+	EState	   *estate = node->ss.ps.state;
+	TimestampTz  outerAsofTimestamp = estate->es_snapshot->asofTimestamp;
 
+	ExecAsofTimestamp(estate, &node->ss);
 	heap_parallelscan_reinitialize(scan->rs_parallel);
+
+	estate->es_snapshot->asofTimestamp = outerAsofTimestamp;
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 84d7171..259d991 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -410,6 +410,7 @@ CopyScanFields(const Scan *from, Scan *newnode)
 	CopyPlanFields((const Plan *) from, (Plan *) newnode);
 
 	COPY_SCALAR_FIELD(scanrelid);
+	COPY_NODE_FIELD(asofTimestamp);
 }
 
 /*
@@ -1216,6 +1217,7 @@ _copyRangeVar(const RangeVar *from)
 	COPY_SCALAR_FIELD(relpersistence);
 	COPY_NODE_FIELD(alias);
 	COPY_LOCATION_FIELD(location);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
@@ -2326,6 +2328,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_BITMAPSET_FIELD(insertedCols);
 	COPY_BITMAPSET_FIELD(updatedCols);
 	COPY_NODE_FIELD(securityQuals);
+	COPY_NODE_FIELD(asofTimestamp);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 2e869a9..8ee4228 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -112,6 +112,7 @@ _equalRangeVar(const RangeVar *a, const RangeVar *b)
 	COMPARE_SCALAR_FIELD(relpersistence);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_LOCATION_FIELD(location);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
@@ -2661,6 +2662,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_BITMAPSET_FIELD(insertedCols);
 	COMPARE_BITMAPSET_FIELD(updatedCols);
 	COMPARE_NODE_FIELD(securityQuals);
+	COMPARE_NODE_FIELD(asofTimestamp);
 
 	return true;
 }
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index c2a93b2..0ace44d 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2338,6 +2338,10 @@ range_table_walker(List *rtable,
 
 		if (walker(rte->securityQuals, context))
 			return true;
+
+		if (walker(rte->asofTimestamp, context))
+			return true;
+
 	}
 	return false;
 }
@@ -3161,6 +3165,7 @@ range_table_mutator(List *rtable,
 				break;
 		}
 		MUTATE(newrte->securityQuals, rte->securityQuals, List *);
+		MUTATE(newrte->asofTimestamp, rte->asofTimestamp, Node *);
 		newrt = lappend(newrt, newrte);
 	}
 	return newrt;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e468d7c..3ee00f3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3105,6 +3105,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 	WRITE_BITMAPSET_FIELD(insertedCols);
 	WRITE_BITMAPSET_FIELD(updatedCols);
 	WRITE_NODE_FIELD(securityQuals);
+	WRITE_NODE_FIELD(asofTimestamp);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1133c70..cf7c637 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1399,6 +1399,7 @@ _readRangeTblEntry(void)
 	READ_BITMAPSET_FIELD(insertedCols);
 	READ_BITMAPSET_FIELD(updatedCols);
 	READ_NODE_FIELD(securityQuals);
+	READ_NODE_FIELD(asofTimestamp);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 1a9fd82..713f9b3 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -168,10 +168,10 @@ static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
 static SampleScan *make_samplescan(List *qptlist, List *qpqual, Index scanrelid,
 				TableSampleClause *tsc);
 static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
-			   Oid indexid, List *indexqual, List *indexqualorig,
-			   List *indexorderby, List *indexorderbyorig,
-			   List *indexorderbyops,
-			   ScanDirection indexscandir);
+								 Oid indexid, List *indexqual, List *indexqualorig,
+								 List *indexorderby, List *indexorderbyorig,
+								 List *indexorderbyops,
+								 ScanDirection indexscandir);
 static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
 				   Index scanrelid, Oid indexid,
 				   List *indexqual, List *indexorderby,
@@ -509,6 +509,7 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 	List	   *gating_clauses;
 	List	   *tlist;
 	Plan	   *plan;
+	RangeTblEntry *rte;
 
 	/*
 	 * Extract the relevant restriction clauses from the parent relation. The
@@ -709,6 +710,12 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 			break;
 	}
 
+	if (plan != NULL)
+	{
+		rte = planner_rt_fetch(rel->relid, root);
+		((Scan*)plan)->asofTimestamp = rte->asofTimestamp;
+	}
+
 	/*
 	 * If there are any pseudoconstant clauses attached to this node, insert a
 	 * gating Result node that evaluates the pseudoconstants as one-time
@@ -2434,7 +2441,7 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 	Assert(scan_relid > 0);
 	Assert(best_path->parent->rtekind == RTE_RELATION);
 
-	/* Sort clauses into best execution order */
+    /* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 382791f..ceb6542 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -84,6 +84,7 @@ create_upper_paths_hook_type create_upper_paths_hook = NULL;
 #define EXPRKIND_ARBITER_ELEM		10
 #define EXPRKIND_TABLEFUNC			11
 #define EXPRKIND_TABLEFUNC_LATERAL	12
+#define EXPRKIND_ASOF	            13
 
 /* Passthrough data for standard_qp_callback */
 typedef struct
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ebfc94f..a642e28 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -449,7 +449,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type <node>	fetch_args limit_clause select_limit_value
 				offset_clause select_offset_value
-				select_offset_value2 opt_select_fetch_first_value
+				select_offset_value2 opt_select_fetch_first_value opt_asof_clause
 %type <ival>	row_or_rows first_or_next
 
 %type <list>	OptSeqOptList SeqOptList OptParenthesizedSeqOptList
@@ -704,7 +704,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * as NOT, at least with respect to their left-hand subexpression.
  * NULLS_LA and WITH_LA are needed to make the grammar LALR(1).
  */
-%token		NOT_LA NULLS_LA WITH_LA
+%token		NOT_LA NULLS_LA WITH_LA AS_LA
 
 
 /* Precedence: lowest to highest */
@@ -11720,9 +11720,10 @@ from_list:
 /*
  * table_ref is where an alias clause can be attached.
  */
-table_ref:	relation_expr opt_alias_clause
+table_ref:	relation_expr opt_alias_clause opt_asof_clause
 				{
 					$1->alias = $2;
+					$1->asofTimestamp = $3;
 					$$ = (Node *) $1;
 				}
 			| relation_expr opt_alias_clause tablesample_clause
@@ -11948,6 +11949,10 @@ opt_alias_clause: alias_clause						{ $$ = $1; }
 			| /*EMPTY*/								{ $$ = NULL; }
 		;
 
+opt_asof_clause: AS_LA OF a_expr                    { $$ = $3; }
+			| /*EMPTY*/								{ $$ = NULL; }
+		;
+
 /*
  * func_alias_clause can include both an Alias and a coldeflist, so we make it
  * return a 2-element list that gets disassembled by calling production.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4c4f4cd..6c3e506 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -439,6 +439,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
@@ -856,6 +857,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 2828bbf..a23f3d8 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -426,7 +426,11 @@ transformTableEntry(ParseState *pstate, RangeVar *r)
 
 	/* We need only build a range table entry */
 	rte = addRangeTableEntry(pstate, r, r->alias, r->inh, true);
-
+	if (r->asofTimestamp)
+	{
+		Node* asof = transformExpr(pstate, r->asofTimestamp, EXPR_KIND_ASOF);
+		rte->asofTimestamp = coerce_to_specific_type(pstate, asof, TIMESTAMPTZOID, "ASOF");
+	}
 	return rte;
 }
 
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da7..cd83fc3 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1818,6 +1818,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
 		case EXPR_KIND_VALUES:
 		case EXPR_KIND_VALUES_SINGLE:
 		case EXPR_KIND_CALL:
+		case EXPR_KIND_ASOF:
 			/* okay */
 			break;
 		case EXPR_KIND_CHECK_CONSTRAINT:
@@ -3470,6 +3471,8 @@ ParseExprKindName(ParseExprKind exprKind)
 			return "PARTITION BY";
 		case EXPR_KIND_CALL:
 			return "CALL";
+		case EXPR_KIND_ASOF:
+			return "ASOF";
 
 			/*
 			 * There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e6b0856..a6bcfc7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2250,6 +2250,7 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
 			break;
 		case EXPR_KIND_LIMIT:
 		case EXPR_KIND_OFFSET:
+		case EXPR_KIND_ASOF:
 			errkind = true;
 			break;
 		case EXPR_KIND_RETURNING:
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 58bdb23..ddf6af4 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1206,6 +1206,7 @@ addRangeTableEntry(ParseState *pstate,
 
 	rte->rtekind = RTE_RELATION;
 	rte->alias = alias;
+	rte->asofTimestamp = relation->asofTimestamp;
 
 	/*
 	 * Get the rel's OID.  This access also ensures that we have an up-to-date
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 245b4cd..a3845b5 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -108,6 +108,9 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	 */
 	switch (cur_token)
 	{
+		case AS:
+			cur_token_length = 2;
+			break;
 		case NOT:
 			cur_token_length = 3;
 			break;
@@ -155,6 +158,10 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	/* Replace cur_token if needed, based on lookahead */
 	switch (cur_token)
 	{
+		case AS:
+		    if (next_token == OF)
+			    cur_token = AS_LA;
+		    break;
 		case NOT:
 			/* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
 			switch (next_token)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index d87799c..945f782 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -51,6 +51,7 @@
 #include "access/twophase.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "catalog/catalog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -91,6 +92,9 @@ typedef struct ProcArrayStruct
 	/* oldest catalog xmin of any replication slot */
 	TransactionId replication_slot_catalog_xmin;
 
+	TransactionId time_travel_xmin;
+	TimestampTz   time_travel_horizon;
+
 	/* indexes into allPgXact[], has PROCARRAY_MAXPROCS entries */
 	int			pgprocnos[FLEXIBLE_ARRAY_MEMBER];
 } ProcArrayStruct;
@@ -1256,6 +1260,87 @@ TransactionIdIsActive(TransactionId xid)
 	return result;
 }
 
+/*
+ * Get minimal XID which belongs to time travel period.
+ * This function tries to adjust current time travel horizon.
+ * It is commit_ts SLRU to map xids to timestamps. As far as order of XIDs doesn't match with order of timestamps,
+ * this function may produce no quite correct results in case of presence of long living transaction.
+ * So time travel period specification is not exact and should consider maximal transaction duration.
+ *
+ * Passed time_travel_xmin&time_travel_horizon are taken from procarray under lock.
+ */
+static TransactionId
+GetTimeTravelXmin(TransactionId oldestXmin, TransactionId time_travel_xmin, TimestampTz time_travel_horizon)
+{
+	if (time_travel_period < 0)
+	{
+		/* Infinite history */
+		oldestXmin -= MaxTimeTravelPeriod;
+	}
+	else
+	{
+		/* Limited history: check time travel horizon */
+		TimestampTz new_horizon = GetCurrentTimestamp()	- (TimestampTz)time_travel_period*USECS_PER_SEC;
+		TransactionId old_xmin = time_travel_xmin;
+
+		if (time_travel_xmin != InvalidTransactionId)
+		{
+			/* We have already determined time travel horizon: check if it needs to be adjusted */
+			TimestampTz old_horizon = time_travel_horizon;
+			TransactionId xid = old_xmin;
+
+			while (timestamptz_cmp_internal(old_horizon, new_horizon) < 0)
+			{
+				/* Move horizon forward */
+				time_travel_xmin  = xid;
+				time_travel_horizon = old_horizon;
+				do {
+					TransactionIdAdvance(xid);
+					/* Stop if we reach oldest xmin */
+					if (TransactionIdFollowsOrEquals(xid, oldestXmin))
+						goto EndScan;
+				} while (!TransactionIdGetCommitTsData(xid, &old_horizon, NULL));
+			}
+		}
+		else
+		{
+			/* Find out time travel horizon */
+			TransactionId xid = oldestXmin;
+
+			do {
+				TransactionIdRetreat(xid);
+				/*
+				 * Lack of information about transaction timestamp in SLRU means that we reach unexisted or untracked transaction,
+				 * so we need to stop traversal in this case
+				 */
+				if (!TransactionIdGetCommitTsData(xid, &time_travel_horizon, NULL))
+					goto EndScan;
+				time_travel_xmin = xid;
+			} while (timestamptz_cmp_internal(time_travel_horizon, new_horizon) > 0);
+		}
+	  EndScan:
+		if (old_xmin != time_travel_xmin)
+		{
+			/* Horizon moved */
+			LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+			/* Recheck under lock that xmin is advanced */
+			if (TransactionIdPrecedes(procArray->time_travel_xmin, time_travel_xmin))
+			{
+				procArray->time_travel_xmin = time_travel_xmin;
+				procArray->time_travel_horizon = time_travel_horizon;
+			}
+			LWLockRelease(ProcArrayLock);
+		}
+		/* Move oldest xmin in the past if it is required for time travel */
+		if (TransactionIdPrecedes(time_travel_xmin, oldestXmin))
+			oldestXmin = time_travel_xmin;
+	}
+
+	if (!TransactionIdIsNormal(oldestXmin))
+		oldestXmin = FirstNormalTransactionId;
+
+	return oldestXmin;
+}
 
 /*
  * GetOldestXmin -- returns oldest transaction that was running
@@ -1321,6 +1406,8 @@ GetOldestXmin(Relation rel, int flags)
 
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	volatile TransactionId time_travel_xmin;
+	TimestampTz time_travel_horizon;
 
 	/*
 	 * If we're not computing a relation specific limit, or if a shared
@@ -1383,6 +1470,9 @@ GetOldestXmin(Relation rel, int flags)
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
 
+	time_travel_xmin = procArray->time_travel_xmin;
+	time_travel_horizon = procArray->time_travel_horizon;
+
 	if (RecoveryInProgress())
 	{
 		/*
@@ -1423,6 +1513,9 @@ GetOldestXmin(Relation rel, int flags)
 			result = FirstNormalTransactionId;
 	}
 
+	if (time_travel_period != 0)
+		result = GetTimeTravelXmin(result, time_travel_xmin, time_travel_horizon);
+
 	/*
 	 * Check whether there are replication slots requiring an older xmin.
 	 */
@@ -1469,6 +1562,7 @@ GetMaxSnapshotSubxidCount(void)
 	return TOTAL_MAX_CACHED_SUBXIDS;
 }
 
+
 /*
  * GetSnapshotData -- returns information about running transactions.
  *
@@ -1518,6 +1612,8 @@ GetSnapshotData(Snapshot snapshot)
 	bool		suboverflowed = false;
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	volatile TransactionId time_travel_xmin;
+	TimestampTz time_travel_horizon;
 
 	Assert(snapshot != NULL);
 
@@ -1707,6 +1803,9 @@ GetSnapshotData(Snapshot snapshot)
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
 
+	time_travel_xmin = procArray->time_travel_xmin;
+	time_travel_horizon = procArray->time_travel_horizon;
+
 	if (!TransactionIdIsValid(MyPgXact->xmin))
 		MyPgXact->xmin = TransactionXmin = xmin;
 
@@ -1730,6 +1829,9 @@ GetSnapshotData(Snapshot snapshot)
 		NormalTransactionIdPrecedes(replication_slot_xmin, RecentGlobalXmin))
 		RecentGlobalXmin = replication_slot_xmin;
 
+	if (time_travel_period != 0)
+		RecentGlobalXmin = GetTimeTravelXmin(RecentGlobalXmin, time_travel_xmin, time_travel_horizon);
+
 	/* Non-catalog tables can be vacuumed if older than this xid */
 	RecentGlobalDataXmin = RecentGlobalXmin;
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e32901d..a155f51 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -191,6 +191,7 @@ static void assign_application_name(const char *newval, void *extra);
 static bool check_cluster_name(char **newval, void **extra, GucSource source);
 static const char *show_unix_socket_permissions(void);
 static const char *show_log_file_mode(void);
+static void assign_time_travel_period_hook(int newval, void *extra);
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -1702,7 +1703,15 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
-
+	{
+		{"check_asof_timestamp", PGC_USERSET, AUTOVACUUM,
+		    gettext_noop("Controls whether AS OF timestamp specified in query should be checked for belonging to time travel period"),
+			gettext_noop("There is no warranty that versions outside time travel period are not reclaimed. But Postgres performs cleanup very lazily, so there is large enough probability that version outside time travel interval is still alive. Also this check adds some extra runtime overhead, because it needs to get current system time.")
+		},
+		&check_asof_timestamp,
+		false,
+		NULL, NULL, NULL
+	},
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL, NULL
@@ -1713,6 +1722,15 @@ static struct config_bool ConfigureNamesBool[] =
 static struct config_int ConfigureNamesInt[] =
 {
 	{
+		{"time_travel_period", PGC_SIGHUP, AUTOVACUUM,
+			gettext_noop("Specifies time travel period in seconds: 0 disables, -1 infinite"),
+			NULL
+		},
+		&time_travel_period,
+		0, -1, MaxTimeTravelPeriod,
+		NULL, assign_time_travel_period_hook, NULL
+	},
+	{
 		{"archive_timeout", PGC_SIGHUP, WAL_ARCHIVING,
 			gettext_noop("Forces a switch to the next WAL file if a "
 						 "new file has not been started within N seconds."),
@@ -10530,4 +10548,21 @@ show_log_file_mode(void)
 	return buf;
 }
 
+static void assign_time_travel_period_hook(int newval, void *extra)
+{
+	if (newval != 0)
+	{
+		track_commit_timestamp = true;
+		if (newval < 0)
+		{
+			autovacuum_start_daemon = false;
+			/* Do we actually need to adjust freeze horizon? 
+			vacuum_freeze_min_age = MaxTimeTravelPeriod;
+			autovacuum_freeze_max_age = MaxTimeTravelPeriod*2;
+			autovacuum_multixact_freeze_max_age = MaxTimeTravelPeriod*2;
+			*/
+		}
+	}
+}
+
 #include "guc-file.c"
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 0b03290..fd66b83 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -73,6 +73,9 @@
  * GUC parameters
  */
 int			old_snapshot_threshold; /* number of minutes, -1 disables */
+int         time_travel_period;     /* number of seconds, 0 disables, -1 infinite */
+bool        check_asof_timestamp;   /* should we throw error if specified timestamp is out of time_travel_period */
+
 
 /*
  * Structure for dealing with old_snapshot_threshold implementation.
@@ -244,6 +247,7 @@ typedef struct SerializedSnapshotData
 	bool		takenDuringRecovery;
 	CommandId	curcid;
 	TimestampTz whenTaken;
+	TimestampTz asofTimestamp;
 	XLogRecPtr	lsn;
 } SerializedSnapshotData;
 
@@ -2080,6 +2084,7 @@ SerializeSnapshot(Snapshot snapshot, char *start_address)
 	serialized_snapshot.takenDuringRecovery = snapshot->takenDuringRecovery;
 	serialized_snapshot.curcid = snapshot->curcid;
 	serialized_snapshot.whenTaken = snapshot->whenTaken;
+	serialized_snapshot.asofTimestamp = snapshot->asofTimestamp;
 	serialized_snapshot.lsn = snapshot->lsn;
 
 	/*
@@ -2154,6 +2159,7 @@ RestoreSnapshot(char *start_address)
 	snapshot->takenDuringRecovery = serialized_snapshot.takenDuringRecovery;
 	snapshot->curcid = serialized_snapshot.curcid;
 	snapshot->whenTaken = serialized_snapshot.whenTaken;
+	snapshot->asofTimestamp = serialized_snapshot.asofTimestamp;
 	snapshot->lsn = serialized_snapshot.lsn;
 
 	/* Copy XIDs, if present. */
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 2b218e0..09e067f 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -69,6 +69,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/commit_ts.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -1476,6 +1477,16 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
 	uint32		i;
 
+	if (snapshot->asofTimestamp != 0)
+	{
+		TimestampTz ts;
+		if (TransactionIdGetCommitTsData(xid, &ts, NULL))
+		{
+			return timestamptz_cmp_internal(snapshot->asofTimestamp, ts) < 0;
+		}
+	}
+
+
 	/*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 86076de..e46f4c6 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -33,6 +33,7 @@
 #define FrozenTransactionId			((TransactionId) 2)
 #define FirstNormalTransactionId	((TransactionId) 3)
 #define MaxTransactionId			((TransactionId) 0xFFFFFFFF)
+#define MaxTimeTravelPeriod         ((TransactionId) 0x3FFFFFFF)
 
 /* ----------------
  *		transaction ID manipulation macros
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 2cc74da..13ebd7c 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -553,4 +553,6 @@ extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
 extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname);
 
+extern void ExecAsofTimestamp(EState* estate, ScanState* ss);
+
 #endif							/* EXECUTOR_H  */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index c9a5279..ed923ab 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1120,6 +1120,9 @@ typedef struct ScanState
 	Relation	ss_currentRelation;
 	HeapScanDesc ss_currentScanDesc;
 	TupleTableSlot *ss_ScanTupleSlot;
+	ExprState  *asofExpr;	      /* AS OF expression */
+	bool        asofTimestampSet; /* As OF timestamp evaluated */
+	TimestampTz asofTimestamp;    /* AS OF timestamp or 0 if not set */
 } ScanState;
 
 /* ----------------
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2eaa6b2..b78c8e2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1062,6 +1062,7 @@ typedef struct RangeTblEntry
 	Bitmapset  *insertedCols;	/* columns needing INSERT permission */
 	Bitmapset  *updatedCols;	/* columns needing UPDATE permission */
 	List	   *securityQuals;	/* security barrier quals to apply, if any */
+	Node       *asofTimestamp;  /* AS OF timestamp */
 } RangeTblEntry;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index d763da6..083dc90 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -327,7 +327,8 @@ typedef struct BitmapOr
 typedef struct Scan
 {
 	Plan		plan;
-	Index		scanrelid;		/* relid is index into the range table */
+	Index		scanrelid;	   /* relid is index into the range table */
+	Node       *asofTimestamp; /* AS OF timestamp */
 } Scan;
 
 /* ----------------
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 074ae0a..11e1a0c 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -70,6 +70,7 @@ typedef struct RangeVar
 								 * on children? */
 	char		relpersistence; /* see RELPERSISTENCE_* in pg_class.h */
 	Alias	   *alias;			/* table alias & optional column aliases */
+	Node       *asofTimestamp;  /* expression with AS OF timestamp */
 	int			location;		/* token location, or -1 if unknown */
 } RangeVar;
 
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 565bb3d..b1efb5c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -68,7 +68,8 @@ typedef enum ParseExprKind
 	EXPR_KIND_TRIGGER_WHEN,		/* WHEN condition in CREATE TRIGGER */
 	EXPR_KIND_POLICY,			/* USING or WITH CHECK expr in policy */
 	EXPR_KIND_PARTITION_EXPRESSION,	/* PARTITION BY expression */
-	EXPR_KIND_CALL				/* CALL argument */
+	EXPR_KIND_CALL,				/* CALL argument */
+	EXPR_KIND_ASOF              /* AS OF */
 } ParseExprKind;
 
 
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 8585194..bda3a85 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -47,7 +47,8 @@
 
 /* GUC variables */
 extern PGDLLIMPORT int old_snapshot_threshold;
-
+extern PGDLLIMPORT int time_travel_period;
+extern PGDLLIMPORT bool check_asof_timestamp;
 
 extern Size SnapMgrShmemSize(void);
 extern void SnapMgrInit(void);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index bf51977..a00f0d9 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -111,6 +111,7 @@ typedef struct SnapshotData
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
 	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
+	TimestampTz asofTimestamp;	/* select AS OF timestamp */
 	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
 } SnapshotData;
 
diff --git a/src/test/regress/asof_schedule b/src/test/regress/asof_schedule
new file mode 100644
index 0000000..9e77b91
--- /dev/null
+++ b/src/test/regress/asof_schedule
@@ -0,0 +1,2 @@
+# src/test/regress/asof_schedule
+test: asof
diff --git a/src/test/regress/expected/asof.out b/src/test/regress/expected/asof.out
new file mode 100644
index 0000000..c2c46ac
--- /dev/null
+++ b/src/test/regress/expected/asof.out
@@ -0,0 +1,185 @@
+-- This test requires postgres to be configured with track_commit_timestamp = on
+-- Please run it using make check EXTRA_REGRESS_OPTS="--schedule=asof_schedule --temp-config=postgresql.asof.config"
+alter system set time_travel_period = 10;
+select pg_reload_conf();
+ pg_reload_conf 
+----------------
+ t
+(1 row)
+
+create table foo(pk int primary key, val int);
+insert into foo values (1,10);
+insert into foo values (2,20);
+insert into foo values (3,30);
+select * from foo;
+ pk | val 
+----+-----
+  1 |  10
+  2 |  20
+  3 |  30
+(3 rows)
+
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=3;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+update foo set val=val+1 where pk=3;
+select * from foo as of now() - interval '1 second';
+ pk | val 
+----+-----
+  3 |  31
+  1 |  12
+  2 |  22
+(3 rows)
+
+select * from foo as of now() - interval '1 second' where pk=3;
+ pk | val 
+----+-----
+  3 |  31
+(1 row)
+
+select new_foo.val - old_foo.val from foo as old_foo as of now() - interval '1 second' join foo as new_foo on old_foo.pk=new_foo.pk where old_foo.pk=3;
+ ?column? 
+----------
+        1
+(1 row)
+
+select * from foo as of now() - interval '2 seconds';
+ pk | val 
+----+-----
+  2 |  21
+  3 |  31
+  1 |  12
+(3 rows)
+
+select * from foo as of now() - interval '2 seconds' where pk=2;
+ pk | val 
+----+-----
+  2 |  21
+(1 row)
+
+select * from foo as of now() - interval '3 seconds';
+ pk | val 
+----+-----
+  1 |  11
+  2 |  21
+  3 |  31
+(3 rows)
+
+select * from foo as of now() - interval '3 seconds' where pk=1;
+ pk | val 
+----+-----
+  1 |  11
+(1 row)
+
+select * from foo as of now() - interval '4 seconds';
+ pk | val 
+----+-----
+  3 |  30
+  1 |  11
+  2 |  21
+(3 rows)
+
+select * from foo as of now() - interval '4 seconds' where pk=3;
+ pk | val 
+----+-----
+  3 |  30
+(1 row)
+
+select * from foo as of now() - interval '5 seconds';
+ pk | val 
+----+-----
+  2 |  20
+  3 |  30
+  1 |  11
+(3 rows)
+
+select * from foo as of now() - interval '5 seconds' where pk=2;
+ pk | val 
+----+-----
+  2 |  20
+(1 row)
+
+select * from foo as of now() - interval '6 seconds';
+ pk | val 
+----+-----
+  1 |  10
+  2 |  20
+  3 |  30
+(3 rows)
+
+select * from foo as of now() - interval '6 seconds' where pk=1;
+ pk | val 
+----+-----
+  1 |  10
+(1 row)
+
+vacuum foo;
+select * from foo as of now() - interval '6 seconds';
+ pk | val 
+----+-----
+  1 |  10
+  2 |  20
+  3 |  30
+(3 rows)
+
+select pg_sleep(10);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+vacuum foo;
+select * from foo as of now() - interval '10 seconds';
+ pk | val 
+----+-----
+  1 |  12
+  2 |  22
+  3 |  32
+(3 rows)
+
+alter system set time_travel_period = 0;
+select pg_reload_conf();
+ pg_reload_conf 
+----------------
+ t
+(1 row)
+
+drop table foo;
diff --git a/src/test/regress/sql/asof.sql b/src/test/regress/sql/asof.sql
new file mode 100644
index 0000000..baa134c
--- /dev/null
+++ b/src/test/regress/sql/asof.sql
@@ -0,0 +1,43 @@
+-- This test requires postgres to be configured with track_commit_timestamp = on
+-- Please run it using make check EXTRA_REGRESS_OPTS="--schedule=asof_schedule --temp-config=postgresql.asof.config"
+alter system set time_travel_period = 10;
+select pg_reload_conf();
+create table foo(pk int primary key, val int);
+insert into foo values (1,10);
+insert into foo values (2,20);
+insert into foo values (3,30);
+select * from foo;
+select pg_sleep(1);
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+update foo set val=val+1 where pk=3;
+select pg_sleep(1);
+update foo set val=val+1 where pk=1;
+select pg_sleep(1);
+update foo set val=val+1 where pk=2;
+select pg_sleep(1);
+update foo set val=val+1 where pk=3;
+select * from foo as of now() - interval '1 second';
+select * from foo as of now() - interval '1 second' where pk=3;
+select new_foo.val - old_foo.val from foo as old_foo as of now() - interval '1 second' join foo as new_foo on old_foo.pk=new_foo.pk where old_foo.pk=3;
+select * from foo as of now() - interval '2 seconds';
+select * from foo as of now() - interval '2 seconds' where pk=2;
+select * from foo as of now() - interval '3 seconds';
+select * from foo as of now() - interval '3 seconds' where pk=1;
+select * from foo as of now() - interval '4 seconds';
+select * from foo as of now() - interval '4 seconds' where pk=3;
+select * from foo as of now() - interval '5 seconds';
+select * from foo as of now() - interval '5 seconds' where pk=2;
+select * from foo as of now() - interval '6 seconds';
+select * from foo as of now() - interval '6 seconds' where pk=1;
+vacuum foo;
+select * from foo as of now() - interval '6 seconds';
+select pg_sleep(10);
+vacuum foo;
+select * from foo as of now() - interval '10 seconds';
+
+alter system set time_travel_period = 0;
+select pg_reload_conf();
+drop table foo;
#42legrand legrand
legrand_legrand@hotmail.com
In reply to: Konstantin Knizhnik (#41)
Re: AS OF queries

Sorry, truncate is not compatible with AS OF. It is performed at file
level and deletes old old version.
So if you want to use time travel, you should not use truncate.

As time travel doesn't support truncate, I would prefer it to be checked.
If no check is performed, ASOF queries (with timestamp before truncate )
would return no data even when there was: this could be considered as a
wrong result.

if a truncate is detected, an error should be raised, saying data is no more
available before truncate timestamp.

Does it mean that no explicit check is needed that table metadata was
not checked after specified timeslice?

Not sure, it would depend on metadata modification type ...
adding/dropping a columns seems working,
what about altering a column type or dropping / recreating a table ?

Regards
PAscal

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#43Bruce Momjian
bruce@momjian.us
In reply to: konstantin knizhnik (#18)
Re: AS OF queries

On Sat, Dec 23, 2017 at 11:53:19PM +0300, konstantin knizhnik wrote:

On Dec 23, 2017, at 2:08 AM, Greg Stark wrote:

On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)

"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"

I think you'll find it a lot harder to get this to work than just
disabling autovacuum. Notably HOT updates can get cleaned up (and even
non-HOT updates can now leave tombstone dead line pointers iirc) even
if vacuum hasn't run.

Yeh, I suspected that just disabling autovacuum was not enough.
I heard (but do no know too much) about microvacuum and hot updates.
This is why I was a little bit surprised when me test didn't show lost of updated versions.
May be it is because of vacuum_defer_cleanup_age.

Well vacuum and single-page pruning do 3 things:

1. remove expired updated rows
2. remove deleted row
3. remove rows from aborted transactions

While time travel doesn't want #1 and #2, it probably wants #3.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#44Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Bruce Momjian (#43)
Re: AS OF queries

On 26.01.2018 03:55, Bruce Momjian wrote:

On Sat, Dec 23, 2017 at 11:53:19PM +0300, konstantin knizhnik wrote:

On Dec 23, 2017, at 2:08 AM, Greg Stark wrote:

On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:

It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)

"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"

I think you'll find it a lot harder to get this to work than just
disabling autovacuum. Notably HOT updates can get cleaned up (and even
non-HOT updates can now leave tombstone dead line pointers iirc) even
if vacuum hasn't run.

Yeh, I suspected that just disabling autovacuum was not enough.
I heard (but do no know too much) about microvacuum and hot updates.
This is why I was a little bit surprised when me test didn't show lost of updated versions.
May be it is because of vacuum_defer_cleanup_age.

Well vacuum and single-page pruning do 3 things:

1. remove expired updated rows
2. remove deleted row
3. remove rows from aborted transactions

While time travel doesn't want #1 and #2, it probably wants #3.

Rows of aborted transactions are in any case excluded by visibility checks.
Definitely skipping them costs some time, so large percent of aborted
transactions  may affect query speed.
But query speed is reduced in any case if in order to support time
travel we prohibit or postpone vacuum.

What is the expected relation of committed and aborted transactions? I
expected that it should be much bigger than one (especially if we take
in account
only read-write transaction which has really updated database). In this
case number of versions created by aborted transaction should be much
smaller than number of versions created by updated/delete of successful
transactions. So them should not have significant impact on performance.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#45Bruce Momjian
bruce@momjian.us
In reply to: Konstantin Knizhnik (#44)
Re: AS OF queries

On Fri, Jan 26, 2018 at 10:56:06AM +0300, Konstantin Knizhnik wrote:

Yeh, I suspected that just disabling autovacuum was not enough.
I heard (but do no know too much) about microvacuum and hot updates.
This is why I was a little bit surprised when me test didn't show lost of updated versions.
May be it is because of vacuum_defer_cleanup_age.

Well vacuum and single-page pruning do 3 things:

1. remove expired updated rows
2. remove deleted row
3. remove rows from aborted transactions

While time travel doesn't want #1 and #2, it probably wants #3.

Rows of aborted transactions are in any case excluded by visibility checks.
Definitely skipping them costs some time, so large percent of aborted
transactions� may affect query speed.
But query speed is reduced in any case if in order to support time travel we
prohibit or postpone vacuum.

What is the expected relation of committed and aborted transactions? I
expected that it should be much bigger than one (especially if we take in
account
only read-write transaction which has really updated database). In this case
number of versions created by aborted transaction should be much smaller
than number of versions created by updated/delete of successful
transactions. So them should not have significant impact on performance.

Uh, I think the big question is whether we are ready to agreed that a
time-travel database will _never_ have aborted rows removed. The
aborted rows are clearly useless for time travel, so the question is
whether we ever want to remove them. I would think at some point we do.

Also, I am not sure we have any statistics on how many aborted rows are
in each table.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +