sorting table columns
Hi,
I've been trying to implement the holy grail of decoupling
logical/physical column sort order representation, i.e., the feature
that lets the server have one physical order, for storage compactness,
and a different "output" order that can be tweaked by the user. This
has been discussed many times; most recently, I believe, here:
http://archives.postgresql.org/pgsql-hackers/2007-02/msg01235.php
with implementation details here:
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php
The idea described there by Tom, and upon which I formed a vague
implementation plan in my head, is that I was to look for all uses of
an "attnum", and then replace it by either "attlognum" (i.e. the
user-visible sort identifier) or "attphysnum" (i.e. the order of
attributes as stored on disk). This turned out to be far from the
truth; the way things really work is that tupledescs are constructed
from catalogs, which are then converted into target lists, and those are
turned back into tupledescs in some places or into tupleslots in others.
So the real implementation is about making sure that we read the column
order ids, and preserve them appropriately while the query travels
through parser, rewriter, planner, executor, and to client. To this
end, I added members to nodes Var and TargetEntry; this lets me carry
the catalog data down. This isn't particularly complex, though it was
quite a challenge figuring out exactly the changes that made sense.
Soon thereafter I noticed that the column sort order needs to be
preserved in RangeTblEntry too, so I added a list of ints there to map
from logical to canonical.
So far so good. I made a few simple cases work: "select * from foo"
works correctly, of course, as do joins using ON, USING and NATURAL.
See attached patch (very much a WIP). (Note that for business reasons
there is no SQL syntax to fool around with logical column numbers; what
I do to test this is create a table and then UPDATE the
pg_attribute.attlognum entries to create a different order. Also I
haven't gotten into the business of handling a different physical
order.)
My next test case was a SQL function. There, things crashed and burned
immediately and it took me some time to realize that the reason for this
is the DestReceiver stuff: the patch I wrote to handle the basic cases
simply sorts attrs in logical order to pass to the receiveSlot function
(printtup in those basic cases), but this obviously affects how the
tuple is passed to other DestReceivers too. So the function DR is
getting the attributes in logical order, and then trying to stuff them
into a tuplestore as a minimaltuple. But the underlying code tries to
compute datum lengths using the TupleDesc and it doesn't use logical
order, but just canonical (catalog) order, which doesn't match the data
values. So it crashes.
So at this point I'm at a crossroads. One idea was to avoid sending
tuples in logical order unless the DR explicitely requests for it. So
the printtup DR would set a flag so that ExecutePlan would send tuples
in logical order; other DRs would not set this flag, and executor would
behave normally. What's not clear to me is that this is feasible at
all, because the order in which attrs are sent out are defined pretty
early in parser stages, so maybe we don't know enough about the DR yet.
Another idea was to modify the rest of the DRs so that they are aware
that the tuples they are being passed are in logical order.
Maybe this is all wrong and I need to take a completely different
approach. In particular, if I'm completely on the wrong track about
this, I want to know as soon as possible!
Ideas? Opinions?
--
Álvaro Herrera <alvherre@alvh.no-ip.org>
Attachments:
alter-column-order.patchapplication/octet-stream; name=alter-column-order.patchDownload
*** a/src/backend/access/common/tupdesc.c
--- b/src/backend/access/common/tupdesc.c
***************
*** 22,27 ****
--- 22,28 ----
#include "catalog/pg_type.h"
#include "parser/parse_type.h"
#include "utils/builtins.h"
+ #include "utils/memutils.h"
#include "utils/resowner.h"
#include "utils/syscache.h"
***************
*** 84,89 **** CreateTemplateTupleDesc(int natts, bool hasoid)
--- 85,91 ----
* Initialize other fields of the tupdesc.
*/
desc->natts = natts;
+ desc->logattrs = NULL;
desc->constr = NULL;
desc->tdtypeid = RECORDOID;
desc->tdtypmod = -1;
***************
*** 117,122 **** CreateTupleDesc(int natts, bool hasoid, Form_pg_attribute *attrs)
--- 119,125 ----
desc = (TupleDesc) palloc(sizeof(struct tupleDesc));
desc->attrs = attrs;
desc->natts = natts;
+ desc->logattrs = NULL;
desc->constr = NULL;
desc->tdtypeid = RECORDOID;
desc->tdtypmod = -1;
***************
*** 151,156 **** CreateTupleDescCopy(TupleDesc tupdesc)
--- 154,161 ----
desc->tdtypeid = tupdesc->tdtypeid;
desc->tdtypmod = tupdesc->tdtypmod;
+ Assert(desc->logattrs == NULL);
+
return desc;
}
***************
*** 256,261 **** FreeTupleDesc(TupleDesc tupdesc)
--- 261,269 ----
pfree(tupdesc->constr);
}
+ if (tupdesc->logattrs)
+ pfree(tupdesc->logattrs);
+
pfree(tupdesc);
}
***************
*** 300,306 **** DecrTupleDescRefCount(TupleDesc tupdesc)
* Note: we deliberately do not check the attrelid and tdtypmod fields.
* This allows typcache.c to use this routine to see if a cached record type
* matches a requested type, and is harmless for relcache.c's uses.
! * We don't compare tdrefcount, either.
*/
bool
equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
--- 308,314 ----
* Note: we deliberately do not check the attrelid and tdtypmod fields.
* This allows typcache.c to use this routine to see if a cached record type
* matches a requested type, and is harmless for relcache.c's uses.
! * We don't compare tdrefcount nor logattrs, either.
*/
bool
equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
***************
*** 341,346 **** equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
--- 349,361 ----
return false;
if (attr1->attlen != attr2->attlen)
return false;
+ if (attr1->attphysnum != attr2->attphysnum)
+ return false;
+ /* intentionally do not compare attlognum */
+ #if 0
+ if (attr1->attlognum != attr2->attlognum)
+ return false;
+ #endif
if (attr1->attndims != attr2->attndims)
return false;
if (attr1->atttypmod != attr2->atttypmod)
***************
*** 476,481 **** TupleDescInitEntry(TupleDesc desc,
--- 491,498 ----
att->atttypmod = typmod;
att->attnum = attributeNumber;
+ att->attphysnum = attributeNumber;
+ att->attlognum = attributeNumber;
att->attndims = attdim;
att->attnotnull = false;
***************
*** 521,526 **** TupleDescInitEntryCollation(TupleDesc desc,
--- 538,564 ----
desc->attrs[attributeNumber - 1]->attcollation = collationid;
}
+ /*
+ * TupleDescInitEntryLognum
+ *
+ * Assign a nondefault lognum to a previously initialized tuple descriptor
+ * entry.
+ */
+ void
+ TupleDescInitEntryLognum(TupleDesc desc,
+ AttrNumber attributeNumber,
+ int attlognum)
+ {
+ /*
+ * sanity checks
+ */
+ AssertArg(PointerIsValid(desc));
+ AssertArg(attributeNumber >= 1);
+ AssertArg(attributeNumber <= desc->natts);
+
+ desc->attrs[attributeNumber - 1]->attlognum = attlognum;
+ }
+
/*
* BuildDescForRelation
***************
*** 607,612 **** BuildDescForRelation(List *schema)
--- 645,652 ----
desc->constr = NULL;
}
+ Assert(desc->logattrs == NULL);
+
return desc;
}
***************
*** 667,671 **** BuildDescFromLists(List *names, List *types, List *typmods, List *collations)
--- 707,755 ----
TupleDescInitEntryCollation(desc, attnum, attcollation);
}
+ Assert(desc->logattrs == NULL);
return desc;
}
+
+ /*
+ * qsort callback for TupleDescGetSortedAttrs
+ */
+ static int
+ cmplognum(const void *attr1, const void *attr2)
+ {
+ Form_pg_attribute att1 = *(Form_pg_attribute *) attr1;
+ Form_pg_attribute att2 = *(Form_pg_attribute *) attr2;
+
+ if (att1->attlognum < att2->attlognum)
+ return -1;
+ if (att1->attlognum > att2->attlognum)
+ return 1;
+ return 0;
+ }
+
+ /*
+ * Return the array of attrs sorted by logical position
+ */
+ Form_pg_attribute *
+ TupleDescGetSortedAttrs(TupleDesc desc)
+ {
+ if (desc->logattrs == NULL)
+ {
+ Form_pg_attribute *attrs;
+
+ /*
+ * logattrs must be allocated in the same memcxt as the tupdesc it
+ * belongs to, so that it isn't reset ahead of time.
+ */
+ attrs = MemoryContextAlloc(GetMemoryChunkContext(desc),
+ sizeof(Form_pg_attribute) * desc->natts);
+ memcpy(attrs, desc->attrs,
+ sizeof(Form_pg_attribute) * desc->natts);
+
+ qsort(attrs, desc->natts, sizeof(Form_pg_attribute), cmplognum);
+
+ desc->logattrs = attrs;
+ }
+
+ return desc->logattrs;
+ }
*** a/src/backend/bootstrap/bootstrap.c
--- b/src/backend/bootstrap/bootstrap.c
***************
*** 703,709 **** DefineAttr(char *name, char *type, int attnum)
namestrcpy(&attrtypes[attnum]->attname, name);
elog(DEBUG4, "column %s %s", NameStr(attrtypes[attnum]->attname), type);
! attrtypes[attnum]->attnum = attnum + 1; /* fillatt */
typeoid = gettype(type);
--- 703,711 ----
namestrcpy(&attrtypes[attnum]->attname, name);
elog(DEBUG4, "column %s %s", NameStr(attrtypes[attnum]->attname), type);
! attrtypes[attnum]->attnum = attnum + 1;
! attrtypes[attnum]->attphysnum = attnum + 1;
! attrtypes[attnum]->attlognum = attnum + 1;
typeoid = gettype(type);
*** a/src/backend/catalog/genbki.pl
--- b/src/backend/catalog/genbki.pl
***************
*** 190,195 **** foreach my $catname ( @{ $catalogs->{names} } )
--- 190,197 ----
$attnum++;
my $row = emit_pgattr_row($table_name, $attr, $priornotnull);
$row->{attnum} = $attnum;
+ $row->{attphysnum} = $attnum;
+ $row->{attlognum} = $attnum;
$row->{attstattarget} = '-1';
$priornotnull &= ($row->{attnotnull} eq 't');
***************
*** 225,230 **** foreach my $catname ( @{ $catalogs->{names} } )
--- 227,234 ----
$attnum--;
my $row = emit_pgattr_row($table_name, $attr, 1);
$row->{attnum} = $attnum;
+ $row->{attphysnum} = $attnum;
+ $row->{attlognum} = $attnum;
$row->{attstattarget} = '0';
# some catalogs don't have oids
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
***************
*** 127,163 **** static List *insert_ordered_unique_oid(List *list, Oid datum);
static FormData_pg_attribute a1 = {
0, {"ctid"}, TIDOID, 0, sizeof(ItemPointerData),
! SelfItemPointerAttributeNumber, 0, -1, -1,
false, 'p', 's', true, false, false, true, 0
};
static FormData_pg_attribute a2 = {
0, {"oid"}, OIDOID, 0, sizeof(Oid),
! ObjectIdAttributeNumber, 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a3 = {
0, {"xmin"}, XIDOID, 0, sizeof(TransactionId),
! MinTransactionIdAttributeNumber, 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a4 = {
0, {"cmin"}, CIDOID, 0, sizeof(CommandId),
! MinCommandIdAttributeNumber, 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a5 = {
0, {"xmax"}, XIDOID, 0, sizeof(TransactionId),
! MaxTransactionIdAttributeNumber, 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a6 = {
0, {"cmax"}, CIDOID, 0, sizeof(CommandId),
! MaxCommandIdAttributeNumber, 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
--- 127,175 ----
static FormData_pg_attribute a1 = {
0, {"ctid"}, TIDOID, 0, sizeof(ItemPointerData),
! SelfItemPointerAttributeNumber, SelfItemPointerAttributeNumber,
! SelfItemPointerAttributeNumber,
! 0, -1, -1,
false, 'p', 's', true, false, false, true, 0
};
static FormData_pg_attribute a2 = {
0, {"oid"}, OIDOID, 0, sizeof(Oid),
! ObjectIdAttributeNumber, ObjectIdAttributeNumber,
! ObjectIdAttributeNumber,
! 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a3 = {
0, {"xmin"}, XIDOID, 0, sizeof(TransactionId),
! MinTransactionIdAttributeNumber, MinTransactionIdAttributeNumber,
! MinTransactionIdAttributeNumber,
! 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a4 = {
0, {"cmin"}, CIDOID, 0, sizeof(CommandId),
! MinCommandIdAttributeNumber, MinCommandIdAttributeNumber,
! MinCommandIdAttributeNumber,
! 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a5 = {
0, {"xmax"}, XIDOID, 0, sizeof(TransactionId),
! MaxTransactionIdAttributeNumber, MaxTransactionIdAttributeNumber,
! MaxTransactionIdAttributeNumber,
! 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
static FormData_pg_attribute a6 = {
0, {"cmax"}, CIDOID, 0, sizeof(CommandId),
! MaxCommandIdAttributeNumber, MaxCommandIdAttributeNumber,
! MaxCommandIdAttributeNumber,
! 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
***************
*** 169,175 **** static FormData_pg_attribute a6 = {
*/
static FormData_pg_attribute a7 = {
0, {"tableoid"}, OIDOID, 0, sizeof(Oid),
! TableOidAttributeNumber, 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
--- 181,189 ----
*/
static FormData_pg_attribute a7 = {
0, {"tableoid"}, OIDOID, 0, sizeof(Oid),
! TableOidAttributeNumber, TableOidAttributeNumber,
! TableOidAttributeNumber,
! 0, -1, -1,
true, 'p', 'i', true, false, false, true, 0
};
***************
*** 598,603 **** InsertPgAttributeTuple(Relation pg_attribute_rel,
--- 612,619 ----
values[Anum_pg_attribute_attstattarget - 1] = Int32GetDatum(new_attribute->attstattarget);
values[Anum_pg_attribute_attlen - 1] = Int16GetDatum(new_attribute->attlen);
values[Anum_pg_attribute_attnum - 1] = Int16GetDatum(new_attribute->attnum);
+ values[Anum_pg_attribute_attphysnum - 1] = Int16GetDatum(new_attribute->attphysnum);
+ values[Anum_pg_attribute_attlognum - 1] = Int16GetDatum(new_attribute->attlognum);
values[Anum_pg_attribute_attndims - 1] = Int32GetDatum(new_attribute->attndims);
values[Anum_pg_attribute_attcacheoff - 1] = Int32GetDatum(new_attribute->attcacheoff);
values[Anum_pg_attribute_atttypmod - 1] = Int32GetDatum(new_attribute->atttypmod);
*** a/src/backend/executor/execQual.c
--- b/src/backend/executor/execQual.c
***************
*** 4638,4644 **** ExecInitExpr(Expr *node, PlanState *parent)
}
/* Set up evaluation, skipping any deleted columns */
Assert(list_length(rowexpr->args) <= rstate->tupdesc->natts);
! attrs = rstate->tupdesc->attrs;
i = 0;
foreach(l, rowexpr->args)
{
--- 4638,4644 ----
}
/* Set up evaluation, skipping any deleted columns */
Assert(list_length(rowexpr->args) <= rstate->tupdesc->natts);
! attrs = TupleDescGetSortedAttrs(rstate->tupdesc);
i = 0;
foreach(l, rowexpr->args)
{
*** a/src/backend/executor/execTuples.c
--- b/src/backend/executor/execTuples.c
***************
*** 945,950 **** ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)
--- 945,953 ----
TupleDescInitEntryCollation(typeInfo,
cur_resno,
exprCollation((Node *) tle->expr));
+ TupleDescInitEntryLognum(typeInfo,
+ cur_resno,
+ tle->resoriglogcol);
cur_resno++;
}
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
***************
*** 1401,1406 **** check_sql_fn_retval(Oid func_id, Oid rettype, List *queryTreeList,
--- 1401,1407 ----
{
/* Returns a rowtype */
TupleDesc tupdesc;
+ Form_pg_attribute *attrs;
int tupnatts; /* physical number of columns in tuple */
int tuplogcols; /* # of nondeleted columns in tuple */
int colindex; /* physical column index */
***************
*** 1465,1470 **** check_sql_fn_retval(Oid func_id, Oid rettype, List *queryTreeList,
--- 1466,1472 ----
* result columns if the caller asked for that.
*/
tupnatts = tupdesc->natts;
+ attrs = TupleDescGetSortedAttrs(tupdesc);
tuplogcols = 0; /* we'll count nondeleted cols as we go */
colindex = 0;
newtlist = NIL; /* these are only used if modifyTargetList */
***************
*** 1493,1499 **** check_sql_fn_retval(Oid func_id, Oid rettype, List *queryTreeList,
errmsg("return type mismatch in function declared to return %s",
format_type_be(rettype)),
errdetail("Final statement returns too many columns.")));
! attr = tupdesc->attrs[colindex - 1];
if (attr->attisdropped && modifyTargetList)
{
Expr *null_expr;
--- 1495,1501 ----
errmsg("return type mismatch in function declared to return %s",
format_type_be(rettype)),
errdetail("Final statement returns too many columns.")));
! attr = attrs[colindex - 1];
if (attr->attisdropped && modifyTargetList)
{
Expr *null_expr;
***************
*** 1550,1556 **** check_sql_fn_retval(Oid func_id, Oid rettype, List *queryTreeList,
/* remaining columns in tupdesc had better all be dropped */
for (colindex++; colindex <= tupnatts; colindex++)
{
! if (!tupdesc->attrs[colindex - 1]->attisdropped)
ereport(ERROR,
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
errmsg("return type mismatch in function declared to return %s",
--- 1552,1558 ----
/* remaining columns in tupdesc had better all be dropped */
for (colindex++; colindex <= tupnatts; colindex++)
{
! if (!attrs[colindex - 1]->attisdropped)
ereport(ERROR,
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
errmsg("return type mismatch in function declared to return %s",
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 1063,1068 **** _copyVar(Var *from)
--- 1063,1069 ----
COPY_SCALAR_FIELD(varno);
COPY_SCALAR_FIELD(varattno);
+ COPY_SCALAR_FIELD(varlogno);
COPY_SCALAR_FIELD(vartype);
COPY_SCALAR_FIELD(vartypmod);
COPY_SCALAR_FIELD(varcollid);
***************
*** 1762,1767 **** _copyTargetEntry(TargetEntry *from)
--- 1763,1769 ----
COPY_SCALAR_FIELD(ressortgroupref);
COPY_SCALAR_FIELD(resorigtbl);
COPY_SCALAR_FIELD(resorigcol);
+ COPY_SCALAR_FIELD(resoriglogcol);
COPY_SCALAR_FIELD(resjunk);
return newnode;
***************
*** 1964,1969 **** _copyRangeTblEntry(RangeTblEntry *from)
--- 1966,1972 ----
COPY_SCALAR_FIELD(rtekind);
COPY_SCALAR_FIELD(relid);
COPY_SCALAR_FIELD(relkind);
+ COPY_NODE_FIELD(lognums);
COPY_NODE_FIELD(subquery);
COPY_SCALAR_FIELD(jointype);
COPY_NODE_FIELD(joinaliasvars);
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 136,141 **** _equalVar(Var *a, Var *b)
--- 136,142 ----
{
COMPARE_SCALAR_FIELD(varno);
COMPARE_SCALAR_FIELD(varattno);
+ /* intentially do not compare varlogno */
COMPARE_SCALAR_FIELD(vartype);
COMPARE_SCALAR_FIELD(vartypmod);
COMPARE_SCALAR_FIELD(varcollid);
***************
*** 740,745 **** _equalTargetEntry(TargetEntry *a, TargetEntry *b)
--- 741,747 ----
COMPARE_SCALAR_FIELD(ressortgroupref);
COMPARE_SCALAR_FIELD(resorigtbl);
COMPARE_SCALAR_FIELD(resorigcol);
+ /* intentionally do not compare resoriglogcol */
COMPARE_SCALAR_FIELD(resjunk);
return true;
*** a/src/backend/nodes/makefuncs.c
--- b/src/backend/nodes/makefuncs.c
***************
*** 88,94 **** makeVar(Index varno,
var->varnoold = varno;
var->varoattno = varattno;
! /* Likewise, we just set location to "unknown" here */
var->location = -1;
return var;
--- 88,97 ----
var->varnoold = varno;
var->varoattno = varattno;
! /*
! * Likewise, we just set varlogno to Invalid and location to "unknown" here
! */
! var->varlogno = InvalidAttrNumber;
var->location = -1;
return var;
***************
*** 228,233 **** makeTargetEntry(Expr *expr,
--- 231,237 ----
tle->ressortgroupref = 0;
tle->resorigtbl = InvalidOid;
tle->resorigcol = 0;
+ tle->resoriglogcol = 0;
tle->resjunk = resjunk;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 909,914 **** _outVar(StringInfo str, Var *node)
--- 909,915 ----
WRITE_UINT_FIELD(varno);
WRITE_INT_FIELD(varattno);
+ WRITE_INT_FIELD(varlogno);
WRITE_OID_FIELD(vartype);
WRITE_INT_FIELD(vartypmod);
WRITE_OID_FIELD(varcollid);
***************
*** 1422,1431 **** _outTargetEntry(StringInfo str, TargetEntry *node)
--- 1423,1449 ----
WRITE_UINT_FIELD(ressortgroupref);
WRITE_OID_FIELD(resorigtbl);
WRITE_INT_FIELD(resorigcol);
+ WRITE_INT_FIELD(resoriglogcol);
WRITE_BOOL_FIELD(resjunk);
}
static void
+ _outGenericExprState(StringInfo str, GenericExprState *node)
+ {
+ WRITE_NODE_TYPE("GENERICEXPRSTATE");
+
+ WRITE_NODE_FIELD(arg);
+ }
+
+ static void
+ _outExprState(StringInfo str, ExprState *node)
+ {
+ WRITE_NODE_TYPE("EXPRSTATE");
+
+ WRITE_NODE_FIELD(expr);
+ }
+
+ static void
_outRangeTblRef(StringInfo str, RangeTblRef *node)
{
WRITE_NODE_TYPE("RANGETBLREF");
***************
*** 2318,2323 **** _outRangeTblEntry(StringInfo str, RangeTblEntry *node)
--- 2336,2342 ----
case RTE_RELATION:
WRITE_OID_FIELD(relid);
WRITE_CHAR_FIELD(relkind);
+ WRITE_NODE_FIELD(lognums);
break;
case RTE_SUBQUERY:
WRITE_NODE_FIELD(subquery);
***************
*** 2942,2947 **** _outNode(StringInfo str, void *obj)
--- 2961,2972 ----
case T_FromExpr:
_outFromExpr(str, obj);
break;
+ case T_GenericExprState:
+ _outGenericExprState(str, obj);
+ break;
+ case T_ExprState:
+ _outExprState(str, obj);
+ break;
case T_Path:
_outPath(str, obj);
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 409,414 **** _readVar(void)
--- 409,415 ----
READ_UINT_FIELD(varno);
READ_INT_FIELD(varattno);
+ READ_INT_FIELD(varlogno);
READ_OID_FIELD(vartype);
READ_INT_FIELD(vartypmod);
READ_OID_FIELD(varcollid);
***************
*** 1114,1119 **** _readTargetEntry(void)
--- 1115,1121 ----
READ_UINT_FIELD(ressortgroupref);
READ_OID_FIELD(resorigtbl);
READ_INT_FIELD(resorigcol);
+ READ_INT_FIELD(resoriglogcol);
READ_BOOL_FIELD(resjunk);
READ_DONE();
***************
*** 1189,1194 **** _readRangeTblEntry(void)
--- 1191,1197 ----
case RTE_RELATION:
READ_OID_FIELD(relid);
READ_CHAR_FIELD(relkind);
+ READ_NODE_FIELD(lognums);
break;
case RTE_SUBQUERY:
READ_NODE_FIELD(subquery);
*** a/src/backend/optimizer/prep/prepjointree.c
--- b/src/backend/optimizer/prep/prepjointree.c
***************
*** 1343,1349 **** pullup_replace_vars_callback(Var *var,
* expansion with varlevelsup = 0, and then adjust if needed.
*/
expandRTE(rcon->target_rte,
! var->varno, 0 /* not varlevelsup */ , var->location,
(var->vartype != RECORDOID),
&colnames, &fields);
/* Adjust the generated per-field Vars, but don't insert PHVs */
--- 1343,1349 ----
* expansion with varlevelsup = 0, and then adjust if needed.
*/
expandRTE(rcon->target_rte,
! var->varno, 0 /* not varlevelsup */ , var->location, false,
(var->vartype != RECORDOID),
&colnames, &fields);
/* Adjust the generated per-field Vars, but don't insert PHVs */
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 850,855 **** build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
--- 850,856 ----
for (attrno = 1; attrno <= numattrs; attrno++)
{
Form_pg_attribute att_tup = relation->rd_att->attrs[attrno - 1];
+ TargetEntry *te;
if (att_tup->attisdropped)
{
***************
*** 864,875 **** build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
att_tup->atttypmod,
att_tup->attcollation,
0);
! tlist = lappend(tlist,
! makeTargetEntry((Expr *) var,
! attrno,
! NULL,
! false));
}
heap_close(relation, NoLock);
--- 865,875 ----
att_tup->atttypmod,
att_tup->attcollation,
0);
+ var->varlogno = att_tup->attlognum;
+ te = makeTargetEntry((Expr *) var, attrno, NULL, false);
+ te->resoriglogcol = var->varlogno;
! tlist = lappend(tlist, te);
}
heap_close(relation, NoLock);
***************
*** 899,905 **** build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
case RTE_VALUES:
case RTE_CTE:
/* Not all of these can have dropped cols, but share code anyway */
! expandRTE(rte, varno, 0, -1, true /* include dropped */ ,
NULL, &colvars);
foreach(l, colvars)
{
--- 899,905 ----
case RTE_VALUES:
case RTE_CTE:
/* Not all of these can have dropped cols, but share code anyway */
! expandRTE(rte, varno, 0, -1, true /* include dropped */ , false,
NULL, &colvars);
foreach(l, colvars)
{
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
***************
*** 635,641 **** transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/*
* Generate list of Vars referencing the RTE
*/
! expandRTE(rte, rtr->rtindex, 0, -1, false, NULL, &exprList);
}
else
{
--- 635,641 ----
/*
* Generate list of Vars referencing the RTE
*/
! expandRTE(rte, rtr->rtindex, 0, -1, false, false, NULL, &exprList);
}
else
{
***************
*** 1163,1169 **** transformValuesClause(ParseState *pstate, SelectStmt *stmt)
* Generate a targetlist as though expanding "*"
*/
Assert(pstate->p_next_resno == 1);
! qry->targetList = expandRelAttrs(pstate, rte, rtr->rtindex, 0, -1);
/*
* The grammar allows attaching ORDER BY, LIMIT, and FOR UPDATE to a
--- 1163,1169 ----
* Generate a targetlist as though expanding "*"
*/
Assert(pstate->p_next_resno == 1);
! qry->targetList = expandRelAttrs(pstate, rte, rtr->rtindex, 0, false, -1);
/*
* The grammar allows attaching ORDER BY, LIMIT, and FOR UPDATE to a
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
***************
*** 789,797 **** transformFromClauseItem(ParseState *pstate, Node *n,
*
* Note: expandRTE returns new lists, safe for me to modify
*/
! expandRTE(l_rte, l_rtindex, 0, -1, false,
&l_colnames, &l_colvars);
! expandRTE(r_rte, r_rtindex, 0, -1, false,
&r_colnames, &r_colvars);
/*
--- 789,797 ----
*
* Note: expandRTE returns new lists, safe for me to modify
*/
! expandRTE(l_rte, l_rtindex, 0, -1, false, true,
&l_colnames, &l_colvars);
! expandRTE(r_rte, r_rtindex, 0, -1, false, true,
&r_colnames, &r_colvars);
/*
*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
***************
*** 897,903 **** coerce_record_to_complex(ParseState *pstate, Node *node,
RangeTblEntry *rte;
rte = GetRTEByRangeTablePosn(pstate, rtindex, sublevels_up);
! expandRTE(rte, rtindex, sublevels_up, vlocation, false,
NULL, &args);
}
else
--- 897,903 ----
RangeTblEntry *rte;
rte = GetRTEByRangeTablePosn(pstate, rtindex, sublevels_up);
! expandRTE(rte, rtindex, sublevels_up, vlocation, false, false,
NULL, &args);
}
else
*** a/src/backend/parser/parse_relation.c
--- b/src/backend/parser/parse_relation.c
***************
*** 40,50 **** static void markRTEForSelectPriv(ParseState *pstate, RangeTblEntry *rte,
int rtindex, AttrNumber col);
static void expandRelation(Oid relid, Alias *eref,
int rtindex, int sublevels_up,
! int location, bool include_dropped,
List **colnames, List **colvars);
static void expandTupleDesc(TupleDesc tupdesc, Alias *eref,
int rtindex, int sublevels_up,
! int location, bool include_dropped,
List **colnames, List **colvars);
static int specialAttNum(const char *attname);
--- 40,50 ----
int rtindex, AttrNumber col);
static void expandRelation(Oid relid, Alias *eref,
int rtindex, int sublevels_up,
! int location, bool include_dropped, bool logical_sort,
List **colnames, List **colvars);
static void expandTupleDesc(TupleDesc tupdesc, Alias *eref,
int rtindex, int sublevels_up,
! int location, bool include_dropped, bool logical_sort,
List **colnames, List **colvars);
static int specialAttNum(const char *attname);
***************
*** 442,447 **** GetCTEForRTE(ParseState *pstate, RangeTblEntry *rte, int rtelevelsup)
--- 442,453 ----
return NULL; /* keep compiler quiet */
}
+ static int16
+ get_attnum_by_lognum(RangeTblEntry *rte, int16 attlognum)
+ {
+ return list_nth_int(rte->lognums, attlognum - 1);
+ }
+
/*
* scanRTEForColumn
* Search the column names of a single RTE for the given name.
***************
*** 484,489 **** scanRTEForColumn(ParseState *pstate, RangeTblEntry *rte, char *colname,
--- 490,497 ----
errmsg("column reference \"%s\" is ambiguous",
colname),
parser_errposition(pstate, location)));
+ if (rte->lognums)
+ attnum = get_attnum_by_lognum(rte, attnum);
var = make_var(pstate, rte, attnum, location);
/* Require read access to the column */
markVarForSelectPriv(pstate, var, rte);
***************
*** 699,711 **** markVarForSelectPriv(ParseState *pstate, Var *var, RangeTblEntry *rte)
* eref->colnames is filled in. Also, alias->colnames is rebuilt to insert
* empty strings for any dropped columns, so that it will be one-to-one with
* physical column numbers.
*/
static void
! buildRelationAliases(TupleDesc tupdesc, Alias *alias, Alias *eref)
{
int maxattrs = tupdesc->natts;
ListCell *aliaslc;
int numaliases;
int varattno;
int numdropped = 0;
--- 707,724 ----
* eref->colnames is filled in. Also, alias->colnames is rebuilt to insert
* empty strings for any dropped columns, so that it will be one-to-one with
* physical column numbers.
+ *
+ * If lognums is not NULL, it will be filled with a map from logical column
+ * numbers to attnum; that way, the nth element of eref->colnames corresponds
+ * to the attnum found in the nth element of lognums.
*/
static void
! buildRelationAliases(TupleDesc tupdesc, Alias *alias, Alias *eref, List **lognums)
{
int maxattrs = tupdesc->natts;
ListCell *aliaslc;
int numaliases;
+ Form_pg_attribute *attrs;
int varattno;
int numdropped = 0;
***************
*** 724,732 **** buildRelationAliases(TupleDesc tupdesc, Alias *alias, Alias *eref)
numaliases = 0;
}
for (varattno = 0; varattno < maxattrs; varattno++)
{
! Form_pg_attribute attr = tupdesc->attrs[varattno];
Value *attrname;
if (attr->attisdropped)
--- 737,747 ----
numaliases = 0;
}
+ attrs = TupleDescGetSortedAttrs(tupdesc);
+
for (varattno = 0; varattno < maxattrs; varattno++)
{
! Form_pg_attribute attr = attrs[varattno];
Value *attrname;
if (attr->attisdropped)
***************
*** 751,756 **** buildRelationAliases(TupleDesc tupdesc, Alias *alias, Alias *eref)
--- 766,774 ----
}
eref->colnames = lappend(eref->colnames, attrname);
+
+ if (lognums)
+ *lognums = lappend_int(*lognums, attr->attnum);
}
/* Too many user-supplied aliases? */
***************
*** 907,913 **** addRangeTableEntry(ParseState *pstate,
* and/or actual column names.
*/
rte->eref = makeAlias(refname, NIL);
! buildRelationAliases(rel->rd_att, alias, rte->eref);
/*
* Drop the rel refcount, but keep the access lock till end of transaction
--- 925,931 ----
* and/or actual column names.
*/
rte->eref = makeAlias(refname, NIL);
! buildRelationAliases(rel->rd_att, alias, rte->eref, &rte->lognums);
/*
* Drop the rel refcount, but keep the access lock till end of transaction
***************
*** 970,976 **** addRangeTableEntryForRelation(ParseState *pstate,
* and/or actual column names.
*/
rte->eref = makeAlias(refname, NIL);
! buildRelationAliases(rel->rd_att, alias, rte->eref);
/*----------
* Flags:
--- 988,994 ----
* and/or actual column names.
*/
rte->eref = makeAlias(refname, NIL);
! buildRelationAliases(rel->rd_att, alias, rte->eref, &rte->lognums);
/*----------
* Flags:
***************
*** 1145,1151 **** addRangeTableEntryForFunction(ParseState *pstate,
/* Composite data type, e.g. a table's row type */
Assert(tupdesc);
/* Build the column alias list */
! buildRelationAliases(tupdesc, alias, eref);
}
else if (functypclass == TYPEFUNC_SCALAR)
{
--- 1163,1169 ----
/* Composite data type, e.g. a table's row type */
Assert(tupdesc);
/* Build the column alias list */
! buildRelationAliases(tupdesc, alias, eref, NULL);
}
else if (functypclass == TYPEFUNC_SCALAR)
{
***************
*** 1552,1564 **** addRTEtoQuery(ParseState *pstate, RangeTblEntry *rte,
* values to use in the created Vars. Ordinarily rtindex should match the
* actual position of the RTE in its rangetable.
*
* The output lists go into *colnames and *colvars.
* If only one of the two kinds of output list is needed, pass NULL for the
* output pointer for the unwanted one.
*/
void
expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
! int location, bool include_dropped,
List **colnames, List **colvars)
{
int varattno;
--- 1570,1585 ----
* values to use in the created Vars. Ordinarily rtindex should match the
* actual position of the RTE in its rangetable.
*
+ * If logical_sort is true, then the resulting lists are sorted by logical
+ * column number (attlognum); otherwise use regular attnum.
+ *
* The output lists go into *colnames and *colvars.
* If only one of the two kinds of output list is needed, pass NULL for the
* output pointer for the unwanted one.
*/
void
expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
! int location, bool include_dropped, bool logical_sort,
List **colnames, List **colvars)
{
int varattno;
***************
*** 1573,1580 **** expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
case RTE_RELATION:
/* Ordinary relation RTE */
expandRelation(rte->relid, rte->eref,
! rtindex, sublevels_up, location,
! include_dropped, colnames, colvars);
break;
case RTE_SUBQUERY:
{
--- 1594,1601 ----
case RTE_RELATION:
/* Ordinary relation RTE */
expandRelation(rte->relid, rte->eref,
! rtindex, sublevels_up, location, include_dropped,
! logical_sort, colnames, colvars);
break;
case RTE_SUBQUERY:
{
***************
*** 1632,1638 **** expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
/* Composite data type, e.g. a table's row type */
Assert(tupdesc);
expandTupleDesc(tupdesc, rte->eref,
! rtindex, sublevels_up, location,
include_dropped, colnames, colvars);
}
else if (functypclass == TYPEFUNC_SCALAR)
--- 1653,1659 ----
/* Composite data type, e.g. a table's row type */
Assert(tupdesc);
expandTupleDesc(tupdesc, rte->eref,
! rtindex, sublevels_up, location, false,
include_dropped, colnames, colvars);
}
else if (functypclass == TYPEFUNC_SCALAR)
***************
*** 1844,1850 **** expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
*/
static void
expandRelation(Oid relid, Alias *eref, int rtindex, int sublevels_up,
! int location, bool include_dropped,
List **colnames, List **colvars)
{
Relation rel;
--- 1865,1871 ----
*/
static void
expandRelation(Oid relid, Alias *eref, int rtindex, int sublevels_up,
! int location, bool include_dropped, bool logical_sort,
List **colnames, List **colvars)
{
Relation rel;
***************
*** 1852,1858 **** expandRelation(Oid relid, Alias *eref, int rtindex, int sublevels_up,
/* Get the tupledesc and turn it over to expandTupleDesc */
rel = relation_open(relid, AccessShareLock);
expandTupleDesc(rel->rd_att, eref, rtindex, sublevels_up,
! location, include_dropped,
colnames, colvars);
relation_close(rel, AccessShareLock);
}
--- 1873,1879 ----
/* Get the tupledesc and turn it over to expandTupleDesc */
rel = relation_open(relid, AccessShareLock);
expandTupleDesc(rel->rd_att, eref, rtindex, sublevels_up,
! location, include_dropped, logical_sort,
colnames, colvars);
relation_close(rel, AccessShareLock);
}
***************
*** 1863,1878 **** expandRelation(Oid relid, Alias *eref, int rtindex, int sublevels_up,
static void
expandTupleDesc(TupleDesc tupdesc, Alias *eref,
int rtindex, int sublevels_up,
! int location, bool include_dropped,
List **colnames, List **colvars)
{
int maxattrs = tupdesc->natts;
int numaliases = list_length(eref->colnames);
int varattno;
for (varattno = 0; varattno < maxattrs; varattno++)
{
! Form_pg_attribute attr = tupdesc->attrs[varattno];
if (attr->attisdropped)
{
--- 1884,1905 ----
static void
expandTupleDesc(TupleDesc tupdesc, Alias *eref,
int rtindex, int sublevels_up,
! int location, bool include_dropped, bool logical_sort,
List **colnames, List **colvars)
{
int maxattrs = tupdesc->natts;
int numaliases = list_length(eref->colnames);
int varattno;
+ Form_pg_attribute *attrs;
+
+ if (logical_sort)
+ attrs = TupleDescGetSortedAttrs(tupdesc);
+ else
+ attrs = tupdesc->attrs;
for (varattno = 0; varattno < maxattrs; varattno++)
{
! Form_pg_attribute attr = attrs[varattno];
if (attr->attisdropped)
{
***************
*** 1912,1917 **** expandTupleDesc(TupleDesc tupdesc, Alias *eref,
--- 1939,1945 ----
attr->atttypid, attr->atttypmod,
attr->attcollation,
sublevels_up);
+ varnode->varlogno = attr->attlognum;
varnode->location = location;
*colvars = lappend(*colvars, varnode);
***************
*** 1931,1937 **** expandTupleDesc(TupleDesc tupdesc, Alias *eref,
*/
List *
expandRelAttrs(ParseState *pstate, RangeTblEntry *rte,
! int rtindex, int sublevels_up, int location)
{
List *names,
*vars;
--- 1959,1965 ----
*/
List *
expandRelAttrs(ParseState *pstate, RangeTblEntry *rte,
! int rtindex, int sublevels_up, bool logical_sort, int location)
{
List *names,
*vars;
***************
*** 1939,1945 **** expandRelAttrs(ParseState *pstate, RangeTblEntry *rte,
*var;
List *te_list = NIL;
! expandRTE(rte, rtindex, sublevels_up, location, false,
&names, &vars);
/*
--- 1967,1973 ----
*var;
List *te_list = NIL;
! expandRTE(rte, rtindex, sublevels_up, location, false, logical_sort,
&names, &vars);
/*
***************
*** 1959,1964 **** expandRelAttrs(ParseState *pstate, RangeTblEntry *rte,
--- 1987,1993 ----
(AttrNumber) pstate->p_next_resno++,
label,
false);
+ te->resoriglogcol = varnode->varlogno;
te_list = lappend(te_list, te);
/* Require read access to each column */
*** a/src/backend/parser/parse_target.c
--- b/src/backend/parser/parse_target.c
***************
*** 282,287 **** markTargetListOrigin(ParseState *pstate, TargetEntry *tle,
--- 282,288 ----
/* It's a table or view, report it */
tle->resorigtbl = rte->relid;
tle->resorigcol = attnum;
+ tle->resoriglogcol = var->varlogno;
break;
case RTE_SUBQUERY:
/* Subselect-in-FROM: copy up from the subselect */
***************
*** 1134,1140 **** ExpandAllTables(ParseState *pstate, int location)
target = list_concat(target,
expandRelAttrs(pstate, rte, rtindex, 0,
! location));
}
return target;
--- 1135,1141 ----
target = list_concat(target,
expandRelAttrs(pstate, rte, rtindex, 0,
! true, location));
}
return target;
***************
*** 1189,1202 **** ExpandSingleTable(ParseState *pstate, RangeTblEntry *rte,
{
/* expandRelAttrs handles permissions marking */
return expandRelAttrs(pstate, rte, rtindex, sublevels_up,
! location);
}
else
{
List *vars;
ListCell *l;
! expandRTE(rte, rtindex, sublevels_up, location, false,
NULL, &vars);
/*
--- 1190,1203 ----
{
/* expandRelAttrs handles permissions marking */
return expandRelAttrs(pstate, rte, rtindex, sublevels_up,
! true, location);
}
else
{
List *vars;
ListCell *l;
! expandRTE(rte, rtindex, sublevels_up, location, false, true,
NULL, &vars);
/*
***************
*** 1304,1309 **** ExpandRowReference(ParseState *pstate, Node *expr,
--- 1305,1311 ----
(AttrNumber) pstate->p_next_resno++,
pstrdup(NameStr(att->attname)),
false);
+ te->resoriglogcol = att->attlognum;
result = lappend(result, te);
}
else
***************
*** 1350,1356 **** expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
*lvar;
int i;
! expandRTE(rte, var->varno, 0, var->location, false,
&names, &vars);
tupleDesc = CreateTemplateTupleDesc(list_length(vars), false);
--- 1352,1358 ----
*lvar;
int i;
! expandRTE(rte, var->varno, 0, var->location, false, false,
&names, &vars);
tupleDesc = CreateTemplateTupleDesc(list_length(vars), false);
*** a/src/backend/rewrite/rewriteManip.c
--- b/src/backend/rewrite/rewriteManip.c
***************
*** 1263,1269 **** ResolveNew_callback(Var *var,
*/
expandRTE(rcon->target_rte,
var->varno, var->varlevelsup, var->location,
! (var->vartype != RECORDOID),
&colnames, &fields);
/* Adjust the generated per-field Vars... */
fields = (List *) replace_rte_variables_mutator((Node *) fields,
--- 1263,1269 ----
*/
expandRTE(rcon->target_rte,
var->varno, var->varlevelsup, var->location,
! (var->vartype != RECORDOID), false,
&colnames, &fields);
/* Adjust the generated per-field Vars... */
fields = (List *) replace_rte_variables_mutator((Node *) fields,
*** a/src/include/access/tupdesc.h
--- b/src/include/access/tupdesc.h
***************
*** 59,64 **** typedef struct tupleConstr
--- 59,69 ----
* row type, or a value >= 0 to allow the rowtype to be looked up in the
* typcache.c type cache.
*
+ * We keep an array of attribute sorted by attlognum. This helps *-expansion.
+ * The array is initially set to NULL, and is only populated on first access;
+ * those wanting to access it should always do it through
+ * TupleDescGetSortedAttrs.
+ *
* Tuple descriptors that live in caches (relcache or typcache, at present)
* are reference-counted: they can be deleted when their reference count goes
* to zero. Tuple descriptors created by the executor need no reference
***************
*** 72,77 **** typedef struct tupleDesc
--- 77,83 ----
int natts; /* number of attributes in the tuple */
Form_pg_attribute *attrs;
/* attrs[N] is a pointer to the description of Attribute Number N+1 */
+ Form_pg_attribute *logattrs; /* array of attributes sorted by attlognum */
TupleConstr *constr; /* constraints, or NULL if none */
Oid tdtypeid; /* composite type ID for tuple type */
int32 tdtypmod; /* typmod for tuple type */
***************
*** 119,126 **** extern void TupleDescInitEntryCollation(TupleDesc desc,
--- 125,138 ----
AttrNumber attributeNumber,
Oid collationid);
+ extern void TupleDescInitEntryLognum(TupleDesc desc,
+ AttrNumber attributeNumber,
+ int attlognum);
+
extern TupleDesc BuildDescForRelation(List *schema);
extern TupleDesc BuildDescFromLists(List *names, List *types, List *typmods, List *collations);
+ extern Form_pg_attribute *TupleDescGetSortedAttrs(TupleDesc desc);
+
#endif /* TUPDESC_H */
*** a/src/include/catalog/pg_attribute.h
--- b/src/include/catalog/pg_attribute.h
***************
*** 63,81 **** CATALOG(pg_attribute,1249) BKI_BOOTSTRAP BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(75) BK
int2 attlen;
/*
! * attnum is the "attribute number" for the attribute: A value that
! * uniquely identifies this attribute within its class. For user
! * attributes, Attribute numbers are greater than 0 and not greater than
! * the number of attributes in the class. I.e. if the Class pg_class says
! * that Class XYZ has 10 attributes, then the user attribute numbers in
! * Class pg_attribute must be 1-10.
*
! * System attributes have attribute numbers less than 0 that are unique
! * within the class, but not constrained to any particular range.
*
* Note that (attnum - 1) is often used as the index to an array.
*/
int2 attnum;
/*
* attndims is the declared number of dimensions, if an array type,
--- 63,92 ----
int2 attlen;
/*
! * We previously had a single "attnum" attribute here, which has been
! * broken up in three parts:
*
! * attnum uniquely identifies the column within its class, throughout its
! * lifetime. For user attributes, Attribute numbers are greater than 0 and
! * not greater than the number of attributes in the class. I.e. if the
! * Class pg_class says that Class XYZ has 10 attributes, then the user
! * attribute numbers in Class pg_attribute must be 1-10. System attributes
! * have attribute numbers less than 0 that are unique within the class, but
! * not constrained to any particular range.
! *
! * attphysnum (physical position) specifies the position in which the
! * column is stored in physical tuples. This might differ from attnum if
! * there are useful optimizations in storage space, for example alignment
! * considerations.
! *
! * attlognum (logical position) specifies the position in which the column
! * is expanded in "SELECT * FROM rel" type queries.
*
* Note that (attnum - 1) is often used as the index to an array.
*/
int2 attnum;
+ int2 attphysnum;
+ int2 attlognum;
/*
* attndims is the declared number of dimensions, if an array type,
***************
*** 182,209 **** typedef FormData_pg_attribute *Form_pg_attribute;
* ----------------
*/
! #define Natts_pg_attribute 21
#define Anum_pg_attribute_attrelid 1
#define Anum_pg_attribute_attname 2
#define Anum_pg_attribute_atttypid 3
#define Anum_pg_attribute_attstattarget 4
#define Anum_pg_attribute_attlen 5
#define Anum_pg_attribute_attnum 6
! #define Anum_pg_attribute_attndims 7
! #define Anum_pg_attribute_attcacheoff 8
! #define Anum_pg_attribute_atttypmod 9
! #define Anum_pg_attribute_attbyval 10
! #define Anum_pg_attribute_attstorage 11
! #define Anum_pg_attribute_attalign 12
! #define Anum_pg_attribute_attnotnull 13
! #define Anum_pg_attribute_atthasdef 14
! #define Anum_pg_attribute_attisdropped 15
! #define Anum_pg_attribute_attislocal 16
! #define Anum_pg_attribute_attinhcount 17
! #define Anum_pg_attribute_attcollation 18
! #define Anum_pg_attribute_attacl 19
! #define Anum_pg_attribute_attoptions 20
! #define Anum_pg_attribute_attfdwoptions 21
/* ----------------
--- 193,222 ----
* ----------------
*/
! #define Natts_pg_attribute 23
#define Anum_pg_attribute_attrelid 1
#define Anum_pg_attribute_attname 2
#define Anum_pg_attribute_atttypid 3
#define Anum_pg_attribute_attstattarget 4
#define Anum_pg_attribute_attlen 5
#define Anum_pg_attribute_attnum 6
! #define Anum_pg_attribute_attphysnum 7
! #define Anum_pg_attribute_attlognum 8
! #define Anum_pg_attribute_attndims 9
! #define Anum_pg_attribute_attcacheoff 10
! #define Anum_pg_attribute_atttypmod 11
! #define Anum_pg_attribute_attbyval 12
! #define Anum_pg_attribute_attstorage 13
! #define Anum_pg_attribute_attalign 14
! #define Anum_pg_attribute_attnotnull 15
! #define Anum_pg_attribute_atthasdef 16
! #define Anum_pg_attribute_attisdropped 17
! #define Anum_pg_attribute_attislocal 18
! #define Anum_pg_attribute_attinhcount 19
! #define Anum_pg_attribute_attcollation 20
! #define Anum_pg_attribute_attacl 21
! #define Anum_pg_attribute_attoptions 22
! #define Anum_pg_attribute_attfdwoptions 23
/* ----------------
*** a/src/include/catalog/pg_class.h
--- b/src/include/catalog/pg_class.h
***************
*** 135,141 **** typedef FormData_pg_class *Form_pg_class;
/* Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId */
DATA(insert OID = 1247 ( pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 29 0 t f f f f 3 _null_ _null_ ));
DESCR("");
! DATA(insert OID = 1249 ( pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 21 0 f f f f f 3 _null_ _null_ ));
DESCR("");
DATA(insert OID = 1255 ( pg_proc PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 26 0 t f f f f 3 _null_ _null_ ));
DESCR("");
--- 135,141 ----
/* Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId */
DATA(insert OID = 1247 ( pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 29 0 t f f f f 3 _null_ _null_ ));
DESCR("");
! DATA(insert OID = 1249 ( pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 23 0 f f f f f 3 _null_ _null_ ));
DESCR("");
DATA(insert OID = 1255 ( pg_proc PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 26 0 t f f f f 3 _null_ _null_ ));
DESCR("");
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 701,706 **** typedef struct RangeTblEntry
--- 701,707 ----
*/
Oid relid; /* OID of the relation */
char relkind; /* relation kind (see pg_class.relkind) */
+ List *lognums; /* int list of logical column numbers */
/*
* Fields valid for a subquery RTE (else NULL):
*** a/src/include/nodes/primnodes.h
--- b/src/include/nodes/primnodes.h
***************
*** 142,147 **** typedef struct Var
--- 142,148 ----
* table, or INNER_VAR/OUTER_VAR/INDEX_VAR */
AttrNumber varattno; /* attribute number of this var, or zero for
* all */
+ AttrNumber varlogno; /* logical position of column in table XXX invalid value? */
Oid vartype; /* pg_type OID for the type of this var */
int32 vartypmod; /* pg_attribute typmod value */
Oid varcollid; /* OID of collation, or InvalidOid if none */
***************
*** 1166,1171 **** typedef struct TargetEntry
--- 1167,1173 ----
* clause */
Oid resorigtbl; /* OID of column's source table */
AttrNumber resorigcol; /* column's number in source table */
+ AttrNumber resoriglogcol; /* column's logical number in source table */
bool resjunk; /* set to true to eliminate the attribute from
* final target list */
} TargetEntry;
*** a/src/include/parser/parse_relation.h
--- b/src/include/parser/parse_relation.h
***************
*** 83,92 **** extern void addRTEtoQuery(ParseState *pstate, RangeTblEntry *rte,
bool addToRelNameSpace, bool addToVarNameSpace);
extern void errorMissingRTE(ParseState *pstate, RangeVar *relation);
extern void expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
! int location, bool include_dropped,
List **colnames, List **colvars);
extern List *expandRelAttrs(ParseState *pstate, RangeTblEntry *rte,
! int rtindex, int sublevels_up, int location);
extern int attnameAttNum(Relation rd, const char *attname, bool sysColOK);
extern Name attnumAttName(Relation rd, int attid);
extern Oid attnumTypeId(Relation rd, int attid);
--- 83,92 ----
bool addToRelNameSpace, bool addToVarNameSpace);
extern void errorMissingRTE(ParseState *pstate, RangeVar *relation);
extern void expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
! int location, bool include_dropped, bool logical_sort,
List **colnames, List **colvars);
extern List *expandRelAttrs(ParseState *pstate, RangeTblEntry *rte,
! int rtindex, int sublevels_up, bool logical_sort, int location);
extern int attnameAttNum(Relation rd, const char *attname, bool sysColOK);
extern Name attnumAttName(Relation rd, int attid);
extern Oid attnumTypeId(Relation rd, int attid);
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
I've been trying to implement the holy grail of decoupling
logical/physical column sort order representation, i.e., the feature
that lets the server have one physical order, for storage compactness,
and a different "output" order that can be tweaked by the user. This
has been discussed many times; most recently, I believe, here:
http://archives.postgresql.org/pgsql-hackers/2007-02/msg01235.php
with implementation details here:
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php
The idea described there by Tom, and upon which I formed a vague
implementation plan in my head, is that I was to look for all uses of
an "attnum", and then replace it by either "attlognum" (i.e. the
user-visible sort identifier) or "attphysnum" (i.e. the order of
attributes as stored on disk).
I thought we'd concluded that we really need three values: attnum should
be a permanent logical ID for each column, and then the user-visible
column order would be determined by a different number, and the on-disk
column order by a third. If we're going to do this at all, it seems
like a seriously bad idea to only go halfway, because then we'll just
have to revisit all the same code again later.
You do *not* want to store either of the latter two numbers in
parse-time Var nodes, because then you can't rearrange columns without
having to update stored rules. But it might be useful to decree that
one thing setrefs.c does is renumber Vars in scan nodes to use the
physical column numbers instead of the permanent IDs.
I haven't looked into any of the details, but I would guess that
targetlists should always be constructed in logical (user-visible)
column order. TupleDescs need to match the physical order, most
likely. Note that all three orderings are always going to be the same
everywhere above the table scan level. (And I suppose COPY will need
some hack or other.)
regards, tom lane
Excerpts from Tom Lane's message of mar dic 20 18:24:29 -0300 2011:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
I've been trying to implement the holy grail of decoupling
logical/physical column sort order representation, i.e., the feature
that lets the server have one physical order, for storage compactness,
and a different "output" order that can be tweaked by the user. This
has been discussed many times; most recently, I believe, here:
http://archives.postgresql.org/pgsql-hackers/2007-02/msg01235.php
with implementation details here:
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.phpThe idea described there by Tom, and upon which I formed a vague
implementation plan in my head, is that I was to look for all uses of
an "attnum", and then replace it by either "attlognum" (i.e. the
user-visible sort identifier) or "attphysnum" (i.e. the order of
attributes as stored on disk).I thought we'd concluded that we really need three values: attnum should
be a permanent logical ID for each column, and then the user-visible
column order would be determined by a different number, and the on-disk
column order by a third. If we're going to do this at all, it seems
like a seriously bad idea to only go halfway, because then we'll just
have to revisit all the same code again later.
Yeah, I was unclear -- that's what I'm doing (or, rather, attempting to
do).
You do *not* want to store either of the latter two numbers in
parse-time Var nodes, because then you can't rearrange columns without
having to update stored rules. But it might be useful to decree that
one thing setrefs.c does is renumber Vars in scan nodes to use the
physical column numbers instead of the permanent IDs.
Hmm, having the numbers in Var nodes seems a fundamental part of the way
I'm attacking the problem. Hopefully after I give setrefs.c a read I
will have a clearer picture of the way to do it without that.
I haven't looked into any of the details, but I would guess that
targetlists should always be constructed in logical (user-visible)
column order. TupleDescs need to match the physical order, most
likely. Note that all three orderings are always going to be the same
everywhere above the table scan level. (And I suppose COPY will need
some hack or other.)
Okay. AFAICS this shoots down the idea of modifying destreceivers,
which is good because I was coming to that conclusion for a different
reason.
Thanks for the pointers.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
Excerpts from Tom Lane's message of mar dic 20 18:24:29 -0300 2011:
You do *not* want to store either of the latter two numbers in
parse-time Var nodes, because then you can't rearrange columns without
having to update stored rules. But it might be useful to decree that
one thing setrefs.c does is renumber Vars in scan nodes to use the
physical column numbers instead of the permanent IDs.
Hmm, having the numbers in Var nodes seems a fundamental part of the way
I'm attacking the problem. Hopefully after I give setrefs.c a read I
will have a clearer picture of the way to do it without that.
To clarify a bit: one thing that setrefs.c already does is to renumber
Var nodes above the scan level, so that their attnums refer not to
original table column attnums but to column numbers in the output of the
next plan level down. Vars in scan nodes currently don't need any
renumbering, but it'd be easy enough to extend the logic to do something
to them as well. I'm visualizing the run-time transformation from
physical to logical column ordering as a sort of projection, much like
the mapping that happens in a join node.
regards, tom lane
On Tue, Dec 20, 2011 at 9:47 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
The idea described there by Tom, and upon which I formed a vague
implementation plan in my head, is that I was to look for all uses of
an "attnum", and then replace it by either "attlognum" (i.e. the
user-visible sort identifier) or "attphysnum" (i.e. the order of
attributes as stored on disk).I thought we'd concluded that we really need three values: attnum should
be a permanent logical ID for each column, and then the user-visible
column order would be determined by a different number, and the on-disk
column order by a third. If we're going to do this at all, it seems
like a seriously bad idea to only go halfway, because then we'll just
have to revisit all the same code again later.Yeah, I was unclear -- that's what I'm doing (or, rather, attempting to
do).
Sounds great.
While you're doing this, I'd like to think about future requirements,
to see if that changes anything.
Having a unique logical column id is a great thing because it allows
the physical storage to differ. This is the first part to allowing
these features...
* "column-based storage" where the data for some column(s) lives in a
dedicated heap
* "vertical partitioning" where defined groups of columns live in
separate heaps for performance and/or security
* "generated columns" where the column exists only logically and is
derived at run-time (per SQL Standard)
* "key/value columns" where we retrieve the column value from an hstore
* "very large number of columns" for statistical data sets where we
automatically vertically partition the heap when faced with large
numbers of column definitions
So when you store the physical representation please also store a
storage method, that currently has just one method SM_HEAP and a
relfilenode.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Excerpts from Simon Riggs's message of mié dic 21 09:44:04 -0300 2011:
Sounds great.
While you're doing this, I'd like to think about future requirements,
to see if that changes anything.Having a unique logical column id is a great thing because it allows
the physical storage to differ. This is the first part to allowing
these features...
Great ideas. This one I'm not sure about at all:
* "very large number of columns" for statistical data sets where we
automatically vertically partition the heap when faced with large
numbers of column definitionsSo when you store the physical representation please also store a
storage method, that currently has just one method SM_HEAP and a
relfilenode.
Well, for the patch I'm working on right now, I'm just going to store an
ID as "physical representation", which will mean the sort order used for
the on-disk representation of our current heap storage; the idea here is
to allow columns to be sorted internally by the system so that alignment
padding is reduced; nothing more. Of course, we can work on more
complex representations later that allow different storage strategies,
such as the ones you propose.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Dec 21, 2011 at 1:42 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
This one I'm not sure about at all:
* "very large number of columns" for statistical data sets where we
automatically vertically partition the heap when faced with large
numbers of column definitions
We currently have pg_attribute.attnum as an int2, so we can store up
to 32768 columns without changing that size, as long as we have some
place to put the data.
Was there something you're working on likely to preventing >240 cols?
Just worth documenting what you see at this stage.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Excerpts from Simon Riggs's message of mié dic 21 15:53:20 -0300 2011:
On Wed, Dec 21, 2011 at 1:42 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:This one I'm not sure about at all:
* "very large number of columns" for statistical data sets where we
automatically vertically partition the heap when faced with large
numbers of column definitionsWe currently have pg_attribute.attnum as an int2, so we can store up
to 32768 columns without changing that size, as long as we have some
place to put the data.
Hm, right.
Was there something you're working on likely to preventing >240 cols?
No, not at all.
Just worth documenting what you see at this stage.
I'll keep my eyes open :-)
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Tom Lane's message of mar dic 20 22:23:36 -0300 2011:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Excerpts from Tom Lane's message of mar dic 20 18:24:29 -0300 2011:
You do *not* want to store either of the latter two numbers in
parse-time Var nodes, because then you can't rearrange columns without
having to update stored rules. But it might be useful to decree that
one thing setrefs.c does is renumber Vars in scan nodes to use the
physical column numbers instead of the permanent IDs.Hmm, having the numbers in Var nodes seems a fundamental part of the way
I'm attacking the problem. Hopefully after I give setrefs.c a read I
will have a clearer picture of the way to do it without that.To clarify a bit: one thing that setrefs.c already does is to renumber
Var nodes above the scan level, so that their attnums refer not to
original table column attnums but to column numbers in the output of the
next plan level down. Vars in scan nodes currently don't need any
renumbering, but it'd be easy enough to extend the logic to do something
to them as well. I'm visualizing the run-time transformation from
physical to logical column ordering as a sort of projection, much like
the mapping that happens in a join node.
After more playing with this, it turned out that those logical numbers
stored in Var and TargetEntry are actually completely useless; after
they served their purpose in helping me track down that I actually
needed to sort columns at the RangeTblEntry level, I was able to revert
all those bits and things work fine (actually they work better). So
far, I have had no need to touch setrefs.c that I see. The reason is
that * expansion happens much earlier than setrefs.c is involved, at the
parse analysis level; the target lists generated at that point must
already follow the logical column order. So that part of the patch
becomes this:
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 9e277c5..f640bd8 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -701,6 +701,7 @@ typedef struct RangeTblEntry
*/
Oid relid; /* OID of the relation */
char relkind; /* relation kind (see pg_class.relkind) */
+ List *lognums; /* int list of logical column numbers */
/*
* Fields valid for a subquery RTE (else NULL):
Note that the the eref->colnames list is built in logical column order
(which is what it should be, because it then matches the alias->colnames
list). With all that, it's easy to map the attnums to the logical
numbers when the target list is being constructed. And things work fine
from that point onwards, because we still keep track of the original
attnum to reference the TupleDesc.
A RTE in a stored rule looks like this:
{RTE :alias <> :eref {ALIAS :aliasname bar :colnames ("z" "y" "x")} :rtekind 0
:relid 16404 :relkind r :lognums (i 3 2 1) :inh true :inFromCl true
:requiredPerms 2 :checkAsUser 0 :selectedCols (b 9 10 11) :modifiedCols (b)}
The original table was created with columns "x, y, z", and then I
reversed the order. So if I change the column order in the original
table, the rule does not need any change and it continues to return the
logical order that the table had when the view was originally defined.
(I wonder what the other two RTEs, those named "new" and "old", are
for.)
One thing I'm finding necessary (for COPY support as well as things that
travel through different DestReceivers, such as SQL functions) is that
TupleTableSlots need to keep track of logical vs. physical order, and
form/deform tuples using the correct ordering. So the values/isnull
arrays may be in either order depending on what the caller is doing. At
some point a MinimalTuple might be constructed in logical order, for
example, and the caller must be aware of this so that it can be
deconstructed correctly later on. I mention this so that there's time
for bells to ring ...
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support