Global temporary tables
Current Postgres implementation of temporary table causes number of
problems:
1. Catalog bloating: if client creates and deletes too many temporary
tables, then autovacuum get stuck on catalog.
2. Parallel queries: right now usage of temporary tables in query
disables parallel plan.
3. It is not possible to use temporary tables at replica. Hot standby
configuration is frequently used to run OLAP queries on replica
and results of such queries are used to be saved in temporary tables.
Right now it is not possible (except "hackers" solution with storing
results in file_fdw).
4. Temporary tables can not be used in prepared transactions.
5. Inefficient memory usage and possible memory overflow: each backend
maintains its own local buffers for work with temporary tables.
Default size of temporary buffers is 8Mb. It seems to be too small for
modern servers having hundreds of gigabytes of RAM, causing extra
copying of data
between OS cache and local buffers. But if there are thousands of
backends, each executing queries with temporary tables, then total
amount of
memory used for temporary buffers can exceed several tens of gigabytes.
6. Connection pooler can not reschedule session which has created
temporary tables to some other backend
because it's data is stored in local buffers.
There were several attempts to address this problems.
For example Alexandr Alekseev has implemented patch which allows to
create fast temporary tables without accessing system catalog:
/messages/by-id/20160301182500.2c81c3dc@fujitsu
Unfortunately this patch was too invasive and rejected by community.
There was also attempt to allow under some condition use temporary
tables in 2PC transactions:
/messages/by-id/m2d0pllvqy.fsf@dimitris-macbook-pro.home
/messages/by-id/3a4b3c88-4fa5-1edb-a878-1ed76fa1c82b@postgrespro.ru
Them were also rejected.
I try to make yet another attempt to address this problems, first of all
1), 2), 5) and 6)
To solve this problems I propose notion of "global temporary" tables,
similar with ones in Oracle.
Definition of this table (metadata) is shared by all backends but data
is private to the backend. After session termination data is obviously lost.
Suggested syntax for creation of global temporary tables:
create global temp table
or
create session table
Once been created it can be used by all backends.
Global temporary tables are accessed though shared buffers (to solve
problem 2).
Cleanup of temporary tables data (release of shared buffer and deletion
of relation files) is performed on backend termination.
In case of abnormal server termination, files of global temporary tables
are cleaned-up in the same way as of local temporary tables.
Certainly there are cases were global temporary tables can not be used,
i.e. when application is dynamically constructed name and columns of
temporary table.
Also access to local buffers is more efficient than access to shared
buffers because it doesn't require any synchronization.
But please notice that it is always possible to create old (local)
temporary tables which preserves current behavior.
The problem with replica is still not solved. But shared metadata is
step in this direction.
I am thinking about reimplementation of temporary tables using new table
access method API.
The drawback of such approach is that it will be necessary to
reimplement large bulk of heapam code.
But this approach allows to eliminate visibility check for temporary
table tuples and decrease size of tuple header.
I still not sure if implementing special table access method for
temporary tables is good idea.
Patch for global temporary tables is attached to this mail.
The known limitation is that now it supports only B-Tree indexes.
Any feedback is welcome.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
session_tables.patchtext/x-patch; name=session_tables.patchDownload
commit 4d7a1f51405cdcb4d9dc421b0cb3a0d3439ba362
Author: Konstantin Knizhnik <knizhnik@garret.ru>
Date: Tue Jun 25 14:49:50 2019 +0300
Add session tables
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 92ea1d1..2994687 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -674,6 +674,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index de4d4ef..c96b722 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -804,7 +804,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) && IsSessionRelationBackendId(rel->rd_backend))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index d3c0a93..5e2dd63 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 11936a6..82108d3 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d2e4f53..8709ff1 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3550,7 +3550,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index ebaec4f..2719e2a 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 95af5ec..788d6cb 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7617,6 +7617,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -13968,6 +13974,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 8311b1d..3f011c6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e7c32f2..d19d7bc 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3494,6 +3494,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3507,7 +3508,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7332e6b..44213f5 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -555,7 +555,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -709,7 +709,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1009,7 +1009,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1531,7 +1531,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1542,7 +1543,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1844,8 +1846,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2558,7 +2560,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2630,7 +2632,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2695,7 +2697,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2929,7 +2931,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2957,11 +2959,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2984,24 +2986,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3024,11 +3026,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3043,7 +3045,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3053,7 +3055,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3062,7 +3064,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3101,11 +3103,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3135,7 +3137,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3203,7 +3205,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3250,13 +3253,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3304,13 +3309,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4050,7 +4055,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4074,7 +4079,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4092,7 +4097,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4107,22 +4112,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4358,7 +4368,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4377,7 +4387,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index c462ea8..d197c46 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index bbcd18d..f4dbd0a 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -32,6 +32,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -86,6 +87,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -151,6 +164,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -204,6 +259,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -451,6 +508,19 @@ mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -462,6 +532,7 @@ mdopen(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -627,8 +698,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -713,12 +789,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopen(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopen(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2b992d7..b9fbf01 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1099,6 +1099,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3295,6 +3299,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 9f59cc7..159321f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15582,8 +15582,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index f23fe8d..13da565 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -106,7 +106,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index ca200eb..f9b17e0 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -171,6 +171,7 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
On Wed, 31 Jul 2019 at 23:05, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
Current Postgres implementation of temporary table causes number of
problems:1. Catalog bloating: if client creates and deletes too many temporary
tables, then autovacuum get stuck on catalog.
This also upsets logical decoding a little - AFAICS it still has to treat
transactions that use temporary tables as catalog-modifying transactions,
tracking them in its historic catalog snapshots and doing extra cache
flushes etc when decoding them.
This will become even more important as we work to support eager/optimistic
output plugin processing of in-progress transactions. We'd have to switch
snapshots more, and that can get quite expensive so using temp tables could
really hurt performance. Or we'd have to serialize on catalog-changing
transactions, in which case using temp tables would negate the benefits of
optimistic streaming of in-progress transactions.
3. It is not possible to use temporary tables at replica.
For physical replicas, yes.
Hot standby
configuration is frequently used to run OLAP queries on replica
and results of such queries are used to be saved in temporary tables.
Right now it is not possible (except "hackers" solution with storing
results in file_fdw).
Right. Because we cannot modify pg_class, pg_attribute etc, even though we
could reasonably enough write to local-only relfilenodes on a replica if we
didn't have to change WAL-logged catalog tables.
I've seen some hacks suggested around this where we have an unlogged fork
of each of the needed catalog tables, allowing replicas to write temp table
info to them. We'd scan both the logged and unlogged forks when doing
relcache management etc. But there are plenty of ugly issues with this.
We'd have to reserve oid ranges for them which is ugly; to make it BC
friendly those reservations would probably have to take the form of some
kind of placeholder entry in the real pg_class. And it gets ickier from
there. It hardly seems worth it when we should probably just implement
global temp tables instead.
5. Inefficient memory usage and possible memory overflow: each backend
maintains its own local buffers for work with temporary tables.
Is there any reason that would change with global temp tables? We'd still
be creating a backend-local relfilenode for each backend that actually
writes to the temp table, and I don't see how it'd be useful or practical
to keep those in shared_buffers.
Using local buffers has big advantages too. It saves shared_buffers space
for data where there's actually some possibility of getting cache hits, or
for where we can benefit from lazy/async writeback and write combining. I
wouldn't want to keep temp data there if I had the option.
If you're concerned about the memory use of backend local temp buffers, or
about how we account for and limit those, that's worth looking into. But I
don't think it'd be something that should be affected by global-temp vs
backend-local-temp tables.
Default size of temporary buffers is 8Mb. It seems to be too small for
modern servers having hundreds of gigabytes of RAM, causing extra
copying of data between OS cache and local buffers. But if there are
thousands of backends, each executing queries with temporary tables,
then total amount of memory used for temporary buffers can exceed
several tens of gigabytes.
Right. But what solution do you propose for this? Putting that in
shared_buffers will do nothing except deprive shared_buffers of space that
can be used for other more useful things. A server-wide temp buffer would
add IPC and locking overheads and AFAICS little benefit. One of the big
appeals of temp tables is that we don't need any of that.
If you want to improve server-wide temp buffer memory accounting and
management that makes sense. I can see it being useful to have things like
a server-wide DSM/DSA pool of temp buffers that backends borrow from and
return to based on memory pressure on a LRU-ish basis, maybe. But I can
also see how that'd be complex and hard to get right. It'd also be prone to
priority inversion problems where an idle/inactive backend must be woken up
to release memory or release locks, depriving an actively executing backend
of runtime. And it'd be as likely to create inefficiencies with copying and
eviction as solve them since backends could easily land up taking turns
kicking each other out of memory and re-reading their own data.
I don't think this is something that should be tackled as part of work on
global temp tables personally.
6. Connection pooler can not reschedule session which has created
temporary tables to some other backend because it's data is stored in local
buffers.
Yeah, if you're using transaction-associative pooling. That's just part of
a more general problem though, there are piles of related issues with temp
tables, session GUCs, session advisory locks and more.
I don't see how global temp tables will do you the slightest bit of good
here as the data in them will still be backend-local. If it isn't then you
should just be using unlogged tables.
Definition of this table (metadata) is shared by all backends but data
is private to the backend. After session termination data is obviously
lost.
+1 that's what a global temp table should be, and it's IIRC pretty much how
the SQL standard specifies temp tables.
I suspect I'm overlooking some complexities here, because to me it seems
like we could implement these fairly simply. A new relkind would identify
it as a global temp table and the relfilenode would be 0. Same for indexes
on temp tables. We'd extend the relfilenode mapper to support a
backend-local non-persistent relfilenode map that's used to track temp
table and index relfilenodes. If no relfilenode is defined for the table,
the mapper would allocate one. We already happily create missing
relfilenodes on write so we don't even have to pre-create the actual file.
We'd register the relfilenode as a tempfile and use existing tempfile
cleanup mechanisms, and we'd use the temp tablespace to store it.
I must be missing something important because it doesn't seem hard.
Global temporary tables are accessed though shared buffers (to solve
problem 2).
I'm far from convinced of the wisdom or necessity of that, but I haven't
spent as much time digging into this problem as you have.
The drawback of such approach is that it will be necessary to
reimplement large bulk of heapam code.
But this approach allows to eliminate visibility check for temporary
table tuples and decrease size of tuple header.
That sounds potentially cool, but perhaps a "next step" thing? Allow the
creation of global temp tables to specify reloptions, and you can add it as
a reloption later. You can't actually eliminate visibility checks anyway
because they're still MVCC heaps. Savepoints can create invisible tuples
even if you're using temp tables that are cleared on commit, and of course
so can DELETEs or UPDATEs. So I'm not sure how much use it'd really be in
practice.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 01.08.2019 6:10, Craig Ringer wrote:
3. It is not possible to use temporary tables at replica.
For physical replicas, yes.
Yes, definitely logical replicas (for example our PgPro-EE multimaster
based on logical replication) do not suffer from this problem.
But in case of multimaster we have another problem related with
temporary tables: we have to use 2PC for each transaction and using
temporary tables in prepared transaction is now prohibited.
This was the motivation of the patch proposed by Stas Kelvich which
allows to use temporary tables in prepared transactions under some
conditions.
5. Inefficient memory usage and possible memory overflow: each backend
maintains its own local buffers for work with temporary tables.
Is there any reason that would change with global temp tables? We'd
still be creating a backend-local relfilenode for each backend that
actually writes to the temp table, and I don't see how it'd be useful
or practical to keep those in shared_buffers.
Yes, my implementation of global temp tables is using shared buffers.
It was not strictly needed as far as data is local. It is possible to
have shared metadata and private data accessed through local buffers.
But I have done it for three reasons:
1, Make it possible to use parallel plans for temp tables.
2. Eliminate memory overflow problem.
3. Make in possible to reschedule session to other backens (connection
pooler).
Using local buffers has big advantages too. It saves shared_buffers
space for data where there's actually some possibility of getting
cache hits, or for where we can benefit from lazy/async writeback and
write combining. I wouldn't want to keep temp data there if I had the
option.
Definitely local buffers have some advantages:
- do not require synchronization
- avoid flushing data from shared buffers
But global temp tables are not excluding use of original (local) temp
tables.
So you will have a choice: either to use local temp tables which can be
easily created on demand and accessed through local buffers,
either create global temp tables, which eliminate catalog bloating,
allow parallel queries and which data is controlled by the same cache
replacement discipline as for normal tables...
Default size of temporary buffers is 8Mb. It seems to be too small
for
modern servers having hundreds of gigabytes of RAM, causing extra
copying of data between OS cache and local buffers. But if there arethousands of backends, each executing queries with temporary tables,
then total amount of memory used for temporary buffers can exceed
several tens of gigabytes.
Right. But what solution do you propose for this? Putting that in
shared_buffers will do nothing except deprive shared_buffers of space
that can be used for other more useful things. A server-wide temp
buffer would add IPC and locking overheads and AFAICS little benefit.
One of the big appeals of temp tables is that we don't need any of that.
I do not think that parallel execution and efficient connection pooling
are "little benefit".
If you want to improve server-wide temp buffer memory accounting and
management that makes sense. I can see it being useful to have things
like a server-wide DSM/DSA pool of temp buffers that backends borrow
from and return to based on memory pressure on a LRU-ish basis, maybe.
But I can also see how that'd be complex and hard to get right. It'd
also be prone to priority inversion problems where an idle/inactive
backend must be woken up to release memory or release locks, depriving
an actively executing backend of runtime. And it'd be as likely to
create inefficiencies with copying and eviction as solve them since
backends could easily land up taking turns kicking each other out of
memory and re-reading their own data.I don't think this is something that should be tackled as part of work
on global temp tables personally.
My assumptions are the following: temporary tables are mostly used in
OLAP queries. And OLAP workload means that there are few concurrent
queries which are working with large datasets.
So size of produced temporary tables can be quite big. For OLAP it seems
to be very important to be able to use parallel query execution and use
the same cache eviction rule both for persistent and temp tables
(otherwise you either cause swapping, either extra copying of data
between OS and Postgres caches).
6. Connection pooler can not reschedule session which has created
temporary tables to some other backend because it's data is stored
in local buffers.Yeah, if you're using transaction-associative pooling. That's just
part of a more general problem though, there are piles of related
issues with temp tables, session GUCs, session advisory locks and more.I don't see how global temp tables will do you the slightest bit of
good here as the data in them will still be backend-local. If it isn't
then you should just be using unlogged tables.
You can not use the same unlogged table to save intermediate query
results in two parallel sessions.
Definition of this table (metadata) is shared by all backends but
data
is private to the backend. After session termination data is
obviously lost.+1 that's what a global temp table should be, and it's IIRC pretty
much how the SQL standard specifies temp tables.I suspect I'm overlooking some complexities here, because to me it
seems like we could implement these fairly simply. A new relkind would
identify it as a global temp table and the relfilenode would be 0.
Same for indexes on temp tables. We'd extend the relfilenode mapper to
support a backend-local non-persistent relfilenode map that's used to
track temp table and index relfilenodes. If no relfilenode is defined
for the table, the mapper would allocate one. We already happily
create missing relfilenodes on write so we don't even have to
pre-create the actual file. We'd register the relfilenode as a
tempfile and use existing tempfile cleanup mechanisms, and we'd use
the temp tablespace to store it.I must be missing something important because it doesn't seem hard.
As I already wrote, I tried to kill two bird with one stone: eliminate
catalog bloating and allow access to temp tables from multiple backends
(to be able to perform parallel queries and connection pooling).
This is why I have to use shared buffers for global temp tables.
May be it was not so good idea. But it was one of my primary intention
of publishing this patch to know opinion of other people.
In PG-Pro some of my colleagues think that the most critical problem is
inability to use temporary tables at replica.
Other think that it is not a problem at all if you are using logical
replication.
From my point of view the most critical problem is inability to use
parallel plans for temporary tables.
But looks like you don't think so.
I see three different activities related with temporary tables:
1. Shared metadata
2. Shared buffers
3. Alternative concurrency control & reducing tuple header size
(specialized table access method for temporary tables)
In my proposal I combined 1 and 2, leaving 3 for next step.
I will be interested to know other suggestions.
One more thing - 1 and 2 are really independent: you can share metadata
without sharing buffers.
But introducing yet another kind of temporary tables seems to be really
overkill:
- local temp tables (private namespace and lcoal buffers)
- tables with shared metadata but local bufferes
- tables with shared metadata and bufferes
The drawback of such approach is that it will be necessary to
reimplement large bulk of heapam code.
But this approach allows to eliminate visibility check for temporary
table tuples and decrease size of tuple header.That sounds potentially cool, but perhaps a "next step" thing? Allow
the creation of global temp tables to specify reloptions, and you can
add it as a reloption later. You can't actually eliminate visibility
checks anyway because they're still MVCC heaps.
Sorry?
I mean elimination of MVCC overhead (visibility checks) for temp tables
only.
I am not sure that we can really fully eliminate it if we support use of
temp tables in prepared transactions and autonomous transactions (yet
another awful feature we have in PgPro-EE).
Also looks like we need to have some analogue of CID to be able to
correctly executed queries like "insert into T (select from T ...)"
where T is global temp table.
I didn't think much about it, but I really considering new table access
method API for reducing per-tuple storage overhead for temporary and
append-only tables.
Savepoints can create invisible tuples even if you're using temp
tables that are cleared on commit, and of course so can DELETEs or
UPDATEs. So I'm not sure how much use it'd really be in practice.
Yehh, subtransactions can be also a problem for eliminating xmin/xmax
for temp tables. Thanks for noticing it.
I noticed that I have not patched some extension - fixed and rebased
version of the patch is attached.
Also you can find this version in our github repository:
https://github.com/postgrespro/postgresql.builtin_pool.git
branch global_temp.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
session_tables-1.patchtext/x-patch; name=session_tables-1.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe..c60effd 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -671,6 +671,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 5962126..60f4696 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) && IsSessionRelationBackendId(rel->rd_backend))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 99ae159..24b2438 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3612,7 +3612,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..372c9a5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7678,6 +7678,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14088,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e8ffa04..2004d2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3483,6 +3483,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3496,7 +3497,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6f3a402..76ce953 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2930,7 +2932,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2958,11 +2960,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2985,24 +2987,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3025,11 +3027,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3044,7 +3046,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3054,7 +3056,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3063,7 +3065,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3102,11 +3104,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3136,7 +3138,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3204,7 +3206,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3251,13 +3254,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3305,13 +3310,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4056,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4075,7 +4080,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4093,7 +4098,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4108,22 +4113,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4359,7 +4369,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4378,7 +4388,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..3a6c78d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..b7fcc99 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,7 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
New version of the patch with several fixes is attached.
Many thanks to Roman Zharkov for testing.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
session_tables-2.patchtext/x-patch; name=session_tables-2.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..389466e 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,8 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe..c60effd 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -671,6 +671,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 5962126..60f4696 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) && IsSessionRelationBackendId(rel->rd_backend))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 99ae159..24b2438 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3612,7 +3612,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..3604e4af 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,16 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ HeapTuple tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..38a25cf 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && stmt->relation->relpersistence != RELPERSISTENCE_TEMP
+ && stmt->relation->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -7678,6 +7679,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14089,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14583,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index a9b2f8b..2f261b9 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..ae8b7fd 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2069,7 +2069,8 @@ do_autovacuum(void)
* Check if it is a temp table (presumably, of some other backend's).
* We cannot safely process other backends' temp tables.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (classForm->relpersistence == RELPERSISTENCE_TEMP ||
+ classForm->relpersistence == RELPERSISTENCE_SESSION)
{
/*
* We just ignore it if the owning backend is still active and
@@ -2154,7 +2155,8 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (classForm->relpersistence == RELPERSISTENCE_TEMP ||
+ classForm->relpersistence == RELPERSISTENCE_SESSION)
continue;
relid = classForm->oid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e8ffa04..2004d2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3483,6 +3483,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3496,7 +3497,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6f3a402..76ce953 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2930,7 +2932,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2958,11 +2960,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2985,24 +2987,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3025,11 +3027,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3044,7 +3046,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3054,7 +3056,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3063,7 +3065,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3102,11 +3104,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3136,7 +3138,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3204,7 +3206,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3251,13 +3254,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3305,13 +3310,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4056,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4075,7 +4080,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4093,7 +4098,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4108,22 +4113,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4359,7 +4369,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4378,7 +4388,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
On Tue, 6 Aug 2019 at 16:32, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
New version of the patch with several fixes is attached.
Many thanks to Roman Zharkov for testing.
FWIW I still don't understand your argument with regards to using
shared_buffers for temp tables having connection pooling benefits. Are you
assuming the presence of some other extension in your extended version of
PostgreSQL ? In community PostgreSQL a temp table's contents in one backend
will not be visible in another backend. So if your connection pooler in
transaction pooling mode runs txn 1 on backend 42 and it populates temp
table X, then the pooler runs the same app session's txn 2 on backend 45,
the contents of temp table X are not visible anymore.
Can you explain? Because AFAICS so long as temp table contents are
backend-private there's absolutely no point ever using shared buffers for
their contents.
Perhaps you mean that in a connection pooling case, each backend may land
up filling up temp buffers with contents from *multiple different temp
tables*? If so, sure, I get that, but the answer there seems to be to
improve eviction and memory accounting, not make backends waste precious
shared_buffers space on non-shareable data.
Anyhow, I strongly suggest you simplify the feature to add the basic global
temp table feature so the need to change pg_class, pg_attribute etc to use
temp tables is removed, but separate changes to temp table memory handling
etc into a follow-up patch. That'll make it smaller and easier to review
and merge too. The two changes are IMO logically quite separate anyway.
Come to think of it, I think connection poolers might benefit from an
extension to the DISCARD command, say "DISCARD TEMP_BUFFERS", which evicts
temp table buffers from memory *without* dropping the temp tables. If
they're currently in-memory tuplestores they'd be written out and evicted.
That way a connection pooler could "clean" the backend, at the cost of some
probably pretty cheap buffered writes to the system buffer cache. The
kernel might not even bother to write out the buffercache and it won't be
forced to do so by fsync, checkpoints, etc, nor will the writes go via WAL
so such evictions could be pretty cheap - and if not under lots of memory
pressure the backend could often read the temp table back in from system
buffer cache without disk I/O.
That's my suggestion for how to solve your pooler problem, assuming I've
understood it correctly.
Along these lines I suggest adding the following to DISCARD at some point,
obviously not as part of your patch:
* DISCARD TEMP_BUFFERS
* DISCARD SHARED_BUFFERS
* DISCARD TEMP_FILES
* DISCARD CATALOG_CACHE
* DISCARD HOLD_CURSORS
* DISCARD ADVISORY_LOCKS
where obviously DISCARD SHARED_BUFFERS would be superuser-only and evict
only clean buffers.
(Also, if we extend DISCARD lets also it to be written as DISCARD (LIST,
OF, THINGS, TO, DISCARD) so that we can make the syntax extensible for
plugins in future).
Thoughts?
Would DISCARD TEMP_BUFFERS meet your needs?
On 08.08.2019 5:40, Craig Ringer wrote:
On Tue, 6 Aug 2019 at 16:32, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:New version of the patch with several fixes is attached.
Many thanks to Roman Zharkov for testing.FWIW I still don't understand your argument with regards to using
shared_buffers for temp tables having connection pooling benefits. Are
you assuming the presence of some other extension in your extended
version of PostgreSQL ? In community PostgreSQL a temp table's
contents in one backend will not be visible in another backend. So if
your connection pooler in transaction pooling mode runs txn 1 on
backend 42 and it populates temp table X, then the pooler runs the
same app session's txn 2 on backend 45, the contents of temp table X
are not visible anymore.
Certainly here I mean built-in connection pooler which is not currently
present in Postgres,
but it is part of PgPRO-EE and there is my patch for vanilla at commitfest:
https://commitfest.postgresql.org/24/2067
Can you explain? Because AFAICS so long as temp table contents are
backend-private there's absolutely no point ever using shared buffers
for their contents.
Sure, there is no such problem with temporary tables now.
There is another problem: you can not use temporary table with any
existed connection poolers (pgbouncer,...) with pooling level other than
session unless temporary table is used inside one transaction.
One of the advantages of built-in connection pooler is that it can
provide session semantic (GUCs, prepared statement, temporary
tables,...) with limited number of backends (smaller than number of
sessions).
In PgPRO-EE this problem was solved by binding session to backend. I.e.
one backend can manage multiple sessions,
but session can not migrate to another backend. The drawback of such
solution is obvious: one long living transaction can block transactions
of all other sessions scheduled to this backend.
Possibility to migrate session to another backend is one of the obvious
solutions of the problem. But the main show stopper for it is temporary
tables.
This is why I consider moving temporary tables to shared buffers as
very important step.
In vanilla version of built-in connection pooler situation is slightly
different.
Right now if client is using temporary tables without "ON COMMIT DROP"
clause, backend is marked as "tainted" and is pinned for this session.
So it is actually excluded from connection pool and servers only this
session. Once again - if I will be able to access temporary table data
from other backend, there will be no need to mark backend as tainted in
this case.
Certainly it also requires shared metadata. And here we come to the
concept of global temp tables (if we forget for a moment that global
temp tables were "invented" long time ago by Oracle and many other DBMSes:)
Perhaps you mean that in a connection pooling case, each backend may
land up filling up temp buffers with contents from *multiple different
temp tables*? If so, sure, I get that, but the answer there seems to
be to improve eviction and memory accounting, not make backends waste
precious shared_buffers space on non-shareable data.Anyhow, I strongly suggest you simplify the feature to add the basic
global temp table feature so the need to change pg_class, pg_attribute
etc to use temp tables is removed, but separate changes to temp table
memory handling etc into a follow-up patch. That'll make it smaller
and easier to review and merge too. The two changes are IMO logically
quite separate anyway.
I agree that them are separate.
But even if we forget about built-in connection pooler, don't you think
that possibility to use parallel query plans for temporary tables is
itself strong enough motivation to access global temp table through
shared buffers
(while still supporting private page pool for local temp tables). So
both approaches (shared vs. private buffers) have their pros and
contras. This is why it seems to be reasonable to support both of them
and let user to make choice most suitable for concrete application.
Certainly it is possible to provide "global shared temp tables" and
"global private temp tables". But IMHO it is overkill.
Come to think of it, I think connection poolers might benefit from an
extension to the DISCARD command, say "DISCARD TEMP_BUFFERS", which
evicts temp table buffers from memory *without* dropping the temp
tables. If they're currently in-memory tuplestores they'd be written
out and evicted. That way a connection pooler could "clean" the
backend, at the cost of some probably pretty cheap buffered writes to
the system buffer cache. The kernel might not even bother to write out
the buffercache and it won't be forced to do so by fsync, checkpoints,
etc, nor will the writes go via WAL so such evictions could be pretty
cheap - and if not under lots of memory pressure the backend could
often read the temp table back in from system buffer cache without
disk I/O.
Yes, this is one of th possible solutions for session migration. But
frankly speaking flushing local buffers on each session reschedule seems
to be not so good solution. Even if OS file cache is large enough and
flushed buffers are still present in memory (but them will be written
to the disk in this case even if data of temp table is not intended to
be persisted).
That's my suggestion for how to solve your pooler problem, assuming
I've understood it correctly.Along these lines I suggest adding the following to DISCARD at some
point, obviously not as part of your patch:* DISCARD TEMP_BUFFERS
* DISCARD SHARED_BUFFERS
* DISCARD TEMP_FILES
* DISCARD CATALOG_CACHE
* DISCARD HOLD_CURSORS
* DISCARD ADVISORY_LOCKSwhere obviously DISCARD SHARED_BUFFERS would be superuser-only and
evict only clean buffers.(Also, if we extend DISCARD lets also it to be written as DISCARD
(LIST, OF, THINGS, TO, DISCARD) so that we can make the syntax
extensible for plugins in future).Thoughts?
Would DISCARD TEMP_BUFFERS meet your needs?
Actually I have already implemented DropLocalBuffers function (three
line of code:)
void
DropLocalBuffers(void)
{
RelFileNode rnode;
rnode.relNode = InvalidOid; /* drop all local buffers */
DropRelFileNodeAllLocalBuffers(rnode);
}
for yet another Postgres extension which is not yet included even in
PgPRO-EE - SnapFS: support of database snapshots.
I do not think that we need such command at user level (i.e. have
correspondent SQL command).
But, as I already wrote above, I do not consider flushing all buffers on
session reschedule as acceptable solution.
And moreover, just flushing buffers is not enough. There is still some
smgr stuff associated with this relation which is local to the backend.
We in any case has to make some changes to be able to access temporary
data from other backend even if data is flushed to the file system.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Thu, 8 Aug 2019 at 15:03, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
On 08.08.2019 5:40, Craig Ringer wrote:
On Tue, 6 Aug 2019 at 16:32, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:New version of the patch with several fixes is attached.
Many thanks to Roman Zharkov for testing.FWIW I still don't understand your argument with regards to using
shared_buffers for temp tables having connection pooling benefits. Are you
assuming the presence of some other extension in your extended version of
PostgreSQL ? In community PostgreSQL a temp table's contents in one backend
will not be visible in another backend. So if your connection pooler in
transaction pooling mode runs txn 1 on backend 42 and it populates temp
table X, then the pooler runs the same app session's txn 2 on backend 45,
the contents of temp table X are not visible anymore.Certainly here I mean built-in connection pooler which is not currently
present in Postgres,
but it is part of PgPRO-EE and there is my patch for vanilla at commitfest:
https://commitfest.postgresql.org/24/2067
OK, that's what I assumed.
You're trying to treat this change as if it's a given that the other
functionality you want/propose is present in core or will be present in
core. That's far from given. My suggestion is to split it up so that the
parts can be reviewed and committed separately.
In PgPRO-EE this problem was solved by binding session to backend. I.e.
one backend can manage multiple sessions,
but session can not migrate to another backend. The drawback of such
solution is obvious: one long living transaction can block transactions of
all other sessions scheduled to this backend.
Possibility to migrate session to another backend is one of the obvious
solutions of the problem. But the main show stopper for it is temporary
tables.
This is why I consider moving temporary tables to shared buffers as very
important step.
I can see why it's important for your use case.
I am not disagreeing.
I am however strongly suggesting that your patch has two fairly distinct
functional changes in it, and you should separate them out.
* Introduce global temp tables, a new relkind that works like a temp table
but doesn't require catalog changes. Uses per-backend relfilenode and
cleanup like existing temp tables. You could extend the relmapper to handle
the mapping of relation oid to per-backend relfilenode.
* Associate global temp tables with session state and manage them in
shared_buffers so they can work with the in-core connection pooler (if
committed)
Historically we've had a few efforts to get in-core connection pooling that
haven't gone anywhere. Without your pooler patch the changes you make to
use shared_buffers etc are going to be unhelpful at best, if not actively
harmful to performance, and will add unnecessary complexity. So I think
there's a logical series of patches here:
* global temp table relkind and support for it
* session state separation
* connection pooling
* pooler-friendly temp tables in shared_buffers
Make sense?
But even if we forget about built-in connection pooler, don't you think
that possibility to use parallel query plans for temporary tables is itself
strong enough motivation to access global temp table through shared buffers?
I can see a way to share temp tables across parallel query backends being
very useful for DW/OLAP workloads, yes. But I don't know if putting them in
shared_buffers is the right answer for that. We have DSM/DSA, we have
shm_mq, various options for making temp buffers share-able with parallel
worker backends.
I'm suggesting that you not tie the whole (very useful) global temp tables
feature to this, but instead split it up into logical units that can be
understood, reviewed and committed separately.
I would gladly participate in review.
Would DISCARD TEMP_BUFFERS meet your needs?
Actually I have already implemented DropLocalBuffers function (three line
of code:)[...]
I do not think that we need such command at user level (i.e. have
correspondent SQL command).
I'd be very happy to have it personally, but don't think it needs to be
tied in with your patch set here. Maybe I can cook up a patch soon.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 09.08.2019 8:34, Craig Ringer wrote:
On Thu, 8 Aug 2019 at 15:03, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:On 08.08.2019 5:40, Craig Ringer wrote:
On Tue, 6 Aug 2019 at 16:32, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:New version of the patch with several fixes is attached.
Many thanks to Roman Zharkov for testing.FWIW I still don't understand your argument with regards to using
shared_buffers for temp tables having connection pooling
benefits. Are you assuming the presence of some other extension
in your extended version of PostgreSQL ? In community PostgreSQL
a temp table's contents in one backend will not be visible in
another backend. So if your connection pooler in transaction
pooling mode runs txn 1 on backend 42 and it populates temp table
X, then the pooler runs the same app session's txn 2 on backend
45, the contents of temp table X are not visible anymore.Certainly here I mean built-in connection pooler which is not
currently present in Postgres,
but it is part of PgPRO-EE and there is my patch for vanilla at
commitfest:
https://commitfest.postgresql.org/24/2067OK, that's what I assumed.
You're trying to treat this change as if it's a given that the other
functionality you want/propose is present in core or will be present
in core. That's far from given. My suggestion is to split it up so
that the parts can be reviewed and committed separately.In PgPRO-EE this problem was solved by binding session to backend.
I.e. one backend can manage multiple sessions,
but session can not migrate to another backend. The drawback of
such solution is obvious: one long living transaction can block
transactions of all other sessions scheduled to this backend.
Possibility to migrate session to another backend is one of the
obvious solutions of the problem. But the main show stopper for it
is temporary tables.
This is why I consider moving temporary tables to shared buffers
as very important step.I can see why it's important for your use case.
I am not disagreeing.
I am however strongly suggesting that your patch has two fairly
distinct functional changes in it, and you should separate them out.* Introduce global temp tables, a new relkind that works like a temp
table but doesn't require catalog changes. Uses per-backend
relfilenode and cleanup like existing temp tables. You could extend
the relmapper to handle the mapping of relation oid to per-backend
relfilenode.* Associate global temp tables with session state and manage them in
shared_buffers so they can work with the in-core connection pooler (if
committed)Historically we've had a few efforts to get in-core connection pooling
that haven't gone anywhere. Without your pooler patch the changes you
make to use shared_buffers etc are going to be unhelpful at best, if
not actively harmful to performance, and will add unnecessary
complexity. So I think there's a logical series of patches here:* global temp table relkind and support for it
* session state separation
* connection pooling
* pooler-friendly temp tables in shared_buffersMake sense?
But even if we forget about built-in connection pooler, don't you
think that possibility to use parallel query plans for temporary
tables is itself strong enough motivation to access global temp
table through shared buffers?I can see a way to share temp tables across parallel query backends
being very useful for DW/OLAP workloads, yes. But I don't know if
putting them in shared_buffers is the right answer for that. We have
DSM/DSA, we have shm_mq, various options for making temp buffers
share-able with parallel worker backends.I'm suggesting that you not tie the whole (very useful) global temp
tables feature to this, but instead split it up into logical units
that can be understood, reviewed and committed separately.I would gladly participate in review.
Ok, here it is: global_private_temp-1.patch
Also I have attached updated version of the global temp tables with
shared buffers - global_shared_temp-1.patch
It is certainly larger (~2k lines vs. 1.5k lines) because it is changing
BufferTag and related functions.
But I do not think that this different is so critical.
Still have a wish to kill two birds with one stone:)
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_shared_temp-1.patchtext/x-patch; name=global_shared_temp-1.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..389466e 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,8 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe..c60effd 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -671,6 +671,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 5962126..60f4696 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) && IsSessionRelationBackendId(rel->rd_backend))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 99ae159..24b2438 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3612,7 +3612,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..d700650 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a7d0e99 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && stmt->relation->relpersistence != RELPERSISTENCE_TEMP
+ && stmt->relation->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1773,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7680,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14090,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14584,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index a9b2f8b..2f261b9 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..ae8b7fd 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2069,7 +2069,8 @@ do_autovacuum(void)
* Check if it is a temp table (presumably, of some other backend's).
* We cannot safely process other backends' temp tables.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (classForm->relpersistence == RELPERSISTENCE_TEMP ||
+ classForm->relpersistence == RELPERSISTENCE_SESSION)
{
/*
* We just ignore it if the owning backend is still active and
@@ -2154,7 +2155,8 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (classForm->relpersistence == RELPERSISTENCE_TEMP ||
+ classForm->relpersistence == RELPERSISTENCE_SESSION)
continue;
relid = classForm->oid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e8ffa04..2004d2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3483,6 +3483,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3496,7 +3497,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6f3a402..76ce953 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2930,7 +2932,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2958,11 +2960,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2985,24 +2987,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3025,11 +3027,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3044,7 +3046,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3054,7 +3056,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3063,7 +3065,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3102,11 +3104,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3136,7 +3138,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3204,7 +3206,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3251,13 +3254,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3305,13 +3310,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4056,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4075,7 +4080,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4093,7 +4098,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4108,22 +4113,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4359,7 +4369,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4378,7 +4388,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 74b5077..44df4e0 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -85,3 +85,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
global_private_temp-1.patchtext/x-patch; name=global_private_temp-1.patchDownload
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..389466e 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,8 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index f1ff01e..e92d324 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -673,6 +673,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 9c1f7de..e4a56f6 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) && IsSessionRelationBackendId(rel->rd_backend))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 3e1d406..aaa2c49 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3590,7 +3590,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..d700650 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a7d0e99 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && stmt->relation->relpersistence != RELPERSISTENCE_TEMP
+ && stmt->relation->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1773,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7680,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14090,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14584,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 6e5768c..ea6989b 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..ae8b7fd 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2069,7 +2069,8 @@ do_autovacuum(void)
* Check if it is a temp table (presumably, of some other backend's).
* We cannot safely process other backends' temp tables.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (classForm->relpersistence == RELPERSISTENCE_TEMP ||
+ classForm->relpersistence == RELPERSISTENCE_SESSION)
{
/*
* We just ignore it if the owning backend is still active and
@@ -2154,7 +2155,8 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (classForm->relpersistence == RELPERSISTENCE_TEMP ||
+ classForm->relpersistence == RELPERSISTENCE_SESSION)
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..5db79ec 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,45 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +270,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +519,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +543,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +720,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +811,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 69ae227..95919f8 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -87,3 +87,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
On Fri, 9 Aug 2019 at 22:07, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
Ok, here it is: global_private_temp-1.patch
Fantastic.
I'll put that high on my queue.
I'd love to see something like this get in.
Doubly so if it brings us closer to being able to use temp tables on
physical read replicas, though I know there are plenty of other barriers
there (not least of which being temp tables using persistent txns not
vtxids)
Does it have a CF entry?
Also I have attached updated version of the global temp tables with shared
buffers - global_shared_temp-1.patch
Nice to see that split out. In addition to giving the first patch more hope
of being committed this time around, it'll help with readability and
testability too.
To be clear, I have long wanted to see PostgreSQL have the "session" state
abstraction you have implemented. I think it's really important for high
client count OLTP workloads, working with the endless collection of ORMs
out there, etc. So I'm all in favour of it in principle so long as it can
be made to work reliably with limited performance impact on existing
workloads and without making life lots harder when adding new core
functionality, for extension authors etc. The same goes for built-in
pooling. I think PostgreSQL has needed some sort of separation of
"connection", "backend", "session" and "executor" for a long time and I'm
glad to see you working on it.
With that said: How do you intend to address the likelihood that this will
cause performance regressions for existing workloads that use temp tables
*without* relying on your session state and connection pooler? Consider
workloads that use temp tables for mid-long txns where txn pooling is
unimportant, where they also do plenty of read and write activity on
persistent tables. Classic OLAP/DW stuff. e.g.:
* four clients, four backends, four connections, session-level connections
that stay busy with minimal client sleeps
* All sessions run the same bench code
* transactions all read plenty of data from a medium to large persistent
table (think fact tables, etc)
* transactions store a filtered, joined dataset with some pre-computed
window results or something in temp tables
* benchmark workload makes big-ish temp tables to store intermediate data
for its medium-length transactions
* transactions also write to some persistent relations, say to record their
summarised results
How does it perform with and without your patch? I'm concerned that:
* the extra buffer locking and various IPC may degrade performance of temp
tables
* the temp table data in shared_buffers may put pressure on shared_buffers
space, cached pages for persistent tables all sessions are sharing;
* the temp table data in shared_buffers may put pressure on shared_buffers
space for dirty buffers, forcing writes of persistent tables out earlier
therefore reducing write-combining opportunities;
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 10.08.2019 5:12, Craig Ringer wrote:
On Fri, 9 Aug 2019 at 22:07, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:Ok, here it is: global_private_temp-1.patch
Fantastic.
I'll put that high on my queue.
I'd love to see something like this get in.
Doubly so if it brings us closer to being able to use temp tables on
physical read replicas, though I know there are plenty of other
barriers there (not least of which being temp tables using persistent
txns not vtxids)Does it have a CF entry?
https://commitfest.postgresql.org/24/2233/
Also I have attached updated version of the global temp tables
with shared buffers - global_shared_temp-1.patchNice to see that split out. In addition to giving the first patch more
hope of being committed this time around, it'll help with readability
and testability too.To be clear, I have long wanted to see PostgreSQL have the "session"
state abstraction you have implemented. I think it's really important
for high client count OLTP workloads, working with the endless
collection of ORMs out there, etc. So I'm all in favour of it in
principle so long as it can be made to work reliably with limited
performance impact on existing workloads and without making life lots
harder when adding new core functionality, for extension authors etc.
The same goes for built-in pooling. I think PostgreSQL has needed some
sort of separation of "connection", "backend", "session" and
"executor" for a long time and I'm glad to see you working on it.With that said: How do you intend to address the likelihood that this
will cause performance regressions for existing workloads that use
temp tables *without* relying on your session state and connection
pooler? Consider workloads that use temp tables for mid-long txns
where txn pooling is unimportant, where they also do plenty of read
and write activity on persistent tables. Classic OLAP/DW stuff. e.g.:* four clients, four backends, four connections, session-level
connections that stay busy with minimal client sleeps
* All sessions run the same bench code
* transactions all read plenty of data from a medium to large
persistent table (think fact tables, etc)
* transactions store a filtered, joined dataset with some pre-computed
window results or something in temp tables
* benchmark workload makes big-ish temp tables to store intermediate
data for its medium-length transactions
* transactions also write to some persistent relations, say to record
their summarised resultsHow does it perform with and without your patch? I'm concerned that:
* the extra buffer locking and various IPC may degrade performance of
temp tables
* the temp table data in shared_buffers may put pressure on
shared_buffers space, cached pages for persistent tables all sessions
are sharing;
* the temp table data in shared_buffers may put pressure on
shared_buffers space for dirty buffers, forcing writes of persistent
tables out earlier therefore reducing write-combining opportunities;
I agree that access to local buffers is cheaper than to shared buffers
because there is no lock overhead.
And the fact that access to local tables can not affect cached data of
persistent tables is also important.
But most of Postgres tables are still normal (persistent) tables access
through shared buffers.
And huge amount of efforts were made to make this access as efficient as
possible (use clock algorithm which doesn't require global lock,
atomic operations,...). Also using the same replacement discipline for
all tables at some workloads may be also preferable.
So it is not so obvious to me that in the described scenario local
buffer cache for temporary table really will provide significant advantages.
It will be interesting to perform some benchmarking - I am going to do it.
What I have observed right now is that in type scenario: dumping results
of huge query to temporary table with subsequent traverse of this table
old (local) temporary tables provide better performance (may be because
of small size of local buffer cache and different eviction policy).
But subsequent accesses to global shared table are faster (because it
completely fits in large shared buffer cache).
There is one more problem with global temporary tables for which I do
not know good solution now: collecting statistic.
As far as each backend has its own data, generally them may need
different query plans.
Right now if you perform "analyze table" in one backend, then it will
affect plans in all backends.
It can be considered not as bug, but as feature if we assume that
distribution if data in all backens is similar.
But if this assumption is not true, then it can be a problem.
Hi
There is one more problem with global temporary tables for which I do not
know good solution now: collecting statistic.
As far as each backend has its own data, generally them may need different
query plans.
Right now if you perform "analyze table" in one backend, then it will
affect plans in all backends.
It can be considered not as bug, but as feature if we assume that
distribution if data in all backens is similar.
But if this assumption is not true, then it can be a problem.
Last point is probably the most difficult issue and I think about it years.
I have a experience with my customers so 99% of usage temp tables is
without statistics - just with good information only about rows. Only few
customers know so manual ANALYZE is necessary for temp tables (when it is
really necessary).
Sharing meta data about global temporary tables can real problem - probably
not about statistics, but surely about number of pages and number of rows.
There are two requirements:
a) we need some special meta data for any instance (per session) of global
temporary table (row, pages, statistics, maybe multicolumn statistics, ...)
b) we would not to use persistent global catalogue (against catalogue
bloating)
I see two possible solution:
1. hold these data only in memory in special buffers
2. hold these data in global temporary tables - it is similar to normal
tables - we can use global temp tables for metadata like classic persistent
tables are used for metadata of classic persistent tables. Next syscache
can be enhanced to work with union of two system tables.
I prefer @2 because changes can be implemented on deeper level.
Sharing metadata for global temp tables (current state if I understand
well) is good enough for develop stage, but It is hard to expect so it can
work generally in production environment.
Regards
p.s. I am very happy so you are working on this topic. It is interesting
and important problem.
Pavel
Show quoted text
Hi,
On 11.08.2019 10:14, Pavel Stehule wrote:
Hi
There is one more problem with global temporary tables for which I
do not know good solution now: collecting statistic.
As far as each backend has its own data, generally them may need
different query plans.
Right now if you perform "analyze table" in one backend, then it
will affect plans in all backends.
It can be considered not as bug, but as feature if we assume that
distribution if data in all backens is similar.
But if this assumption is not true, then it can be a problem.Last point is probably the most difficult issue and I think about it
years.I have a experience with my customers so 99% of usage temp tables is
without statistics - just with good information only about rows. Only
few customers know so manual ANALYZE is necessary for temp tables
(when it is really necessary).Sharing meta data about global temporary tables can real problem -
probably not about statistics, but surely about number of pages and
number of rows.
But Postgres is not storing this information now anywhere else except
statistic, isn't it?
There was proposal to cache relation size, but it is not implemented
yet. If such cache exists, then we can use it to store local information
about global temporary tables.
So if 99% of users do not perform analyze for temporary tables, then
them will not be faced with this problem, right?
There are two requirements:
a) we need some special meta data for any instance (per session) of
global temporary table (row, pages, statistics, maybe multicolumn
statistics, ...)b) we would not to use persistent global catalogue (against catalogue
bloating)I see two possible solution:
1. hold these data only in memory in special buffers
2. hold these data in global temporary tables - it is similar to
normal tables - we can use global temp tables for metadata like
classic persistent tables are used for metadata of classic persistent
tables. Next syscache can be enhanced to work with union of two system
tables.I prefer @2 because changes can be implemented on deeper level.
Sharing metadata for global temp tables (current state if I understand
well) is good enough for develop stage, but It is hard to expect so it
can work generally in production environment.
I think that it not possible to assume that temporary data will aways
fir in memory.
So 1) seems to be not acceptable solution.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
po 12. 8. 2019 v 18:19 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
Hi,
On 11.08.2019 10:14, Pavel Stehule wrote:
Hi
There is one more problem with global temporary tables for which I do not
know good solution now: collecting statistic.
As far as each backend has its own data, generally them may need
different query plans.
Right now if you perform "analyze table" in one backend, then it will
affect plans in all backends.
It can be considered not as bug, but as feature if we assume that
distribution if data in all backens is similar.
But if this assumption is not true, then it can be a problem.Last point is probably the most difficult issue and I think about it
years.I have a experience with my customers so 99% of usage temp tables is
without statistics - just with good information only about rows. Only few
customers know so manual ANALYZE is necessary for temp tables (when it is
really necessary).Sharing meta data about global temporary tables can real problem -
probably not about statistics, but surely about number of pages and number
of rows.But Postgres is not storing this information now anywhere else except
statistic, isn't it?
not only - critical numbers are reltuples, relpages from pg_class
There was proposal to cache relation size, but it is not implemented yet.
If such cache exists, then we can use it to store local information about
global temporary tables.
So if 99% of users do not perform analyze for temporary tables, then them
will not be faced with this problem, right?
they use default statistics based on relpages. But for 1% of applications
statistics are critical - almost always for OLAP applications.
There are two requirements:
a) we need some special meta data for any instance (per session) of global
temporary table (row, pages, statistics, maybe multicolumn statistics, ...)b) we would not to use persistent global catalogue (against catalogue
bloating)I see two possible solution:
1. hold these data only in memory in special buffers
2. hold these data in global temporary tables - it is similar to normal
tables - we can use global temp tables for metadata like classic persistent
tables are used for metadata of classic persistent tables. Next syscache
can be enhanced to work with union of two system tables.I prefer @2 because changes can be implemented on deeper level.
Sharing metadata for global temp tables (current state if I understand
well) is good enough for develop stage, but It is hard to expect so it can
work generally in production environment.I think that it not possible to assume that temporary data will aways fir
in memory.
So 1) seems to be not acceptable solution.
I spoke only about metadata. Data should be stored in temp buffers (and
possibly in temp files)
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Tue, 13 Aug 2019 at 00:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
But Postgres is not storing this information now anywhere else except
statistic, isn't it?
not only - critical numbers are reltuples, relpages from pg_class
That's a very good point. relallvisible too. How's the global temp table
impl handling that right now, since you won't be changing the pg_class row?
AFAICS relpages doesn't need to be up to date (and reltuples certainly
doesn't) so presumably you're just leaving them as zero?
What happens right now if you ANALYZE or VACUUM ANALYZE a global temp
table? Is it just disallowed?
I'll need to check, but I wonder if periodically updating those fields in
pg_class impacts logical decoding too. Logical decoding must treat
transactions with catalog changes as special cases where it creates custom
snapshots and does other expensive additional work.
(See ReorderBufferXidSetCatalogChanges in reorderbuffer.c and its
callsites). We don't actually need to know relpages and reltuples during
logical decoding. It can probably ignore relfrozenxid and relminmxid
changes too, maybe even pg_statistic changes though I'd be less confident
about that one.
At some point I need to patch in a bunch of extra tracepoints and do some
perf tracing to see how often we do potentially unnecessary snapshot
related work in logical decoding.
There was proposal to cache relation size, but it is not implemented yet.
If such cache exists, then we can use it to store local information about
global temporary tables.
So if 99% of users do not perform analyze for temporary tables, then them
will not be faced with this problem, right?they use default statistics based on relpages. But for 1% of applications
statistics are critical - almost always for OLAP applications.
Agreed. It's actually quite a common solution to user problem reports /
support queries about temp table performance: "Run ANALYZE. Consider
creating indexes too."
Which reminds me - if global temp tables do get added, it'll further
increase the desirability of exposing a feature to let users
disable+invalidate and then later reindex+enable indexes without icky
catalog hacking. So they can disable indexes for their local copy, load
data, re-enable indexes. That'd be "interesting" to implement for global
temp tables given that index state is part of the pg_index row associated
with an index rel though.
1. hold these data only in memory in special buffers
I don't see that working well for pg_statistic or anything else that holds
nontrivial user data though.
2. hold these data in global temporary tables - it is similar to normal
tables - we can use global temp tables for metadata like classic persistent
tables are used for metadata of classic persistent tables. Next syscache
can be enhanced to work with union of two system tables.
Very meta. Syscache and relcache are extremely performance critical but
could probably skip scans entirely on backends that haven't used any global
temp tables.
I don't know the relevant caches well enough to have a useful opinion here.
I think that it not possible to assume that temporary data will aways fir
in memory.
So 1) seems to be not acceptable solution.
It'd only be the metadata, but if it includes things like column histograms
and most frequent value data that'd still be undesirable to have pinned in
backend memory.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 13.08.2019 8:34, Craig Ringer wrote:
On Tue, 13 Aug 2019 at 00:47, Pavel Stehule <pavel.stehule@gmail.com
<mailto:pavel.stehule@gmail.com>> wrote:But Postgres is not storing this information now anywhere else
except statistic, isn't it?not only - critical numbers are reltuples, relpages from pg_class
That's a very good point. relallvisible too. How's the global temp
table impl handling that right now, since you won't be changing the
pg_class row? AFAICS relpages doesn't need to be up to date (and
reltuples certainly doesn't) so presumably you're just leaving them as
zero?
As far as I understand relpages and reltuples are set only when you
perform "analyze" of the table.
What happens right now if you ANALYZE or VACUUM ANALYZE a global temp
table? Is it just disallowed?
No, it is not disallowed now.
It updates the statistic and also fields in pg_class which are shared by
all backends.
So all backends will now build plans according to this statistic.
Certainly it may lead to not so efficient plans if there are large
differences in number of tuples stored in this table in different backends.
But seems to me critical mostly in case of presence of indexes for
temporary table. And it seems to me that users are created indexes for
temporary tables even rarely than doing analyze for them.
I'll need to check, but I wonder if periodically updating those fields
in pg_class impacts logical decoding too. Logical decoding must treat
transactions with catalog changes as special cases where it creates
custom snapshots and does other expensive additional work.
(See ReorderBufferXidSetCatalogChanges in reorderbuffer.c and its
callsites). We don't actually need to know relpages and reltuples
during logical decoding. It can probably ignore relfrozenxid
and relminmxid changes too, maybe even pg_statistic changes though I'd
be less confident about that one.At some point I need to patch in a bunch of extra tracepoints and do
some perf tracing to see how often we do potentially unnecessary
snapshot related work in logical decoding.
Temporary tables (both local and global) as well as unlogged tables are
not subject of logical replication, aren't them?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Fri, 9 Aug 2019 at 22:07, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
Ok, here it is: global_private_temp-1.patch
Initial pass review follows.
Relation name "SESSION" is odd. I guess you're avoiding "global" because
the data is session-scoped, not globally temporary. But I'm not sure
"session" fits either - after all regular temp tables are also session
temporary tables. I won't focus on naming further beyond asking that it be
consistent though, right now there's a mix of "global" in some places and
"session" in others.
Why do you need to do all this indirection with changing RelFileNode to
RelFileNodeBackend in the bufmgr, changing BufferGetTag etc? Similarly,
your changes of RelFileNodeBackendIsTemp to RelFileNodeBackendIsLocalTemp .
Did you look into my suggestion of extending the relmapper so that global
temp tables would have a relfilenode of 0 like pg_class etc, and use a
backend-local map of oid-to-relfilenode mappings? I'm guessing you did it
the way you did instead to lay the groundwork for cross-backend sharing,
but if so it should IMO be in that patch. Maybe my understanding of the
existing temp table mechanics is just insufficient as I
see RelFileNodeBackendIsTemp is already used in some aspects of existing
temp relation handling.
Similarly, TruncateSessionRelations probably shouldn't need to exist in
this patch in its current form; there's no shared_buffers use to clean and
the same file cleanup mechanism should handle both session-temp and
local-temp relfilenodes.
A number of places make a change like this:
rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
and I'd like to see a test macro or inline static for it since it's
repeated so much. Mostly to make the intent clear: "is this a relation with
temporary backend-scoped data?"
This test:
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) &&
IsSessionRelationBackendId(rel->rd_backend))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
seems sensible but I'm wondering if there's a way to channel initialization
of global-temp objects through a bit more of a common path, so it reads
more obviously as a common test applying to all global-temp tables. "Global
temp table not initialized in session yet? Initialize it." So we don't have
to have every object type do an object type specific test for "am I
actually uninitialized?" in all paths it might hit. I guess I expected to
see something more like a
if (RelGlobalTempUninitialized(rel))
but maybe I've been doing too much Java ;)
A similar test reappears here for sequences:
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION &&
PageIsNew(page))
Why is this written differently?
Sequence initialization ignores sequence startval/firstval settings. Why?
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
Doesn't this change the test outcome for RELPERSISTENCE_UNLOGGED?
In PreCommit_on_commit_actions, in the the ONCOMMIT_DELETE_ROWS case, is
there any way to still respect the XACT_FLAGS_ACCESSEDTEMPNAMESPACE flag if
the oid is for a backend-temp table not a global-temp table?
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence ==
RELPERSISTENCE_TEMP;
Won't session-temp tables have local buffers too? Until your next patch
that adds shared_buffers storage for them anyway?
+ * These is no need to separate them at file system level, so just
subtract SessionRelFirstBackendId
+ * to avoid too long file names.
I agree that there's no reason to be able to differentiate between
local-temp and session-temp relfilenodes at the filesystem level.
Also I have attached updated version of the global temp tables with shared
buffers - global_shared_temp-1.patch
It is certainly larger (~2k lines vs. 1.5k lines) because it is changing
BufferTag and related functions.
But I do not think that this different is so critical.
Still have a wish to kill two birds with one stone:)--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On Tue, 13 Aug 2019 at 16:19, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
On 13.08.2019 8:34, Craig Ringer wrote:
On Tue, 13 Aug 2019 at 00:47, Pavel Stehule <pavel.stehule@gmail.com>
wrote:But Postgres is not storing this information now anywhere else except
statistic, isn't it?
not only - critical numbers are reltuples, relpages from pg_class
That's a very good point. relallvisible too. How's the global temp table
impl handling that right now, since you won't be changing the pg_class row?
AFAICS relpages doesn't need to be up to date (and reltuples certainly
doesn't) so presumably you're just leaving them as zero?As far as I understand relpages and reltuples are set only when you
perform "analyze" of the table.
Also autovacuum's autoanalyze.
What happens right now if you ANALYZE or VACUUM ANALYZE a global temp
table? Is it just disallowed?
No, it is not disallowed now.
It updates the statistic and also fields in pg_class which are shared by
all backends.
So all backends will now build plans according to this statistic.
Certainly it may lead to not so efficient plans if there are large
differences in number of tuples stored in this table in different backends.
But seems to me critical mostly in case of presence of indexes for
temporary table. And it seems to me that users are created indexes for
temporary tables even rarely than doing analyze for them.
That doesn't seem too bad TBH. Hacky but it doesn't seem dangerously wrong
and as likely to be helpful as not if clearly documented.
Temporary tables (both local and global) as well as unlogged tables are
not subject of logical replication, aren't them?
Right. But in the same way that they're still present in the catalogs, I
think they still affect catalog snapshots maintained by logical decoding's
historic snapshot manager as temp table creation/drop will still AFAIK
cause catalog invalidations to be written on commit. I need to double check
that.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 13.08.2019 11:21, Craig Ringer wrote:
On Fri, 9 Aug 2019 at 22:07, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:Ok, here it is: global_private_temp-1.patch
Initial pass review follows.
Relation name "SESSION" is odd. I guess you're avoiding "global"
because the data is session-scoped, not globally temporary. But I'm
not sure "session" fits either - after all regular temp tables are
also session temporary tables. I won't focus on naming further beyond
asking that it be consistent though, right now there's a mix of
"global" in some places and "session" in others.
I have supported both forms "create session table" and "create global temp".
Both "session" and "global" are expected PostgreSQL keywords so we do
not need to introduce new one (unlike "public" or "shared").
The form "global temp" is used in Oracle so it seems to be natural to
provide similar syntax.
I am not insisting on this syntax and will consider all other suggestions.
But IMHO almost any SQL keyword is overloaded and have different meaning.
Temporary tables has session as living area (or transaction if created
with "ON COMMIT DROP" clause).
So calling them "session tables" is actually more clear than just
"temporary tables".
But "local temp tables" can be also considered as session tables. So may
be it is really not so good idea
to use "session" keyword for them.
Why do you need to do all this indirection with changing RelFileNode
to RelFileNodeBackend in the bufmgr, changing BufferGetTag etc?
Similarly, your changes of RelFileNodeBackendIsTemp
to RelFileNodeBackendIsLocalTemp . Did you look into my suggestion of
extending the relmapper so that global temp tables would have a
relfilenode of 0 like pg_class etc, and use a backend-local map of
oid-to-relfilenode mappings? I'm guessing you did it the way you did
instead to lay the groundwork for cross-backend sharing, but if so it
should IMO be in that patch. Maybe my understanding of the existing
temp table mechanics is just insufficient as I
see RelFileNodeBackendIsTemp is already used in some aspects of
existing temp relation handling.
Sorry, are you really speaking about global_private_temp-1.patch?
This patch doesn't change bufmgr file at all.
May be you looked at another patch - global_shared_temp-1.patch
which is accessing shared tables though shared buffers and so have to
change buffer tag to include backend ID in it.
Similarly, TruncateSessionRelations probably shouldn't need to exist
in this patch in its current form; there's no shared_buffers use to
clean and the same file cleanup mechanism should handle both
session-temp and local-temp relfilenodes.
In global_private_temp-1.patch TruncateSessionRelations does nothing
with shared buffers, it just delete relation files.
A number of places make a change like this:
rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSIONand I'd like to see a test macro or inline static for it since it's
repeated so much. Mostly to make the intent clear: "is this a relation
with temporary backend-scoped data?"
I consider to call such macro IsSessionRelation() or something like this
but you do not like notion "session".
Is IsBackendScopedRelation() a better choice?
This test:
+ if (blkno == BTREE_METAPAGE && PageIsNew(BufferGetPage(buf)) && IsSessionRelationBackendId(rel->rd_backend)) + _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);seems sensible but I'm wondering if there's a way to channel
initialization of global-temp objects through a bit more of a common
path, so it reads more obviously as a common test applying to all
global-temp tables. "Global temp table not initialized in session yet?
Initialize it." So we don't have to have every object type do an
object type specific test for "am I actually uninitialized?" in all
paths it might hit. I guess I expected to see something more like aif (RelGlobalTempUninitialized(rel))
but maybe I've been doing too much Java ;)
A similar test reappears here for sequences:
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION &&
PageIsNew(page))Why is this written differently?
Just because I wrote them in different moment of times:)
I think that adding RelGlobalTempUninitialized is really good idea.
Sequence initialization ignores sequence startval/firstval settings. Why?
I am handling only case of implicitly created sequences for
SERIAL/BIGSERIAL columns.
Is it possible to explicitly specify initial value and step for them?
If so, this place should definitely be rewritten.
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT) + else if (newrelpersistence != RELPERSISTENCE_TEMP)Doesn't this change the test outcome for RELPERSISTENCE_UNLOGGED?
RELPERSISTENCE_UNLOGGED case is handle in previous IF branch.
In PreCommit_on_commit_actions, in the the ONCOMMIT_DELETE_ROWS case,
is there any way to still respect the XACT_FLAGS_ACCESSEDTEMPNAMESPACE
flag if the oid is for a backend-temp table not a global-temp table?
If it is local temp table, then XACT_FLAGS_ACCESSEDTEMPNAMESPACE is set
and so there is no need to check this flag.
And as far as XACT_FLAGS_ACCESSEDTEMPNAMESPACE is not set now for
global temp tables, I have to remove this check.
So for local temp table original behavior is preserved.
The question is why I do not set XACT_FLAGS_ACCESSEDTEMPNAMESPACE for
global temp tables?
I wanted to avoid current limitation for using temp tables in prepared
transactions.
Global metadata allows to eliminate some problems related with using
temp tables in 2PC.
But I am not sure that it eliminates ALL problems and there are no
strange effects related with
prepared transactions&global temp tables.
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence ==
RELPERSISTENCE_TEMP;Won't session-temp tables have local buffers too? Until your next
patch that adds shared_buffers storage for them anyway?
Once again, it is change from global_shared_temp-1.patch, not from
global_private_temp-1.patch
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 13.08.2019 11:27, Craig Ringer wrote:
On Tue, 13 Aug 2019 at 16:19, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:On 13.08.2019 8:34, Craig Ringer wrote:
On Tue, 13 Aug 2019 at 00:47, Pavel Stehule
<pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote:But Postgres is not storing this information now anywhere
else except statistic, isn't it?not only - critical numbers are reltuples, relpages from pg_class
That's a very good point. relallvisible too. How's the global
temp table impl handling that right now, since you won't be
changing the pg_class row? AFAICS relpages doesn't need to be up
to date (and reltuples certainly doesn't) so presumably you're
just leaving them as zero?As far as I understand relpages and reltuples are set only when
you perform "analyze" of the table.Also autovacuum's autoanalyze.
When it happen?
I have created normal table, populated it with some data and then wait
several hours but pg_class was not updated for this table.
I attach to this mail slightly refactored versions of this patches with
fixes of issues reported in your review.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_shared_temp-2.patchtext/x-patch; name=global_shared_temp-2.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..c99701d 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe..c60effd 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -671,6 +671,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 5962126..bdb6c95 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 99ae159..24b2438 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3612,7 +3612,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..6c3998f 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a31f775 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1772,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7679,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14089,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14583,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index a9b2f8b..2f261b9 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..3383c35 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2069,7 +2069,7 @@ do_autovacuum(void)
* Check if it is a temp table (presumably, of some other backend's).
* We cannot safely process other backends' temp tables.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
{
/*
* We just ignore it if the owning backend is still active and
@@ -2154,7 +2154,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e8ffa04..2004d2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3483,6 +3483,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3496,7 +3497,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6f3a402..76ce953 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2930,7 +2932,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2958,11 +2960,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2985,24 +2987,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3025,11 +3027,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3044,7 +3046,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3054,7 +3056,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3063,7 +3065,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3102,11 +3104,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3136,7 +3138,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3204,7 +3206,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3251,13 +3254,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3305,13 +3310,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4056,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4075,7 +4080,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4093,7 +4098,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4108,22 +4113,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4359,7 +4369,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4378,7 +4388,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b0fe19e..b361851 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -328,6 +328,17 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
/*
* ViewOptions
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 74b5077..44df4e0 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -85,3 +85,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
global_private_temp-2.patchtext/x-patch; name=global_private_temp-2.patchDownload
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..c99701d 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index f1ff01e..e92d324 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -673,6 +673,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 9c1f7de..97cc9e4 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 3e1d406..aaa2c49 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3590,7 +3590,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..6c3998f 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a31f775 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1772,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7679,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14089,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14583,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 6e5768c..ea6989b 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..3383c35 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2069,7 +2069,7 @@ do_autovacuum(void)
* Check if it is a temp table (presumably, of some other backend's).
* We cannot safely process other backends' temp tables.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
{
/*
* We just ignore it if the owning backend is still active and
@@ -2154,7 +2154,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..5db79ec 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,45 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +270,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +519,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +543,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +720,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +811,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b0fe19e..dfa2044 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -328,6 +328,17 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
/*
* ViewOptions
@@ -524,7 +535,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 69ae227..95919f8 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -87,3 +87,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
On Tue, 13 Aug 2019 at 21:50, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
As far as I understand relpages and reltuples are set only when you
perform "analyze" of the table.
Also autovacuum's autoanalyze.
When it happen?
I have created normal table, populated it with some data and then wait
several hours but pg_class was not updated for this table.
heap_vacuum_rel() in src/backend/access/heap/vacuumlazy.c below
* Update statistics in pg_class.
which I'm pretty sure is common to explicit vacuum and autovacuum. I
haven't run up a test to verify 100% but most DBs would never have relpages
etc set if autovac didn't do it since most aren't explicitly VACUUMed at
all.
I thought it was done when autovac ran an analyze, but it looks like it's
all autovac. Try setting very aggressive autovac thresholds and inserting +
deleting a bunch of tuples maybe.
I attach to this mail slightly refactored versions of this patches with
fixes of issues reported in your review.
Thanks.
Did you have a chance to consider my questions too? I see a couple of
things where there's no patch change, which is fine, but I'd be interested
in your thoughts on the question/issue in those cases.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 16.08.2019 9:25, Craig Ringer wrote:
On Tue, 13 Aug 2019 at 21:50, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:As far as I understand relpages and reltuples are set only
when you perform "analyze" of the table.Also autovacuum's autoanalyze.
When it happen?
I have created normal table, populated it with some data and then
wait several hours but pg_class was not updated for this table.heap_vacuum_rel() in src/backend/access/heap/vacuumlazy.c below
* Update statistics in pg_class.
which I'm pretty sure is common to explicit vacuum and autovacuum. I
haven't run up a test to verify 100% but most DBs would never have
relpages etc set if autovac didn't do it since most aren't explicitly
VACUUMed at all.
Sorry, I already understood it myself.
But to make vacuum process the table it is necessary to remove or update
some rows in it.
It seems to be yet another Postgres problem, which was noticed by
Darafei Praliaskouski some time ago: append-only tables are never
proceeded by autovacuum.
I thought it was done when autovac ran an analyze, but it looks like
it's all autovac. Try setting very aggressive autovac thresholds and
inserting + deleting a bunch of tuples maybe.I attach to this mail slightly refactored versions of this patches
with fixes of issues reported in your review.Thanks.
Did you have a chance to consider my questions too? I see a couple of
things where there's no patch change, which is fine, but I'd be
interested in your thoughts on the question/issue in those cases.
Sorry, may be I didn't notice some your questions. I have a filling that
I have replied on all your comments/questions.
Right now I reread all this thread and see two open issues:
1. Statistic for global temporary tables (including number of tuples,
pages and all visible flag).
My position is the following: while in most cases it should not be a
problem, because users rarely create indexes or do analyze for temporary
tables,
there can be situations when differences in data sets of global
temporary tables in different backends can really be a problem.
Unfortunately I can not propose good solution for this problem. It is
certainly possible to create some private (per-backend) cache for this
metadata.
But it seems to requires changes in many places.
2. Your concerns about performance penalty of global temp tables
accessed through shared buffers comparing with local temp tables access
through local buffers.
I think that this concern is not actual any more because there is
implementation of global temp tables using local buffers.
But my experiments doesn't show significant difference in access speed
of shared and local buffers. As far as shared buffers are used to be
much larger than local buffers,
there are more chances to hold all temp relation in memory without
spilling it to the disk. In this case access to global temp table will
be much faster comparing with access to
local temp tables. But the fact is that right now in the most frequent
scenario of temp table usage:
SELECT ... FROM PersistentTable INTO TempTable WHERE ...;
SELECT * FROM TempTable;
local temp table are more efficient than global temp table access
through shared buffer.
I think it is explained by caching and eviction policies.
In case of pulling all content of temp table in memory (pg_prewarm)
global temp table with shared buffers becomes faster.
I forget or do not notice some of your questions, would you be so kind
as to repeat them?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Fri, 16 Aug 2019 at 15:30, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
wrote:
1. Statistic for global temporary tables (including number of tuples,
pages and all visible flag).
My position is the following: while in most cases it should not be a
problem, because users rarely create indexes or do analyze for temporary
tables,
there can be situations when differences in data sets of global temporary
tables in different backends can really be a problem.
Unfortunately I can not propose good solution for this problem. It is
certainly possible to create some private (per-backend) cache for this
metadata.
But it seems to requires changes in many places.
Yeah. I don't really like just sharing them but it's not that bad either.
2. Your concerns about performance penalty of global temp tables accessed
through shared buffers comparing with local temp tables access through
local buffers.
I think that this concern is not actual any more because there is
implementation of global temp tables using local buffers.
But my experiments doesn't show significant difference in access speed of
shared and local buffers. As far as shared buffers are used to be much
larger than local buffers,
there are more chances to hold all temp relation in memory without
spilling it to the disk. In this case access to global temp table will be
much faster comparing with access to
local temp tables.
You ignore the costs of evicting non-temporary data from shared_buffers,
i.e. contention for space. Also increased chance of backends being forced
to do direct write-out due to lack of s_b space for dirty buffers.
In case of pulling all content of temp table in memory (pg_prewarm)
global temp table with shared buffers becomes faster.
Who would ever do that?
I forget or do not notice some of your questions, would you be so kind as
to repeat them?
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On Fri, 16 Aug 2019 at 15:30, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:I forget or do not notice some of your questions, would you be so kind as
to repeat them?
Sent early by accident.
Repeating questions:
Why do you need to do all this indirection with changing RelFileNode to
RelFileNodeBackend in the bufmgr, changing BufferGetTag etc? Similarly,
your changes of RelFileNodeBackendIsTemp to RelFileNodeBackendIsLocalTemp .
I'm guessing you did it the way you did instead to lay the groundwork for
cross-backend sharing, but if so it should IMO be in your second patch that
adds support for using shared_buffers for temp tables, not in the first
patch that adds a minimal global temp tables implementation. Maybe my
understanding of the existing temp table mechanics is just insufficient as
I see RelFileNodeBackendIsTemp is already used in some aspects of existing
temp relation handling.
Did you look into my suggestion of extending the relmapper so that global
temp tables would have a relfilenode of 0 like pg_class etc, and use a
backend-local map of oid-to-relfilenode mappings?
Similarly, TruncateSessionRelations probably shouldn't need to exist in
this patch in its current form; there's no shared_buffers use to clean and
the same file cleanup mechanism should handle both session-temp and
local-temp relfilenodes.
Sequence initialization ignores sequence startval/firstval settings. Why?
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start
sequence with 1 */
Doesn't this change the test outcome for RELPERSISTENCE_UNLOGGED?:
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
On 16.08.2019 11:37, Craig Ringer wrote:
On Fri, 16 Aug 2019 at 15:30, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:I forget or do not notice some of your questions, would you be
so kind as to repeat them?Sent early by accident.
Repeating questions:
Sorry, but I have answered them (my e-mail from 13.08)!
Looks like you have looed at wrong version of the patch:
global_shared_temp-1.patch instead of global_private_temp-1.patch which
implements global tables accessed through local buffers.
Why do you need to do all this indirection with changing RelFileNode
to RelFileNodeBackend in the bufmgr, changing BufferGetTag etc?
Similarly, your changes of RelFileNodeBackendIsTemp
to RelFileNodeBackendIsLocalTemp . I'm guessing you did it the way you
did instead to lay the groundwork for cross-backend sharing, but if so
it should IMO be in your second patch that adds support for using
shared_buffers for temp tables, not in the first patch that adds a
minimal global temp tables implementation. Maybe my understanding of
the existing temp table mechanics is just insufficient as I
see RelFileNodeBackendIsTemp is already used in some aspects of
existing temp relation handling.
Sorry, are you really speaking about global_private_temp-1.patch?
This patch doesn't change bufmgr file at all.
May be you looked at another patch - global_shared_temp-1.patch
which is accessing shared tables though shared buffers and so have to
change buffer tag to include backend ID in it.
Did you look into my suggestion of extending the relmapper so that
global temp tables would have a relfilenode of 0 like pg_class etc,
and use a backend-local map of oid-to-relfilenode mappings?Similarly, TruncateSessionRelations probably shouldn't need to exist
in this patch in its current form; there's no shared_buffers use to
clean and the same file cleanup mechanism should handle both
session-temp and local-temp relfilenodes.
In global_private_temp-1.patch TruncateSessionRelations does nothing
with shared buffers, it just delete relation files.
Sequence initialization ignores sequence startval/firstval settings. Why?
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /*
start sequence with 1 */
I am handling only case of implicitly created sequences for
SERIAL/BIGSERIAL columns.
Is it possible to explicitly specify initial value and step for them?
If so, this place should definitely be rewritten.
Doesn't this change the test outcome for RELPERSISTENCE_UNLOGGED?: - else if (newrelpersistence == RELPERSISTENCE_PERMANENT) + else if (newrelpersistence != RELPERSISTENCE_TEMP)
RELPERSISTENCE_UNLOGGED case is handle in previous IF branch.
-
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 16.08.2019 11:32, Craig Ringer wrote:
You ignore the costs of evicting non-temporary data from
shared_buffers, i.e. contention for space. Also increased chance of
backends being forced to do direct write-out due to lack of s_b space
for dirty buffers.In case of pulling all content of temp table in memory (pg_prewarm)
global temp table with shared buffers becomes faster.
Who would ever do that?
I decided to redo my experiments and now get different results which
illustrates advantages of global temp tables with shared buffer.
I performed the following test at my desktop with SSD and 16GB of RAM
and Postgres with default configuration except shared-buffers increased
to 1Gb.
postgres=# create table big(pk bigint primary key, val bigint);
CREATE TABLE
postgres=# insert into big values
(generate_series(1,100000000),generate_series(1,100000000)/100);
INSERT 0 100000000
postgres=# select * from buffer_usage limit 3;
relname | buffered | buffer_percent | percent_of_relation
----------------+------------+----------------+---------------------
big | 678 MB | 66.2 | 16.1
big_pkey | 344 MB | 33.6 | 16.1
pg_am | 8192 bytes | 0.0 | 20.0
postgres=# create temp table lt(key bigint, count bigint);
postgres=# \timing
Timing is on.
postgres=# insert into lt (select count(*),val as key from big group by
val);
INSERT 0 1000001
Time: 43265.491 ms (00:43.265)
postgres=# select sum(count) from lt;
sum
--------------
500000500000
(1 row)
Time: 94.194 ms
postgres=# insert into gt (select count(*),val as key from big group by
val);
INSERT 0 1000001
Time: 42952.671 ms (00:42.953)
postgres=# select sum(count) from gt;
sum
--------------
500000500000
(1 row)
Time: 35.906 ms
postgres=# select * from buffer_usage limit 3;
relname | buffered | buffer_percent | percent_of_relation
----------+----------+----------------+---------------------
big | 679 MB | 66.3 | 16.1
big_pkey | 300 MB | 29.3 | 14.0
gt | 42 MB | 4.1 | 100.0
So time of storing result in global temp table is slightly smaller than
time of storing it in local temp table and time of scanning global temp
table is twice smaller!
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
I did more investigations of performance of global temp tables with
shared buffers vs. vanilla (local) temp tables.
1. Combination of persistent and temporary tables in the same query.
Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);
Size of table is about 6Gb, I run this test on desktop with 16GB of RAM
and postgres with 1Gb shared buffers.
I run two queries:
insert into T (select count(*),pk/P as key from big group by key);
select sum(count) from T;
where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:
Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015
As you can see, time of insertion in temporary table is almost the same
and time of traversal of temporary table is about twice smaller for
global temp table
when it fits in RAM together with persistent table and slightly worser
when it doesn't fit.
2. Temporary table only access.
The same system, but Postgres is configured with shared_buffers=10GB,
max_parallel_workers = 4, max_parallel_workers_per_gather = 4
Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4 bigint,
x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;
Global temp tables:
create global temporary table global_temp(x1 bigint, x2 bigint, x3
bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;
Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003
So insertion in local temp table is performed slightly faster but select
is 16 times slower!
Conclusion:
In the assumption then temp table fits in memory, global temp tables
with shared buffers provides better performance than local temp table.
I didn't consider here global temp tables with local buffers because for
them results should be similar with local temp tables.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
I did more investigations of performance of global temp tables with shared
buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in the same query.
Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop with 16GB of RAM
and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big group by key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is almost the same
and time of traversal of temporary table is about twice smaller for global
temp table
when it fits in RAM together with persistent table and slightly worser
when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with shared_buffers=10GB,
max_parallel_workers = 4, max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4 bigint,
x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2 bigint, x3 bigint,
x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly faster but select
is 16 times slower!Conclusion:
In the assumption then temp table fits in memory, global temp tables with
shared buffers provides better performance than local temp table.
I didn't consider here global temp tables with local buffers because for
them results should be similar with local temp tables.
Probably there is not a reason why shared buffers should be slower than
local buffers when system is under low load.
access to shared memory is protected by spin locks (are cheap for few
processes), so tests in one or few process are not too important (or it is
just one side of space)
another topic can be performance on MS Sys - there are stories about not
perfect performance of shared memory there.
Regards
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 16.08.2019 20:17, Pavel Stehule wrote:
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:I did more investigations of performance of global temp tables
with shared buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in the same query.
Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop with 16GB
of RAM and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big group by key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is almost the
same
and time of traversal of temporary table is about twice smaller
for global temp table
when it fits in RAM together with persistent table and slightly
worser when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with
shared_buffers=10GB, max_parallel_workers = 4,
max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4
bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2 bigint, x3
bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9
bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly faster but
select is 16 times slower!Conclusion:
In the assumption then temp table fits in memory, global temp
tables with shared buffers provides better performance than local
temp table.
I didn't consider here global temp tables with local buffers
because for them results should be similar with local temp tables.Probably there is not a reason why shared buffers should be slower
than local buffers when system is under low load.access to shared memory is protected by spin locks (are cheap for few
processes), so tests in one or few process are not too important (or
it is just one side of space)another topic can be performance on MS Sys - there are stories about
not perfect performance of shared memory there.Regards
Pavel
One more test which is used to simulate access to temp tables under
high load.
I am using "upsert" into temp table in multiple connections.
create global temp table gtemp (x integer primary key, y bigint);
upsert.sql:
insert into gtemp values (random() * 1000000, 0) on conflict(x) do
update set y=gtemp.y+1;
pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
I failed to find some standard way in pgbech to perform per-session
initialization to create local temp table,
so I just insert this code in pgbench code:
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf33..af6a431 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5994,6 +5994,7 @@ threadRun(void *arg)
{
if ((state[i].con = doConnect()) == NULL)
goto done;
+ executeStatement(state[i].con, "create temp
table ltemp(x integer primary key, y bigint)");
}
}
Results are the following:
Global temp table: 117526 TPS
Local temp table: 107802 TPS
So even for this workload global temp table with shared buffers are a
little bit faster.
I will be pleased if you can propose some other testing scenario.
ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 16.08.2019 20:17, Pavel Stehule wrote:
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:I did more investigations of performance of global temp tables with
shared buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in the same query.
Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop with 16GB of RAM
and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big group by key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is almost the same
and time of traversal of temporary table is about twice smaller for
global temp table
when it fits in RAM together with persistent table and slightly worser
when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with shared_buffers=10GB,
max_parallel_workers = 4, max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4 bigint,
x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2 bigint, x3
bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly faster but select
is 16 times slower!Conclusion:
In the assumption then temp table fits in memory, global temp tables with
shared buffers provides better performance than local temp table.
I didn't consider here global temp tables with local buffers because for
them results should be similar with local temp tables.Probably there is not a reason why shared buffers should be slower than
local buffers when system is under low load.access to shared memory is protected by spin locks (are cheap for few
processes), so tests in one or few process are not too important (or it is
just one side of space)another topic can be performance on MS Sys - there are stories about not
perfect performance of shared memory there.Regards
Pavel
One more test which is used to simulate access to temp tables under high
load.
I am using "upsert" into temp table in multiple connections.create global temp table gtemp (x integer primary key, y bigint);
upsert.sql:
insert into gtemp values (random() * 1000000, 0) on conflict(x) do update
set y=gtemp.y+1;pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
I failed to find some standard way in pgbech to perform per-session
initialization to create local temp table,
so I just insert this code in pgbench code:diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..af6a431 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,7 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); } }Results are the following:
Global temp table: 117526 TPS
Local temp table: 107802 TPSSo even for this workload global temp table with shared buffers are a
little bit faster.
I will be pleased if you can propose some other testing scenario.
please, try to increase number of connections.
Regards
Pavel
On 18.08.2019 11:28, Pavel Stehule wrote:
ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 16.08.2019 20:17, Pavel Stehule wrote:
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:I did more investigations of performance of global temp
tables with shared buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in the same
query.Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop with
16GB of RAM and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big group by
key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is
almost the same
and time of traversal of temporary table is about twice
smaller for global temp table
when it fits in RAM together with persistent table and
slightly worser when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with
shared_buffers=10GB, max_parallel_workers = 4,
max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3 bigint,
x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9
bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2
bigint, x3 bigint, x4 bigint, x5 bigint, x6 bigint, x7
bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly faster
but select is 16 times slower!Conclusion:
In the assumption then temp table fits in memory, global temp
tables with shared buffers provides better performance than
local temp table.
I didn't consider here global temp tables with local buffers
because for them results should be similar with local temp
tables.Probably there is not a reason why shared buffers should be
slower than local buffers when system is under low load.access to shared memory is protected by spin locks (are cheap for
few processes), so tests in one or few process are not too
important (or it is just one side of space)another topic can be performance on MS Sys - there are stories
about not perfect performance of shared memory there.Regards
Pavel
One more test which is used to simulate access to temp tables
under high load.
I am using "upsert" into temp table in multiple connections.create global temp table gtemp (x integer primary key, y bigint);
upsert.sql:
insert into gtemp values (random() * 1000000, 0) on conflict(x) do
update set y=gtemp.y+1;pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
I failed to find some standard way in pgbech to perform
per-session initialization to create local temp table,
so I just insert this code in pgbench code:diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..af6a431 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,7 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); } }Results are the following:
Global temp table: 117526 TPS
Local temp table: 107802 TPSSo even for this workload global temp table with shared buffers
are a little bit faster.
I will be pleased if you can propose some other testing scenario.please, try to increase number of connections.
With 20 connections and 4 pgbench threads results are similar: 119k TPS
for global temp tables and 115k TPS for local temp tables.
I have tried yet another scenario: read-only access to temp tables:
\set id random(1,10000000)
select sum(y) from ltemp where x=:id;
Tables are created and initialized in pgbench session startup:
knizhnik@knizhnik:~/postgresql$ git diff
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf33..95295b0 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5994,6 +5994,8 @@ threadRun(void *arg)
{
if ((state[i].con = doConnect()) == NULL)
goto done;
+ executeStatement(state[i].con, "create temp
table ltemp(x integer primary key, y bigint)");
+ executeStatement(state[i].con, "insert into
ltemp values (generate_series(1,1000000), generate_series(1,1000000))");
}
}
Results for 10 connections with 10 million inserted records per table
and 100 connections with 1 million inserted record per table :
#connections:
10
100
local temp
68k
90k
global temp, shared_buffers=1G
63k
61k
global temp, shared_buffers=10G 150k
150k
So temporary tables with local buffers are slightly faster when data
doesn't fit in shared buffers, but significantly slower when it fits.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.08.2019 11:51, Konstantin Knizhnik wrote:
On 18.08.2019 11:28, Pavel Stehule wrote:
ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 16.08.2019 20:17, Pavel Stehule wrote:
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:I did more investigations of performance of global temp
tables with shared buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in the
same query.Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop with
16GB of RAM and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big group by
key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is
almost the same
and time of traversal of temporary table is about twice
smaller for global temp table
when it fits in RAM together with persistent table and
slightly worser when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with
shared_buffers=10GB, max_parallel_workers = 4,
max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3
bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8
bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2
bigint, x3 bigint, x4 bigint, x5 bigint, x6 bigint, x7
bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly
faster but select is 16 times slower!Conclusion:
In the assumption then temp table fits in memory, global
temp tables with shared buffers provides better performance
than local temp table.
I didn't consider here global temp tables with local buffers
because for them results should be similar with local temp
tables.Probably there is not a reason why shared buffers should be
slower than local buffers when system is under low load.access to shared memory is protected by spin locks (are cheap
for few processes), so tests in one or few process are not too
important (or it is just one side of space)another topic can be performance on MS Sys - there are stories
about not perfect performance of shared memory there.Regards
Pavel
One more test which is used to simulate access to temp tables
under high load.
I am using "upsert" into temp table in multiple connections.create global temp table gtemp (x integer primary key, y bigint);
upsert.sql:
insert into gtemp values (random() * 1000000, 0) on conflict(x)
do update set y=gtemp.y+1;pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
I failed to find some standard way in pgbech to perform
per-session initialization to create local temp table,
so I just insert this code in pgbench code:diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..af6a431 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,7 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); } }Results are the following:
Global temp table: 117526 TPS
Local temp table: 107802 TPSSo even for this workload global temp table with shared buffers
are a little bit faster.
I will be pleased if you can propose some other testing scenario.please, try to increase number of connections.
With 20 connections and 4 pgbench threads results are similar: 119k
TPS for global temp tables and 115k TPS for local temp tables.I have tried yet another scenario: read-only access to temp tables:
\set id random(1,10000000)
select sum(y) from ltemp where x=:id;Tables are created and initialized in pgbench session startup:
knizhnik@knizhnik:~/postgresql$ git diff diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..95295b0 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,8 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); + executeStatement(state[i].con, "insert into ltemp values (generate_series(1,1000000), generate_series(1,1000000))"); } }Results for 10 connections with 10 million inserted records per table
and 100 connections with 1 million inserted record per table :#connections:
10
100
local temp
68k
90k
global temp, shared_buffers=1G
63k
61k
global temp, shared_buffers=10G 150k
150kSo temporary tables with local buffers are slightly faster when data
doesn't fit in shared buffers, but significantly slower when it fits.
All previously reported results were produced at my desktop.
I also run this read-only test on huge IBM server (POWER9, 2 NUMA nodes,
176 CPU, 1Tb RAM).
Here the difference between local and global tables is not so large:
Local temp: 739k TPS
Global temp: 924k TPS
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
po 19. 8. 2019 v 13:16 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.08.2019 11:51, Konstantin Knizhnik wrote:
On 18.08.2019 11:28, Pavel Stehule wrote:
ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 16.08.2019 20:17, Pavel Stehule wrote:
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:I did more investigations of performance of global temp tables with
shared buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in the same query.
Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop with 16GB of RAM
and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big group by key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is almost the same
and time of traversal of temporary table is about twice smaller for
global temp table
when it fits in RAM together with persistent table and slightly worser
when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with shared_buffers=10GB,
max_parallel_workers = 4, max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4 bigint,
x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2 bigint, x3
bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly faster but select
is 16 times slower!Conclusion:
In the assumption then temp table fits in memory, global temp tables
with shared buffers provides better performance than local temp table.
I didn't consider here global temp tables with local buffers because for
them results should be similar with local temp tables.Probably there is not a reason why shared buffers should be slower than
local buffers when system is under low load.access to shared memory is protected by spin locks (are cheap for few
processes), so tests in one or few process are not too important (or it is
just one side of space)another topic can be performance on MS Sys - there are stories about not
perfect performance of shared memory there.Regards
Pavel
One more test which is used to simulate access to temp tables under high
load.
I am using "upsert" into temp table in multiple connections.create global temp table gtemp (x integer primary key, y bigint);
upsert.sql:
insert into gtemp values (random() * 1000000, 0) on conflict(x) do update
set y=gtemp.y+1;pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
I failed to find some standard way in pgbech to perform per-session
initialization to create local temp table,
so I just insert this code in pgbench code:diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..af6a431 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,7 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); } }Results are the following:
Global temp table: 117526 TPS
Local temp table: 107802 TPSSo even for this workload global temp table with shared buffers are a
little bit faster.
I will be pleased if you can propose some other testing scenario.please, try to increase number of connections.
With 20 connections and 4 pgbench threads results are similar: 119k TPS
for global temp tables and 115k TPS for local temp tables.I have tried yet another scenario: read-only access to temp tables:
\set id random(1,10000000)
select sum(y) from ltemp where x=:id;Tables are created and initialized in pgbench session startup:
knizhnik@knizhnik:~/postgresql$ git diff diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..95295b0 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,8 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); + executeStatement(state[i].con, "insert into ltemp values (generate_series(1,1000000), generate_series(1,1000000))"); } }Results for 10 connections with 10 million inserted records per table and
100 connections with 1 million inserted record per table :#connections:
10
100
local temp
68k
90k
global temp, shared_buffers=1G
63k
61k
global temp, shared_buffers=10G 150k
150kSo temporary tables with local buffers are slightly faster when data
doesn't fit in shared buffers, but significantly slower when it fits.All previously reported results were produced at my desktop.
I also run this read-only test on huge IBM server (POWER9, 2 NUMA nodes,
176 CPU, 1Tb RAM).Here the difference between local and global tables is not so large:
Local temp: 739k TPS
Global temp: 924k TPS
is not difference between local temp buffers and global temp buffers by too
low value of TEMP_BUFFERS?
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.08.2019 14:25, Pavel Stehule wrote:
po 19. 8. 2019 v 13:16 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 19.08.2019 11:51, Konstantin Knizhnik wrote:
On 18.08.2019 11:28, Pavel Stehule wrote:
ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:On 16.08.2019 20:17, Pavel Stehule wrote:
pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru
<mailto:k.knizhnik@postgrespro.ru>> napsal:I did more investigations of performance of global temp
tables with shared buffers vs. vanilla (local) temp tables.1. Combination of persistent and temporary tables in
the same query.Preparation:
create table big(pk bigint primary key, val bigint);
insert into big values
(generate_series(1,100000000),generate_series(1,100000000));
create temp table lt(key bigint, count bigint);
create global temp table gt(key bigint, count bigint);Size of table is about 6Gb, I run this test on desktop
with 16GB of RAM and postgres with 1Gb shared buffers.
I run two queries:insert into T (select count(*),pk/P as key from big
group by key);
select sum(count) from T;where P is (100,10,1) and T is name of temp table (lt
or gt).
The table below contains times of both queries in msec:Percent of selected data
1%
10%
100%
Local temp table
44610
90
47920
891
63414
21612
Global temp table
44669
35
47939
298
59159
26015As you can see, time of insertion in temporary table is
almost the same
and time of traversal of temporary table is about twice
smaller for global temp table
when it fits in RAM together with persistent table and
slightly worser when it doesn't fit.2. Temporary table only access.
The same system, but Postgres is configured with
shared_buffers=10GB, max_parallel_workers = 4,
max_parallel_workers_per_gather = 4Local temp tables:
create temp table local_temp(x1 bigint, x2 bigint, x3
bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8
bigint, x9 bigint);
insert into local_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from local_temp;Global temp tables:
create global temporary table global_temp(x1 bigint, x2
bigint, x3 bigint, x4 bigint, x5 bigint, x6 bigint, x7
bigint, x8 bigint, x9 bigint);
insert into global_temp values
(generate_series(1,100000000),0,0,0,0,0,0,0,0);
select sum(x1) from global_temp;Results (msec):
Insert
Select
Local temp table 37489
48322
Global temp table 44358
3003So insertion in local temp table is performed slightly
faster but select is 16 times slower!Conclusion:
In the assumption then temp table fits in memory,
global temp tables with shared buffers provides better
performance than local temp table.
I didn't consider here global temp tables with local
buffers because for them results should be similar with
local temp tables.Probably there is not a reason why shared buffers should be
slower than local buffers when system is under low load.access to shared memory is protected by spin locks (are
cheap for few processes), so tests in one or few process
are not too important (or it is just one side of space)another topic can be performance on MS Sys - there are
stories about not perfect performance of shared memory there.Regards
Pavel
One more test which is used to simulate access to temp
tables under high load.
I am using "upsert" into temp table in multiple connections.create global temp table gtemp (x integer primary key, y
bigint);upsert.sql:
insert into gtemp values (random() * 1000000, 0) on
conflict(x) do update set y=gtemp.y+1;pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
I failed to find some standard way in pgbech to perform
per-session initialization to create local temp table,
so I just insert this code in pgbench code:diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..af6a431 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,7 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); } }Results are the following:
Global temp table: 117526 TPS
Local temp table: 107802 TPSSo even for this workload global temp table with shared
buffers are a little bit faster.
I will be pleased if you can propose some other testing
scenario.please, try to increase number of connections.
With 20 connections and 4 pgbench threads results are similar:
119k TPS for global temp tables and 115k TPS for local temp tables.I have tried yet another scenario: read-only access to temp tables:
\set id random(1,10000000)
select sum(y) from ltemp where x=:id;Tables are created and initialized in pgbench session startup:
knizhnik@knizhnik:~/postgresql$ git diff diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 570cf33..95295b0 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -5994,6 +5994,8 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) goto done; + executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)"); + executeStatement(state[i].con, "insert into ltemp values (generate_series(1,1000000), generate_series(1,1000000))"); } }Results for 10 connections with 10 million inserted records per
table and 100 connections with 1 million inserted record per table :#connections:
10
100
local temp
68k
90k
global temp, shared_buffers=1G
63k
61k
global temp, shared_buffers=10G 150k
150kSo temporary tables with local buffers are slightly faster when
data doesn't fit in shared buffers, but significantly slower when
it fits.All previously reported results were produced at my desktop.
I also run this read-only test on huge IBM server (POWER9, 2 NUMA
nodes, 176 CPU, 1Tb RAM).Here the difference between local and global tables is not so large:
Local temp: 739k TPS
Global temp: 924k TPSis not difference between local temp buffers and global temp buffers
by too low value of TEMP_BUFFERS?
Certainly, default (small) temp buffer size plays roles.
But it this IPC host this difference is not so important.
Result with local temp tables and temp_buffers = 1GB: 859k TPS.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Certainly, default (small) temp buffer size plays roles.
But it this IPC host this difference is not so important.
Result with local temp tables and temp_buffers = 1GB: 859k TPS.
It is little bit unexpected result.I understand so it partially it is
generic problem access to smaller dedicated caches versus access to bigger
shared cache.
But it is hard to imagine so access to local cache is 10% slower than
access to shared cache. Maybe there is some bottle neck - maybe our
implementation of local buffers are suboptimal.
Using local buffers for global temporary tables can be interesting from
another reason - it uses temporary files, and temporary files can be
forwarded on ephemeral IO on Amazon cloud (with much better performance
than persistent IO).
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.08.2019 18:53, Pavel Stehule wrote:
Certainly, default (small) temp buffer size plays roles.
But it this IPC host this difference is not so important.
Result with local temp tables and temp_buffers = 1GB: 859k TPS.It is little bit unexpected result.I understand so it partially it is
generic problem access to smaller dedicated caches versus access to
bigger shared cache.But it is hard to imagine so access to local cache is 10% slower than
access to shared cache. Maybe there is some bottle neck - maybe our
implementation of local buffers are suboptimal.
It may be caused by system memory allocator - in case of using shared
buffers we do not need to ask OS to allocate more memory.
Using local buffers for global temporary tables can be interesting
from another reason - it uses temporary files, and temporary files can
be forwarded on ephemeral IO on Amazon cloud (with much better
performance than persistent IO).
My assumption is that temporary tables almost always fit in memory. So
in most cases there is on need to write data to file at all.
As I wrote at the beginning of this thread, one of the problems with
temporary table sis that it is not possible to use them at replica.
Global temp tables allows to share metadata between master and replica.
I perform small investigation: how difficult it will be to support
inserts in temp tables at replica.
First my impression was that it can be done in tricky but simple way.
By making small changes changing just three places:
1. Prohibit non-select statements in read-only transactions
2. Xid assignment (return FrozenTransactionId)
3. Transaction commit/abort
I managed to provide normal work with global temp tables at replica.
But there is one problem with this approach: it is not possible to undo
changes in temp tables so rollback doesn't work.
I tried another solution, but assigning some dummy Xids to standby
transactions.
But this approach require much more changes:
- Initialize page for such transaction in CLOG
- Mark transaction as committed/aborted in XCLOG
- Change snapshot check in visibility function
And still I didn't find safe way to cleanup CLOG space.
Alternative solution is to implement "local CLOG" for such transactions.
The straightforward solution is to use hashtable. But it may cause
memory overflow if we have long living backend which performs huge
number of transactions.
Also in this case we need to change visibility check functions.
So I have implemented simplest solution with frozen xid and force
backend termination in case of transaction rollback (so user will no see
inconsistent behavior).
Attached please find global_private_temp_replica.patch which implements
this approach.
It will be nice if somebody can suggest better solution for temporary
tables at replica.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp_replica.patchtext/x-patch; name=global_private_temp_replica.patchDownload
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..c99701d 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index f1ff01e..e92d324 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -673,6 +673,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 9c1f7de..97cc9e4 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 5b759ec..fda7573 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -69,9 +69,10 @@ GetNewTransactionId(bool isSubXact)
return FullTransactionIdFromEpochAndXid(0, BootstrapTransactionId);
}
- /* safety check, we should never get this far in a HS standby */
+ /* Make it possible to access global temporary tables at standby */
if (RecoveryInProgress())
- elog(ERROR, "cannot assign TransactionIds during recovery");
+ return FullTransactionIdFromEpochAndXid(0, FrozenTransactionId);
+ /* elog(ERROR, "cannot assign TransactionIds during recovery"); */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1bbaeee..9857f1c 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1206,7 +1206,7 @@ static TransactionId
RecordTransactionCommit(void)
{
TransactionId xid = GetTopTransactionIdIfAny();
- bool markXidCommitted = TransactionIdIsValid(xid);
+ bool markXidCommitted = TransactionIdIsNormal(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
RelFileNode *rels;
@@ -1624,7 +1624,7 @@ RecordTransactionAbort(bool isSubXact)
* rels to delete (note that this routine is not responsible for actually
* deleting 'em). We cannot have any child XIDs, either.
*/
- if (!TransactionIdIsValid(xid))
+ if (!TransactionIdIsNormal(xid))
{
/* Reset XactLastRecEnd until the next transaction writes something */
if (!isSubXact)
@@ -2991,6 +2991,9 @@ CommitTransactionCommand(void)
* and then clean up.
*/
case TBLOCK_ABORT_PENDING:
+ if (GetCurrentTransactionIdIfAny() == FrozenTransactionId)
+ elog(FATAL, "Transaction is aborted at standby");
+
AbortTransaction();
CleanupTransaction();
s->blockState = TBLOCK_DEFAULT;
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 3e1d406..aaa2c49 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3590,7 +3590,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..6c3998f 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a31f775 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1772,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7679,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14089,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14583,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbd7dd9..efe6f21 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -788,6 +788,9 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
if (isTempNamespace(get_rel_namespace(rte->relid)))
continue;
+ if (get_rel_persistence(rte->relid) == RELPERSISTENCE_SESSION)
+ continue;
+
PreventCommandIfReadOnly(CreateCommandTag((Node *) plannedstmt));
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 98e9948..1a9170b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -124,7 +124,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
relation = table_open(relationObjectId, NoLock);
/* Temporary and unlogged relations are inaccessible during recovery. */
- if (!RelationNeedsWAL(relation) && RecoveryInProgress())
+ if (!RelationNeedsWAL(relation) && RecoveryInProgress() && relation->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary or unlogged relations during recovery")));
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 6e5768c..ea6989b 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..3383c35 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2069,7 +2069,7 @@ do_autovacuum(void)
* Check if it is a temp table (presumably, of some other backend's).
* We cannot safely process other backends' temp tables.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
{
/*
* We just ignore it if the owning backend is still active and
@@ -2154,7 +2154,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..5db79ec 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,45 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +270,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +519,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +543,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +720,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +811,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b0fe19e..dfa2044 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -328,6 +328,17 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
/*
* ViewOptions
@@ -524,7 +535,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 69ae227..95919f8 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -87,3 +87,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
út 20. 8. 2019 v 16:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.08.2019 18:53, Pavel Stehule wrote:
Certainly, default (small) temp buffer size plays roles.
But it this IPC host this difference is not so important.
Result with local temp tables and temp_buffers = 1GB: 859k TPS.It is little bit unexpected result.I understand so it partially it is
generic problem access to smaller dedicated caches versus access to bigger
shared cache.But it is hard to imagine so access to local cache is 10% slower than
access to shared cache. Maybe there is some bottle neck - maybe our
implementation of local buffers are suboptimal.It may be caused by system memory allocator - in case of using shared
buffers we do not need to ask OS to allocate more memory.
maybe, but shared buffers you have a overhead with searching free buffers
and some overhead with synchronization processes.
Using local buffers for global temporary tables can be interesting from
another reason - it uses temporary files, and temporary files can be
forwarded on ephemeral IO on Amazon cloud (with much better performance
than persistent IO).My assumption is that temporary tables almost always fit in memory. So in
most cases there is on need to write data to file at all.As I wrote at the beginning of this thread, one of the problems with
temporary table sis that it is not possible to use them at replica.
Global temp tables allows to share metadata between master and replica.
I am not sure if I understand to last sentence. Global temp tables should
be replicated on replica servers. But the content should not be replicated.
This should be session specific.
I perform small investigation: how difficult it will be to support inserts
in temp tables at replica.
First my impression was that it can be done in tricky but simple way.By making small changes changing just three places:
1. Prohibit non-select statements in read-only transactions
2. Xid assignment (return FrozenTransactionId)
3. Transaction commit/abortI managed to provide normal work with global temp tables at replica.
But there is one problem with this approach: it is not possible to undo
changes in temp tables so rollback doesn't work.I tried another solution, but assigning some dummy Xids to standby
transactions.
But this approach require much more changes:
- Initialize page for such transaction in CLOG
- Mark transaction as committed/aborted in XCLOG
- Change snapshot check in visibility functionAnd still I didn't find safe way to cleanup CLOG space.
Alternative solution is to implement "local CLOG" for such transactions.
The straightforward solution is to use hashtable. But it may cause memory
overflow if we have long living backend which performs huge number of
transactions.
Also in this case we need to change visibility check functions.So I have implemented simplest solution with frozen xid and force backend
termination in case of transaction rollback (so user will no see
inconsistent behavior).
Attached please find global_private_temp_replica.patch which implements
this approach.
It will be nice if somebody can suggest better solution for temporary
tables at replica.
This is another hard issue. Probably backend temination should be
acceptable solution. I don't understand well to this area, but if replica
allows writing (to global temp tables), then replica have to have local
CLOG.
CLOG for global temp tables can be more simple then standard CLOG. Data are
not shared, and life of data (and number of transactions) can be low.
Another solution is wait on ZHeap storage and replica can to have own UNDO
log.
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 20.08.2019 19:06, Pavel Stehule wrote:
As I wrote at the beginning of this thread, one of the problems
with temporary table sis that it is not possible to use them at
replica.
Global temp tables allows to share metadata between master and
replica.I am not sure if I understand to last sentence. Global temp tables
should be replicated on replica servers. But the content should not be
replicated. This should be session specific.
Obviously.
When we run OLAP queries at replica, it will be great if we can do
insert into temp_table (select ...);
With local temp tables it is not possible just because you can not
create temp table at replica.
But global temp table can be created at master and populated with data
at replica.
I perform small investigation: how difficult it will be to support
inserts in temp tables at replica.
First my impression was that it can be done in tricky but simple way.By making small changes changing just three places:
1. Prohibit non-select statements in read-only transactions
2. Xid assignment (return FrozenTransactionId)
3. Transaction commit/abortI managed to provide normal work with global temp tables at replica.
But there is one problem with this approach: it is not possible to
undo changes in temp tables so rollback doesn't work.I tried another solution, but assigning some dummy Xids to standby
transactions.
But this approach require much more changes:
- Initialize page for such transaction in CLOG
- Mark transaction as committed/aborted in XCLOG
- Change snapshot check in visibility functionAnd still I didn't find safe way to cleanup CLOG space.
Alternative solution is to implement "local CLOG" for such
transactions.
The straightforward solution is to use hashtable. But it may cause
memory overflow if we have long living backend which performs huge
number of transactions.
Also in this case we need to change visibility check functions.So I have implemented simplest solution with frozen xid and force
backend termination in case of transaction rollback (so user will
no see inconsistent behavior).
Attached please find global_private_temp_replica.patch which
implements this approach.
It will be nice if somebody can suggest better solution for
temporary tables at replica.This is another hard issue. Probably backend temination should be
acceptable solution. I don't understand well to this area, but if
replica allows writing (to global temp tables), then replica have to
have local CLOG.
There are several problems:
1. How to choose XID for writing transaction at standby. The simplest
solution is to just add 0x7fffffff to the current XID.
It eliminates possibility of conflict with normal XIDs (received from
master).
But requires changes in visibility functions. Visibility check function
do not know OID of tuple owner, just XID stored in the tuple header. It
should make a decision just based on this XID.
2. How to perform cleanup of not needed XIDs. Right now there is quite
complex logic of how to free CLOG pages.
3. How to implement visibility rules to such XIDs.
CLOG for global temp tables can be more simple then standard CLOG.
Data are not shared, and life of data (and number of transactions) can
be low.Another solution is wait on ZHeap storage and replica can to have own
UNDO log.
I thought about implementation of special table access method for
temporary tables.
I am trying to understand now if it is the only possible approach or
there are simpler solutions.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
út 20. 8. 2019 v 18:42 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 20.08.2019 19:06, Pavel Stehule wrote:
As I wrote at the beginning of this thread, one of the problems with
temporary table sis that it is not possible to use them at replica.
Global temp tables allows to share metadata between master and replica.I am not sure if I understand to last sentence. Global temp tables should
be replicated on replica servers. But the content should not be replicated.
This should be session specific.Obviously.
When we run OLAP queries at replica, it will be great if we can doinsert into temp_table (select ...);
With local temp tables it is not possible just because you can not create
temp table at replica.
But global temp table can be created at master and populated with data at
replica.
yes
I perform small investigation: how difficult it will be to support
inserts in temp tables at replica.
First my impression was that it can be done in tricky but simple way.By making small changes changing just three places:
1. Prohibit non-select statements in read-only transactions
2. Xid assignment (return FrozenTransactionId)
3. Transaction commit/abortI managed to provide normal work with global temp tables at replica.
But there is one problem with this approach: it is not possible to undo
changes in temp tables so rollback doesn't work.I tried another solution, but assigning some dummy Xids to standby
transactions.
But this approach require much more changes:
- Initialize page for such transaction in CLOG
- Mark transaction as committed/aborted in XCLOG
- Change snapshot check in visibility functionAnd still I didn't find safe way to cleanup CLOG space.
Alternative solution is to implement "local CLOG" for such transactions.
The straightforward solution is to use hashtable. But it may cause memory
overflow if we have long living backend which performs huge number of
transactions.
Also in this case we need to change visibility check functions.So I have implemented simplest solution with frozen xid and force backend
termination in case of transaction rollback (so user will no see
inconsistent behavior).
Attached please find global_private_temp_replica.patch which implements
this approach.
It will be nice if somebody can suggest better solution for temporary
tables at replica.This is another hard issue. Probably backend temination should be
acceptable solution. I don't understand well to this area, but if replica
allows writing (to global temp tables), then replica have to have local
CLOG.There are several problems:
1. How to choose XID for writing transaction at standby. The simplest
solution is to just add 0x7fffffff to the current XID.
It eliminates possibility of conflict with normal XIDs (received from
master).
But requires changes in visibility functions. Visibility check function do
not know OID of tuple owner, just XID stored in the tuple header. It should
make a decision just based on this XID.2. How to perform cleanup of not needed XIDs. Right now there is quite
complex logic of how to free CLOG pages.
3. How to implement visibility rules to such XIDs.
in theory every session can have own CLOG. When you finish session, you can
truncate this file.
CLOG for global temp tables can be more simple then standard CLOG. Data
are not shared, and life of data (and number of transactions) can be low.Another solution is wait on ZHeap storage and replica can to have own UNDO
log.I thought about implementation of special table access method for
temporary tables.
+1
Show quoted text
I am trying to understand now if it is the only possible approach or
there are simpler solutions.--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 20.08.2019 20:01, Pavel Stehule wrote:
Another solution is wait on ZHeap storage and replica can to have own
UNDO log.I thought about implementation of special table access method for
temporary tables.+1
Unfortunately implementing special table access method for temporary
tables doesn't solve all problems.
XID generation is not part of table access methods.
So we still need to assign some XID to write transaction at replica
which will not conflict with XIDs received from master.
Actually only global temp tables can be updated at replica and so
assigned XIDs can be stored only in tuples of such relations.
But still I am not sure that we can use arbitrary XID for such
transactions at replica.
Also I upset by amount of functionality which has to be reimplemented
for global temp tables if we really want to provide access method for them:
1. CLOG
2. vacuum
3. MVCC visibility
And still it is not possible to encapsulate all changes need to support
writes to temp tables at replica inside table access method.
XID assignment, transaction commit and abort, subtransactions - all this
places need to be patched.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 21.08.2019 11:54, Konstantin Knizhnik wrote:
On 20.08.2019 20:01, Pavel Stehule wrote:
Another solution is wait on ZHeap storage and replica can to have own
UNDO log.I thought about implementation of special table access method for
temporary tables.+1
Unfortunately implementing special table access method for temporary
tables doesn't solve all problems.
XID generation is not part of table access methods.
So we still need to assign some XID to write transaction at replica
which will not conflict with XIDs received from master.
Actually only global temp tables can be updated at replica and so
assigned XIDs can be stored only in tuples of such relations.
But still I am not sure that we can use arbitrary XID for such
transactions at replica.Also I upset by amount of functionality which has to be reimplemented
for global temp tables if we really want to provide access method for
them:1. CLOG
2. vacuum
3. MVCC visibilityAnd still it is not possible to encapsulate all changes need to
support writes to temp tables at replica inside table access method.
XID assignment, transaction commit and abort, subtransactions - all
this places need to be patched.
I was able to fully support work with global temp tables at replica
(including subtransactions).
The patch is attached. Also you can find this version in
https://github.com/postgrespro/postgresql.builtin_pool/tree/global_temp_hot
Right now transactions at replica updating global temp table are
assigned special kind of GIDs which are not related with XIDs received
from master.
So special visibility rules are used for such tables at replica. Also I
have to patch TransactionIdIsInProgress, TransactionIdDidCommit,
TransactionIdGetCurrent
functions to correctly handle such XIDs. In principle it is possible to
implement global temp tables as special heap access method. But it will
require copying a lot of code (heapam.c)
so I prefer to add few checks to existed functions.
There are still some limitations:
- Number of transactions at replica which update temp tables is limited
by 2^32 (wraparound problem is not addressed).
- I have to maintain in-memory analog of CLOG for such transactions
which is also not cropped. It means that for 2^32 transaction size of
bitmap can grow up to 0.5Gb.
I try to understand what are the following steps in global temp tables
support.
This is why I want to perform short survey - what people are expecting
from global temp tables:
1. I do not need them at all.
2. Eliminate catalog bloating.
3. Mostly needed for compatibility with Oracle (simplify porting,...).
4. Parallel query execution.
5. Can be used at replica.
6. More efficient use of resources (first of all memory).
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_shared_temp_replica.patchtext/x-patch; name=global_shared_temp_replica.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index a2c44a9..43b4c66 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -158,7 +158,8 @@ pgrowlocks(PG_FUNCTION_ARGS)
/* must hold a buffer lock to call HeapTupleSatisfiesUpdate */
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
- htsu = HeapTupleSatisfiesUpdate(tuple,
+ htsu = HeapTupleSatisfiesUpdate(mydata->rel,
+ tuple,
GetCurrentCommandId(false),
hscan->rs_cbuf);
xmax = HeapTupleHeaderGetRawXmax(tuple->t_data);
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 70af43e..9cce720 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -349,7 +349,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
/* must hold a buffer lock to call HeapTupleSatisfiesVisibility */
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
- if (HeapTupleSatisfiesVisibility(tuple, &SnapshotDirty, hscan->rs_cbuf))
+ if (HeapTupleSatisfiesVisibility(rel, tuple, &SnapshotDirty, hscan->rs_cbuf))
{
stat.tuple_len += tuple->t_len;
stat.tuple_count++;
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..c99701d 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 9430994..181efde 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -444,7 +444,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
if (all_visible)
valid = true;
else
- valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(scan->rs_base.rs_rd, &loctup, snapshot, buffer);
CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
&loctup, buffer, snapshot);
@@ -664,7 +664,8 @@ heapgettup(HeapScanDesc scan,
/*
* if current tuple qualifies, return it.
*/
- valid = HeapTupleSatisfiesVisibility(tuple,
+ valid = HeapTupleSatisfiesVisibility(scan->rs_base.rs_rd,
+ tuple,
snapshot,
scan->rs_cbuf);
@@ -1474,7 +1475,7 @@ heap_fetch(Relation relation,
/*
* check tuple visibility, then release lock
*/
- valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, tuple, snapshot, buffer);
if (valid)
PredicateLockTuple(relation, tuple, snapshot);
@@ -1612,7 +1613,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), offnum);
/* If it's visible per the snapshot, we must return it */
- valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, heapTuple, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, heapTuple,
buffer, snapshot);
/* reset to original, non-redirected, tid */
@@ -1754,7 +1755,7 @@ heap_get_latest_tid(TableScanDesc sscan,
* Check tuple visibility; if visible, set it as the new result
* candidate.
*/
- valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, &tp, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
if (valid)
*tid = ctid;
@@ -1851,6 +1852,14 @@ ReleaseBulkInsertStatePin(BulkInsertState bistate)
}
+static TransactionId
+GetTransactionId(Relation relation)
+{
+ return relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress()
+ ? GetReplicaTransactionId()
+ : GetCurrentTransactionId();
+}
+
/*
* heap_insert - insert tuple into a heap
*
@@ -1873,7 +1882,7 @@ void
heap_insert(Relation relation, HeapTuple tup, CommandId cid,
int options, BulkInsertState bistate)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
HeapTuple heaptup;
Buffer buffer;
Buffer vmbuffer = InvalidBuffer;
@@ -2110,7 +2119,7 @@ void
heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
CommandId cid, int options, BulkInsertState bistate)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
HeapTuple *heaptuples;
int i;
int ndone;
@@ -2449,7 +2458,7 @@ heap_delete(Relation relation, ItemPointer tid,
TM_FailureData *tmfd, bool changingPart)
{
TM_Result result;
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
ItemId lp;
HeapTupleData tp;
Page page;
@@ -2514,7 +2523,7 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(relation, &tp, cid, buffer);
if (result == TM_Invisible)
{
@@ -2633,7 +2642,7 @@ l1:
if (crosscheck != InvalidSnapshot && result == TM_Ok)
{
/* Perform additional check for transaction-snapshot mode RI updates */
- if (!HeapTupleSatisfiesVisibility(&tp, crosscheck, buffer))
+ if (!HeapTupleSatisfiesVisibility(relation, &tp, crosscheck, buffer))
result = TM_Updated;
}
@@ -2900,7 +2909,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
TM_FailureData *tmfd, LockTupleMode *lockmode)
{
TM_Result result;
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
Bitmapset *hot_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
@@ -3070,7 +3079,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
l2:
checked_lockers = false;
locker_remains = false;
- result = HeapTupleSatisfiesUpdate(&oldtup, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(relation, &oldtup, cid, buffer);
/* see below about the "no wait" case */
Assert(result != TM_BeingModified || wait);
@@ -3262,7 +3271,7 @@ l2:
if (crosscheck != InvalidSnapshot && result == TM_Ok)
{
/* Perform additional check for transaction-snapshot mode RI updates */
- if (!HeapTupleSatisfiesVisibility(&oldtup, crosscheck, buffer))
+ if (!HeapTupleSatisfiesVisibility(relation, &oldtup, crosscheck, buffer))
{
result = TM_Updated;
Assert(!ItemPointerEquals(&oldtup.t_self, &oldtup.t_data->t_ctid));
@@ -4018,7 +4027,7 @@ heap_lock_tuple(Relation relation, HeapTuple tuple,
tuple->t_tableOid = RelationGetRelid(relation);
l3:
- result = HeapTupleSatisfiesUpdate(tuple, cid, *buffer);
+ result = HeapTupleSatisfiesUpdate(relation, tuple, cid, *buffer);
if (result == TM_Invisible)
{
@@ -4193,7 +4202,7 @@ l3:
TM_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
- GetCurrentTransactionId(),
+ GetTransactionId(relation),
mode);
if (res != TM_Ok)
{
@@ -4441,7 +4450,7 @@ l3:
TM_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
- GetCurrentTransactionId(),
+ GetTransactionId(relation),
mode);
if (res != TM_Ok)
{
@@ -4550,7 +4559,7 @@ failed:
* state if multixact.c elogs.
*/
compute_new_xmax_infomask(xmax, old_infomask, tuple->t_data->t_infomask2,
- GetCurrentTransactionId(), mode, false,
+ GetTransactionId(relation), mode, false,
&xid, &new_infomask, &new_infomask2);
START_CRIT_SECTION();
@@ -5570,7 +5579,7 @@ heap_finish_speculative(Relation relation, ItemPointer tid)
void
heap_abort_speculative(Relation relation, ItemPointer tid)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
ItemId lp;
HeapTupleData tp;
Page page;
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe..a189834 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -226,7 +226,8 @@ heapam_tuple_satisfies_snapshot(Relation rel, TupleTableSlot *slot,
* Caller should be holding pin, but not lock.
*/
LockBuffer(bslot->buffer, BUFFER_LOCK_SHARE);
- res = HeapTupleSatisfiesVisibility(bslot->base.tuple, snapshot,
+
+ res = HeapTupleSatisfiesVisibility(rel, bslot->base.tuple, snapshot,
bslot->buffer);
LockBuffer(bslot->buffer, BUFFER_LOCK_UNLOCK);
@@ -671,6 +672,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
@@ -2160,7 +2162,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
loctup.t_len = ItemIdGetLength(lp);
loctup.t_tableOid = scan->rs_rd->rd_id;
ItemPointerSet(&loctup.t_self, page, offnum);
- valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(scan->rs_rd, &loctup, snapshot, buffer);
if (valid)
{
hscan->rs_vistuples[ntup++] = offnum;
@@ -2480,7 +2482,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
else
{
/* Otherwise, we have to check the tuple individually. */
- return HeapTupleSatisfiesVisibility(tuple, scan->rs_snapshot,
+ return HeapTupleSatisfiesVisibility(scan->rs_rd, tuple, scan->rs_snapshot,
buffer);
}
}
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index 537e681..3076f6a 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -77,6 +77,7 @@
#include "utils/combocid.h"
#include "utils/snapmgr.h"
+static bool TempTupleSatisfiesVisibility(HeapTuple htup, CommandId curcid, Buffer buffer);
/*
* SetHintBits()
@@ -454,7 +455,7 @@ HeapTupleSatisfiesToast(HeapTuple htup, Snapshot snapshot,
* test for it themselves.)
*/
TM_Result
-HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
+HeapTupleSatisfiesUpdate(Relation relation, HeapTuple htup, CommandId curcid,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
@@ -462,6 +463,13 @@ HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ if (relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress())
+ {
+ AccessTempRelationAtReplica = true;
+ return TempTupleSatisfiesVisibility(htup, curcid, buffer) ? TM_Ok : TM_Invisible;
+ }
+ AccessTempRelationAtReplica = false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1677,6 +1685,59 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
}
/*
+ * TempTupleSatisfiesVisibility
+ * True iff global temp table tuple is visible for the current transaction.
+ *
+ * Temporary tables are visible only for current backend, so there is no need to
+ * handle cases with tuples committed by other backends. We only need to exclude
+ * modifications done by aborted transactions or after start of table scan.
+ *
+ */
+static bool
+TempTupleSatisfiesVisibility(HeapTuple htup, CommandId curcid, Buffer buffer)
+{
+ HeapTupleHeader tuple = htup->t_data;
+ TransactionId xmin;
+ TransactionId xmax;
+
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
+ if (HeapTupleHeaderXminInvalid(tuple))
+ return false;
+
+ xmin = HeapTupleHeaderGetRawXmin(tuple);
+
+ if (IsReplicaTransactionAborted(xmin))
+ return false;
+
+ if (IsReplicaCurrentTransactionId(xmin)
+ && HeapTupleHeaderGetCmin(tuple) >= curcid)
+ {
+ return false; /* inserted after scan started */
+ }
+
+ if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */
+ return true;
+
+ if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) /* not deleter */
+ return true;
+
+ xmax = (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
+ ? HeapTupleGetUpdateXid(tuple)
+ : HeapTupleHeaderGetRawXmax(tuple);
+
+ if (IsReplicaTransactionAborted(xmax))
+ return true; /* updating subtransaction aborted */
+
+ if (!IsReplicaCurrentTransactionId(xmax))
+ return false; /* updating transaction committed */
+
+ return (HeapTupleHeaderGetCmax(tuple) >= curcid); /* updated after scan started */
+}
+
+
+/*
* HeapTupleSatisfiesVisibility
* True iff heap tuple satisfies a time qual.
*
@@ -1687,8 +1748,15 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
* if so, the indicated buffer is marked dirty.
*/
bool
-HeapTupleSatisfiesVisibility(HeapTuple tup, Snapshot snapshot, Buffer buffer)
+HeapTupleSatisfiesVisibility(Relation relation, HeapTuple tup, Snapshot snapshot, Buffer buffer)
{
+ if (relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress())
+ {
+ AccessTempRelationAtReplica = true;
+ return TempTupleSatisfiesVisibility(tup, snapshot->curcid, buffer);
+ }
+ AccessTempRelationAtReplica = false;
+
switch (snapshot->snapshot_type)
{
case SNAPSHOT_MVCC:
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 5962126..bdb6c95 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/transam/transam.c b/src/backend/access/transam/transam.c
index 365ddfb..bce9c4a 100644
--- a/src/backend/access/transam/transam.c
+++ b/src/backend/access/transam/transam.c
@@ -22,6 +22,7 @@
#include "access/clog.h"
#include "access/subtrans.h"
#include "access/transam.h"
+#include "access/xact.h"
#include "utils/snapmgr.h"
/*
@@ -126,6 +127,9 @@ TransactionIdDidCommit(TransactionId transactionId)
{
XidStatus xidstatus;
+ if (AccessTempRelationAtReplica)
+ return !IsReplicaCurrentTransactionId(transactionId) && !IsReplicaTransactionAborted(transactionId);
+
xidstatus = TransactionLogFetch(transactionId);
/*
@@ -182,6 +186,9 @@ TransactionIdDidAbort(TransactionId transactionId)
{
XidStatus xidstatus;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaTransactionAborted(transactionId);
+
xidstatus = TransactionLogFetch(transactionId);
/*
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 5b759ec..388faae 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -71,7 +71,7 @@ GetNewTransactionId(bool isSubXact)
/* safety check, we should never get this far in a HS standby */
if (RecoveryInProgress())
- elog(ERROR, "cannot assign TransactionIds during recovery");
+ elog(ERROR, "cannot assign TransactionIds during recovery");
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1bbaeee..ab1bef9 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -192,6 +192,7 @@ typedef struct TransactionStateData
int parallelModeLevel; /* Enter/ExitParallelMode counter */
bool chain; /* start a new block after this one */
struct TransactionStateData *parent; /* back link to parent */
+ TransactionId replicaTransactionId; /* pseudo XID for inserting data in global temp tables at replica */
} TransactionStateData;
typedef TransactionStateData *TransactionState;
@@ -286,6 +287,12 @@ typedef struct XactCallbackItem
static XactCallbackItem *Xact_callbacks = NULL;
+static TransactionId replicaTransIdCount = FirstNormalTransactionId;
+static TransactionId replicaTopTransId;
+static Bitmapset* replicaAbortedXids;
+
+bool AccessTempRelationAtReplica;
+
/*
* List of add-on start- and end-of-subxact callbacks
*/
@@ -443,6 +450,48 @@ GetCurrentTransactionIdIfAny(void)
}
/*
+ * Transactions at replica can update only global temporary tables.
+ * Them are assigned backend-local XIDs which are independent from normal XIDs received from primary node.
+ * So tuples of temporary tables at replica requires special visibility rules.
+ *
+ * XIDs for such transactions at replica are created on demand (when tuple of temp table is updated).
+ * XID wrap-around and adjusting XID horizon is not supported. So number of such transactions at replica is
+ * limited by 2^32 and require up to 2^29 in-memory bitmap for marking aborted transactions.
+ */
+TransactionId
+GetReplicaTransactionId(void)
+{
+ TransactionState s = CurrentTransactionState;
+ if (!TransactionIdIsValid(s->replicaTransactionId))
+ s->replicaTransactionId = ++replicaTransIdCount;
+ return s->replicaTransactionId;
+}
+
+/*
+ * At replica transaction can update only temporary tables
+ * and them are assigned special XIDs (not related with normal XIDs received from primary node).
+ * As far as we see only own transaction it is not necessary to mark committed transactions.
+ * Only marking aborted ones is enough. All transactions which are not marked as aborted are treated as
+ * committed or self in-progress transactions.
+ */
+bool
+IsReplicaTransactionAborted(TransactionId xid)
+{
+ return bms_is_member(xid, replicaAbortedXids);
+}
+
+/*
+ * As far as XIDs for transactions at replica are generated individually for each backends,
+ * we can check that XID belongs to the current transaction or any of its subtransactions by
+ * just comparing it with XID of top transaction.
+ */
+bool
+IsReplicaCurrentTransactionId(TransactionId xid)
+{
+ return xid > replicaTopTransId;
+}
+
+/*
* GetTopFullTransactionId
*
* This will return the FullTransactionId of the main transaction, assigning
@@ -855,6 +904,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
{
TransactionState s;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaCurrentTransactionId(xid);
+
/*
* We always say that BootstrapTransactionId is "not my transaction ID"
* even when it is (ie, during bootstrap). Along with the fact that
@@ -1206,7 +1258,7 @@ static TransactionId
RecordTransactionCommit(void)
{
TransactionId xid = GetTopTransactionIdIfAny();
- bool markXidCommitted = TransactionIdIsValid(xid);
+ bool markXidCommitted = TransactionIdIsNormal(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
RelFileNode *rels;
@@ -1624,7 +1676,7 @@ RecordTransactionAbort(bool isSubXact)
* rels to delete (note that this routine is not responsible for actually
* deleting 'em). We cannot have any child XIDs, either.
*/
- if (!TransactionIdIsValid(xid))
+ if (!TransactionIdIsNormal(xid))
{
/* Reset XactLastRecEnd until the next transaction writes something */
if (!isSubXact)
@@ -1892,6 +1944,8 @@ StartTransaction(void)
s = &TopTransactionStateData;
CurrentTransactionState = s;
+ replicaTopTransId = replicaTransIdCount;
+
Assert(!FullTransactionIdIsValid(XactTopFullTransactionId));
/* check the current transaction state */
@@ -1905,6 +1959,7 @@ StartTransaction(void)
*/
s->state = TRANS_START;
s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
+ s->replicaTransactionId = InvalidTransactionId; /* until assigned */
/* Determine if statements are logged in this transaction */
xact_is_sampled = log_xact_sample_rate != 0 &&
@@ -2570,6 +2625,14 @@ AbortTransaction(void)
/* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS();
+ /* Mark transactions involved global temp table at replica as aborted */
+ if (TransactionIdIsValid(s->replicaTransactionId))
+ {
+ MemoryContext ctx = MemoryContextSwitchTo(TopMemoryContext);
+ replicaAbortedXids = bms_add_member(replicaAbortedXids, s->replicaTransactionId);
+ MemoryContextSwitchTo(ctx);
+ }
+
/* Make sure we have a valid memory context and resource owner */
AtAbort_Memory();
AtAbort_ResourceOwner();
@@ -2991,6 +3054,9 @@ CommitTransactionCommand(void)
* and then clean up.
*/
case TBLOCK_ABORT_PENDING:
+ if (GetCurrentTransactionIdIfAny() == FrozenTransactionId)
+ elog(FATAL, "Transaction is aborted at standby");
+
AbortTransaction();
CleanupTransaction();
s->blockState = TBLOCK_DEFAULT;
@@ -4856,6 +4922,14 @@ AbortSubTransaction(void)
/* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS();
+ /* Mark transactions involved global temp table at replica as aborted */
+ if (TransactionIdIsValid(s->replicaTransactionId))
+ {
+ MemoryContext ctx = MemoryContextSwitchTo(TopMemoryContext);
+ replicaAbortedXids = bms_add_member(replicaAbortedXids, s->replicaTransactionId);
+ MemoryContextSwitchTo(ctx);
+ }
+
/* Make sure we have a valid memory context and resource owner */
AtSubAbort_Memory();
AtSubAbort_ResourceOwner();
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 99ae159..24b2438 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3612,7 +3612,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..6c3998f 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a31f775 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1772,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7679,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14089,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14583,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbd7dd9..efe6f21 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -788,6 +788,9 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
if (isTempNamespace(get_rel_namespace(rte->relid)))
continue;
+ if (get_rel_persistence(rte->relid) == RELPERSISTENCE_SESSION)
+ continue;
+
PreventCommandIfReadOnly(CreateCommandTag((Node *) plannedstmt));
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 98e9948..1a9170b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -124,7 +124,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
relation = table_open(relationObjectId, NoLock);
/* Temporary and unlogged relations are inaccessible during recovery. */
- if (!RelationNeedsWAL(relation) && RecoveryInProgress())
+ if (!RelationNeedsWAL(relation) && RecoveryInProgress() && relation->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary or unlogged relations during recovery")));
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index a9b2f8b..2f261b9 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..5760a9c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2154,7 +2154,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e8ffa04..2004d2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3483,6 +3483,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3496,7 +3497,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6f3a402..76ce953 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2930,7 +2932,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2958,11 +2960,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2985,24 +2987,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3025,11 +3027,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3044,7 +3046,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3054,7 +3056,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3063,7 +3065,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3102,11 +3104,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3136,7 +3138,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3204,7 +3206,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3251,13 +3254,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3305,13 +3310,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4056,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4075,7 +4080,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4093,7 +4098,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4108,22 +4113,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4359,7 +4369,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4378,7 +4388,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index fadab62..055ec6b 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -994,6 +994,9 @@ TransactionIdIsInProgress(TransactionId xid)
int i,
j;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaCurrentTransactionId(xid) && !IsReplicaTransactionAborted(xid);
+
/*
* Don't bother checking a transaction older than RecentXmin; it could not
* possibly still be running. (Note: in particular, this guarantees that
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6..2f16c58 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -195,9 +195,9 @@ extern void heap_vacuum_rel(Relation onerel,
struct VacuumParams *params, BufferAccessStrategy bstrategy);
/* in heap/heapam_visibility.c */
-extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
+extern bool HeapTupleSatisfiesVisibility(Relation relation, HeapTuple stup, Snapshot snapshot,
Buffer buffer);
-extern TM_Result HeapTupleSatisfiesUpdate(HeapTuple stup, CommandId curcid,
+extern TM_Result HeapTupleSatisfiesUpdate(Relation relation, HeapTuple stup, CommandId curcid,
Buffer buffer);
extern HTSV_Result HeapTupleSatisfiesVacuum(HeapTuple stup, TransactionId OldestXmin,
Buffer buffer);
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index d714551..cbe6760 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -41,6 +41,9 @@
extern int DefaultXactIsoLevel;
extern PGDLLIMPORT int XactIsoLevel;
+extern bool AccessTempRelationAtReplica;
+
+
/*
* We implement three isolation levels internally.
* The two stronger ones use one snapshot per database transaction;
@@ -440,4 +443,8 @@ extern void EnterParallelMode(void);
extern void ExitParallelMode(void);
extern bool IsInParallelMode(void);
+extern TransactionId GetReplicaTransactionId(void);
+extern bool IsReplicaTransactionAborted(TransactionId xid);
+extern bool IsReplicaCurrentTransactionId(TransactionId xid);
+
#endif /* XACT_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b0fe19e..b361851 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -328,6 +328,17 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
/*
* ViewOptions
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 74b5077..44df4e0 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -85,3 +85,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..b7bf067
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,323 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+PREPARE TRANSACTION 'global_temp1';
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+COMMIT PREPARED 'global_temp1';
+-- 1, 2
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+-- Only 4
+SELECT * FROM global_temptest;
+ col
+-----
+ 4
+(1 row)
+
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+ 1
+ 3
+(2 rows)
+
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/expected/global_temp_0.out b/src/test/regress/expected/global_temp_0.out
new file mode 100644
index 0000000..934e751
--- /dev/null
+++ b/src/test/regress/expected/global_temp_0.out
@@ -0,0 +1,326 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+PREPARE TRANSACTION 'global_temp1';
+ERROR: prepared transactions are disabled
+HINT: Set max_prepared_transactions to a nonzero value.
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+ERROR: prepared transactions are disabled
+HINT: Set max_prepared_transactions to a nonzero value.
+COMMIT PREPARED 'global_temp1';
+ERROR: prepared transaction with identifier "global_temp1" does not exist
+-- 1, 2
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+ERROR: prepared transaction with identifier "global_temp2" does not exist
+-- Only 4
+SELECT * FROM global_temptest;
+ col
+-----
+ 4
+(1 row)
+
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..4d2da8d
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,191 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp1';
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+COMMIT PREPARED 'global_temp1';
+-- 1, 2
+SELECT * FROM global_temptest;
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+-- Only 4
+SELECT * FROM global_temptest;
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
I have added support of all indexes (brin, btree, gin, gist, hash,
spgist) for global temp tables (before only B-Tree index was supported).
It will be nice to have some generic mechanism for it, but I do not
understand how it can look like.
The problem is that normal relations are initialized at the moment of
their creation.
But for global temp relations metadata already exists while data is
absent. We should somehow catch such access to not initialized page (but
not not all pages, but just first page of relation)
and perform initialization on demand.
New patch for global temp tables with shared buffers is attached.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_shared_temp_replica-2.patchtext/x-patch; name=global_shared_temp_replica-2.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index a2c44a9..43b4c66 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -158,7 +158,8 @@ pgrowlocks(PG_FUNCTION_ARGS)
/* must hold a buffer lock to call HeapTupleSatisfiesUpdate */
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
- htsu = HeapTupleSatisfiesUpdate(tuple,
+ htsu = HeapTupleSatisfiesUpdate(mydata->rel,
+ tuple,
GetCurrentCommandId(false),
hscan->rs_cbuf);
xmax = HeapTupleHeaderGetRawXmax(tuple->t_data);
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 70af43e..9cce720 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -349,7 +349,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
/* must hold a buffer lock to call HeapTupleSatisfiesVisibility */
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
- if (HeapTupleSatisfiesVisibility(tuple, &SnapshotDirty, hscan->rs_cbuf))
+ if (HeapTupleSatisfiesVisibility(rel, tuple, &SnapshotDirty, hscan->rs_cbuf))
{
stat.tuple_len += tuple->t_len;
stat.tuple_count++;
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index e2bfbf8..97041a8 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -25,6 +25,7 @@
#include "access/brin_revmap.h"
#include "access/brin_tuple.h"
#include "access/brin_xlog.h"
+#include "access/brin.h"
#include "access/rmgr.h"
#include "access/xloginsert.h"
#include "miscadmin.h"
@@ -79,6 +80,11 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
LockBuffer(meta, BUFFER_LOCK_SHARE);
page = BufferGetPage(meta);
+
+ if (GlobalTempRelationPageIsNotInitialized(idxrel, page))
+ brin_metapage_init(page, BrinGetPagesPerRange(idxrel),
+ BRIN_CURRENT_VERSION);
+
TestForOldSnapshot(snapshot, idxrel, page);
metadata = (BrinMetaPageData *) PageGetContents(page);
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b..8a6ac71 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -241,6 +241,16 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Buffer rootbuffer = ReadBuffer(index, GIN_ROOT_BLKNO);
+ LockBuffer(rootbuffer, BUFFER_LOCK_EXCLUSIVE);
+ GinInitMetabuffer(metabuffer);
+ GinInitBuffer(rootbuffer, GIN_LEAF);
+ MarkBufferDirty(rootbuffer);
+ UnlockReleaseBuffer(rootbuffer);
+ }
+
/*
* An insertion to the pending list could logically belong anywhere in the
* tree, so it conflicts with all serializable scans. All scans acquire a
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..41bab5d 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -1750,7 +1750,7 @@ collectMatchesForHeapRow(IndexScanDesc scan, pendingPosition *pos)
/*
* Collect all matched rows from pending list into bitmap.
*/
-static void
+static bool
scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
GinScanOpaque so = (GinScanOpaque) scan->opaque;
@@ -1774,6 +1774,12 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
LockBuffer(metabuffer, GIN_SHARE);
page = BufferGetPage(metabuffer);
TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
+
+ if (GlobalTempRelationPageIsNotInitialized(scan->indexRelation, page))
+ {
+ UnlockReleaseBuffer(metabuffer);
+ return false;
+ }
blkno = GinPageGetMeta(page)->head;
/*
@@ -1784,7 +1790,7 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
/* No pending list, so proceed with normal scan */
UnlockReleaseBuffer(metabuffer);
- return;
+ return true;
}
pos.pendingBuffer = ReadBuffer(scan->indexRelation, blkno);
@@ -1840,6 +1846,7 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
}
pfree(pos.hasMatchKey);
+ return true;
}
@@ -1875,7 +1882,8 @@ gingetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
* to scan the main index before the pending list, since concurrent
* cleanup could then make us miss entries entirely.
*/
- scanPendingInsert(scan, tbm, &ntids);
+ if (!scanPendingInsert(scan, tbm, &ntids))
+ return 0;
/*
* Now scan the main index.
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index e9ca4b8..d3ea072 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -677,7 +677,10 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
if (!xlocked)
{
LockBuffer(stack->buffer, GIST_SHARE);
- gistcheckpage(state.r, stack->buffer);
+ if (stack->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(state.r, BufferGetPage(stack->buffer)))
+ GISTInitBuffer(stack->buffer, F_LEAF);
+ else
+ gistcheckpage(state.r, stack->buffer);
}
stack->page = (Page) BufferGetPage(stack->buffer);
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 4e0c500..cced239 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -339,7 +339,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
buffer = ReadBuffer(scan->indexRelation, pageItem->blkno);
LockBuffer(buffer, GIST_SHARE);
PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
- gistcheckpage(scan->indexRelation, buffer);
+ if (pageItem->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(r, BufferGetPage(buffer)))
+ GISTInitBuffer(buffer, F_LEAF);
+ else
+ gistcheckpage(scan->indexRelation, buffer);
page = BufferGetPage(buffer);
TestForOldSnapshot(scan->xs_snapshot, r, page);
opaque = GistPageGetOpaque(page);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 9726020..c99701d 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index defdc9b..35de5fa 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -75,13 +75,20 @@ _hash_getbuf(Relation rel, BlockNumber blkno, int access, int flags)
buf = ReadBuffer(rel, blkno);
- if (access != HASH_NOLOCK)
- LockBuffer(buf, access);
-
/* ref count and lock type are correct */
- _hash_checkpage(rel, buf, flags);
-
+ if (blkno == HASH_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ _hash_init(rel, 0, MAIN_FORKNUM);
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ _hash_checkpage(rel, buf, flags);
+ }
return buf;
}
@@ -339,7 +346,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
bool use_wal;
/* safety check */
- if (RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
+ if (rel->rd_rel->relpersistence != RELPERSISTENCE_SESSION && RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
elog(ERROR, "cannot initialize non-empty hash index \"%s\"",
RelationGetRelationName(rel));
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 9430994..181efde 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -444,7 +444,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
if (all_visible)
valid = true;
else
- valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(scan->rs_base.rs_rd, &loctup, snapshot, buffer);
CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
&loctup, buffer, snapshot);
@@ -664,7 +664,8 @@ heapgettup(HeapScanDesc scan,
/*
* if current tuple qualifies, return it.
*/
- valid = HeapTupleSatisfiesVisibility(tuple,
+ valid = HeapTupleSatisfiesVisibility(scan->rs_base.rs_rd,
+ tuple,
snapshot,
scan->rs_cbuf);
@@ -1474,7 +1475,7 @@ heap_fetch(Relation relation,
/*
* check tuple visibility, then release lock
*/
- valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, tuple, snapshot, buffer);
if (valid)
PredicateLockTuple(relation, tuple, snapshot);
@@ -1612,7 +1613,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), offnum);
/* If it's visible per the snapshot, we must return it */
- valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, heapTuple, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, heapTuple,
buffer, snapshot);
/* reset to original, non-redirected, tid */
@@ -1754,7 +1755,7 @@ heap_get_latest_tid(TableScanDesc sscan,
* Check tuple visibility; if visible, set it as the new result
* candidate.
*/
- valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, &tp, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
if (valid)
*tid = ctid;
@@ -1851,6 +1852,14 @@ ReleaseBulkInsertStatePin(BulkInsertState bistate)
}
+static TransactionId
+GetTransactionId(Relation relation)
+{
+ return relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress()
+ ? GetReplicaTransactionId()
+ : GetCurrentTransactionId();
+}
+
/*
* heap_insert - insert tuple into a heap
*
@@ -1873,7 +1882,7 @@ void
heap_insert(Relation relation, HeapTuple tup, CommandId cid,
int options, BulkInsertState bistate)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
HeapTuple heaptup;
Buffer buffer;
Buffer vmbuffer = InvalidBuffer;
@@ -2110,7 +2119,7 @@ void
heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
CommandId cid, int options, BulkInsertState bistate)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
HeapTuple *heaptuples;
int i;
int ndone;
@@ -2449,7 +2458,7 @@ heap_delete(Relation relation, ItemPointer tid,
TM_FailureData *tmfd, bool changingPart)
{
TM_Result result;
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
ItemId lp;
HeapTupleData tp;
Page page;
@@ -2514,7 +2523,7 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(relation, &tp, cid, buffer);
if (result == TM_Invisible)
{
@@ -2633,7 +2642,7 @@ l1:
if (crosscheck != InvalidSnapshot && result == TM_Ok)
{
/* Perform additional check for transaction-snapshot mode RI updates */
- if (!HeapTupleSatisfiesVisibility(&tp, crosscheck, buffer))
+ if (!HeapTupleSatisfiesVisibility(relation, &tp, crosscheck, buffer))
result = TM_Updated;
}
@@ -2900,7 +2909,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
TM_FailureData *tmfd, LockTupleMode *lockmode)
{
TM_Result result;
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
Bitmapset *hot_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
@@ -3070,7 +3079,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
l2:
checked_lockers = false;
locker_remains = false;
- result = HeapTupleSatisfiesUpdate(&oldtup, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(relation, &oldtup, cid, buffer);
/* see below about the "no wait" case */
Assert(result != TM_BeingModified || wait);
@@ -3262,7 +3271,7 @@ l2:
if (crosscheck != InvalidSnapshot && result == TM_Ok)
{
/* Perform additional check for transaction-snapshot mode RI updates */
- if (!HeapTupleSatisfiesVisibility(&oldtup, crosscheck, buffer))
+ if (!HeapTupleSatisfiesVisibility(relation, &oldtup, crosscheck, buffer))
{
result = TM_Updated;
Assert(!ItemPointerEquals(&oldtup.t_self, &oldtup.t_data->t_ctid));
@@ -4018,7 +4027,7 @@ heap_lock_tuple(Relation relation, HeapTuple tuple,
tuple->t_tableOid = RelationGetRelid(relation);
l3:
- result = HeapTupleSatisfiesUpdate(tuple, cid, *buffer);
+ result = HeapTupleSatisfiesUpdate(relation, tuple, cid, *buffer);
if (result == TM_Invisible)
{
@@ -4193,7 +4202,7 @@ l3:
TM_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
- GetCurrentTransactionId(),
+ GetTransactionId(relation),
mode);
if (res != TM_Ok)
{
@@ -4441,7 +4450,7 @@ l3:
TM_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
- GetCurrentTransactionId(),
+ GetTransactionId(relation),
mode);
if (res != TM_Ok)
{
@@ -4550,7 +4559,7 @@ failed:
* state if multixact.c elogs.
*/
compute_new_xmax_infomask(xmax, old_infomask, tuple->t_data->t_infomask2,
- GetCurrentTransactionId(), mode, false,
+ GetTransactionId(relation), mode, false,
&xid, &new_infomask, &new_infomask2);
START_CRIT_SECTION();
@@ -5570,7 +5579,7 @@ heap_finish_speculative(Relation relation, ItemPointer tid)
void
heap_abort_speculative(Relation relation, ItemPointer tid)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
ItemId lp;
HeapTupleData tp;
Page page;
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe..a189834 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -226,7 +226,8 @@ heapam_tuple_satisfies_snapshot(Relation rel, TupleTableSlot *slot,
* Caller should be holding pin, but not lock.
*/
LockBuffer(bslot->buffer, BUFFER_LOCK_SHARE);
- res = HeapTupleSatisfiesVisibility(bslot->base.tuple, snapshot,
+
+ res = HeapTupleSatisfiesVisibility(rel, bslot->base.tuple, snapshot,
bslot->buffer);
LockBuffer(bslot->buffer, BUFFER_LOCK_UNLOCK);
@@ -671,6 +672,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
@@ -2160,7 +2162,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
loctup.t_len = ItemIdGetLength(lp);
loctup.t_tableOid = scan->rs_rd->rd_id;
ItemPointerSet(&loctup.t_self, page, offnum);
- valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(scan->rs_rd, &loctup, snapshot, buffer);
if (valid)
{
hscan->rs_vistuples[ntup++] = offnum;
@@ -2480,7 +2482,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
else
{
/* Otherwise, we have to check the tuple individually. */
- return HeapTupleSatisfiesVisibility(tuple, scan->rs_snapshot,
+ return HeapTupleSatisfiesVisibility(scan->rs_rd, tuple, scan->rs_snapshot,
buffer);
}
}
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index 537e681..3076f6a 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -77,6 +77,7 @@
#include "utils/combocid.h"
#include "utils/snapmgr.h"
+static bool TempTupleSatisfiesVisibility(HeapTuple htup, CommandId curcid, Buffer buffer);
/*
* SetHintBits()
@@ -454,7 +455,7 @@ HeapTupleSatisfiesToast(HeapTuple htup, Snapshot snapshot,
* test for it themselves.)
*/
TM_Result
-HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
+HeapTupleSatisfiesUpdate(Relation relation, HeapTuple htup, CommandId curcid,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
@@ -462,6 +463,13 @@ HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ if (relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress())
+ {
+ AccessTempRelationAtReplica = true;
+ return TempTupleSatisfiesVisibility(htup, curcid, buffer) ? TM_Ok : TM_Invisible;
+ }
+ AccessTempRelationAtReplica = false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1677,6 +1685,59 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
}
/*
+ * TempTupleSatisfiesVisibility
+ * True iff global temp table tuple is visible for the current transaction.
+ *
+ * Temporary tables are visible only for current backend, so there is no need to
+ * handle cases with tuples committed by other backends. We only need to exclude
+ * modifications done by aborted transactions or after start of table scan.
+ *
+ */
+static bool
+TempTupleSatisfiesVisibility(HeapTuple htup, CommandId curcid, Buffer buffer)
+{
+ HeapTupleHeader tuple = htup->t_data;
+ TransactionId xmin;
+ TransactionId xmax;
+
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
+ if (HeapTupleHeaderXminInvalid(tuple))
+ return false;
+
+ xmin = HeapTupleHeaderGetRawXmin(tuple);
+
+ if (IsReplicaTransactionAborted(xmin))
+ return false;
+
+ if (IsReplicaCurrentTransactionId(xmin)
+ && HeapTupleHeaderGetCmin(tuple) >= curcid)
+ {
+ return false; /* inserted after scan started */
+ }
+
+ if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */
+ return true;
+
+ if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) /* not deleter */
+ return true;
+
+ xmax = (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
+ ? HeapTupleGetUpdateXid(tuple)
+ : HeapTupleHeaderGetRawXmax(tuple);
+
+ if (IsReplicaTransactionAborted(xmax))
+ return true; /* updating subtransaction aborted */
+
+ if (!IsReplicaCurrentTransactionId(xmax))
+ return false; /* updating transaction committed */
+
+ return (HeapTupleHeaderGetCmax(tuple) >= curcid); /* updated after scan started */
+}
+
+
+/*
* HeapTupleSatisfiesVisibility
* True iff heap tuple satisfies a time qual.
*
@@ -1687,8 +1748,15 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
* if so, the indicated buffer is marked dirty.
*/
bool
-HeapTupleSatisfiesVisibility(HeapTuple tup, Snapshot snapshot, Buffer buffer)
+HeapTupleSatisfiesVisibility(Relation relation, HeapTuple tup, Snapshot snapshot, Buffer buffer)
{
+ if (relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress())
+ {
+ AccessTempRelationAtReplica = true;
+ return TempTupleSatisfiesVisibility(tup, snapshot->curcid, buffer);
+ }
+ AccessTempRelationAtReplica = false;
+
switch (snapshot->snapshot_type)
{
case SNAPSHOT_MVCC:
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 5962126..bdb6c95 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db..a8497a2 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -106,6 +106,7 @@ spgGetCache(Relation index)
spgConfigIn in;
FmgrInfo *procinfo;
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
cache = MemoryContextAllocZero(index->rd_indexcxt,
@@ -155,12 +156,32 @@ spgGetCache(Relation index)
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
- metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
+ metapage = BufferGetPage(metabuffer);
+ metadata = SpGistPageGetMeta(metapage);
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
- elog(ERROR, "index \"%s\" is not an SP-GiST index",
- RelationGetRelationName(index));
+ {
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Buffer rootbuffer = ReadBuffer(index, SPGIST_ROOT_BLKNO);
+ Buffer nullbuffer = ReadBuffer(index, SPGIST_NULL_BLKNO);
+
+ SpGistInitMetapage(metapage);
+
+ LockBuffer(rootbuffer, BUFFER_LOCK_EXCLUSIVE);
+ SpGistInitPage(BufferGetPage(rootbuffer), SPGIST_LEAF);
+ MarkBufferDirty(rootbuffer);
+ UnlockReleaseBuffer(rootbuffer);
+ LockBuffer(nullbuffer, BUFFER_LOCK_EXCLUSIVE);
+ SpGistInitPage(BufferGetPage(nullbuffer), SPGIST_LEAF | SPGIST_NULLS);
+ MarkBufferDirty(nullbuffer);
+ UnlockReleaseBuffer(nullbuffer);
+ }
+ else
+ elog(ERROR, "index \"%s\" is not an SP-GiST index",
+ RelationGetRelationName(index));
+ }
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/transam/transam.c b/src/backend/access/transam/transam.c
index 365ddfb..bce9c4a 100644
--- a/src/backend/access/transam/transam.c
+++ b/src/backend/access/transam/transam.c
@@ -22,6 +22,7 @@
#include "access/clog.h"
#include "access/subtrans.h"
#include "access/transam.h"
+#include "access/xact.h"
#include "utils/snapmgr.h"
/*
@@ -126,6 +127,9 @@ TransactionIdDidCommit(TransactionId transactionId)
{
XidStatus xidstatus;
+ if (AccessTempRelationAtReplica)
+ return !IsReplicaCurrentTransactionId(transactionId) && !IsReplicaTransactionAborted(transactionId);
+
xidstatus = TransactionLogFetch(transactionId);
/*
@@ -182,6 +186,9 @@ TransactionIdDidAbort(TransactionId transactionId)
{
XidStatus xidstatus;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaTransactionAborted(transactionId);
+
xidstatus = TransactionLogFetch(transactionId);
/*
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 5b759ec..388faae 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -71,7 +71,7 @@ GetNewTransactionId(bool isSubXact)
/* safety check, we should never get this far in a HS standby */
if (RecoveryInProgress())
- elog(ERROR, "cannot assign TransactionIds during recovery");
+ elog(ERROR, "cannot assign TransactionIds during recovery");
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1bbaeee..ab1bef9 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -192,6 +192,7 @@ typedef struct TransactionStateData
int parallelModeLevel; /* Enter/ExitParallelMode counter */
bool chain; /* start a new block after this one */
struct TransactionStateData *parent; /* back link to parent */
+ TransactionId replicaTransactionId; /* pseudo XID for inserting data in global temp tables at replica */
} TransactionStateData;
typedef TransactionStateData *TransactionState;
@@ -286,6 +287,12 @@ typedef struct XactCallbackItem
static XactCallbackItem *Xact_callbacks = NULL;
+static TransactionId replicaTransIdCount = FirstNormalTransactionId;
+static TransactionId replicaTopTransId;
+static Bitmapset* replicaAbortedXids;
+
+bool AccessTempRelationAtReplica;
+
/*
* List of add-on start- and end-of-subxact callbacks
*/
@@ -443,6 +450,48 @@ GetCurrentTransactionIdIfAny(void)
}
/*
+ * Transactions at replica can update only global temporary tables.
+ * Them are assigned backend-local XIDs which are independent from normal XIDs received from primary node.
+ * So tuples of temporary tables at replica requires special visibility rules.
+ *
+ * XIDs for such transactions at replica are created on demand (when tuple of temp table is updated).
+ * XID wrap-around and adjusting XID horizon is not supported. So number of such transactions at replica is
+ * limited by 2^32 and require up to 2^29 in-memory bitmap for marking aborted transactions.
+ */
+TransactionId
+GetReplicaTransactionId(void)
+{
+ TransactionState s = CurrentTransactionState;
+ if (!TransactionIdIsValid(s->replicaTransactionId))
+ s->replicaTransactionId = ++replicaTransIdCount;
+ return s->replicaTransactionId;
+}
+
+/*
+ * At replica transaction can update only temporary tables
+ * and them are assigned special XIDs (not related with normal XIDs received from primary node).
+ * As far as we see only own transaction it is not necessary to mark committed transactions.
+ * Only marking aborted ones is enough. All transactions which are not marked as aborted are treated as
+ * committed or self in-progress transactions.
+ */
+bool
+IsReplicaTransactionAborted(TransactionId xid)
+{
+ return bms_is_member(xid, replicaAbortedXids);
+}
+
+/*
+ * As far as XIDs for transactions at replica are generated individually for each backends,
+ * we can check that XID belongs to the current transaction or any of its subtransactions by
+ * just comparing it with XID of top transaction.
+ */
+bool
+IsReplicaCurrentTransactionId(TransactionId xid)
+{
+ return xid > replicaTopTransId;
+}
+
+/*
* GetTopFullTransactionId
*
* This will return the FullTransactionId of the main transaction, assigning
@@ -855,6 +904,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
{
TransactionState s;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaCurrentTransactionId(xid);
+
/*
* We always say that BootstrapTransactionId is "not my transaction ID"
* even when it is (ie, during bootstrap). Along with the fact that
@@ -1206,7 +1258,7 @@ static TransactionId
RecordTransactionCommit(void)
{
TransactionId xid = GetTopTransactionIdIfAny();
- bool markXidCommitted = TransactionIdIsValid(xid);
+ bool markXidCommitted = TransactionIdIsNormal(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
RelFileNode *rels;
@@ -1624,7 +1676,7 @@ RecordTransactionAbort(bool isSubXact)
* rels to delete (note that this routine is not responsible for actually
* deleting 'em). We cannot have any child XIDs, either.
*/
- if (!TransactionIdIsValid(xid))
+ if (!TransactionIdIsNormal(xid))
{
/* Reset XactLastRecEnd until the next transaction writes something */
if (!isSubXact)
@@ -1892,6 +1944,8 @@ StartTransaction(void)
s = &TopTransactionStateData;
CurrentTransactionState = s;
+ replicaTopTransId = replicaTransIdCount;
+
Assert(!FullTransactionIdIsValid(XactTopFullTransactionId));
/* check the current transaction state */
@@ -1905,6 +1959,7 @@ StartTransaction(void)
*/
s->state = TRANS_START;
s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
+ s->replicaTransactionId = InvalidTransactionId; /* until assigned */
/* Determine if statements are logged in this transaction */
xact_is_sampled = log_xact_sample_rate != 0 &&
@@ -2570,6 +2625,14 @@ AbortTransaction(void)
/* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS();
+ /* Mark transactions involved global temp table at replica as aborted */
+ if (TransactionIdIsValid(s->replicaTransactionId))
+ {
+ MemoryContext ctx = MemoryContextSwitchTo(TopMemoryContext);
+ replicaAbortedXids = bms_add_member(replicaAbortedXids, s->replicaTransactionId);
+ MemoryContextSwitchTo(ctx);
+ }
+
/* Make sure we have a valid memory context and resource owner */
AtAbort_Memory();
AtAbort_ResourceOwner();
@@ -2991,6 +3054,9 @@ CommitTransactionCommand(void)
* and then clean up.
*/
case TBLOCK_ABORT_PENDING:
+ if (GetCurrentTransactionIdIfAny() == FrozenTransactionId)
+ elog(FATAL, "Transaction is aborted at standby");
+
AbortTransaction();
CleanupTransaction();
s->blockState = TBLOCK_DEFAULT;
@@ -4856,6 +4922,14 @@ AbortSubTransaction(void)
/* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS();
+ /* Mark transactions involved global temp table at replica as aborted */
+ if (TransactionIdIsValid(s->replicaTransactionId))
+ {
+ MemoryContext ctx = MemoryContextSwitchTo(TopMemoryContext);
+ replicaAbortedXids = bms_add_member(replicaAbortedXids, s->replicaTransactionId);
+ MemoryContextSwitchTo(ctx);
+ }
+
/* Make sure we have a valid memory context and resource owner */
AtSubAbort_Memory();
AtSubAbort_ResourceOwner();
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a065419..8814afb 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -409,6 +409,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 99ae159..24b2438 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3612,7 +3612,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 3cc886f..a111ddc 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index cedb4ee..d11c5b3 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 0960b33..6c3998f 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fb2be10..a31f775 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1772,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7678,6 +7679,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14082,6 +14089,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14569,14 +14583,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbd7dd9..efe6f21 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -788,6 +788,9 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
if (isTempNamespace(get_rel_namespace(rte->relid)))
continue;
+ if (get_rel_persistence(rte->relid) == RELPERSISTENCE_SESSION)
+ continue;
+
PreventCommandIfReadOnly(CreateCommandTag((Node *) plannedstmt));
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 98e9948..1a9170b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -124,7 +124,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
relation = table_open(relationObjectId, NoLock);
/* Temporary and unlogged relations are inaccessible during recovery. */
- if (!RelationNeedsWAL(relation) && RecoveryInProgress())
+ if (!RelationNeedsWAL(relation) && RecoveryInProgress() && relation->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary or unlogged relations during recovery")));
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb36..f9b2000 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3265,20 +3265,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index a9b2f8b..2f261b9 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..5760a9c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2154,7 +2154,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index e8ffa04..2004d2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3483,6 +3483,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3496,7 +3497,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6f3a402..76ce953 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2930,7 +2932,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
int i;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
DropRelFileNodeLocalBuffers(rnode.node, forkNum, firstDelBlock);
@@ -2958,11 +2960,11 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
InvalidateBuffer(bufHdr); /* releases spinlock */
@@ -2985,24 +2987,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3025,11 +3027,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3044,7 +3046,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3054,7 +3056,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3063,7 +3065,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3102,11 +3104,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3136,7 +3138,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3204,7 +3206,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3251,13 +3254,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3305,13 +3310,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4051,7 +4056,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4075,7 +4080,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4093,7 +4098,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4108,22 +4113,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
- if (n1.relNode < n2.relNode)
+ if (n1.node.relNode < n2.node.relNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.relNode > n2.node.relNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
+ return 1;
+
+ if (n1.backend < n2.backend)
+ return -1;
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4359,7 +4369,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4378,7 +4388,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index fadab62..055ec6b 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -994,6 +994,9 @@ TransactionIdIsInProgress(TransactionId xid)
int i,
j;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaCurrentTransactionId(xid) && !IsReplicaTransactionAborted(xid);
+
/*
* Don't bother checking a transaction older than RecentXmin; it could not
* possibly still be running. (Note: in particular, this guarantees that
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2488607..86e8fca 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0cc9ede..1dff0c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15593,8 +15593,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6..2f16c58 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -195,9 +195,9 @@ extern void heap_vacuum_rel(Relation onerel,
struct VacuumParams *params, BufferAccessStrategy bstrategy);
/* in heap/heapam_visibility.c */
-extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
+extern bool HeapTupleSatisfiesVisibility(Relation relation, HeapTuple stup, Snapshot snapshot,
Buffer buffer);
-extern TM_Result HeapTupleSatisfiesUpdate(HeapTuple stup, CommandId curcid,
+extern TM_Result HeapTupleSatisfiesUpdate(Relation relation, HeapTuple stup, CommandId curcid,
Buffer buffer);
extern HTSV_Result HeapTupleSatisfiesVacuum(HeapTuple stup, TransactionId OldestXmin,
Buffer buffer);
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index d714551..cbe6760 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -41,6 +41,9 @@
extern int DefaultXactIsoLevel;
extern PGDLLIMPORT int XactIsoLevel;
+extern bool AccessTempRelationAtReplica;
+
+
/*
* We implement three isolation levels internally.
* The two stronger ones use one snapshot per database transaction;
@@ -440,4 +443,8 @@ extern void EnterParallelMode(void);
extern void ExitParallelMode(void);
extern bool IsInParallelMode(void);
+extern TransactionId GetReplicaTransactionId(void);
+extern bool IsReplicaTransactionAborted(TransactionId xid);
+extern bool IsReplicaCurrentTransactionId(TransactionId xid);
+
#endif /* XACT_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index df2dda7..7adb96b 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 509f4b7..3315fa0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b0fe19e..b361851 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -328,6 +328,17 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
/*
* ViewOptions
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 74b5077..44df4e0 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -85,3 +85,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..b7bf067
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,323 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+PREPARE TRANSACTION 'global_temp1';
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+COMMIT PREPARED 'global_temp1';
+-- 1, 2
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+-- Only 4
+SELECT * FROM global_temptest;
+ col
+-----
+ 4
+(1 row)
+
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+ 1
+ 3
+(2 rows)
+
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/expected/global_temp_0.out b/src/test/regress/expected/global_temp_0.out
new file mode 100644
index 0000000..934e751
--- /dev/null
+++ b/src/test/regress/expected/global_temp_0.out
@@ -0,0 +1,326 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+PREPARE TRANSACTION 'global_temp1';
+ERROR: prepared transactions are disabled
+HINT: Set max_prepared_transactions to a nonzero value.
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+ERROR: prepared transactions are disabled
+HINT: Set max_prepared_transactions to a nonzero value.
+COMMIT PREPARED 'global_temp1';
+ERROR: prepared transaction with identifier "global_temp1" does not exist
+-- 1, 2
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+ERROR: prepared transaction with identifier "global_temp2" does not exist
+-- Only 4
+SELECT * FROM global_temptest;
+ col
+-----
+ 4
+(1 row)
+
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..4d2da8d
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,191 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp1';
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+COMMIT PREPARED 'global_temp1';
+-- 1, 2
+SELECT * FROM global_temptest;
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+-- Only 4
+SELECT * FROM global_temptest;
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
st 18. 9. 2019 v 12:04 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 21.08.2019 11:54, Konstantin Knizhnik wrote:
On 20.08.2019 20:01, Pavel Stehule wrote:
Another solution is wait on ZHeap storage and replica can to have own UNDO
log.I thought about implementation of special table access method for
temporary tables.+1
Unfortunately implementing special table access method for temporary
tables doesn't solve all problems.
XID generation is not part of table access methods.
So we still need to assign some XID to write transaction at replica which
will not conflict with XIDs received from master.
Actually only global temp tables can be updated at replica and so assigned
XIDs can be stored only in tuples of such relations.
But still I am not sure that we can use arbitrary XID for such
transactions at replica.Also I upset by amount of functionality which has to be reimplemented for
global temp tables if we really want to provide access method for them:1. CLOG
2. vacuum
3. MVCC visibilityAnd still it is not possible to encapsulate all changes need to support
writes to temp tables at replica inside table access method.
XID assignment, transaction commit and abort, subtransactions - all this
places need to be patched.I was able to fully support work with global temp tables at replica
(including subtransactions).
The patch is attached. Also you can find this version in
https://github.com/postgrespro/postgresql.builtin_pool/tree/global_temp_hotRight now transactions at replica updating global temp table are assigned
special kind of GIDs which are not related with XIDs received from master.
So special visibility rules are used for such tables at replica. Also I
have to patch TransactionIdIsInProgress, TransactionIdDidCommit,
TransactionIdGetCurrent
functions to correctly handle such XIDs. In principle it is possible to
implement global temp tables as special heap access method. But it will
require copying a lot of code (heapam.c)
so I prefer to add few checks to existed functions.There are still some limitations:
- Number of transactions at replica which update temp tables is limited by
2^32 (wraparound problem is not addressed).
- I have to maintain in-memory analog of CLOG for such transactions which
is also not cropped. It means that for 2^32 transaction size of bitmap can
grow up to 0.5Gb.I try to understand what are the following steps in global temp tables
support.
This is why I want to perform short survey - what people are expecting
from global temp tables:1. I do not need them at all.
2. Eliminate catalog bloating.
3. Mostly needed for compatibility with Oracle (simplify porting,...).
4. Parallel query execution.
5. Can be used at replica.
6. More efficient use of resources (first of all memory).
There can be other point important for cloud. Inside some cloud usually
there are two types of discs - persistent (slow) and ephemeral (fast). We
effectively used temp tables there because we moved temp tablespace to
ephemeral discs.
I missing one point in your list - developer's comfort - using temp tables
is just much more comfortable - you don't need create it again, again, ..
Due this behave is possible to reduce @2 and @3 can be nice side effect. If
you reduce @2 to zero, then @5 should be possible without any other.
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 20.09.2019 19:43, Pavel Stehule wrote:
1. I do not need them at all.
2. Eliminate catalog bloating.
3. Mostly needed for compatibility with Oracle (simplify porting,...).
4. Parallel query execution.
5. Can be used at replica.
6. More efficient use of resources (first of all memory).There can be other point important for cloud. Inside some cloud
usually there are two types of discs - persistent (slow) and ephemeral
(fast). We effectively used temp tables there because we moved temp
tablespace to ephemeral discs.
Yes, I already heard this argument and agree with it.
I just want to notice two things:
1. My assumption is that in most cases data of temporary table can fit
in memory (certainly if we are not limiting them by temp_buffers = 8MB,
but store in shared buffers) and so there is on need to write them to
the persistent media at all.
2. Global temp tables do not substitute local temp tables, accessed
through local buffers. So if you want to use temporary storage, you will
always have a way to do it.
The question is whether we need to support two kinds of global temp
tables (with shared or private buffers) or just implement one of them.
I missing one point in your list - developer's comfort - using temp
tables is just much more comfortable - you don't need create it again,
again, .. Due this behave is possible to reduce @2 and @3 can be nice
side effect. If you reduce @2 to zero, then @5 should be possible
without any other.
Sorry, I do not completely understand your point here
You can use normal (permanent) table and you will not have to create
them again and again. It is also possible to use them for storing
temporary data - just need to truncate table when data is not needed any
more.
Certainly you can not use the same table in more than one backend. Here
is the main advantage of temp tables - you can have storage of
per-session data and do not worry about possible name conflicts.
From the other side: there are many cases where format of temporary
data is not statically known: it is determined dynamically during
program execution.
In this case local temp table provides the most convenient mechanism for
working with such data.
This is why I think that ewe need to have both local and global temp tables.
Also I do not agree with your statement "If you reduce @2 to zero, then
@5 should be possible without any other".
In the solution implemented by Aleksander Alekseev metadata of temporary
tables is kept in memory and not affecting catalog at all.
But them still can not be used at replica.
There are still some serious problems which need to be fixed to able it:
allow insert/update/delete statements for read-only transactions,
somehow assign XIDs for them, implement savepoints and rollback of such
transactions.
All this was done in the last version of my patch.
Yes, it doesn't depend on whether we are using shared or private buffers
for temporary tables. The same approach can be implemented for both of them.
The question is whether we are really need temp tables at replica and if
so, do we need full transaction support for them, including rollbacks,
subtransactions.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
po 23. 9. 2019 v 9:57 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 20.09.2019 19:43, Pavel Stehule wrote:
1. I do not need them at all.
2. Eliminate catalog bloating.
3. Mostly needed for compatibility with Oracle (simplify porting,...).
4. Parallel query execution.
5. Can be used at replica.
6. More efficient use of resources (first of all memory).There can be other point important for cloud. Inside some cloud usually
there are two types of discs - persistent (slow) and ephemeral (fast). We
effectively used temp tables there because we moved temp tablespace to
ephemeral discs.Yes, I already heard this argument and agree with it.
I just want to notice two things:
1. My assumption is that in most cases data of temporary table can fit in
memory (certainly if we are not limiting them by temp_buffers = 8MB, but
store in shared buffers) and so there is on need to write them to the
persistent media at all.
2. Global temp tables do not substitute local temp tables, accessed
through local buffers. So if you want to use temporary storage, you will
always have a way to do it.
The question is whether we need to support two kinds of global temp tables
(with shared or private buffers) or just implement one of them.
It's valid only for OLTP. OLAP world is totally different. More if all
users used temporary tables, and you should to calculate with it - it is
one reason for global temp tables, then you need multiply size by
max_connection.
hard to say what is best from implementation perspective, but it can be
unhappy if global temporary tables has different performance
characteristics and configuration than local temporary tables.
I missing one point in your list - developer's comfort - using temp tables
is just much more comfortable - you don't need create it again, again, ..
Due this behave is possible to reduce @2 and @3 can be nice side effect. If
you reduce @2 to zero, then @5 should be possible without any other.Sorry, I do not completely understand your point here
You can use normal (permanent) table and you will not have to create them
again and again. It is also possible to use them for storing temporary data
- just need to truncate table when data is not needed any more.
Certainly you can not use the same table in more than one backend. Here is
the main advantage of temp tables - you can have storage of per-session
data and do not worry about possible name conflicts.
You use temporary tables because you know so you share data between session
never. I don't remember any situation when I designed temp tables with
different schema for different sessions.
Using global temp table is not effective - you are work with large tables,
you need to use delete, .. so you cannot to use classic table like temp
tables effectively.
From the other side: there are many cases where format of temporary data
is not statically known: it is determined dynamically during program
execution.
In this case local temp table provides the most convenient mechanism for
working with such data.This is why I think that ewe need to have both local and global temp
tables.Also I do not agree with your statement "If you reduce @2 to zero, then @5
should be possible without any other".
In the solution implemented by Aleksander Alekseev metadata of temporary
tables is kept in memory and not affecting catalog at all.
But them still can not be used at replica.
There are still some serious problems which need to be fixed to able it:
allow insert/update/delete statements for read-only transactions, somehow
assign XIDs for them, implement savepoints and rollback of such
transactions.
All this was done in the last version of my patch.
Yes, it doesn't depend on whether we are using shared or private buffers
for temporary tables. The same approach can be implemented for both of them.
The question is whether we are really need temp tables at replica and if
so, do we need full transaction support for them, including rollbacks,
subtransactions.
temporary tables (of any type) on replica is interesting feature that opens
some possibilities. Some queries cannot be optimized and should be divided
and some results should be stored to temporary tables, analysed (to get
correct statistics), maybe indexed, and after that the calculation can
continue. Now you can do this just only on master. More - on HotStandBy the
data are read only, and without direct impact on master (production), so
you can do some harder calculation there. And temporary tables is used
technique how to fix estimation errors.
I don't think so subtransaction, transaction, rollbacks are necessary for
these tables. On second hand with out it, it is half cooked features, and
can looks pretty strange in pg environment.
I am very happy, how much work you do in this area, I had not a courage to
start this job, but I don't think so this work can be reduced just to some
supported scenarios - and I hope so correct implementation is possible -
although it is not simply work.
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
This broke recently. Can you please rebase?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 25.09.2019 23:28, Alvaro Herrera wrote:
This broke recently. Can you please rebase?
Rebased version of the patch is attached.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_shared_temp_replica-3.patchtext/x-patch; name=global_shared_temp_replica-3.patchDownload
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 1bd579f..2d93f6f 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -153,9 +153,9 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
buf_state = LockBufHdr(bufHdr);
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
- fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
- fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
- fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
+ fctx->record[i].relfilenode = bufHdr->tag.rnode.node.relNode;
+ fctx->record[i].reltablespace = bufHdr->tag.rnode.node.spcNode;
+ fctx->record[i].reldatabase = bufHdr->tag.rnode.node.dbNode;
fctx->record[i].forknum = bufHdr->tag.forkNum;
fctx->record[i].blocknum = bufHdr->tag.blockNum;
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 38ae240..8a04954 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -608,9 +608,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
if (buf_state & BM_TAG_VALID &&
((buf_state & BM_PERMANENT) || dump_unlogged))
{
- block_info_array[num_blocks].database = bufHdr->tag.rnode.dbNode;
- block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.spcNode;
- block_info_array[num_blocks].filenode = bufHdr->tag.rnode.relNode;
+ block_info_array[num_blocks].database = bufHdr->tag.rnode.node.dbNode;
+ block_info_array[num_blocks].tablespace = bufHdr->tag.rnode.node.spcNode;
+ block_info_array[num_blocks].filenode = bufHdr->tag.rnode.node.relNode;
block_info_array[num_blocks].forknum = bufHdr->tag.forkNum;
block_info_array[num_blocks].blocknum = bufHdr->tag.blockNum;
++num_blocks;
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index a2c44a9..43b4c66 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -158,7 +158,8 @@ pgrowlocks(PG_FUNCTION_ARGS)
/* must hold a buffer lock to call HeapTupleSatisfiesUpdate */
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
- htsu = HeapTupleSatisfiesUpdate(tuple,
+ htsu = HeapTupleSatisfiesUpdate(mydata->rel,
+ tuple,
GetCurrentCommandId(false),
hscan->rs_cbuf);
xmax = HeapTupleHeaderGetRawXmax(tuple->t_data);
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 70af43e..9cce720 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -349,7 +349,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
/* must hold a buffer lock to call HeapTupleSatisfiesVisibility */
LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
- if (HeapTupleSatisfiesVisibility(tuple, &SnapshotDirty, hscan->rs_cbuf))
+ if (HeapTupleSatisfiesVisibility(rel, tuple, &SnapshotDirty, hscan->rs_cbuf))
{
stat.tuple_len += tuple->t_len;
stat.tuple_count++;
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 647350c..ca5f22d 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -25,6 +25,7 @@
#include "access/brin_revmap.h"
#include "access/brin_tuple.h"
#include "access/brin_xlog.h"
+#include "access/brin.h"
#include "access/rmgr.h"
#include "access/xloginsert.h"
#include "miscadmin.h"
@@ -79,6 +80,11 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
LockBuffer(meta, BUFFER_LOCK_SHARE);
page = BufferGetPage(meta);
+
+ if (GlobalTempRelationPageIsNotInitialized(idxrel, page))
+ brin_metapage_init(page, BrinGetPagesPerRange(idxrel),
+ BRIN_CURRENT_VERSION);
+
TestForOldSnapshot(snapshot, idxrel, page);
metadata = (BrinMetaPageData *) PageGetContents(page);
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b..8a6ac71 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -241,6 +241,16 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Buffer rootbuffer = ReadBuffer(index, GIN_ROOT_BLKNO);
+ LockBuffer(rootbuffer, BUFFER_LOCK_EXCLUSIVE);
+ GinInitMetabuffer(metabuffer);
+ GinInitBuffer(rootbuffer, GIN_LEAF);
+ MarkBufferDirty(rootbuffer);
+ UnlockReleaseBuffer(rootbuffer);
+ }
+
/*
* An insertion to the pending list could logically belong anywhere in the
* tree, so it conflicts with all serializable scans. All scans acquire a
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..41bab5d 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -1750,7 +1750,7 @@ collectMatchesForHeapRow(IndexScanDesc scan, pendingPosition *pos)
/*
* Collect all matched rows from pending list into bitmap.
*/
-static void
+static bool
scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
GinScanOpaque so = (GinScanOpaque) scan->opaque;
@@ -1774,6 +1774,12 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
LockBuffer(metabuffer, GIN_SHARE);
page = BufferGetPage(metabuffer);
TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
+
+ if (GlobalTempRelationPageIsNotInitialized(scan->indexRelation, page))
+ {
+ UnlockReleaseBuffer(metabuffer);
+ return false;
+ }
blkno = GinPageGetMeta(page)->head;
/*
@@ -1784,7 +1790,7 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
/* No pending list, so proceed with normal scan */
UnlockReleaseBuffer(metabuffer);
- return;
+ return true;
}
pos.pendingBuffer = ReadBuffer(scan->indexRelation, blkno);
@@ -1840,6 +1846,7 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
}
pfree(pos.hasMatchKey);
+ return true;
}
@@ -1875,7 +1882,8 @@ gingetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
* to scan the main index before the pending list, since concurrent
* cleanup could then make us miss entries entirely.
*/
- scanPendingInsert(scan, tbm, &ntids);
+ if (!scanPendingInsert(scan, tbm, &ntids))
+ return 0;
/*
* Now scan the main index.
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index c945b28..14d4e48 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -95,13 +95,13 @@ ginRedoInsertEntry(Buffer buffer, bool isLeaf, BlockNumber rightblkno, void *rda
if (PageAddItem(page, (Item) itup, IndexTupleSize(itup), offset, false, false) == InvalidOffsetNumber)
{
- RelFileNode node;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
- BufferGetTag(buffer, &node, &forknum, &blknum);
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
elog(ERROR, "failed to add item to index page in %u/%u/%u",
- node.spcNode, node.dbNode, node.relNode);
+ rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
}
}
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc8791..3215b6f 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -677,7 +677,10 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
if (!xlocked)
{
LockBuffer(stack->buffer, GIST_SHARE);
- gistcheckpage(state.r, stack->buffer);
+ if (stack->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(state.r, BufferGetPage(stack->buffer)))
+ GISTInitBuffer(stack->buffer, F_LEAF);
+ else
+ gistcheckpage(state.r, stack->buffer);
}
stack->page = (Page) BufferGetPage(stack->buffer);
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 22d790d..4c52dbe 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -344,7 +344,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
buffer = ReadBuffer(scan->indexRelation, pageItem->blkno);
LockBuffer(buffer, GIST_SHARE);
PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
- gistcheckpage(scan->indexRelation, buffer);
+ if (pageItem->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(r, BufferGetPage(buffer)))
+ GISTInitBuffer(buffer, F_LEAF);
+ else
+ gistcheckpage(scan->indexRelation, buffer);
page = BufferGetPage(buffer);
TestForOldSnapshot(scan->xs_snapshot, r, page);
opaque = GistPageGetOpaque(page);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 45804d7..50b306a 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 838ee68..4f794b3 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -75,13 +75,20 @@ _hash_getbuf(Relation rel, BlockNumber blkno, int access, int flags)
buf = ReadBuffer(rel, blkno);
- if (access != HASH_NOLOCK)
- LockBuffer(buf, access);
-
/* ref count and lock type are correct */
- _hash_checkpage(rel, buf, flags);
-
+ if (blkno == HASH_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ _hash_init(rel, 0, MAIN_FORKNUM);
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ _hash_checkpage(rel, buf, flags);
+ }
return buf;
}
@@ -339,7 +346,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
bool use_wal;
/* safety check */
- if (RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
+ if (rel->rd_rel->relpersistence != RELPERSISTENCE_SESSION && RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
elog(ERROR, "cannot initialize non-empty hash index \"%s\"",
RelationGetRelationName(rel));
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e954482..2f5f5ef 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -444,7 +444,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
if (all_visible)
valid = true;
else
- valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(scan->rs_base.rs_rd, &loctup, snapshot, buffer);
CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
&loctup, buffer, snapshot);
@@ -664,7 +664,8 @@ heapgettup(HeapScanDesc scan,
/*
* if current tuple qualifies, return it.
*/
- valid = HeapTupleSatisfiesVisibility(tuple,
+ valid = HeapTupleSatisfiesVisibility(scan->rs_base.rs_rd,
+ tuple,
snapshot,
scan->rs_cbuf);
@@ -1474,7 +1475,7 @@ heap_fetch(Relation relation,
/*
* check tuple visibility, then release lock
*/
- valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, tuple, snapshot, buffer);
if (valid)
PredicateLockTuple(relation, tuple, snapshot);
@@ -1609,7 +1610,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
if (!skip)
{
/* If it's visible per the snapshot, we must return it */
- valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, heapTuple, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, heapTuple,
buffer, snapshot);
@@ -1749,7 +1750,7 @@ heap_get_latest_tid(TableScanDesc sscan,
* Check tuple visibility; if visible, set it as the new result
* candidate.
*/
- valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(relation, &tp, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
if (valid)
*tid = ctid;
@@ -1846,6 +1847,14 @@ ReleaseBulkInsertStatePin(BulkInsertState bistate)
}
+static TransactionId
+GetTransactionId(Relation relation)
+{
+ return relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress()
+ ? GetReplicaTransactionId()
+ : GetCurrentTransactionId();
+}
+
/*
* heap_insert - insert tuple into a heap
*
@@ -1868,7 +1877,7 @@ void
heap_insert(Relation relation, HeapTuple tup, CommandId cid,
int options, BulkInsertState bistate)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
HeapTuple heaptup;
Buffer buffer;
Buffer vmbuffer = InvalidBuffer;
@@ -2105,7 +2114,7 @@ void
heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
CommandId cid, int options, BulkInsertState bistate)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
HeapTuple *heaptuples;
int i;
int ndone;
@@ -2444,7 +2453,7 @@ heap_delete(Relation relation, ItemPointer tid,
TM_FailureData *tmfd, bool changingPart)
{
TM_Result result;
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
ItemId lp;
HeapTupleData tp;
Page page;
@@ -2509,7 +2518,7 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(relation, &tp, cid, buffer);
if (result == TM_Invisible)
{
@@ -2628,7 +2637,7 @@ l1:
if (crosscheck != InvalidSnapshot && result == TM_Ok)
{
/* Perform additional check for transaction-snapshot mode RI updates */
- if (!HeapTupleSatisfiesVisibility(&tp, crosscheck, buffer))
+ if (!HeapTupleSatisfiesVisibility(relation, &tp, crosscheck, buffer))
result = TM_Updated;
}
@@ -2895,7 +2904,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
TM_FailureData *tmfd, LockTupleMode *lockmode)
{
TM_Result result;
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
Bitmapset *hot_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
@@ -3065,7 +3074,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
l2:
checked_lockers = false;
locker_remains = false;
- result = HeapTupleSatisfiesUpdate(&oldtup, cid, buffer);
+ result = HeapTupleSatisfiesUpdate(relation, &oldtup, cid, buffer);
/* see below about the "no wait" case */
Assert(result != TM_BeingModified || wait);
@@ -3258,7 +3267,7 @@ l2:
if (crosscheck != InvalidSnapshot && result == TM_Ok)
{
/* Perform additional check for transaction-snapshot mode RI updates */
- if (!HeapTupleSatisfiesVisibility(&oldtup, crosscheck, buffer))
+ if (!HeapTupleSatisfiesVisibility(relation, &oldtup, crosscheck, buffer))
{
result = TM_Updated;
Assert(!ItemPointerEquals(&oldtup.t_self, &oldtup.t_data->t_ctid));
@@ -4014,7 +4023,7 @@ heap_lock_tuple(Relation relation, HeapTuple tuple,
tuple->t_tableOid = RelationGetRelid(relation);
l3:
- result = HeapTupleSatisfiesUpdate(tuple, cid, *buffer);
+ result = HeapTupleSatisfiesUpdate(relation, tuple, cid, *buffer);
if (result == TM_Invisible)
{
@@ -4189,7 +4198,7 @@ l3:
TM_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
- GetCurrentTransactionId(),
+ GetTransactionId(relation),
mode);
if (res != TM_Ok)
{
@@ -4437,7 +4446,7 @@ l3:
TM_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
- GetCurrentTransactionId(),
+ GetTransactionId(relation),
mode);
if (res != TM_Ok)
{
@@ -4546,7 +4555,7 @@ failed:
* state if multixact.c elogs.
*/
compute_new_xmax_infomask(xmax, old_infomask, tuple->t_data->t_infomask2,
- GetCurrentTransactionId(), mode, false,
+ GetTransactionId(relation), mode, false,
&xid, &new_infomask, &new_infomask2);
START_CRIT_SECTION();
@@ -5566,7 +5575,7 @@ heap_finish_speculative(Relation relation, ItemPointer tid)
void
heap_abort_speculative(Relation relation, ItemPointer tid)
{
- TransactionId xid = GetCurrentTransactionId();
+ TransactionId xid = GetTransactionId(relation);
ItemId lp;
HeapTupleData tp;
Page page;
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2dd8821..67a553a 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -226,7 +226,8 @@ heapam_tuple_satisfies_snapshot(Relation rel, TupleTableSlot *slot,
* Caller should be holding pin, but not lock.
*/
LockBuffer(bslot->buffer, BUFFER_LOCK_SHARE);
- res = HeapTupleSatisfiesVisibility(bslot->base.tuple, snapshot,
+
+ res = HeapTupleSatisfiesVisibility(rel, bslot->base.tuple, snapshot,
bslot->buffer);
LockBuffer(bslot->buffer, BUFFER_LOCK_UNLOCK);
@@ -673,6 +674,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
@@ -2162,7 +2164,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
loctup.t_len = ItemIdGetLength(lp);
loctup.t_tableOid = scan->rs_rd->rd_id;
ItemPointerSet(&loctup.t_self, page, offnum);
- valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
+ valid = HeapTupleSatisfiesVisibility(scan->rs_rd, &loctup, snapshot, buffer);
if (valid)
{
hscan->rs_vistuples[ntup++] = offnum;
@@ -2482,7 +2484,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
else
{
/* Otherwise, we have to check the tuple individually. */
- return HeapTupleSatisfiesVisibility(tuple, scan->rs_snapshot,
+ return HeapTupleSatisfiesVisibility(scan->rs_rd, tuple, scan->rs_snapshot,
buffer);
}
}
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index 537e681..3076f6a 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -77,6 +77,7 @@
#include "utils/combocid.h"
#include "utils/snapmgr.h"
+static bool TempTupleSatisfiesVisibility(HeapTuple htup, CommandId curcid, Buffer buffer);
/*
* SetHintBits()
@@ -454,7 +455,7 @@ HeapTupleSatisfiesToast(HeapTuple htup, Snapshot snapshot,
* test for it themselves.)
*/
TM_Result
-HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
+HeapTupleSatisfiesUpdate(Relation relation, HeapTuple htup, CommandId curcid,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
@@ -462,6 +463,13 @@ HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ if (relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress())
+ {
+ AccessTempRelationAtReplica = true;
+ return TempTupleSatisfiesVisibility(htup, curcid, buffer) ? TM_Ok : TM_Invisible;
+ }
+ AccessTempRelationAtReplica = false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1677,6 +1685,59 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
}
/*
+ * TempTupleSatisfiesVisibility
+ * True iff global temp table tuple is visible for the current transaction.
+ *
+ * Temporary tables are visible only for current backend, so there is no need to
+ * handle cases with tuples committed by other backends. We only need to exclude
+ * modifications done by aborted transactions or after start of table scan.
+ *
+ */
+static bool
+TempTupleSatisfiesVisibility(HeapTuple htup, CommandId curcid, Buffer buffer)
+{
+ HeapTupleHeader tuple = htup->t_data;
+ TransactionId xmin;
+ TransactionId xmax;
+
+ Assert(ItemPointerIsValid(&htup->t_self));
+ Assert(htup->t_tableOid != InvalidOid);
+
+ if (HeapTupleHeaderXminInvalid(tuple))
+ return false;
+
+ xmin = HeapTupleHeaderGetRawXmin(tuple);
+
+ if (IsReplicaTransactionAborted(xmin))
+ return false;
+
+ if (IsReplicaCurrentTransactionId(xmin)
+ && HeapTupleHeaderGetCmin(tuple) >= curcid)
+ {
+ return false; /* inserted after scan started */
+ }
+
+ if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */
+ return true;
+
+ if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) /* not deleter */
+ return true;
+
+ xmax = (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
+ ? HeapTupleGetUpdateXid(tuple)
+ : HeapTupleHeaderGetRawXmax(tuple);
+
+ if (IsReplicaTransactionAborted(xmax))
+ return true; /* updating subtransaction aborted */
+
+ if (!IsReplicaCurrentTransactionId(xmax))
+ return false; /* updating transaction committed */
+
+ return (HeapTupleHeaderGetCmax(tuple) >= curcid); /* updated after scan started */
+}
+
+
+/*
* HeapTupleSatisfiesVisibility
* True iff heap tuple satisfies a time qual.
*
@@ -1687,8 +1748,15 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
* if so, the indicated buffer is marked dirty.
*/
bool
-HeapTupleSatisfiesVisibility(HeapTuple tup, Snapshot snapshot, Buffer buffer)
+HeapTupleSatisfiesVisibility(Relation relation, HeapTuple tup, Snapshot snapshot, Buffer buffer)
{
+ if (relation->rd_rel->relpersistence == RELPERSISTENCE_SESSION && RecoveryInProgress())
+ {
+ AccessTempRelationAtReplica = true;
+ return TempTupleSatisfiesVisibility(tup, snapshot->curcid, buffer);
+ }
+ AccessTempRelationAtReplica = false;
+
switch (snapshot->snapshot_type)
{
case SNAPSHOT_MVCC:
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 268f869..ed3ab70 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -763,7 +763,11 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ _bt_initmetapage(BufferGetPage(buf), P_NONE, 0);
+ else
+ _bt_checkpage(rel, buf);
}
else
{
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db..a8497a2 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -106,6 +106,7 @@ spgGetCache(Relation index)
spgConfigIn in;
FmgrInfo *procinfo;
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
cache = MemoryContextAllocZero(index->rd_indexcxt,
@@ -155,12 +156,32 @@ spgGetCache(Relation index)
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
- metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
+ metapage = BufferGetPage(metabuffer);
+ metadata = SpGistPageGetMeta(metapage);
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
- elog(ERROR, "index \"%s\" is not an SP-GiST index",
- RelationGetRelationName(index));
+ {
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Buffer rootbuffer = ReadBuffer(index, SPGIST_ROOT_BLKNO);
+ Buffer nullbuffer = ReadBuffer(index, SPGIST_NULL_BLKNO);
+
+ SpGistInitMetapage(metapage);
+
+ LockBuffer(rootbuffer, BUFFER_LOCK_EXCLUSIVE);
+ SpGistInitPage(BufferGetPage(rootbuffer), SPGIST_LEAF);
+ MarkBufferDirty(rootbuffer);
+ UnlockReleaseBuffer(rootbuffer);
+ LockBuffer(nullbuffer, BUFFER_LOCK_EXCLUSIVE);
+ SpGistInitPage(BufferGetPage(nullbuffer), SPGIST_LEAF | SPGIST_NULLS);
+ MarkBufferDirty(nullbuffer);
+ UnlockReleaseBuffer(nullbuffer);
+ }
+ else
+ elog(ERROR, "index \"%s\" is not an SP-GiST index",
+ RelationGetRelationName(index));
+ }
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/transam/transam.c b/src/backend/access/transam/transam.c
index 365ddfb..bce9c4a 100644
--- a/src/backend/access/transam/transam.c
+++ b/src/backend/access/transam/transam.c
@@ -22,6 +22,7 @@
#include "access/clog.h"
#include "access/subtrans.h"
#include "access/transam.h"
+#include "access/xact.h"
#include "utils/snapmgr.h"
/*
@@ -126,6 +127,9 @@ TransactionIdDidCommit(TransactionId transactionId)
{
XidStatus xidstatus;
+ if (AccessTempRelationAtReplica)
+ return !IsReplicaCurrentTransactionId(transactionId) && !IsReplicaTransactionAborted(transactionId);
+
xidstatus = TransactionLogFetch(transactionId);
/*
@@ -182,6 +186,9 @@ TransactionIdDidAbort(TransactionId transactionId)
{
XidStatus xidstatus;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaTransactionAborted(transactionId);
+
xidstatus = TransactionLogFetch(transactionId);
/*
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index b18eee4..fb3bebc 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -71,7 +71,7 @@ GetNewTransactionId(bool isSubXact)
/* safety check, we should never get this far in a HS standby */
if (RecoveryInProgress())
- elog(ERROR, "cannot assign TransactionIds during recovery");
+ elog(ERROR, "cannot assign TransactionIds during recovery");
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 9162286..cf6bf4e 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -192,6 +192,7 @@ typedef struct TransactionStateData
int parallelModeLevel; /* Enter/ExitParallelMode counter */
bool chain; /* start a new block after this one */
struct TransactionStateData *parent; /* back link to parent */
+ TransactionId replicaTransactionId; /* pseudo XID for inserting data in global temp tables at replica */
} TransactionStateData;
typedef TransactionStateData *TransactionState;
@@ -286,6 +287,12 @@ typedef struct XactCallbackItem
static XactCallbackItem *Xact_callbacks = NULL;
+static TransactionId replicaTransIdCount = FirstNormalTransactionId;
+static TransactionId replicaTopTransId;
+static Bitmapset* replicaAbortedXids;
+
+bool AccessTempRelationAtReplica;
+
/*
* List of add-on start- and end-of-subxact callbacks
*/
@@ -443,6 +450,48 @@ GetCurrentTransactionIdIfAny(void)
}
/*
+ * Transactions at replica can update only global temporary tables.
+ * Them are assigned backend-local XIDs which are independent from normal XIDs received from primary node.
+ * So tuples of temporary tables at replica requires special visibility rules.
+ *
+ * XIDs for such transactions at replica are created on demand (when tuple of temp table is updated).
+ * XID wrap-around and adjusting XID horizon is not supported. So number of such transactions at replica is
+ * limited by 2^32 and require up to 2^29 in-memory bitmap for marking aborted transactions.
+ */
+TransactionId
+GetReplicaTransactionId(void)
+{
+ TransactionState s = CurrentTransactionState;
+ if (!TransactionIdIsValid(s->replicaTransactionId))
+ s->replicaTransactionId = ++replicaTransIdCount;
+ return s->replicaTransactionId;
+}
+
+/*
+ * At replica transaction can update only temporary tables
+ * and them are assigned special XIDs (not related with normal XIDs received from primary node).
+ * As far as we see only own transaction it is not necessary to mark committed transactions.
+ * Only marking aborted ones is enough. All transactions which are not marked as aborted are treated as
+ * committed or self in-progress transactions.
+ */
+bool
+IsReplicaTransactionAborted(TransactionId xid)
+{
+ return bms_is_member(xid, replicaAbortedXids);
+}
+
+/*
+ * As far as XIDs for transactions at replica are generated individually for each backends,
+ * we can check that XID belongs to the current transaction or any of its subtransactions by
+ * just comparing it with XID of top transaction.
+ */
+bool
+IsReplicaCurrentTransactionId(TransactionId xid)
+{
+ return xid > replicaTopTransId;
+}
+
+/*
* GetTopFullTransactionId
*
* This will return the FullTransactionId of the main transaction, assigning
@@ -855,6 +904,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
{
TransactionState s;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaCurrentTransactionId(xid);
+
/*
* We always say that BootstrapTransactionId is "not my transaction ID"
* even when it is (ie, during bootstrap). Along with the fact that
@@ -1206,7 +1258,7 @@ static TransactionId
RecordTransactionCommit(void)
{
TransactionId xid = GetTopTransactionIdIfAny();
- bool markXidCommitted = TransactionIdIsValid(xid);
+ bool markXidCommitted = TransactionIdIsNormal(xid);
TransactionId latestXid = InvalidTransactionId;
int nrels;
RelFileNode *rels;
@@ -1624,7 +1676,7 @@ RecordTransactionAbort(bool isSubXact)
* rels to delete (note that this routine is not responsible for actually
* deleting 'em). We cannot have any child XIDs, either.
*/
- if (!TransactionIdIsValid(xid))
+ if (!TransactionIdIsNormal(xid))
{
/* Reset XactLastRecEnd until the next transaction writes something */
if (!isSubXact)
@@ -1892,6 +1944,8 @@ StartTransaction(void)
s = &TopTransactionStateData;
CurrentTransactionState = s;
+ replicaTopTransId = replicaTransIdCount;
+
Assert(!FullTransactionIdIsValid(XactTopFullTransactionId));
/* check the current transaction state */
@@ -1905,6 +1959,7 @@ StartTransaction(void)
*/
s->state = TRANS_START;
s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
+ s->replicaTransactionId = InvalidTransactionId; /* until assigned */
/* Determine if statements are logged in this transaction */
xact_is_sampled = log_xact_sample_rate != 0 &&
@@ -2570,6 +2625,14 @@ AbortTransaction(void)
/* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS();
+ /* Mark transactions involved global temp table at replica as aborted */
+ if (TransactionIdIsValid(s->replicaTransactionId))
+ {
+ MemoryContext ctx = MemoryContextSwitchTo(TopMemoryContext);
+ replicaAbortedXids = bms_add_member(replicaAbortedXids, s->replicaTransactionId);
+ MemoryContextSwitchTo(ctx);
+ }
+
/* Make sure we have a valid memory context and resource owner */
AtAbort_Memory();
AtAbort_ResourceOwner();
@@ -2991,6 +3054,9 @@ CommitTransactionCommand(void)
* and then clean up.
*/
case TBLOCK_ABORT_PENDING:
+ if (GetCurrentTransactionIdIfAny() == FrozenTransactionId)
+ elog(FATAL, "Transaction is aborted at standby");
+
AbortTransaction();
CleanupTransaction();
s->blockState = TBLOCK_DEFAULT;
@@ -4880,6 +4946,14 @@ AbortSubTransaction(void)
/* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS();
+ /* Mark transactions involved global temp table at replica as aborted */
+ if (TransactionIdIsValid(s->replicaTransactionId))
+ {
+ MemoryContext ctx = MemoryContextSwitchTo(TopMemoryContext);
+ replicaAbortedXids = bms_add_member(replicaAbortedXids, s->replicaTransactionId);
+ MemoryContextSwitchTo(ctx);
+ }
+
/* Make sure we have a valid memory context and resource owner */
AtSubAbort_Memory();
AtSubAbort_ResourceOwner();
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d4..edec8ca 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -213,6 +213,7 @@ void
XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
{
registered_buffer *regbuf;
+ RelFileNodeBackend rnode;
/* NO_IMAGE doesn't make sense with FORCE_IMAGE */
Assert(!((flags & REGBUF_FORCE_IMAGE) && (flags & (REGBUF_NO_IMAGE))));
@@ -227,7 +228,8 @@ XLogRegisterBuffer(uint8 block_id, Buffer buffer, uint8 flags)
regbuf = ®istered_buffers[block_id];
- BufferGetTag(buffer, ®buf->rnode, ®buf->forkno, ®buf->block);
+ BufferGetTag(buffer, &rnode, ®buf->forkno, ®buf->block);
+ regbuf->rnode = rnode.node;
regbuf->page = BufferGetPage(buffer);
regbuf->flags = flags;
regbuf->rdata_tail = (XLogRecData *) ®buf->rdata_head;
@@ -919,7 +921,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
int flags;
PGAlignedBlock copied_buffer;
char *origdata = (char *) BufferGetBlock(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blkno;
@@ -948,7 +950,7 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
flags |= REGBUF_STANDARD;
BufferGetTag(buffer, &rnode, &forkno, &blkno);
- XLogRegisterBlock(0, &rnode, forkno, blkno, copied_buffer.data, flags);
+ XLogRegisterBlock(0, &rnode.node, forkno, blkno, copied_buffer.data, flags);
recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI_FOR_HINT);
}
@@ -1009,7 +1011,7 @@ XLogRecPtr
log_newpage_buffer(Buffer buffer, bool page_std)
{
Page page = BufferGetPage(buffer);
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forkNum;
BlockNumber blkno;
@@ -1018,7 +1020,7 @@ log_newpage_buffer(Buffer buffer, bool page_std)
BufferGetTag(buffer, &rnode, &forkNum, &blkno);
- return log_newpage(&rnode, forkNum, blkno, page, page_std);
+ return log_newpage(&rnode.node, forkNum, blkno, page, page_std);
}
/*
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 1af31c2..e60bdb7 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -402,6 +402,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 098732c..cb35c72 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3641,7 +3641,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index b8c9b6f..b06ff28 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a23128d..5d131a7 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index a13322b..be661a4 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 05593f3..73be898 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -586,7 +586,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1771,7 +1771,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7677,6 +7678,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14109,6 +14116,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14596,14 +14610,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ea4b586..9cd8361 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -787,6 +787,9 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
if (isTempNamespace(get_rel_namespace(rte->relid)))
continue;
+ if (get_rel_persistence(rte->relid) == RELPERSISTENCE_SESSION)
+ continue;
+
PreventCommandIfReadOnly(CreateCommandTag((Node *) plannedstmt));
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index cf17614..d673bd7 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -127,7 +127,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
relation = table_open(relationObjectId, NoLock);
/* Temporary and unlogged relations are inaccessible during recovery. */
- if (!RelationNeedsWAL(relation) && RecoveryInProgress())
+ if (!RelationNeedsWAL(relation) && RecoveryInProgress() && relation->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary or unlogged relations during recovery")));
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3f67aaf..565c868 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3266,20 +3266,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee47547..ea7fe4c 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 073f313..5760a9c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2154,7 +2154,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 8ce28ad..a5d2de9 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3487,6 +3487,7 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
{
ReorderBufferTupleCidKey key;
ReorderBufferTupleCidEnt *ent;
+ RelFileNodeBackend rnode;
ForkNumber forkno;
BlockNumber blockno;
bool updated_mapping = false;
@@ -3500,7 +3501,8 @@ ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
* get relfilenode from the buffer, no convenient way to access it other
* than that.
*/
- BufferGetTag(buffer, &key.relnode, &forkno, &blockno);
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+ key.relnode = rnode.node;
/* tuples can only be in the main fork */
Assert(forkno == MAIN_FORKNUM);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 483f705..cdba076 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -556,7 +556,7 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
int buf_id;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
+ INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode,
forkNum, blockNum);
/* determine its hash code and partition lock ID */
@@ -710,7 +710,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
Block bufBlock;
bool found;
bool isExtend;
- bool isLocalBuf = SmgrIsTemp(smgr);
+ bool isLocalBuf = SmgrIsTemp(smgr) && relpersistence == RELPERSISTENCE_TEMP;
*hit = false;
@@ -1010,7 +1010,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
uint32 buf_state;
/* create a tag so we can lookup the buffer */
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);
@@ -1532,7 +1532,8 @@ ReleaseAndReadBuffer(Buffer buffer,
{
bufHdr = GetLocalBufferDescriptor(-buffer - 1);
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
ResourceOwnerForgetBuffer(CurrentResourceOwner, buffer);
@@ -1543,7 +1544,8 @@ ReleaseAndReadBuffer(Buffer buffer,
bufHdr = GetBufferDescriptor(buffer - 1);
/* we have pin, so it's ok to examine tag without spinlock */
if (bufHdr->tag.blockNum == blockNum &&
- RelFileNodeEquals(bufHdr->tag.rnode, relation->rd_node) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, relation->rd_node) &&
+ bufHdr->tag.rnode.backend == relation->rd_backend &&
bufHdr->tag.forkNum == forkNum)
return buffer;
UnpinBuffer(bufHdr, true);
@@ -1845,8 +1847,8 @@ BufferSync(int flags)
item = &CkptBufferIds[num_to_scan++];
item->buf_id = buf_id;
- item->tsId = bufHdr->tag.rnode.spcNode;
- item->relNode = bufHdr->tag.rnode.relNode;
+ item->tsId = bufHdr->tag.rnode.node.spcNode;
+ item->relNode = bufHdr->tag.rnode.node.relNode;
item->forkNum = bufHdr->tag.forkNum;
item->blockNum = bufHdr->tag.blockNum;
}
@@ -2559,7 +2561,7 @@ PrintBufferLeakWarning(Buffer buffer)
}
/* theoretically we should lock the bufhdr here */
- path = relpathbackend(buf->tag.rnode, backend, buf->tag.forkNum);
+ path = relpathbackend(buf->tag.rnode.node, backend, buf->tag.forkNum);
buf_state = pg_atomic_read_u32(&buf->state);
elog(WARNING,
"buffer refcount leak: [%03d] "
@@ -2631,7 +2633,7 @@ BufferGetBlockNumber(Buffer buffer)
* a buffer.
*/
void
-BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
+BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode, ForkNumber *forknum,
BlockNumber *blknum)
{
BufferDesc *bufHdr;
@@ -2696,7 +2698,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/* Find smgr relation for buffer */
if (reln == NULL)
- reln = smgropen(buf->tag.rnode, InvalidBackendId);
+ reln = smgropen(buf->tag.rnode.node, buf->tag.rnode.backend);
TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,
buf->tag.blockNum,
@@ -2931,7 +2933,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
int j;
/* If it's a local relation, it's localbuf.c's problem. */
- if (RelFileNodeBackendIsTemp(rnode))
+ if (RelFileNodeBackendIsLocalTemp(rnode))
{
if (rnode.backend == MyBackendId)
{
@@ -2963,14 +2965,13 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+ if (!RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode))
continue;
buf_state = LockBufHdr(bufHdr);
-
for (j = 0; j < nforks; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, rnode) &&
bufHdr->tag.forkNum == forkNum[j] &&
bufHdr->tag.blockNum >= firstDelBlock[j])
{
@@ -2997,24 +2998,24 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
{
int i,
n = 0;
- RelFileNode *nodes;
+ RelFileNodeBackend *nodes;
bool use_bsearch;
if (nnodes == 0)
return;
- nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
+ nodes = palloc(sizeof(RelFileNodeBackend) * nnodes); /* non-local relations */
/* If it's a local relation, it's localbuf.c's problem. */
for (i = 0; i < nnodes; i++)
{
- if (RelFileNodeBackendIsTemp(rnodes[i]))
+ if (RelFileNodeBackendIsLocalTemp(rnodes[i]))
{
if (rnodes[i].backend == MyBackendId)
DropRelFileNodeAllLocalBuffers(rnodes[i].node);
}
else
- nodes[n++] = rnodes[i].node;
+ nodes[n++] = rnodes[i];
}
/*
@@ -3037,11 +3038,11 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
/* sort the list of rnodes if necessary */
if (use_bsearch)
- pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+ pg_qsort(nodes, n, sizeof(RelFileNodeBackend), rnode_comparator);
for (i = 0; i < NBuffers; i++)
{
- RelFileNode *rnode = NULL;
+ RelFileNodeBackend *rnode = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint32 buf_state;
@@ -3056,7 +3057,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
for (j = 0; j < n; j++)
{
- if (RelFileNodeEquals(bufHdr->tag.rnode, nodes[j]))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, nodes[j]))
{
rnode = &nodes[j];
break;
@@ -3066,7 +3067,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
else
{
rnode = bsearch((const void *) &(bufHdr->tag.rnode),
- nodes, n, sizeof(RelFileNode),
+ nodes, n, sizeof(RelFileNodeBackend),
rnode_comparator);
}
@@ -3075,7 +3076,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
continue;
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, (*rnode)))
+ if (RelFileNodeBackendEquals(bufHdr->tag.rnode, (*rnode)))
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3114,11 +3115,11 @@ DropDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid)
+ if (bufHdr->tag.rnode.node.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
else
UnlockBufHdr(bufHdr, buf_state);
@@ -3148,7 +3149,7 @@ PrintBufferDescs(void)
"[%02d] (freeNext=%d, rel=%s, "
"blockNum=%u, flags=0x%x, refcount=%u %d)",
i, buf->freeNext,
- relpathbackend(buf->tag.rnode, InvalidBackendId, buf->tag.forkNum),
+ relpath(buf->tag.rnode, buf->tag.forkNum),
buf->tag.blockNum, buf->flags,
buf->refcount, GetPrivateRefCount(b));
}
@@ -3216,7 +3217,8 @@ FlushRelationBuffers(Relation rel)
uint32 buf_state;
bufHdr = GetLocalBufferDescriptor(i);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
((buf_state = pg_atomic_read_u32(&bufHdr->state)) &
(BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
@@ -3263,13 +3265,15 @@ FlushRelationBuffers(Relation rel)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
+ if (!RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node)
+ || bufHdr->tag.rnode.backend != rel->rd_backend)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
+ if (RelFileNodeEquals(bufHdr->tag.rnode.node, rel->rd_node) &&
+ bufHdr->tag.rnode.backend == rel->rd_backend &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -3317,13 +3321,13 @@ FlushDatabaseBuffers(Oid dbid)
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
- if (bufHdr->tag.rnode.dbNode != dbid)
+ if (bufHdr->tag.rnode.node.dbNode != dbid)
continue;
ReservePrivateRefCountEntry();
buf_state = LockBufHdr(bufHdr);
- if (bufHdr->tag.rnode.dbNode == dbid &&
+ if (bufHdr->tag.rnode.node.dbNode == dbid &&
(buf_state & (BM_VALID | BM_DIRTY)) == (BM_VALID | BM_DIRTY))
{
PinBuffer_Locked(bufHdr);
@@ -4063,7 +4067,7 @@ AbortBufferIO(void)
/* Buffer is pinned, so we can read tag without spinlock */
char *path;
- path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
+ path = relpath(buf->tag.rnode, buf->tag.forkNum);
ereport(WARNING,
(errcode(ERRCODE_IO_ERROR),
errmsg("could not write block %u of %s",
@@ -4087,7 +4091,7 @@ shared_buffer_write_error_callback(void *arg)
/* Buffer is pinned, so we can read the tag without locking the spinlock */
if (bufHdr != NULL)
{
- char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+ char *path = relpath(bufHdr->tag.rnode, bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
@@ -4105,7 +4109,7 @@ local_buffer_write_error_callback(void *arg)
if (bufHdr != NULL)
{
- char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ char *path = relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum);
errcontext("writing block %u of relation %s",
@@ -4120,22 +4124,27 @@ local_buffer_write_error_callback(void *arg)
static int
rnode_comparator(const void *p1, const void *p2)
{
- RelFileNode n1 = *(const RelFileNode *) p1;
- RelFileNode n2 = *(const RelFileNode *) p2;
+ RelFileNodeBackend n1 = *(const RelFileNodeBackend *) p1;
+ RelFileNodeBackend n2 = *(const RelFileNodeBackend *) p2;
+
+ if (n1.node.relNode < n2.node.relNode)
+ return -1;
+ else if (n1.node.relNode > n2.node.relNode)
+ return 1;
- if (n1.relNode < n2.relNode)
+ if (n1.node.dbNode < n2.node.dbNode)
return -1;
- else if (n1.relNode > n2.relNode)
+ else if (n1.node.dbNode > n2.node.dbNode)
return 1;
- if (n1.dbNode < n2.dbNode)
+ if (n1.node.spcNode < n2.node.spcNode)
return -1;
- else if (n1.dbNode > n2.dbNode)
+ else if (n1.node.spcNode > n2.node.spcNode)
return 1;
- if (n1.spcNode < n2.spcNode)
+ if (n1.backend < n2.backend)
return -1;
- else if (n1.spcNode > n2.spcNode)
+ else if (n1.backend > n2.backend)
return 1;
else
return 0;
@@ -4371,7 +4380,7 @@ IssuePendingWritebacks(WritebackContext *context)
next = &context->pending_writebacks[i + ahead + 1];
/* different file, stop */
- if (!RelFileNodeEquals(cur->tag.rnode, next->tag.rnode) ||
+ if (!RelFileNodeBackendEquals(cur->tag.rnode, next->tag.rnode) ||
cur->tag.forkNum != next->tag.forkNum)
break;
@@ -4390,7 +4399,7 @@ IssuePendingWritebacks(WritebackContext *context)
i += ahead;
/* and finally tell the kernel to write the data to storage */
- reln = smgropen(tag.rnode, InvalidBackendId);
+ reln = smgropen(tag.rnode.node, tag.rnode.backend);
smgrwriteback(reln, tag.forkNum, tag.blockNum, nblocks);
}
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index f5f6a29..6bd5ecb 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
BufferTag newTag; /* identity of requested block */
LocalBufferLookupEnt *hresult;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -111,7 +111,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
bool found;
uint32 buf_state;
- INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
+ INIT_BUFFERTAG(newTag, smgr->smgr_rnode, forkNum, blockNum);
/* Initialize local buffers if first request in this session */
if (LocalBufHash == NULL)
@@ -209,7 +209,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
Page localpage = (char *) LocalBufHdrGetBlock(bufHdr);
/* Find smgr relation for buffer */
- oreln = smgropen(bufHdr->tag.rnode, MyBackendId);
+ oreln = smgropen(bufHdr->tag.rnode.node, MyBackendId);
PageSetChecksumInplace(localpage, bufHdr->tag.blockNum);
@@ -331,14 +331,14 @@ DropRelFileNodeLocalBuffers(RelFileNode rnode, ForkNumber forkNum,
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode) &&
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode) &&
bufHdr->tag.forkNum == forkNum &&
bufHdr->tag.blockNum >= firstDelBlock)
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
@@ -377,12 +377,12 @@ DropRelFileNodeAllLocalBuffers(RelFileNode rnode)
buf_state = pg_atomic_read_u32(&bufHdr->state);
if ((buf_state & BM_TAG_VALID) &&
- RelFileNodeEquals(bufHdr->tag.rnode, rnode))
+ RelFileNodeEquals(bufHdr->tag.rnode.node, rnode))
{
if (LocalRefCount[i] != 0)
elog(ERROR, "block %u of %s is still referenced (local %u)",
bufHdr->tag.blockNum,
- relpathbackend(bufHdr->tag.rnode, MyBackendId,
+ relpathbackend(bufHdr->tag.rnode.node, MyBackendId,
bufHdr->tag.forkNum),
LocalRefCount[i]);
/* Remove entry from hashtable */
diff --git a/src/backend/storage/freespace/fsmpage.c b/src/backend/storage/freespace/fsmpage.c
index cf7f03f..65eb422 100644
--- a/src/backend/storage/freespace/fsmpage.c
+++ b/src/backend/storage/freespace/fsmpage.c
@@ -268,13 +268,13 @@ restart:
*
* Fix the corruption and restart.
*/
- RelFileNode rnode;
+ RelFileNodeBackend rnode;
ForkNumber forknum;
BlockNumber blknum;
BufferGetTag(buf, &rnode, &forknum, &blknum);
elog(DEBUG1, "fixing corrupt FSM block %u, relation %u/%u/%u",
- blknum, rnode.spcNode, rnode.dbNode, rnode.relNode);
+ blknum, rnode.node.spcNode, rnode.node.dbNode, rnode.node.relNode);
/* make sure we hold an exclusive lock */
if (!exclusive_lock_held)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 8abcfdf..07739e2 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -991,6 +991,9 @@ TransactionIdIsInProgress(TransactionId xid)
int i,
j;
+ if (AccessTempRelationAtReplica)
+ return IsReplicaCurrentTransactionId(xid) && !IsReplicaTransactionAborted(xid);
+
/*
* Don't bother checking a transaction older than RecentXmin; it could not
* possibly still be running. (Note: in particular, this guarantees that
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..204c4cb 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,48 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Remove relation pages from shared buffers */
+ DropRelFileNodesAllBuffers(&rel->rnode, 1);
+
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +273,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +522,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +546,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +723,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +814,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 585dcee..ce8852c 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f01fea5..4162976 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15602,8 +15602,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6..2f16c58 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -195,9 +195,9 @@ extern void heap_vacuum_rel(Relation onerel,
struct VacuumParams *params, BufferAccessStrategy bstrategy);
/* in heap/heapam_visibility.c */
-extern bool HeapTupleSatisfiesVisibility(HeapTuple stup, Snapshot snapshot,
+extern bool HeapTupleSatisfiesVisibility(Relation relation, HeapTuple stup, Snapshot snapshot,
Buffer buffer);
-extern TM_Result HeapTupleSatisfiesUpdate(HeapTuple stup, CommandId curcid,
+extern TM_Result HeapTupleSatisfiesUpdate(Relation relation, HeapTuple stup, CommandId curcid,
Buffer buffer);
extern HTSV_Result HeapTupleSatisfiesVacuum(HeapTuple stup, TransactionId OldestXmin,
Buffer buffer);
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index d714551..cbe6760 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -41,6 +41,9 @@
extern int DefaultXactIsoLevel;
extern PGDLLIMPORT int XactIsoLevel;
+extern bool AccessTempRelationAtReplica;
+
+
/*
* We implement three isolation levels internally.
* The two stronger ones use one snapshot per database transaction;
@@ -440,4 +443,8 @@ extern void EnterParallelMode(void);
extern void ExitParallelMode(void);
extern bool IsInParallelMode(void);
+extern TransactionId GetReplicaTransactionId(void);
+extern bool IsReplicaTransactionAborted(TransactionId xid);
+extern bool IsReplicaCurrentTransactionId(TransactionId xid);
+
#endif /* XACT_H */
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 6ffe184..efaf362 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -90,16 +90,17 @@
*/
typedef struct buftag
{
- RelFileNode rnode; /* physical relation identifier */
+ RelFileNodeBackend rnode; /* physical relation identifier */
ForkNumber forkNum;
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
#define CLEAR_BUFFERTAG(a) \
( \
- (a).rnode.spcNode = InvalidOid, \
- (a).rnode.dbNode = InvalidOid, \
- (a).rnode.relNode = InvalidOid, \
+ (a).rnode.node.spcNode = InvalidOid, \
+ (a).rnode.node.dbNode = InvalidOid, \
+ (a).rnode.node.relNode = InvalidOid, \
+ (a).rnode.backend = InvalidBackendId, \
(a).forkNum = InvalidForkNumber, \
(a).blockNum = InvalidBlockNumber \
)
@@ -113,7 +114,7 @@ typedef struct buftag
#define BUFFERTAGS_EQUAL(a,b) \
( \
- RelFileNodeEquals((a).rnode, (b).rnode) && \
+ RelFileNodeBackendEquals((a).rnode, (b).rnode) && \
(a).blockNum == (b).blockNum && \
(a).forkNum == (b).forkNum \
)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 17b97f7..1b0ce65 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -205,7 +205,7 @@ extern XLogRecPtr BufferGetLSNAtomic(Buffer buffer);
extern void PrintPinnedBufs(void);
#endif
extern Size BufferShmemSize(void);
-extern void BufferGetTag(Buffer buffer, RelFileNode *rnode,
+extern void BufferGetTag(Buffer buffer, RelFileNodeBackend *rnode,
ForkNumber *forknum, BlockNumber *blknum);
extern void MarkBufferDirtyHint(Buffer buffer, bool buffer_std);
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index a5cf804..4bcfea1 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -327,6 +327,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 69ae227..95919f8 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -87,3 +87,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..b7bf067
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,323 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+PREPARE TRANSACTION 'global_temp1';
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+COMMIT PREPARED 'global_temp1';
+-- 1, 2
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+-- Only 4
+SELECT * FROM global_temptest;
+ col
+-----
+ 4
+(1 row)
+
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+ 1
+ 3
+(2 rows)
+
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/expected/global_temp_0.out b/src/test/regress/expected/global_temp_0.out
new file mode 100644
index 0000000..934e751
--- /dev/null
+++ b/src/test/regress/expected/global_temp_0.out
@@ -0,0 +1,326 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+PREPARE TRANSACTION 'global_temp1';
+ERROR: prepared transactions are disabled
+HINT: Set max_prepared_transactions to a nonzero value.
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+ERROR: prepared transactions are disabled
+HINT: Set max_prepared_transactions to a nonzero value.
+COMMIT PREPARED 'global_temp1';
+ERROR: prepared transaction with identifier "global_temp1" does not exist
+-- 1, 2
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+ERROR: prepared transaction with identifier "global_temp2" does not exist
+-- Only 4
+SELECT * FROM global_temptest;
+ col
+-----
+ 4
+(1 row)
+
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+ col
+-----
+(0 rows)
+
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..4d2da8d
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,191 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+DROP TABLE global_temptest2;
+DROP TABLE global_temptest1;
+
+-- Unsupported ON COMMIT and foreign key combination
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
+
+-- Test two phase commit
+CREATE TABLE global_temptest_persistent(col int);
+CREATE GLOBAL TEMP TABLE global_temptest(col int);
+INSERT INTO global_temptest VALUES (1);
+
+BEGIN;
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp1';
+-- We can't see anything from an uncommitted transaction
+SELECT * FROM global_temptest;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (3);
+INSERT INTO global_temptest_persistent SELECT * FROM global_temptest;
+PREPARE TRANSACTION 'global_temp2';
+COMMIT PREPARED 'global_temp1';
+-- 1, 2
+SELECT * FROM global_temptest;
+-- Nothing
+SELECT * FROM global_temptest_persistent;
+\c
+-- The temp table is empty now.
+SELECT * FROM global_temptest;
+-- Still nothing in global_temptest_persistent table;
+SELECT * FROM global_temptest_persistent;
+INSERT INTO global_temptest VALUES (4);
+COMMIT PREPARED 'global_temp2';
+-- Only 4
+SELECT * FROM global_temptest;
+-- 1, 3
+SELECT * FROM global_temptest_persistent;
+\c
+DROP TABLE global_temptest;
+DROP TABLE global_temptest_persistent;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
As far as both Robert and Pavel think that aspects of using GTT in
parallel queries and at replica should be considered separately.
I have prepared simplest version of the patch for GTT which introduces
minimal differences with current (local) temporary table.
So GTT are stored in private buffers, can not be accessed at replica, in
prepared transactions and parallel queries.
But it supports all existed built-on indexes (hash, nbtree, btrin, git,
gist, spgist) and per-backend statistic.
There are no any DDL limitations for GTT.
Also I have not yet introduced pg_statistic view (as proposed by Pavel).
I afraid that it may break compatibility with some existed extensions
and applications.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-5.patchtext/x-patch; name=global_private_temp-5.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae7b729..485c068 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -672,7 +672,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
/*
* We expect to be called exactly once for any index relation.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -681,9 +681,17 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* whole relation will be rolled back.
*/
- meta = ReadBuffer(index, P_NEW);
- Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
- LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ meta = ReadBuffer(index, P_NEW);
+ Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ }
+ else
+ {
+ meta = ReadBuffer(index, BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_SHARE);
+ }
brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
BRIN_CURRENT_VERSION);
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 647350c..62e5212 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -25,6 +25,7 @@
#include "access/brin_revmap.h"
#include "access/brin_tuple.h"
#include "access/brin_xlog.h"
+#include "access/brin.h"
#include "access/rmgr.h"
#include "access/xloginsert.h"
#include "miscadmin.h"
@@ -79,6 +80,13 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
LockBuffer(meta, BUFFER_LOCK_SHARE);
page = BufferGetPage(meta);
+
+ if (GlobalTempRelationPageIsNotInitialized(idxrel, page))
+ {
+ Relation heap = RelationIdGetRelation(idxrel->rd_index->indrelid);
+ brinbuild(heap, idxrel, BuildIndexInfo(idxrel));
+ RelationClose(heap);
+ }
TestForOldSnapshot(snapshot, idxrel, page);
metadata = (BrinMetaPageData *) PageGetContents(page);
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b..51fe102 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -241,6 +241,13 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+
/*
* An insertion to the pending list could logically belong anywhere in the
* tree, so it conflicts with all serializable scans. All scans acquire a
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..975c186 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -1759,7 +1759,8 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
match;
int i;
pendingPosition pos;
- Buffer metabuffer = ReadBuffer(scan->indexRelation, GIN_METAPAGE_BLKNO);
+ Relation index = scan->indexRelation;
+ Buffer metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
Page page;
BlockNumber blkno;
@@ -1769,11 +1770,19 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
* Acquire predicate lock on the metapage, to conflict with any fastupdate
* insertions.
*/
- PredicateLockPage(scan->indexRelation, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
+ PredicateLockPage(index, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
LockBuffer(metabuffer, GIN_SHARE);
page = BufferGetPage(metabuffer);
- TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
+ TestForOldSnapshot(scan->xs_snapshot, index, page);
+
+ if (GlobalTempRelationPageIsNotInitialized(index, page))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ UnlockReleaseBuffer(metabuffer);
+ }
blkno = GinPageGetMeta(page)->head;
/*
@@ -1784,10 +1793,10 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
/* No pending list, so proceed with normal scan */
UnlockReleaseBuffer(metabuffer);
- return;
+ return true;
}
- pos.pendingBuffer = ReadBuffer(scan->indexRelation, blkno);
+ pos.pendingBuffer = ReadBuffer(index, blkno);
LockBuffer(pos.pendingBuffer, GIN_SHARE);
pos.firstOffset = FirstOffsetNumber;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab14..d6739f3 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -328,7 +328,7 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
MemoryContext oldCtx;
OffsetNumber attnum;
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -337,7 +337,15 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
memset(&buildstate.buildStats, 0, sizeof(GinStatsData));
/* initialize the meta page */
- MetaBuffer = GinNewBuffer(index);
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ MetaBuffer = ReadBuffer(index, 0);
+ LockBuffer(MetaBuffer, GIN_SHARE);
+ }
+ else
+ {
+ MetaBuffer = GinNewBuffer(index);
+ }
/* initialize the root page */
RootBuffer = GinNewBuffer(index);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc8791..bcde5ea 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -16,6 +16,7 @@
#include "access/gist_private.h"
#include "access/gistscan.h"
+#include "catalog/index.h"
#include "catalog/pg_collation.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
@@ -677,7 +678,10 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
if (!xlocked)
{
LockBuffer(stack->buffer, GIST_SHARE);
- gistcheckpage(state.r, stack->buffer);
+ if (stack->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(state.r, BufferGetPage(stack->buffer)))
+ gistbuild(heapRel, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(state.r, stack->buffer);
}
stack->page = (Page) BufferGetPage(stack->buffer);
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 2f4543d..8d194c8 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -156,7 +156,7 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -171,8 +171,16 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
buildstate.giststate->tempCxt = createTempGistContext();
/* initialize the root page */
- buffer = gistNewBuffer(index);
- Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ buffer = gistNewBuffer(index);
+ Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ }
+ else
+ {
+ buffer = ReadBuffer(index, GIST_ROOT_BLKNO);
+ LockBuffer(buffer, GIST_SHARE);
+ }
page = BufferGetPage(buffer);
START_CRIT_SECTION();
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 22d790d..5560a41 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -17,8 +17,10 @@
#include "access/genam.h"
#include "access/gist_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
+#include "storage/freespace.h"
#include "storage/predicate.h"
#include "pgstat.h"
#include "lib/pairingheap.h"
@@ -344,7 +346,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
buffer = ReadBuffer(scan->indexRelation, pageItem->blkno);
LockBuffer(buffer, GIST_SHARE);
PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
- gistcheckpage(scan->indexRelation, buffer);
+ if (pageItem->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(r, BufferGetPage(buffer)))
+ gistbuild(scan->heapRelation, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(scan->indexRelation, buffer);
page = BufferGetPage(buffer);
TestForOldSnapshot(scan->xs_snapshot, r, page);
opaque = GistPageGetOpaque(page);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 45804d7..50b306a 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30da..1b228db 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -119,7 +119,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 838ee68..00ba123 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -75,13 +75,22 @@ _hash_getbuf(Relation rel, BlockNumber blkno, int access, int flags)
buf = ReadBuffer(rel, blkno);
- if (access != HASH_NOLOCK)
- LockBuffer(buf, access);
-
/* ref count and lock type are correct */
- _hash_checkpage(rel, buf, flags);
-
+ if (blkno == HASH_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ hashbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ _hash_checkpage(rel, buf, flags);
+ }
return buf;
}
@@ -339,7 +348,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
bool use_wal;
/* safety check */
- if (RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
+ if (rel->rd_rel->relpersistence != RELPERSISTENCE_SESSION && RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
elog(ERROR, "cannot initialize non-empty hash index \"%s\"",
RelationGetRelationName(rel));
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2dd8821..92df373 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -673,6 +673,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 268f869..eff9e10 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -27,8 +27,10 @@
#include "access/transam.h"
#include "access/xlog.h"
#include "access/xloginsert.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/indexfsm.h"
+#include "storage/buf_internals.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
#include "utils/snapmgr.h"
@@ -762,8 +764,22 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
{
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
- LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ ReleaseBuffer(buf);
+ DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
+ btbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ buf = ReadBuffer(rel, blkno);
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ LockBuffer(buf, access);
+ _bt_checkpage(rel, buf);
+ }
}
else
{
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index ab19692..227bc19 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -330,7 +330,7 @@ btbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index b40bd44..f44bec7 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -81,21 +81,32 @@ spgbuild(Relation heap, Relation index, IndexInfo *indexInfo)
rootbuffer,
nullbuffer;
- if (RelationGetNumberOfBlocks(index) != 0)
- elog(ERROR, "index \"%s\" already contains data",
- RelationGetRelationName(index));
-
- /*
- * Initialize the meta page and root pages
- */
- metabuffer = SpGistNewBuffer(index);
- rootbuffer = SpGistNewBuffer(index);
- nullbuffer = SpGistNewBuffer(index);
-
- Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
- Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
- Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
-
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ if (RelationGetNumberOfBlocks(index) != 0)
+ elog(ERROR, "index \"%s\" already contains data",
+ RelationGetRelationName(index));
+
+ /*
+ * Initialize the meta page and root pages
+ */
+ metabuffer = SpGistNewBuffer(index);
+ rootbuffer = SpGistNewBuffer(index);
+ nullbuffer = SpGistNewBuffer(index);
+
+ Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
+ Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
+ Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
+ }
+ else
+ {
+ metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
+ rootbuffer = ReadBuffer(index, SPGIST_ROOT_BLKNO);
+ nullbuffer = ReadBuffer(index, SPGIST_NULL_BLKNO);
+ LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(rootbuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(nullbuffer, BUFFER_LOCK_SHARE);
+ }
START_CRIT_SECTION();
SpGistInitMetapage(BufferGetPage(metabuffer));
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db..ea15964 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -21,6 +21,7 @@
#include "access/spgist_private.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/index.h"
#include "catalog/pg_amop.h"
#include "storage/bufmgr.h"
#include "storage/indexfsm.h"
@@ -106,6 +107,7 @@ spgGetCache(Relation index)
spgConfigIn in;
FmgrInfo *procinfo;
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
cache = MemoryContextAllocZero(index->rd_indexcxt,
@@ -155,12 +157,21 @@ spgGetCache(Relation index)
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
- metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
+ metapage = BufferGetPage(metabuffer);
+ metadata = SpGistPageGetMeta(metapage);
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
- elog(ERROR, "index \"%s\" is not an SP-GiST index",
- RelationGetRelationName(index));
-
+ {
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ spgbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+ else
+ elog(ERROR, "index \"%s\" is not an SP-GiST index",
+ RelationGetRelationName(index));
+ }
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 1af31c2..e60bdb7 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -402,6 +402,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f6c31cc..d943b57 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3652,7 +3652,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 625af8d..1e192fa 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 7accb95..9f2ea48 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -102,7 +102,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -318,6 +318,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -575,14 +576,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1425,7 +1426,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1527,30 +1528,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a23128d..5d131a7 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index a13322b..be661a4 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8d25d14..50d0402 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -587,7 +587,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -1772,7 +1772,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -7708,6 +7709,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14140,6 +14147,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14627,14 +14641,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index db3a68a..60212b0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 17c5f08..7c83e7b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6307,7 +6307,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3f67aaf..565c868 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3266,20 +3266,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee47547..ea7fe4c 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd816..dcfc134 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2157,7 +2157,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..5db79ec 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -33,6 +33,7 @@
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
@@ -87,6 +88,18 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +165,45 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, InvalidForkNumber, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +270,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln);
pfree(path);
@@ -465,6 +519,19 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +543,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +720,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -738,12 +811,18 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
BlockNumber
mdnblocks(SMgrRelation reln, ForkNumber forknum)
{
- MdfdVec *v = mdopenfork(reln, forknum, EXTENSION_FAIL);
+ /*
+ * If we access session relation, there may be no files yet of this relation for this backend.
+ * Pass EXTENSION_RETURN_NULL to make mdopen return NULL in this case instead of reporting error.
+ */
+ MdfdVec *v = mdopenfork(reln, forknum, RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode)
+ ? EXTENSION_RETURN_NULL : EXTENSION_FAIL);
BlockNumber nblocks;
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index c3e7d94..6d86c28 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,111 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+ int i;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 585dcee..ce8852c 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 16297a5..e7a4d3c 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1164,6 +1164,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bf69adc..fa7479c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15637,8 +15637,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..f226e7c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,10 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index ff1faba..31f615d 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index a5cf804..d42830f 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -327,6 +327,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -335,6 +347,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -526,7 +539,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 918765c..5b1598b 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa192..ef7aa85 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
Yet another version of my GTT patch addressing issues reported by
曾文旌(义从) <wenjing.zwj@alibaba-inc.com>
* Bug in TRUNCATE is fixed,
* ON COMMIT DELETE ROWS option is supported
* ALTER TABLE is correctly handled
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-6.patchtext/x-patch; name=global_private_temp-6.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae7b729..485c068 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -672,7 +672,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
/*
* We expect to be called exactly once for any index relation.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -681,9 +681,17 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* whole relation will be rolled back.
*/
- meta = ReadBuffer(index, P_NEW);
- Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
- LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ meta = ReadBuffer(index, P_NEW);
+ Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ }
+ else
+ {
+ meta = ReadBuffer(index, BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_SHARE);
+ }
brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
BRIN_CURRENT_VERSION);
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 647350c..62e5212 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -25,6 +25,7 @@
#include "access/brin_revmap.h"
#include "access/brin_tuple.h"
#include "access/brin_xlog.h"
+#include "access/brin.h"
#include "access/rmgr.h"
#include "access/xloginsert.h"
#include "miscadmin.h"
@@ -79,6 +80,13 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
LockBuffer(meta, BUFFER_LOCK_SHARE);
page = BufferGetPage(meta);
+
+ if (GlobalTempRelationPageIsNotInitialized(idxrel, page))
+ {
+ Relation heap = RelationIdGetRelation(idxrel->rd_index->indrelid);
+ brinbuild(heap, idxrel, BuildIndexInfo(idxrel));
+ RelationClose(heap);
+ }
TestForOldSnapshot(snapshot, idxrel, page);
metadata = (BrinMetaPageData *) PageGetContents(page);
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b5072c0..650f31a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ /*
+ * For global temp table only
+ * use AccessExclusiveLock for ensure safety
+ */
+ {
+ {
+ "on_commit_delete_rows",
+ "global temp table on commit options",
+ RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1478,6 +1491,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
StdRdOptions *rdopts;
int numoptions;
static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)},
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b..51fe102 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -241,6 +241,13 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+
/*
* An insertion to the pending list could logically belong anywhere in the
* tree, so it conflicts with all serializable scans. All scans acquire a
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..975c186 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -1759,7 +1759,8 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
match;
int i;
pendingPosition pos;
- Buffer metabuffer = ReadBuffer(scan->indexRelation, GIN_METAPAGE_BLKNO);
+ Relation index = scan->indexRelation;
+ Buffer metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
Page page;
BlockNumber blkno;
@@ -1769,11 +1770,19 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
* Acquire predicate lock on the metapage, to conflict with any fastupdate
* insertions.
*/
- PredicateLockPage(scan->indexRelation, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
+ PredicateLockPage(index, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
LockBuffer(metabuffer, GIN_SHARE);
page = BufferGetPage(metabuffer);
- TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
+ TestForOldSnapshot(scan->xs_snapshot, index, page);
+
+ if (GlobalTempRelationPageIsNotInitialized(index, page))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ UnlockReleaseBuffer(metabuffer);
+ }
blkno = GinPageGetMeta(page)->head;
/*
@@ -1784,10 +1793,10 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
/* No pending list, so proceed with normal scan */
UnlockReleaseBuffer(metabuffer);
- return;
+ return true;
}
- pos.pendingBuffer = ReadBuffer(scan->indexRelation, blkno);
+ pos.pendingBuffer = ReadBuffer(index, blkno);
LockBuffer(pos.pendingBuffer, GIN_SHARE);
pos.firstOffset = FirstOffsetNumber;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab14..d6739f3 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -328,7 +328,7 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
MemoryContext oldCtx;
OffsetNumber attnum;
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -337,7 +337,15 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
memset(&buildstate.buildStats, 0, sizeof(GinStatsData));
/* initialize the meta page */
- MetaBuffer = GinNewBuffer(index);
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ MetaBuffer = ReadBuffer(index, 0);
+ LockBuffer(MetaBuffer, GIN_SHARE);
+ }
+ else
+ {
+ MetaBuffer = GinNewBuffer(index);
+ }
/* initialize the root page */
RootBuffer = GinNewBuffer(index);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc8791..bcde5ea 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -16,6 +16,7 @@
#include "access/gist_private.h"
#include "access/gistscan.h"
+#include "catalog/index.h"
#include "catalog/pg_collation.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
@@ -677,7 +678,10 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
if (!xlocked)
{
LockBuffer(stack->buffer, GIST_SHARE);
- gistcheckpage(state.r, stack->buffer);
+ if (stack->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(state.r, BufferGetPage(stack->buffer)))
+ gistbuild(heapRel, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(state.r, stack->buffer);
}
stack->page = (Page) BufferGetPage(stack->buffer);
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 2f4543d..8d194c8 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -156,7 +156,7 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -171,8 +171,16 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
buildstate.giststate->tempCxt = createTempGistContext();
/* initialize the root page */
- buffer = gistNewBuffer(index);
- Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ buffer = gistNewBuffer(index);
+ Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ }
+ else
+ {
+ buffer = ReadBuffer(index, GIST_ROOT_BLKNO);
+ LockBuffer(buffer, GIST_SHARE);
+ }
page = BufferGetPage(buffer);
START_CRIT_SECTION();
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 22d790d..5560a41 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -17,8 +17,10 @@
#include "access/genam.h"
#include "access/gist_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
+#include "storage/freespace.h"
#include "storage/predicate.h"
#include "pgstat.h"
#include "lib/pairingheap.h"
@@ -344,7 +346,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
buffer = ReadBuffer(scan->indexRelation, pageItem->blkno);
LockBuffer(buffer, GIST_SHARE);
PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
- gistcheckpage(scan->indexRelation, buffer);
+ if (pageItem->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(r, BufferGetPage(buffer)))
+ gistbuild(scan->heapRelation, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(scan->indexRelation, buffer);
page = BufferGetPage(buffer);
TestForOldSnapshot(scan->xs_snapshot, r, page);
opaque = GistPageGetOpaque(page);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 45804d7..50b306a 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30da..1b228db 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -119,7 +119,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 838ee68..00ba123 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -75,13 +75,22 @@ _hash_getbuf(Relation rel, BlockNumber blkno, int access, int flags)
buf = ReadBuffer(rel, blkno);
- if (access != HASH_NOLOCK)
- LockBuffer(buf, access);
-
/* ref count and lock type are correct */
- _hash_checkpage(rel, buf, flags);
-
+ if (blkno == HASH_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ hashbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ _hash_checkpage(rel, buf, flags);
+ }
return buf;
}
@@ -339,7 +348,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
bool use_wal;
/* safety check */
- if (RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
+ if (rel->rd_rel->relpersistence != RELPERSISTENCE_SESSION && RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
elog(ERROR, "cannot initialize non-empty hash index \"%s\"",
RelationGetRelationName(rel));
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2dd8821..92df373 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -673,6 +673,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 268f869..eff9e10 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -27,8 +27,10 @@
#include "access/transam.h"
#include "access/xlog.h"
#include "access/xloginsert.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/indexfsm.h"
+#include "storage/buf_internals.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
#include "utils/snapmgr.h"
@@ -762,8 +764,22 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
{
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
- LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ ReleaseBuffer(buf);
+ DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
+ btbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ buf = ReadBuffer(rel, blkno);
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ LockBuffer(buf, access);
+ _bt_checkpage(rel, buf);
+ }
}
else
{
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index ab19692..227bc19 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -330,7 +330,7 @@ btbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index b40bd44..f44bec7 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -81,21 +81,32 @@ spgbuild(Relation heap, Relation index, IndexInfo *indexInfo)
rootbuffer,
nullbuffer;
- if (RelationGetNumberOfBlocks(index) != 0)
- elog(ERROR, "index \"%s\" already contains data",
- RelationGetRelationName(index));
-
- /*
- * Initialize the meta page and root pages
- */
- metabuffer = SpGistNewBuffer(index);
- rootbuffer = SpGistNewBuffer(index);
- nullbuffer = SpGistNewBuffer(index);
-
- Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
- Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
- Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
-
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ if (RelationGetNumberOfBlocks(index) != 0)
+ elog(ERROR, "index \"%s\" already contains data",
+ RelationGetRelationName(index));
+
+ /*
+ * Initialize the meta page and root pages
+ */
+ metabuffer = SpGistNewBuffer(index);
+ rootbuffer = SpGistNewBuffer(index);
+ nullbuffer = SpGistNewBuffer(index);
+
+ Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
+ Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
+ Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
+ }
+ else
+ {
+ metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
+ rootbuffer = ReadBuffer(index, SPGIST_ROOT_BLKNO);
+ nullbuffer = ReadBuffer(index, SPGIST_NULL_BLKNO);
+ LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(rootbuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(nullbuffer, BUFFER_LOCK_SHARE);
+ }
START_CRIT_SECTION();
SpGistInitMetapage(BufferGetPage(metabuffer));
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db..ea15964 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -21,6 +21,7 @@
#include "access/spgist_private.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/index.h"
#include "catalog/pg_amop.h"
#include "storage/bufmgr.h"
#include "storage/indexfsm.h"
@@ -106,6 +107,7 @@ spgGetCache(Relation index)
spgConfigIn in;
FmgrInfo *procinfo;
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
cache = MemoryContextAllocZero(index->rd_indexcxt,
@@ -155,12 +157,21 @@ spgGetCache(Relation index)
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
- metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
+ metapage = BufferGetPage(metabuffer);
+ metadata = SpGistPageGetMeta(metapage);
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
- elog(ERROR, "index \"%s\" is not an SP-GiST index",
- RelationGetRelationName(index));
-
+ {
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ spgbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+ else
+ elog(ERROR, "index \"%s\" is not an SP-GiST index",
+ RelationGetRelationName(index));
+ }
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 1af31c2..e60bdb7 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -402,6 +402,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f6c31cc..d943b57 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3652,7 +3652,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 625af8d..1e192fa 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 7accb95..9f2ea48 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -102,7 +102,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -318,6 +318,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -575,14 +576,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1425,7 +1426,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1527,30 +1528,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a23128d..5d131a7 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1400,7 +1400,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index a13322b..be661a4 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8d25d14..21d5a30 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12,6 +12,9 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "postgres.h"
#include "access/genam.h"
@@ -533,6 +536,23 @@ static List *GetParentedForeignKeyRefs(Relation partition);
static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static bool
+has_oncommit_option(List *options)
+{
+ ListCell *listptr;
+
+ foreach(listptr, options)
+ {
+ DefElem *def = (DefElem *) lfirst(listptr);
+
+ if (pg_strcasecmp(def->defname, "on_commit_delete_rows") == 0)
+ return true;
+ }
+
+ return false;
+}
+
+
/* ----------------------------------------------------------------
* DefineRelation
* Creates a new relation.
@@ -576,6 +596,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
LOCKMODE parentLockmode;
const char *accessMethod = NULL;
Oid accessMethodId = InvalidOid;
+ bool has_oncommit_clause = false;
/*
* Truncate relname to appropriate length (probably a waste of time, as
@@ -587,7 +608,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -613,17 +634,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
RangeVarGetAndCheckCreationNamespace(stmt->relation, NoLock, NULL);
/*
- * Security check: disallow creating temp tables from security-restricted
- * code. This is needed because calling code might not expect untrusted
- * tables to appear in pg_temp at the front of its search path.
- */
- if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP
- && InSecurityRestrictedOperation())
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("cannot create temporary table within security-restricted operation")));
-
- /*
* Determine the lockmode to use when scanning parents. A self-exclusive
* lock is needed here.
*
@@ -718,6 +728,38 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
/*
* Parse and validate reloptions, if any.
*/
+ /* global temp table */
+ has_oncommit_clause = has_oncommit_option(stmt->options);
+ if (stmt->relation->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ if (has_oncommit_clause)
+ {
+ if (stmt->oncommit != ONCOMMIT_NOOP)
+ elog(ERROR, "can not defeine global temp table with on commit and with clause at same time");
+ }
+ else if (stmt->oncommit != ONCOMMIT_NOOP)
+ {
+ DefElem *opt = makeNode(DefElem);
+
+ opt->type = T_DefElem;
+ opt->defnamespace = NULL;
+ opt->defname = "on_commit_delete_rows";
+ opt->defaction = DEFELEM_UNSPEC;
+
+ /* use reloptions to remember on commit clause */
+ if (stmt->oncommit == ONCOMMIT_DELETE_ROWS)
+ opt->arg = (Node *)makeString("true");
+ else if (stmt->oncommit == ONCOMMIT_PRESERVE_ROWS)
+ opt->arg = (Node *)makeString("false");
+ else
+ elog(ERROR, "global temp table not support on commit drop clause");
+
+ stmt->options = lappend(stmt->options, opt);
+ }
+ }
+ else if (has_oncommit_clause)
+ elog(ERROR, "regular table cannot specifie on_commit_delete_rows");
+
reloptions = transformRelOptions((Datum) 0, stmt->options, NULL, validnsps,
true, false);
@@ -1772,7 +1814,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -3449,6 +3492,26 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
(void *) stmt);
}
+
+static bool
+CheckGlobalTempTableNotInUse(Relation rel)
+{
+ int id;
+ for (id = 1; id <= MaxBackends; id++)
+ {
+ if (id != MyBackendId)
+ {
+ struct stat fst;
+ char* path = relpathbackend(rel->rd_node, id, MAIN_FORKNUM);
+ int rc = stat(path, &fst);
+ pfree(path);
+ if (rc == 0 && fst.st_size != 0)
+ return false;
+ }
+ }
+ return true;
+}
+
/*
* AlterTable
* Execute ALTER TABLE, which can be a list of subcommands
@@ -3500,6 +3563,9 @@ AlterTable(Oid relid, LOCKMODE lockmode, AlterTableStmt *stmt)
rel = relation_open(relid, NoLock);
CheckTableNotInUse(rel, "ALTER TABLE");
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
+ && !CheckGlobalTempTableNotInUse(rel))
+ elog(ERROR, "Global temp table used by active backends can not be altered");
ATController(stmt, rel, stmt->cmds, stmt->relation->inh, lockmode);
}
@@ -7708,6 +7774,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14140,6 +14212,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14627,14 +14706,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index db3a68a..60212b0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 17c5f08..7c83e7b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6307,7 +6307,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3f67aaf..565c868 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3266,20 +3266,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee47547..ea7fe4c 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd816..dcfc134 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2157,7 +2157,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 483f705..1129dc3 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -2933,7 +2933,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
/* If it's a local relation, it's localbuf.c's problem. */
if (RelFileNodeBackendIsTemp(rnode))
{
- if (rnode.backend == MyBackendId)
+ if (GetRelationBackendId(rnode.backend) == MyBackendId)
{
for (j = 0; j < nforks; j++)
DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..8cf06f6 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,17 +28,20 @@
#include "miscadmin.h"
#include "access/xlogutils.h"
#include "access/xlog.h"
+#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "pg_trace.h"
/*
@@ -87,6 +90,19 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ ForkNumber forknum;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +168,60 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, rel->forknum, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln, ForkNumber forknum)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->forknum = forknum;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
+static void
+RegisterOnCommitAction(SMgrRelation reln, ForkNumber forknum)
+{
+ if (reln->smgr_owner && forknum == MAIN_FORKNUM)
+ {
+ Relation rel = (Relation)((char*)reln->smgr_owner - offsetof(RelationData, rd_smgr));
+ if (rel->rd_options
+ && ((StdRdOptions *)rel->rd_options)->on_commit_delete_rows)
+ {
+ register_on_commit_action(rel->rd_id, ONCOMMIT_DELETE_ROWS);
+ }
+ }
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +288,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln, forkNum);
pfree(path);
@@ -465,6 +537,21 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln, forknum);
+ if (!(behavior & EXTENSION_RETURN_NULL))
+ RegisterOnCommitAction(reln, forknum);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +563,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +740,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -743,7 +836,8 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index c3e7d94..6d86c28 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,111 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+ int i;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 585dcee..ce8852c 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 16297a5..e7a4d3c 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1164,6 +1164,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bf69adc..fa7479c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15637,8 +15637,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..11b4b89 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,12 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
+#define GetRelationBackendId(id) ((id) & ~SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index ff1faba..31f615d 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index a5cf804..a30137f 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -272,6 +272,7 @@ typedef struct StdRdOptions
int parallel_workers; /* max number of parallel workers */
bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
+ bool on_commit_delete_rows; /* global temp table */
} StdRdOptions;
#define HEAP_MIN_FILLFACTOR 10
@@ -327,6 +328,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -335,6 +348,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -526,7 +540,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 918765c..5b1598b 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa192..ef7aa85 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
Now pg_gtt_statistic view is provided for global temp tables.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-7.patchtext/x-patch; name=global_private_temp-7.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae7b729..485c068 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -672,7 +672,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
/*
* We expect to be called exactly once for any index relation.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -681,9 +681,17 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* whole relation will be rolled back.
*/
- meta = ReadBuffer(index, P_NEW);
- Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
- LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ meta = ReadBuffer(index, P_NEW);
+ Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ }
+ else
+ {
+ meta = ReadBuffer(index, BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_SHARE);
+ }
brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
BRIN_CURRENT_VERSION);
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 647350c..d432fec 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -25,8 +25,10 @@
#include "access/brin_revmap.h"
#include "access/brin_tuple.h"
#include "access/brin_xlog.h"
+#include "access/brin.h"
#include "access/rmgr.h"
#include "access/xloginsert.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -79,6 +81,13 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
LockBuffer(meta, BUFFER_LOCK_SHARE);
page = BufferGetPage(meta);
+
+ if (GlobalTempRelationPageIsNotInitialized(idxrel, page))
+ {
+ Relation heap = RelationIdGetRelation(idxrel->rd_index->indrelid);
+ brinbuild(heap, idxrel, BuildIndexInfo(idxrel));
+ RelationClose(heap);
+ }
TestForOldSnapshot(snapshot, idxrel, page);
metadata = (BrinMetaPageData *) PageGetContents(page);
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b5072c0..650f31a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ /*
+ * For global temp table only
+ * use AccessExclusiveLock for ensure safety
+ */
+ {
+ {
+ "on_commit_delete_rows",
+ "global temp table on commit options",
+ RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1478,6 +1491,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
StdRdOptions *rdopts;
int numoptions;
static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)},
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b..06fb671 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -23,6 +23,7 @@
#include "access/xloginsert.h"
#include "access/xlog.h"
#include "commands/vacuum.h"
+#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "miscadmin.h"
#include "utils/memutils.h"
@@ -241,6 +242,13 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+
/*
* An insertion to the pending list could logically belong anywhere in the
* tree, so it conflicts with all serializable scans. All scans acquire a
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..a7ad4c6 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -16,6 +16,7 @@
#include "access/gin_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/predicate.h"
#include "utils/datum.h"
@@ -1759,7 +1760,8 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
match;
int i;
pendingPosition pos;
- Buffer metabuffer = ReadBuffer(scan->indexRelation, GIN_METAPAGE_BLKNO);
+ Relation index = scan->indexRelation;
+ Buffer metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
Page page;
BlockNumber blkno;
@@ -1769,11 +1771,19 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
* Acquire predicate lock on the metapage, to conflict with any fastupdate
* insertions.
*/
- PredicateLockPage(scan->indexRelation, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
+ PredicateLockPage(index, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
LockBuffer(metabuffer, GIN_SHARE);
page = BufferGetPage(metabuffer);
- TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
+ TestForOldSnapshot(scan->xs_snapshot, index, page);
+
+ if (GlobalTempRelationPageIsNotInitialized(index, page))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ UnlockReleaseBuffer(metabuffer);
+ }
blkno = GinPageGetMeta(page)->head;
/*
@@ -1784,10 +1794,9 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
{
/* No pending list, so proceed with normal scan */
UnlockReleaseBuffer(metabuffer);
- return;
}
- pos.pendingBuffer = ReadBuffer(scan->indexRelation, blkno);
+ pos.pendingBuffer = ReadBuffer(index, blkno);
LockBuffer(pos.pendingBuffer, GIN_SHARE);
pos.firstOffset = FirstOffsetNumber;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab14..d6739f3 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -328,7 +328,7 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
MemoryContext oldCtx;
OffsetNumber attnum;
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -337,7 +337,15 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
memset(&buildstate.buildStats, 0, sizeof(GinStatsData));
/* initialize the meta page */
- MetaBuffer = GinNewBuffer(index);
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ MetaBuffer = ReadBuffer(index, 0);
+ LockBuffer(MetaBuffer, GIN_SHARE);
+ }
+ else
+ {
+ MetaBuffer = GinNewBuffer(index);
+ }
/* initialize the root page */
RootBuffer = GinNewBuffer(index);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc8791..bcde5ea 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -16,6 +16,7 @@
#include "access/gist_private.h"
#include "access/gistscan.h"
+#include "catalog/index.h"
#include "catalog/pg_collation.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
@@ -677,7 +678,10 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
if (!xlocked)
{
LockBuffer(stack->buffer, GIST_SHARE);
- gistcheckpage(state.r, stack->buffer);
+ if (stack->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(state.r, BufferGetPage(stack->buffer)))
+ gistbuild(heapRel, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(state.r, stack->buffer);
}
stack->page = (Page) BufferGetPage(stack->buffer);
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 2f4543d..8d194c8 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -156,7 +156,7 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -171,8 +171,16 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
buildstate.giststate->tempCxt = createTempGistContext();
/* initialize the root page */
- buffer = gistNewBuffer(index);
- Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ buffer = gistNewBuffer(index);
+ Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ }
+ else
+ {
+ buffer = ReadBuffer(index, GIST_ROOT_BLKNO);
+ LockBuffer(buffer, GIST_SHARE);
+ }
page = BufferGetPage(buffer);
START_CRIT_SECTION();
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 22d790d..5560a41 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -17,8 +17,10 @@
#include "access/genam.h"
#include "access/gist_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
+#include "storage/freespace.h"
#include "storage/predicate.h"
#include "pgstat.h"
#include "lib/pairingheap.h"
@@ -344,7 +346,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
buffer = ReadBuffer(scan->indexRelation, pageItem->blkno);
LockBuffer(buffer, GIST_SHARE);
PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
- gistcheckpage(scan->indexRelation, buffer);
+ if (pageItem->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(r, BufferGetPage(buffer)))
+ gistbuild(scan->heapRelation, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(scan->indexRelation, buffer);
page = BufferGetPage(buffer);
TestForOldSnapshot(scan->xs_snapshot, r, page);
opaque = GistPageGetOpaque(page);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 45804d7..50b306a 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1028,7 +1028,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30da..1b228db 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -119,7 +119,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 838ee68..544d01b 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -30,6 +30,8 @@
#include "access/hash.h"
#include "access/hash_xlog.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
#include "storage/smgr.h"
@@ -75,13 +77,22 @@ _hash_getbuf(Relation rel, BlockNumber blkno, int access, int flags)
buf = ReadBuffer(rel, blkno);
- if (access != HASH_NOLOCK)
- LockBuffer(buf, access);
-
/* ref count and lock type are correct */
- _hash_checkpage(rel, buf, flags);
-
+ if (blkno == HASH_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ hashbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ _hash_checkpage(rel, buf, flags);
+ }
return buf;
}
@@ -339,7 +350,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
bool use_wal;
/* safety check */
- if (RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
+ if (rel->rd_rel->relpersistence != RELPERSISTENCE_SESSION && RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
elog(ERROR, "cannot initialize non-empty hash index \"%s\"",
RelationGetRelationName(rel));
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2dd8821..92df373 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -673,6 +673,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 268f869..eff9e10 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -27,8 +27,10 @@
#include "access/transam.h"
#include "access/xlog.h"
#include "access/xloginsert.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/indexfsm.h"
+#include "storage/buf_internals.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
#include "utils/snapmgr.h"
@@ -762,8 +764,22 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
{
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
- LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ ReleaseBuffer(buf);
+ DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
+ btbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ buf = ReadBuffer(rel, blkno);
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ LockBuffer(buf, access);
+ _bt_checkpage(rel, buf);
+ }
}
else
{
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index ab19692..227bc19 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -330,7 +330,7 @@ btbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index b40bd44..f44bec7 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -81,21 +81,32 @@ spgbuild(Relation heap, Relation index, IndexInfo *indexInfo)
rootbuffer,
nullbuffer;
- if (RelationGetNumberOfBlocks(index) != 0)
- elog(ERROR, "index \"%s\" already contains data",
- RelationGetRelationName(index));
-
- /*
- * Initialize the meta page and root pages
- */
- metabuffer = SpGistNewBuffer(index);
- rootbuffer = SpGistNewBuffer(index);
- nullbuffer = SpGistNewBuffer(index);
-
- Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
- Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
- Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
-
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ if (RelationGetNumberOfBlocks(index) != 0)
+ elog(ERROR, "index \"%s\" already contains data",
+ RelationGetRelationName(index));
+
+ /*
+ * Initialize the meta page and root pages
+ */
+ metabuffer = SpGistNewBuffer(index);
+ rootbuffer = SpGistNewBuffer(index);
+ nullbuffer = SpGistNewBuffer(index);
+
+ Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
+ Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
+ Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
+ }
+ else
+ {
+ metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
+ rootbuffer = ReadBuffer(index, SPGIST_ROOT_BLKNO);
+ nullbuffer = ReadBuffer(index, SPGIST_NULL_BLKNO);
+ LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(rootbuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(nullbuffer, BUFFER_LOCK_SHARE);
+ }
START_CRIT_SECTION();
SpGistInitMetapage(BufferGetPage(metabuffer));
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 45472db..ea15964 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -21,6 +21,7 @@
#include "access/spgist_private.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/index.h"
#include "catalog/pg_amop.h"
#include "storage/bufmgr.h"
#include "storage/indexfsm.h"
@@ -106,6 +107,7 @@ spgGetCache(Relation index)
spgConfigIn in;
FmgrInfo *procinfo;
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
cache = MemoryContextAllocZero(index->rd_indexcxt,
@@ -155,12 +157,21 @@ spgGetCache(Relation index)
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
- metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
+ metapage = BufferGetPage(metabuffer);
+ metadata = SpGistPageGetMeta(metapage);
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
- elog(ERROR, "index \"%s\" is not an SP-GiST index",
- RelationGetRelationName(index));
-
+ {
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ spgbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+ else
+ elog(ERROR, "index \"%s\" is not an SP-GiST index",
+ RelationGetRelationName(index));
+ }
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 1af31c2..e60bdb7 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -402,6 +402,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f6c31cc..d943b57 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3652,7 +3652,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 625af8d..1e192fa 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -93,6 +93,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9fe4a47..46b07c4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1327,7 +1327,17 @@ LANGUAGE INTERNAL
STRICT STABLE PARALLEL SAFE
AS 'jsonb_path_query_first_tz';
+
+--
+-- Statistic for global temporary tables
--
+
+CREATE VIEW pg_sequence_params AS select s.* from pg_class c,pg_sequence_parameters(c.oid) s where c.relkind='S';
+
+CREATE VIEW pg_gtt_statistic AS
+ SELECT s.* from pg_class c,pg_gtt_statistic_for_relation(c.oid) s where c.relpersistence='s';
+
+
-- The default permissions for functions mean that anyone can execute them.
-- A number of functions shouldn't be executable by just anyone, but rather
-- than use explicit 'superuser()' checks in those functions, we use the GRANT
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 7accb95..fb11f26 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -39,6 +39,7 @@
#include "commands/vacuum.h"
#include "executor/executor.h"
#include "foreign/fdwapi.h"
+#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_oper.h"
@@ -102,7 +103,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -318,6 +319,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -575,14 +577,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1425,7 +1427,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1527,30 +1529,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
@@ -2859,3 +2873,114 @@ analyze_mcv_list(int *mcv_counts,
}
return num_mcv;
}
+
+PG_FUNCTION_INFO_V1(pg_gtt_statistic_for_relation);
+
+typedef struct
+{
+ int staattnum;
+ bool stainherit;
+} PgTempStatIteratorCtx;
+
+Datum
+pg_gtt_statistic_for_relation(PG_FUNCTION_ARGS)
+{
+ Oid starelid = PG_GETARG_OID(0);
+#if 1
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ bool stainherit = false;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build tuplestore to hold the result rows */
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ /* Build a tuple descriptor for our result type */
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ do
+ {
+ int staattnum = 0;
+ while (true)
+ {
+ HeapTuple* statup = SearchSysCacheCopy3(STATRELATTINH,
+ ObjectIdGetDatum(starelid),
+ Int16GetDatum(++staattnum),
+ BoolGetDatum(stainherit));
+ if (statup != NULL)
+ tuplestore_puttuple(tupstore, statup);
+ else
+ break;
+ }
+ stainherit = !stainherit;
+ } while (stainherit);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+#else
+ FuncCallContext *funcctx;
+ PgTempStatIteratorCtx *it;
+ HeapTuple statup;
+
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ it = palloc0(sizeof(PgTempStatIteratorCtx));
+ funcctx->user_fctx = (void *)it;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+ else
+ {
+ funcctx = SRF_PERCALL_SETUP();
+ it = (PgTempStatIteratorCtx*)funcctx->user_fctx;
+ }
+ while (true)
+ {
+ it->staattnum += 1;
+ statup = SearchSysCacheCopy3(STATRELATTINH,
+ ObjectIdGetDatum(starelid),
+ Int16GetDatum(it->staattnum),
+ BoolGetDatum(it->stainherit));
+ if (statup != NULL)
+ SRF_RETURN_NEXT(funcctx, statup);
+
+ if (it->stainherit)
+ SRF_RETURN_DONE(funcctx);
+
+ it->stainherit = true;
+ it->staattnum = 0;
+ }
+#endif
+}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a23128d..b1b786d 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -392,6 +392,13 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
errmsg("cannot vacuum temporary tables of other sessions")));
}
+ /* not support cluster global temp table yet */
+ if (OldHeap->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("not support cluster global temporary tables yet")));
+
+
/*
* Also check for active uses of the relation in the current transaction,
* including open scans and pending AFTER trigger events.
@@ -1400,7 +1407,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index a13322b..be661a4 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8d25d14..21d5a30 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12,6 +12,9 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "postgres.h"
#include "access/genam.h"
@@ -533,6 +536,23 @@ static List *GetParentedForeignKeyRefs(Relation partition);
static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static bool
+has_oncommit_option(List *options)
+{
+ ListCell *listptr;
+
+ foreach(listptr, options)
+ {
+ DefElem *def = (DefElem *) lfirst(listptr);
+
+ if (pg_strcasecmp(def->defname, "on_commit_delete_rows") == 0)
+ return true;
+ }
+
+ return false;
+}
+
+
/* ----------------------------------------------------------------
* DefineRelation
* Creates a new relation.
@@ -576,6 +596,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
LOCKMODE parentLockmode;
const char *accessMethod = NULL;
Oid accessMethodId = InvalidOid;
+ bool has_oncommit_clause = false;
/*
* Truncate relname to appropriate length (probably a waste of time, as
@@ -587,7 +608,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -613,17 +634,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
RangeVarGetAndCheckCreationNamespace(stmt->relation, NoLock, NULL);
/*
- * Security check: disallow creating temp tables from security-restricted
- * code. This is needed because calling code might not expect untrusted
- * tables to appear in pg_temp at the front of its search path.
- */
- if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP
- && InSecurityRestrictedOperation())
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("cannot create temporary table within security-restricted operation")));
-
- /*
* Determine the lockmode to use when scanning parents. A self-exclusive
* lock is needed here.
*
@@ -718,6 +728,38 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
/*
* Parse and validate reloptions, if any.
*/
+ /* global temp table */
+ has_oncommit_clause = has_oncommit_option(stmt->options);
+ if (stmt->relation->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ if (has_oncommit_clause)
+ {
+ if (stmt->oncommit != ONCOMMIT_NOOP)
+ elog(ERROR, "can not defeine global temp table with on commit and with clause at same time");
+ }
+ else if (stmt->oncommit != ONCOMMIT_NOOP)
+ {
+ DefElem *opt = makeNode(DefElem);
+
+ opt->type = T_DefElem;
+ opt->defnamespace = NULL;
+ opt->defname = "on_commit_delete_rows";
+ opt->defaction = DEFELEM_UNSPEC;
+
+ /* use reloptions to remember on commit clause */
+ if (stmt->oncommit == ONCOMMIT_DELETE_ROWS)
+ opt->arg = (Node *)makeString("true");
+ else if (stmt->oncommit == ONCOMMIT_PRESERVE_ROWS)
+ opt->arg = (Node *)makeString("false");
+ else
+ elog(ERROR, "global temp table not support on commit drop clause");
+
+ stmt->options = lappend(stmt->options, opt);
+ }
+ }
+ else if (has_oncommit_clause)
+ elog(ERROR, "regular table cannot specifie on_commit_delete_rows");
+
reloptions = transformRelOptions((Datum) 0, stmt->options, NULL, validnsps,
true, false);
@@ -1772,7 +1814,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -3449,6 +3492,26 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
(void *) stmt);
}
+
+static bool
+CheckGlobalTempTableNotInUse(Relation rel)
+{
+ int id;
+ for (id = 1; id <= MaxBackends; id++)
+ {
+ if (id != MyBackendId)
+ {
+ struct stat fst;
+ char* path = relpathbackend(rel->rd_node, id, MAIN_FORKNUM);
+ int rc = stat(path, &fst);
+ pfree(path);
+ if (rc == 0 && fst.st_size != 0)
+ return false;
+ }
+ }
+ return true;
+}
+
/*
* AlterTable
* Execute ALTER TABLE, which can be a list of subcommands
@@ -3500,6 +3563,9 @@ AlterTable(Oid relid, LOCKMODE lockmode, AlterTableStmt *stmt)
rel = relation_open(relid, NoLock);
CheckTableNotInUse(rel, "ALTER TABLE");
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
+ && !CheckGlobalTempTableNotInUse(rel))
+ elog(ERROR, "Global temp table used by active backends can not be altered");
ATController(stmt, rel, stmt->cmds, stmt->relation->inh, lockmode);
}
@@ -7708,6 +7774,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14140,6 +14212,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14627,14 +14706,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index db3a68a..60212b0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 17c5f08..7c83e7b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6307,7 +6307,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3f67aaf..565c868 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3266,20 +3266,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee47547..ea7fe4c 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd816..dcfc134 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2157,7 +2157,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 483f705..1129dc3 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -2933,7 +2933,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
/* If it's a local relation, it's localbuf.c's problem. */
if (RelFileNodeBackendIsTemp(rnode))
{
- if (rnode.backend == MyBackendId)
+ if (GetRelationBackendId(rnode.backend) == MyBackendId)
{
for (j = 0; j < nforks; j++)
DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 07f3c93..8cf06f6 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -28,17 +28,20 @@
#include "miscadmin.h"
#include "access/xlogutils.h"
#include "access/xlog.h"
+#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/fd.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
#include "storage/smgr.h"
#include "storage/sync.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "pg_trace.h"
/*
@@ -87,6 +90,19 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ ForkNumber forknum;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +168,60 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, rel->forknum, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln, ForkNumber forknum)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->forknum = forknum;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
+static void
+RegisterOnCommitAction(SMgrRelation reln, ForkNumber forknum)
+{
+ if (reln->smgr_owner && forknum == MAIN_FORKNUM)
+ {
+ Relation rel = (Relation)((char*)reln->smgr_owner - offsetof(RelationData, rd_smgr));
+ if (rel->rd_options
+ && ((StdRdOptions *)rel->rd_options)->on_commit_delete_rows)
+ {
+ register_on_commit_action(rel->rd_id, ONCOMMIT_DELETE_ROWS);
+ }
+ }
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +288,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln, forkNum);
pfree(path);
@@ -465,6 +537,21 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln, forknum);
+ if (!(behavior & EXTENSION_RETURN_NULL))
+ RegisterOnCommitAction(reln, forknum);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +563,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +740,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -743,7 +836,8 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index c3e7d94..720dd52 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,110 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 585dcee..ce8852c 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1098,6 +1098,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3301,6 +3305,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 16297a5..e7a4d3c 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1164,6 +1164,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 4688fbc..9e49d4e 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -18,6 +18,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_statistic_d.h"
#include "funcapi.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_coerce.h"
@@ -341,7 +342,8 @@ internal_get_result_type(Oid funcid,
if (resolve_polymorphic_tupdesc(tupdesc,
&procform->proargtypes,
- call_expr))
+ call_expr) ||
+ funcid == GttStatisticFunctionId)
{
if (tupdesc->tdtypeid == RECORDOID &&
tupdesc->tdtypmod < 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bf69adc..fa7479c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15637,8 +15637,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 58ea5b9..082c380 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5476,7 +5476,14 @@
proname => 'pg_stat_get_xact_function_self_time', provolatile => 'v',
proparallel => 'r', prorettype => 'float8', proargtypes => 'oid',
prosrc => 'pg_stat_get_xact_function_self_time' },
-
+{ oid => '3434',
+ descr => 'show local statistics for global temp table',
+ proname => 'pg_gtt_statistic_for_relation', provolatile => 'v', proparallel => 'u',
+ prorettype => 'record', proretset => 't', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,int2,bool,float4,int4,float4,int2,int2,int2,int2,int2,oid,oid,oid,oid,oid,oid,oid,oid,oid,oid,_float4,_float4,_float4,_float4,_float4,anyarray,anyarray,anyarray,anyarray,anyarray}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{relid,starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct,stakind1,stakind2,stakind3,stakind4,stakind5,staop1,staop2,staop3,staop4,staop5,stacoll1,stacoll2,stacoll3,stacoll4,stacoll5,stanumbers1,stanumbers2,stanumbers3,stanumbers4,stanumbers5,stavalues1,stavalues2,stavalues3,stavalues4,stavalues5}',
+ prosrc => 'pg_gtt_statistic_for_relation' },
{ oid => '3788',
descr => 'statistics: timestamp of the current statistics snapshot',
proname => 'pg_stat_get_snapshot_timestamp', provolatile => 's',
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..11b4b89 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,12 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
+#define GetRelationBackendId(id) ((id) & ~SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index ff1faba..31f615d 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index a5cf804..a30137f 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -272,6 +272,7 @@ typedef struct StdRdOptions
int parallel_workers; /* max number of parallel workers */
bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
+ bool on_commit_delete_rows; /* global temp table */
} StdRdOptions;
#define HEAP_MIN_FILLFACTOR 10
@@ -327,6 +328,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -335,6 +348,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -526,7 +540,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 918765c..5b1598b 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa192..ef7aa85 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fc0f141..507cf7d 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 68ac56a..3890777 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
On Wed, Nov 20, 2019 at 07:32:14PM +0300, Konstantin Knizhnik wrote:
Now pg_gtt_statistic view is provided for global temp tables.
Latest patch fails to apply, per Mr Robot's report. Could you please
rebase and send an updated version? For now I have moved the patch to
next CF, waiting on author.
--
Michael
On 01.12.2019 4:56, Michael Paquier wrote:
On Wed, Nov 20, 2019 at 07:32:14PM +0300, Konstantin Knizhnik wrote:
Now pg_gtt_statistic view is provided for global temp tables.
Latest patch fails to apply, per Mr Robot's report. Could you please
rebase and send an updated version? For now I have moved the patch to
next CF, waiting on author.
--
Michael
Rebeased version of the patch is attached.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-8.patchtext/x-patch; name=global_private_temp-8.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 294ffa6..4e3eceb 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -672,7 +672,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
/*
* We expect to be called exactly once for any index relation.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -681,9 +681,17 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* whole relation will be rolled back.
*/
- meta = ReadBuffer(index, P_NEW);
- Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
- LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ meta = ReadBuffer(index, P_NEW);
+ Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
+ }
+ else
+ {
+ meta = ReadBuffer(index, BRIN_METAPAGE_BLKNO);
+ LockBuffer(meta, BUFFER_LOCK_SHARE);
+ }
brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
BRIN_CURRENT_VERSION);
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 647350c..d432fec 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -25,8 +25,10 @@
#include "access/brin_revmap.h"
#include "access/brin_tuple.h"
#include "access/brin_xlog.h"
+#include "access/brin.h"
#include "access/rmgr.h"
#include "access/xloginsert.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -79,6 +81,13 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
LockBuffer(meta, BUFFER_LOCK_SHARE);
page = BufferGetPage(meta);
+
+ if (GlobalTempRelationPageIsNotInitialized(idxrel, page))
+ {
+ Relation heap = RelationIdGetRelation(idxrel->rd_index->indrelid);
+ brinbuild(heap, idxrel, BuildIndexInfo(idxrel));
+ RelationClose(heap);
+ }
TestForOldSnapshot(snapshot, idxrel, page);
metadata = (BrinMetaPageData *) PageGetContents(page);
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 48377ac..3b3bbc1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ /*
+ * For global temp table only
+ * use AccessExclusiveLock for ensure safety
+ */
+ {
+ {
+ "on_commit_delete_rows",
+ "global temp table on commit options",
+ RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1486,6 +1499,8 @@ bytea *
default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{
static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)},
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
@@ -1586,13 +1601,17 @@ build_reloptions(Datum reloptions, bool validate,
bytea *
partitioned_table_reloptions(Datum reloptions, bool validate)
{
+ static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)}
+ };
/*
* There are no options for partitioned tables yet, but this is able to do
* some validation.
*/
return (bytea *) build_reloptions(reloptions, validate,
RELOPT_KIND_PARTITIONED,
- 0, NULL, 0);
+ sizeof(StdRdOptions), tab, lengthof(tab));
}
/*
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 3d2153b..06c7f7a 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -22,6 +22,8 @@
#include "access/ginxlog.h"
#include "access/xlog.h"
#include "access/xloginsert.h"
+#include "commands/vacuum.h"
+#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/vacuum.h"
#include "miscadmin.h"
@@ -241,6 +243,13 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+
/*
* An insertion to the pending list could logically belong anywhere in the
* tree, so it conflicts with all serializable scans. All scans acquire a
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..2e447be 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -16,6 +16,7 @@
#include "access/gin_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/predicate.h"
#include "utils/datum.h"
@@ -1759,7 +1760,8 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
match;
int i;
pendingPosition pos;
- Buffer metabuffer = ReadBuffer(scan->indexRelation, GIN_METAPAGE_BLKNO);
+ Relation index = scan->indexRelation;
+ Buffer metabuffer = ReadBuffer(index, GIN_METAPAGE_BLKNO);
Page page;
BlockNumber blkno;
@@ -1769,11 +1771,19 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
* Acquire predicate lock on the metapage, to conflict with any fastupdate
* insertions.
*/
- PredicateLockPage(scan->indexRelation, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
+ PredicateLockPage(index, GIN_METAPAGE_BLKNO, scan->xs_snapshot);
LockBuffer(metabuffer, GIN_SHARE);
page = BufferGetPage(metabuffer);
- TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
+ TestForOldSnapshot(scan->xs_snapshot, index, page);
+
+ if (GlobalTempRelationPageIsNotInitialized(index, page))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ ginbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ UnlockReleaseBuffer(metabuffer);
+ }
blkno = GinPageGetMeta(page)->head;
/*
@@ -1787,7 +1797,7 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
return;
}
- pos.pendingBuffer = ReadBuffer(scan->indexRelation, blkno);
+ pos.pendingBuffer = ReadBuffer(index, blkno);
LockBuffer(pos.pendingBuffer, GIN_SHARE);
pos.firstOffset = FirstOffsetNumber;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 1ad6228..49d245c 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -329,7 +329,7 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
MemoryContext oldCtx;
OffsetNumber attnum;
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -338,7 +338,15 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo)
memset(&buildstate.buildStats, 0, sizeof(GinStatsData));
/* initialize the meta page */
- MetaBuffer = GinNewBuffer(index);
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ MetaBuffer = ReadBuffer(index, 0);
+ LockBuffer(MetaBuffer, GIN_SHARE);
+ }
+ else
+ {
+ MetaBuffer = GinNewBuffer(index);
+ }
/* initialize the root page */
RootBuffer = GinNewBuffer(index);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 3800f58..e9d1218 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -16,6 +16,7 @@
#include "access/gist_private.h"
#include "access/gistscan.h"
+#include "catalog/index.h"
#include "catalog/pg_collation.h"
#include "miscadmin.h"
#include "nodes/execnodes.h"
@@ -676,7 +677,10 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
if (!xlocked)
{
LockBuffer(stack->buffer, GIST_SHARE);
- gistcheckpage(state.r, stack->buffer);
+ if (stack->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(state.r, BufferGetPage(stack->buffer)))
+ gistbuild(heapRel, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(state.r, stack->buffer);
}
stack->page = (Page) BufferGetPage(stack->buffer);
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 739846a..f35072b 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -156,7 +156,7 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
@@ -171,8 +171,16 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
buildstate.giststate->tempCxt = createTempGistContext();
/* initialize the root page */
- buffer = gistNewBuffer(index);
- Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ buffer = gistNewBuffer(index);
+ Assert(BufferGetBlockNumber(buffer) == GIST_ROOT_BLKNO);
+ }
+ else
+ {
+ buffer = ReadBuffer(index, GIST_ROOT_BLKNO);
+ LockBuffer(buffer, GIST_SHARE);
+ }
page = BufferGetPage(buffer);
START_CRIT_SECTION();
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 98b6892..90d4eaa 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -17,10 +17,12 @@
#include "access/genam.h"
#include "access/gist_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "lib/pairingheap.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/lmgr.h"
+#include "storage/freespace.h"
#include "storage/predicate.h"
#include "utils/float.h"
#include "utils/memutils.h"
@@ -344,7 +346,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
buffer = ReadBuffer(scan->indexRelation, pageItem->blkno);
LockBuffer(buffer, GIST_SHARE);
PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
- gistcheckpage(scan->indexRelation, buffer);
+ if (pageItem->blkno == GIST_ROOT_BLKNO && GlobalTempRelationPageIsNotInitialized(r, BufferGetPage(buffer)))
+ gistbuild(scan->heapRelation, r, BuildIndexInfo(r));
+ else
+ gistcheckpage(scan->indexRelation, buffer);
page = BufferGetPage(buffer);
TestForOldSnapshot(scan->xs_snapshot, r, page);
opaque = GistPageGetOpaque(page);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 553a6d6..581f4c2 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -1013,7 +1013,7 @@ gistGetFakeLSN(Relation rel)
{
static XLogRecPtr counter = FirstNormalUnloggedLSN;
- if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ if (RelationHasSessionScope(rel))
{
/*
* Temporary relations are only accessible in our session, so a simple
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index a0597a0..56bd161 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -117,7 +117,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index f84cee8..1948ef7 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -30,6 +30,8 @@
#include "access/hash.h"
#include "access/hash_xlog.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
@@ -74,13 +76,22 @@ _hash_getbuf(Relation rel, BlockNumber blkno, int access, int flags)
buf = ReadBuffer(rel, blkno);
- if (access != HASH_NOLOCK)
- LockBuffer(buf, access);
-
/* ref count and lock type are correct */
- _hash_checkpage(rel, buf, flags);
-
+ if (blkno == HASH_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ hashbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ if (access != HASH_NOLOCK)
+ LockBuffer(buf, access);
+ _hash_checkpage(rel, buf, flags);
+ }
return buf;
}
@@ -338,7 +349,7 @@ _hash_init(Relation rel, double num_tuples, ForkNumber forkNum)
bool use_wal;
/* safety check */
- if (RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
+ if (rel->rd_rel->relpersistence != RELPERSISTENCE_SESSION && RelationGetNumberOfBlocksInFork(rel, forkNum) != 0)
elog(ERROR, "cannot initialize non-empty hash index \"%s\"",
RelationGetRelationName(rel));
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 92073fe..54533a3 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -670,6 +670,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 268f869..eff9e10 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -27,8 +27,10 @@
#include "access/transam.h"
#include "access/xlog.h"
#include "access/xloginsert.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "storage/indexfsm.h"
+#include "storage/buf_internals.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
#include "utils/snapmgr.h"
@@ -762,8 +764,22 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access)
{
/* Read an existing block of the relation */
buf = ReadBuffer(rel, blkno);
- LockBuffer(buf, access);
- _bt_checkpage(rel, buf);
+ /* Session temporary relation may be not yet initialized for this backend. */
+ if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
+ {
+ Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
+ ReleaseBuffer(buf);
+ DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
+ btbuild(heap, rel, BuildIndexInfo(rel));
+ RelationClose(heap);
+ buf = ReadBuffer(rel, blkno);
+ LockBuffer(buf, access);
+ }
+ else
+ {
+ LockBuffer(buf, access);
+ _bt_checkpage(rel, buf);
+ }
}
else
{
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 1dd39a9..785b665 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -321,7 +321,7 @@ btbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* We expect to be called exactly once for any index relation. If that's
* not the case, big trouble's what we have.
*/
- if (RelationGetNumberOfBlocks(index) != 0)
+ if (RelationGetNumberOfBlocks(index) != 0 && index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
elog(ERROR, "index \"%s\" already contains data",
RelationGetRelationName(index));
diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c
index dd90887..305f171 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -81,21 +81,32 @@ spgbuild(Relation heap, Relation index, IndexInfo *indexInfo)
rootbuffer,
nullbuffer;
- if (RelationGetNumberOfBlocks(index) != 0)
- elog(ERROR, "index \"%s\" already contains data",
- RelationGetRelationName(index));
-
- /*
- * Initialize the meta page and root pages
- */
- metabuffer = SpGistNewBuffer(index);
- rootbuffer = SpGistNewBuffer(index);
- nullbuffer = SpGistNewBuffer(index);
-
- Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
- Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
- Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
-
+ if (index->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ {
+ if (RelationGetNumberOfBlocks(index) != 0)
+ elog(ERROR, "index \"%s\" already contains data",
+ RelationGetRelationName(index));
+
+ /*
+ * Initialize the meta page and root pages
+ */
+ metabuffer = SpGistNewBuffer(index);
+ rootbuffer = SpGistNewBuffer(index);
+ nullbuffer = SpGistNewBuffer(index);
+
+ Assert(BufferGetBlockNumber(metabuffer) == SPGIST_METAPAGE_BLKNO);
+ Assert(BufferGetBlockNumber(rootbuffer) == SPGIST_ROOT_BLKNO);
+ Assert(BufferGetBlockNumber(nullbuffer) == SPGIST_NULL_BLKNO);
+ }
+ else
+ {
+ metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
+ rootbuffer = ReadBuffer(index, SPGIST_ROOT_BLKNO);
+ nullbuffer = ReadBuffer(index, SPGIST_NULL_BLKNO);
+ LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(rootbuffer, BUFFER_LOCK_SHARE);
+ LockBuffer(nullbuffer, BUFFER_LOCK_SHARE);
+ }
START_CRIT_SECTION();
SpGistInitMetapage(BufferGetPage(metabuffer));
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index e2d391e..42137de 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -21,6 +21,7 @@
#include "access/spgist_private.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/index.h"
#include "catalog/pg_amop.h"
#include "storage/bufmgr.h"
#include "storage/indexfsm.h"
@@ -106,6 +107,7 @@ spgGetCache(Relation index)
spgConfigIn in;
FmgrInfo *procinfo;
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
cache = MemoryContextAllocZero(index->rd_indexcxt,
@@ -155,12 +157,21 @@ spgGetCache(Relation index)
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
- metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
+ metapage = BufferGetPage(metabuffer);
+ metadata = SpGistPageGetMeta(metapage);
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
- elog(ERROR, "index \"%s\" is not an SP-GiST index",
- RelationGetRelationName(index));
-
+ {
+ if (GlobalTempRelationPageIsNotInitialized(index, metapage))
+ {
+ Relation heap = RelationIdGetRelation(index->rd_index->indrelid);
+ spgbuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+ else
+ elog(ERROR, "index \"%s\" is not an SP-GiST index",
+ RelationGetRelationName(index));
+ }
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6b10469..88fca55 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -401,6 +401,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index e995570..1522109 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3699,7 +3699,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 056ea3d..317574a 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -92,6 +92,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f7800f0..23ab456 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1332,7 +1332,15 @@ LANGUAGE INTERNAL
STRICT STABLE PARALLEL SAFE
AS 'jsonb_path_query_first_tz';
+
+--
+-- Statistic for global temporary tables
--
+
+CREATE VIEW pg_gtt_statistic AS
+ SELECT s.* from pg_class c,pg_gtt_statistic_for_relation(c.oid) s where c.relpersistence='s';
+
+
-- The default permissions for functions mean that anyone can execute them.
-- A number of functions shouldn't be executable by just anyone, but rather
-- than use explicit 'superuser()' checks in those functions, we use the GRANT
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 71372ce..324a249 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -39,6 +39,7 @@
#include "commands/vacuum.h"
#include "executor/executor.h"
#include "foreign/fdwapi.h"
+#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_oper.h"
@@ -102,7 +103,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -318,6 +319,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -575,14 +577,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1425,7 +1427,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1527,30 +1529,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
@@ -2859,3 +2873,72 @@ analyze_mcv_list(int *mcv_counts,
}
return num_mcv;
}
+
+PG_FUNCTION_INFO_V1(pg_gtt_statistic_for_relation);
+
+typedef struct
+{
+ int staattnum;
+ bool stainherit;
+} PgTempStatIteratorCtx;
+
+Datum
+pg_gtt_statistic_for_relation(PG_FUNCTION_ARGS)
+{
+ Oid starelid = PG_GETARG_OID(0);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ bool stainherit = false;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build tuplestore to hold the result rows */
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ /* Build a tuple descriptor for our result type */
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ do
+ {
+ int staattnum = 0;
+ while (true)
+ {
+ HeapTuple statup = SearchSysCacheCopy3(STATRELATTINH,
+ ObjectIdGetDatum(starelid),
+ Int16GetDatum(++staattnum),
+ BoolGetDatum(stainherit));
+ if (statup != NULL)
+ tuplestore_puttuple(tupstore, statup);
+ else
+ break;
+ }
+ stainherit = !stainherit;
+ } while (stainherit);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index b8c349f..80b7fb4 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -391,6 +391,13 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
errmsg("cannot vacuum temporary tables of other sessions")));
}
+ /* not support cluster global temp table yet */
+ if (OldHeap->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("not support cluster global temporary tables yet")));
+
+
/*
* Also check for active uses of the relation in the current transaction,
* including open scans and pending AFTER trigger events.
@@ -1399,7 +1406,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index a13322b..be661a4 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 5440eb9..6aec01d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12,6 +12,9 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "postgres.h"
#include "access/genam.h"
@@ -531,6 +534,23 @@ static List *GetParentedForeignKeyRefs(Relation partition);
static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static bool
+has_oncommit_option(List *options)
+{
+ ListCell *listptr;
+
+ foreach(listptr, options)
+ {
+ DefElem *def = (DefElem *) lfirst(listptr);
+
+ if (pg_strcasecmp(def->defname, "on_commit_delete_rows") == 0)
+ return true;
+ }
+
+ return false;
+}
+
+
/* ----------------------------------------------------------------
* DefineRelation
* Creates a new relation.
@@ -574,6 +594,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
LOCKMODE parentLockmode;
const char *accessMethod = NULL;
Oid accessMethodId = InvalidOid;
+ bool has_oncommit_clause = false;
/*
* Truncate relname to appropriate length (probably a waste of time, as
@@ -585,7 +606,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -611,17 +632,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
RangeVarGetAndCheckCreationNamespace(stmt->relation, NoLock, NULL);
/*
- * Security check: disallow creating temp tables from security-restricted
- * code. This is needed because calling code might not expect untrusted
- * tables to appear in pg_temp at the front of its search path.
- */
- if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP
- && InSecurityRestrictedOperation())
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("cannot create temporary table within security-restricted operation")));
-
- /*
* Determine the lockmode to use when scanning parents. A self-exclusive
* lock is needed here.
*
@@ -716,6 +726,38 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
/*
* Parse and validate reloptions, if any.
*/
+ /* global temp table */
+ has_oncommit_clause = has_oncommit_option(stmt->options);
+ if (stmt->relation->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ if (has_oncommit_clause)
+ {
+ if (stmt->oncommit != ONCOMMIT_NOOP)
+ elog(ERROR, "can not defeine global temp table with on commit and with clause at same time");
+ }
+ else if (stmt->oncommit != ONCOMMIT_NOOP)
+ {
+ DefElem *opt = makeNode(DefElem);
+
+ opt->type = T_DefElem;
+ opt->defnamespace = NULL;
+ opt->defname = "on_commit_delete_rows";
+ opt->defaction = DEFELEM_UNSPEC;
+
+ /* use reloptions to remember on commit clause */
+ if (stmt->oncommit == ONCOMMIT_DELETE_ROWS)
+ opt->arg = (Node *)makeString("true");
+ else if (stmt->oncommit == ONCOMMIT_PRESERVE_ROWS)
+ opt->arg = (Node *)makeString("false");
+ else
+ elog(ERROR, "global temp table not support on commit drop clause");
+
+ stmt->options = lappend(stmt->options, opt);
+ }
+ }
+ else if (has_oncommit_clause)
+ elog(ERROR, "regular table cannot specifie on_commit_delete_rows");
+
reloptions = transformRelOptions((Datum) 0, stmt->options, NULL, validnsps,
true, false);
@@ -1777,7 +1819,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -3456,6 +3499,26 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
(void *) stmt);
}
+
+static bool
+CheckGlobalTempTableNotInUse(Relation rel)
+{
+ int id;
+ for (id = 1; id <= MaxBackends; id++)
+ {
+ if (id != MyBackendId)
+ {
+ struct stat fst;
+ char* path = relpathbackend(rel->rd_node, id, MAIN_FORKNUM);
+ int rc = stat(path, &fst);
+ pfree(path);
+ if (rc == 0 && fst.st_size != 0)
+ return false;
+ }
+ }
+ return true;
+}
+
/*
* AlterTable
* Execute ALTER TABLE, which can be a list of subcommands
@@ -3507,6 +3570,9 @@ AlterTable(Oid relid, LOCKMODE lockmode, AlterTableStmt *stmt)
rel = relation_open(relid, NoLock);
CheckTableNotInUse(rel, "ALTER TABLE");
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
+ && !CheckGlobalTempTableNotInUse(rel))
+ elog(ERROR, "Global temp table used by active backends can not be altered");
ATController(stmt, rel, stmt->cmds, stmt->relation->inh, lockmode);
}
@@ -7715,6 +7781,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14149,6 +14221,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -14636,14 +14715,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index db3a68a..60212b0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 7fe11b5..4f99402 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6306,7 +6306,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c508684..c9e228b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3268,20 +3268,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee47547..ea7fe4c 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1dd816..dcfc134 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2157,7 +2157,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7ad1073..a085893 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -2933,7 +2933,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
/* If it's a local relation, it's localbuf.c's problem. */
if (RelFileNodeBackendIsTemp(rnode))
{
- if (rnode.backend == MyBackendId)
+ if (GetRelationBackendId(rnode.backend) == MyBackendId)
{
for (j = 0; j < nforks; j++)
DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 8a9eaf6..997d331 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -27,12 +27,14 @@
#include "access/xlog.h"
#include "access/xlogutils.h"
+#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "miscadmin.h"
#include "pg_trace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/fd.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
@@ -40,6 +42,7 @@
#include "storage/sync.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
/*
* The magnetic disk storage manager keeps track of open file
@@ -87,6 +90,19 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ ForkNumber forknum;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +168,60 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, rel->forknum, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln, ForkNumber forknum)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->forknum = forknum;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
+static void
+RegisterOnCommitAction(SMgrRelation reln, ForkNumber forknum)
+{
+ if (reln->smgr_owner && forknum == MAIN_FORKNUM)
+ {
+ Relation rel = (Relation)((char*)reln->smgr_owner - offsetof(RelationData, rd_smgr));
+ if (rel->rd_options
+ && ((StdRdOptions *)rel->rd_options)->on_commit_delete_rows)
+ {
+ register_on_commit_action(rel->rd_id, ONCOMMIT_DELETE_ROWS);
+ }
+ }
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +288,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln, forkNum);
pfree(path);
@@ -465,6 +537,21 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln, forknum);
+ if (!(behavior & EXTENSION_RETURN_NULL))
+ RegisterOnCommitAction(reln, forknum);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +563,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -652,8 +740,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -743,7 +836,8 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index a87e721..2401361 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index c3e7d94..720dd52 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,110 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 50f8912..d9e0230 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1097,6 +1097,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3300,6 +3304,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index d69c0ff..87d273e 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1156,6 +1156,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 4688fbc..e191511 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -18,6 +18,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_statistic_d.h"
#include "funcapi.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_coerce.h"
@@ -30,6 +31,13 @@
#include "utils/syscache.h"
#include "utils/typcache.h"
+/*
+ * TODO: find less ugly way to declare core function returning pg_statistics.
+ * OID of pg_gtt_statistic_for_relation. This function should be handled in special way because it returns set of pg_statistics
+ * which contains attributes of anyarray type. Type of attributes can not be deduced from input parameters and
+ * it prevents using tuple descriptor in this case.
+ */
+#define GttStatisticFunctionId 3434
static void shutdown_MultiFuncCall(Datum arg);
static TypeFuncClass internal_get_result_type(Oid funcid,
@@ -341,7 +349,8 @@ internal_get_result_type(Oid funcid,
if (resolve_polymorphic_tupdesc(tupdesc,
&procform->proargtypes,
- call_expr))
+ call_expr) ||
+ funcid == GttStatisticFunctionId)
{
if (tupdesc->tdtypeid == RECORDOID &&
tupdesc->tdtypmod < 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 08658c8..f2adc5a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15635,8 +15635,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 62b9553..cef99d2 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -166,7 +166,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 090b6ba..6a39663 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ac8f64b..a683ef0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5476,7 +5476,14 @@
proname => 'pg_stat_get_xact_function_self_time', provolatile => 'v',
proparallel => 'r', prorettype => 'float8', proargtypes => 'oid',
prosrc => 'pg_stat_get_xact_function_self_time' },
-
+{ oid => '3434',
+ descr => 'show local statistics for global temp table',
+ proname => 'pg_gtt_statistic_for_relation', provolatile => 'v', proparallel => 'u',
+ prorettype => 'record', proretset => 't', prorows => '100', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,int2,bool,float4,int4,float4,int2,int2,int2,int2,int2,oid,oid,oid,oid,oid,oid,oid,oid,oid,oid,_float4,_float4,_float4,_float4,_float4,anyarray,anyarray,anyarray,anyarray,anyarray}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{relid,starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct,stakind1,stakind2,stakind3,stakind4,stakind5,staop1,staop2,staop3,staop4,staop5,stacoll1,stacoll2,stacoll3,stacoll4,stacoll5,stanumbers1,stanumbers2,stanumbers3,stanumbers4,stanumbers5,stavalues1,stavalues2,stavalues3,stavalues4,stavalues5}',
+ prosrc => 'pg_gtt_statistic_for_relation' },
{ oid => '3788',
descr => 'statistics: timestamp of the current statistics snapshot',
proname => 'pg_stat_get_snapshot_timestamp', provolatile => 's',
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 70ef8eb..11b4b89 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,12 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
+#define GetRelationBackendId(id) ((id) & ~SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 4ef6d8d..bac7a31 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 586500a..20aec72 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index ff1faba..31f615d 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 31d8a1a..171717e 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -271,6 +271,7 @@ typedef struct StdRdOptions
int parallel_workers; /* max number of parallel workers */
bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
+ bool on_commit_delete_rows; /* global temp table */
} StdRdOptions;
#define HEAP_MIN_FILLFACTOR 10
@@ -326,6 +327,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -334,6 +347,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -529,7 +543,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 918765c..5b1598b 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa192..ef7aa85 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index c9cc569..8c21b00 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1348,6 +1348,40 @@ pg_group| SELECT pg_authid.rolname AS groname,
WHERE (pg_auth_members.roleid = pg_authid.oid)) AS grolist
FROM pg_authid
WHERE (NOT pg_authid.rolcanlogin);
+pg_gtt_statistic| SELECT s.starelid,
+ s.staattnum,
+ s.stainherit,
+ s.stanullfrac,
+ s.stawidth,
+ s.stadistinct,
+ s.stakind1,
+ s.stakind2,
+ s.stakind3,
+ s.stakind4,
+ s.stakind5,
+ s.staop1,
+ s.staop2,
+ s.staop3,
+ s.staop4,
+ s.staop5,
+ s.stacoll1,
+ s.stacoll2,
+ s.stacoll3,
+ s.stacoll4,
+ s.stacoll5,
+ s.stanumbers1,
+ s.stanumbers2,
+ s.stanumbers3,
+ s.stanumbers4,
+ s.stanumbers5,
+ s.stavalues1,
+ s.stavalues2,
+ s.stavalues3,
+ s.stavalues4,
+ s.stavalues5
+ FROM pg_class c,
+ LATERAL pg_gtt_statistic_for_relation(c.oid) s(starelid, staattnum, stainherit, stanullfrac, stawidth, stadistinct, stakind1, stakind2, stakind3, stakind4, stakind5, staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, stavalues1, stavalues2, stavalues3, stavalues4, stavalues5)
+ WHERE (c.relpersistence = 's'::"char");
pg_hba_file_rules| SELECT a.line_number,
a.type,
a.database,
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d33a4e1..bda6854 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index f86f5c5..f94b32b 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
Hi all,
I am not aware enough in the Postgres internals to give advice about the implementation.
But my feeling is that there is another big interest for this feature: simplify the Oracle to PostgreSQL migration of applications that use global termporary tables. And this is quite common when stored procedures are used. In such a case, we currently need to modify the logic of the code, always implementing an ugly solution (either add CREATE TEMP TABLE statements in the code everywhere it is needed, or use a regular table with additional TRUNCATE statements if we can ensure that only a single connection uses the table at a time).
So, Konstantin and all, Thanks by advance for all that could be done on this feature :-)
Best regards.
Hi,
this patch was marked as waiting on author since the beginning of the
CF, most likely because it no longer applies (not sure). As there has
been very little activity since then, I've marked it as returned with
feedback. Feel free to re-submit an updated patch for 2020-03.
This definitely does not mean the feature is not desirable, but my
feeling is most of the discussion happens on the other thread dealing
with global temp tables [1]https://commitfest.postgresql.org/26/2349/ so maybe we should keep just that one and
combine the efforts.
[1]: https://commitfest.postgresql.org/26/2349/
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01.02.2020 14:49, Tomas Vondra wrote:
Hi,
this patch was marked as waiting on author since the beginning of the
CF, most likely because it no longer applies (not sure). As there has
been very little activity since then, I've marked it as returned with
feedback. Feel free to re-submit an updated patch for 2020-03.This definitely does not mean the feature is not desirable, but my
feeling is most of the discussion happens on the other thread dealing
with global temp tables [1] so maybe we should keep just that one and
combine the efforts.
New version of the patch with new method of GTT index construction is
attached.
Now GTT indexes are checked before query execution and are initialized
using AM build method.
So now GTT is supported for all indexes, including custom indexes.
--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-9.patchtext/x-patch; charset=UTF-8; name=global_private_temp-9.patchDownload
diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index 33e2d28b27..93059ef581 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -178,7 +178,7 @@ pg_prewarm(PG_FUNCTION_ARGS)
for (block = first_block; block <= last_block; ++block)
{
CHECK_FOR_INTERRUPTS();
- smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data);
+ smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data, false);
++blocks_done;
}
}
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 79430d2b7b..39baddc743 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ /*
+ * For global temp table only
+ * use AccessExclusiveLock for ensure safety
+ */
+ {
+ {
+ "on_commit_delete_rows",
+ "global temp table on commit options",
+ RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1486,6 +1499,8 @@ bytea *
default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{
static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)},
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
@@ -1586,13 +1601,17 @@ build_reloptions(Datum reloptions, bool validate,
bytea *
partitioned_table_reloptions(Datum reloptions, bool validate)
{
+ static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)}
+ };
/*
* There are no options for partitioned tables yet, but this is able to do
* some validation.
*/
return (bytea *) build_reloptions(reloptions, validate,
RELOPT_KIND_PARTITIONED,
- 0, NULL, 0);
+ sizeof(StdRdOptions), tab, lengthof(tab));
}
/*
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3fa4b766db..a86de5046f 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -670,6 +670,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 7d6acaed92..7c48e5c2ae 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -396,6 +396,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8880586c37..22ce8953fd 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3707,7 +3707,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index fddfbf1d8c..97478352ea 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -92,6 +92,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
@@ -367,7 +371,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
/* If we got a cancel signal during the copy of the data, quit */
CHECK_FOR_INTERRUPTS();
- smgrread(src, forkNum, blkno, buf.data);
+ smgrread(src, forkNum, blkno, buf.data, false);
if (!PageIsVerified(page, blkno))
ereport(ERROR,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e6060035..1f5e52b54a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1369,7 +1369,15 @@ LANGUAGE INTERNAL
STRICT STABLE PARALLEL SAFE
AS 'jsonb_path_query_first_tz';
+
+--
+-- Statistic for global temporary tables
--
+
+CREATE VIEW pg_gtt_statistic AS
+ SELECT s.* from pg_class c,pg_gtt_statistic_for_relation(c.oid) s where c.relpersistence='s';
+
+
-- The default permissions for functions mean that anyone can execute them.
-- A number of functions shouldn't be executable by just anyone, but rather
-- than use explicit 'superuser()' checks in those functions, we use the GRANT
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index c4420ddd7f..85d8f04eeb 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -40,6 +40,7 @@
#include "commands/vacuum.h"
#include "executor/executor.h"
#include "foreign/fdwapi.h"
+#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_oper.h"
@@ -103,7 +104,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -323,6 +324,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -586,14 +588,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1456,7 +1458,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1558,30 +1560,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
@@ -2890,3 +2904,72 @@ analyze_mcv_list(int *mcv_counts,
}
return num_mcv;
}
+
+PG_FUNCTION_INFO_V1(pg_gtt_statistic_for_relation);
+
+typedef struct
+{
+ int staattnum;
+ bool stainherit;
+} PgTempStatIteratorCtx;
+
+Datum
+pg_gtt_statistic_for_relation(PG_FUNCTION_ARGS)
+{
+ Oid starelid = PG_GETARG_OID(0);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ bool stainherit = false;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build tuplestore to hold the result rows */
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ /* Build a tuple descriptor for our result type */
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ do
+ {
+ int staattnum = 0;
+ while (true)
+ {
+ HeapTuple statup = SearchSysCacheCopy3(STATRELATTINH,
+ ObjectIdGetDatum(starelid),
+ Int16GetDatum(++staattnum),
+ BoolGetDatum(stainherit));
+ if (statup != NULL)
+ tuplestore_puttuple(tupstore, statup);
+ else
+ break;
+ }
+ stainherit = !stainherit;
+ } while (stainherit);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index e9d7a7ff79..a22a77aa5e 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -391,6 +391,13 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
errmsg("cannot vacuum temporary tables of other sessions")));
}
+ /* not support cluster global temp table yet */
+ if (OldHeap->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("not support cluster global temporary tables yet")));
+
+
/*
* Also check for active uses of the relation in the current transaction,
* including open scans and pending AFTER trigger events.
@@ -1399,7 +1406,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 6aab73bfd4..bc3c986096 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f599393473..1e4a52ee3f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12,6 +12,9 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "postgres.h"
#include "access/attmap.h"
@@ -555,6 +558,23 @@ static List *GetParentedForeignKeyRefs(Relation partition);
static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static bool
+has_oncommit_option(List *options)
+{
+ ListCell *listptr;
+
+ foreach(listptr, options)
+ {
+ DefElem *def = (DefElem *) lfirst(listptr);
+
+ if (pg_strcasecmp(def->defname, "on_commit_delete_rows") == 0)
+ return true;
+ }
+
+ return false;
+}
+
+
/* ----------------------------------------------------------------
* DefineRelation
* Creates a new relation.
@@ -598,6 +618,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
LOCKMODE parentLockmode;
const char *accessMethod = NULL;
Oid accessMethodId = InvalidOid;
+ bool has_oncommit_clause = false;
/*
* Truncate relname to appropriate length (probably a waste of time, as
@@ -609,7 +630,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -634,17 +655,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
namespaceId =
RangeVarGetAndCheckCreationNamespace(stmt->relation, NoLock, NULL);
- /*
- * Security check: disallow creating temp tables from security-restricted
- * code. This is needed because calling code might not expect untrusted
- * tables to appear in pg_temp at the front of its search path.
- */
- if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP
- && InSecurityRestrictedOperation())
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("cannot create temporary table within security-restricted operation")));
-
/*
* Determine the lockmode to use when scanning parents. A self-exclusive
* lock is needed here.
@@ -740,6 +750,38 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
/*
* Parse and validate reloptions, if any.
*/
+ /* global temp table */
+ has_oncommit_clause = has_oncommit_option(stmt->options);
+ if (stmt->relation->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ if (has_oncommit_clause)
+ {
+ if (stmt->oncommit != ONCOMMIT_NOOP)
+ elog(ERROR, "can not defeine global temp table with on commit and with clause at same time");
+ }
+ else if (stmt->oncommit != ONCOMMIT_NOOP)
+ {
+ DefElem *opt = makeNode(DefElem);
+
+ opt->type = T_DefElem;
+ opt->defnamespace = NULL;
+ opt->defname = "on_commit_delete_rows";
+ opt->defaction = DEFELEM_UNSPEC;
+
+ /* use reloptions to remember on commit clause */
+ if (stmt->oncommit == ONCOMMIT_DELETE_ROWS)
+ opt->arg = (Node *)makeString("true");
+ else if (stmt->oncommit == ONCOMMIT_PRESERVE_ROWS)
+ opt->arg = (Node *)makeString("false");
+ else
+ elog(ERROR, "global temp table not support on commit drop clause");
+
+ stmt->options = lappend(stmt->options, opt);
+ }
+ }
+ else if (has_oncommit_clause)
+ elog(ERROR, "regular table cannot specifie on_commit_delete_rows");
+
reloptions = transformRelOptions((Datum) 0, stmt->options, NULL, validnsps,
true, false);
@@ -1824,7 +1866,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -3511,6 +3554,26 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
(void *) stmt);
}
+
+static bool
+CheckGlobalTempTableNotInUse(Relation rel)
+{
+ int id;
+ for (id = 1; id <= MaxBackends; id++)
+ {
+ if (id != MyBackendId)
+ {
+ struct stat fst;
+ char* path = relpathbackend(rel->rd_node, id, MAIN_FORKNUM);
+ int rc = stat(path, &fst);
+ pfree(path);
+ if (rc == 0 && fst.st_size != 0)
+ return false;
+ }
+ }
+ return true;
+}
+
/*
* AlterTable
* Execute ALTER TABLE, which can be a list of subcommands
@@ -3568,6 +3631,9 @@ AlterTable(AlterTableStmt *stmt, LOCKMODE lockmode,
rel = relation_open(context->relid, NoLock);
CheckTableNotInUse(rel, "ALTER TABLE");
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
+ && !CheckGlobalTempTableNotInUse(rel))
+ elog(ERROR, "Global temp table used by active backends can not be altered");
ATController(stmt, rel, stmt->cmds, stmt->relation->inh, lockmode, context);
}
@@ -8169,6 +8235,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14629,6 +14701,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -15116,14 +15195,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8286d9cf34..7a12635e9c 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..fd4e713646 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6312,7 +6312,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d82fc5ab8b..95062ae344 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/heap.h"
+#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_statistic_ext.h"
@@ -46,6 +47,7 @@
#include "rewrite/rewriteManip.h"
#include "statistics/statistics.h"
#include "storage/bufmgr.h"
+#include "storage/buf_internals.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -80,6 +82,28 @@ static void set_baserel_partition_key_exprs(Relation relation,
static void set_baserel_partition_constraint(Relation relation,
RelOptInfo *rel);
+static bool
+is_index_valid(Relation index)
+{
+ if (!index->rd_index->indisvalid)
+ return false;
+
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ Buffer metapage = ReadBuffer(index, 0);
+ bool isNew = PageIsNew(BufferGetPage(metapage));
+ ReleaseBuffer(metapage);
+ if (isNew)
+ {
+ Relation heap;
+ DropRelFileNodeAllLocalBuffers(index->rd_smgr->smgr_rnode.node);
+ heap = RelationIdGetRelation(index->rd_index->indrelid);
+ index->rd_indam->ambuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+ }
+ return true;
+}
/*
* get_relation_info -
@@ -205,7 +229,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
* still needs to insert into "invalid" indexes, if they're marked
* indisready.
*/
- if (!index->indisvalid)
+ if (!is_index_valid(indexRelation))
{
index_close(indexRelation, NoLock);
continue;
@@ -704,7 +728,7 @@ infer_arbiter_indexes(PlannerInfo *root)
idxRel = index_open(indexoid, rte->rellockmode);
idxForm = idxRel->rd_index;
- if (!idxForm->indisvalid)
+ if (!is_index_valid(idxRel))
goto next;
/*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 1b0edf5d3d..787de8329a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3288,20 +3288,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee2d2b54a1..e7f3a20fc4 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -437,6 +437,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->sequence = makeRangeVar(snamespace, sname, -1);
seqstmt->options = seqoptions;
+ /*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
/*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6d1f28c327..4074344030 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2152,7 +2152,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index aba3960481..6322984c45 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -429,7 +429,7 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
)
-static Buffer ReadBuffer_common(SMgrRelation reln, char relpersistence,
+static Buffer ReadBuffer_common(SMgrRelation reln, char relpersistence, char relkind,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool *hit);
@@ -663,7 +663,7 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* miss.
*/
pgstat_count_buffer_read(reln);
- buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence,
+ buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence, reln->rd_rel->relkind,
forkNum, blockNum, mode, strategy, &hit);
if (hit)
pgstat_count_buffer_hit(reln);
@@ -691,7 +691,7 @@ ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
Assert(InRecovery);
- return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, forkNum, blockNum,
+ return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, RELKIND_RELATION, forkNum, blockNum,
mode, strategy, &hit);
}
@@ -702,7 +702,7 @@ ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
* *hit is set to true if the request was satisfied from shared buffer cache.
*/
static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(SMgrRelation smgr, char relpersistence, char relkind, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool *hit)
{
@@ -895,7 +895,8 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (track_io_timing)
INSTR_TIME_SET_CURRENT(io_start);
- smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
+ smgrread(smgr, forkNum, blockNum, (char *) bufBlock,
+ relkind == RELKIND_INDEX);
if (track_io_timing)
{
@@ -2943,7 +2944,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
/* If it's a local relation, it's localbuf.c's problem. */
if (RelFileNodeBackendIsTemp(rnode))
{
- if (rnode.backend == MyBackendId)
+ if (GetRelationBackendId(rnode.backend) == MyBackendId)
{
for (j = 0; j < nforks; j++)
DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index c5b771c531..4400b211f8 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -27,12 +27,14 @@
#include "access/xlog.h"
#include "access/xlogutils.h"
+#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "miscadmin.h"
#include "pg_trace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/fd.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
@@ -40,6 +42,7 @@
#include "storage/sync.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
/*
* The magnetic disk storage manager keeps track of open file
@@ -87,6 +90,19 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ ForkNumber forknum;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +168,60 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, rel->forknum, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln, ForkNumber forknum)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->forknum = forknum;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
+static void
+RegisterOnCommitAction(SMgrRelation reln, ForkNumber forknum)
+{
+ if (reln->smgr_owner && forknum == MAIN_FORKNUM)
+ {
+ Relation rel = (Relation)((char*)reln->smgr_owner - offsetof(RelationData, rd_smgr));
+ if (rel->rd_options
+ && ((StdRdOptions *)rel->rd_options)->on_commit_delete_rows)
+ {
+ register_on_commit_action(rel->rd_id, ONCOMMIT_DELETE_ROWS);
+ }
+ }
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +288,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln, forkNum);
pfree(path);
@@ -465,6 +537,21 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln, forknum);
+ if (!(behavior & EXTENSION_RETURN_NULL))
+ RegisterOnCommitAction(reln, forknum);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +563,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -599,7 +687,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum,
*/
void
mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer)
+ char *buffer, bool skipInit)
{
off_t seekpos;
int nbytes;
@@ -644,8 +732,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode) && !skipInit)
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -735,7 +828,8 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 360b5bf5bf..a7b491b8d5 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -52,7 +52,7 @@ typedef struct f_smgr
void (*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
void (*smgr_read) (SMgrRelation reln, ForkNumber forknum,
- BlockNumber blocknum, char *buffer);
+ BlockNumber blocknum, char *buffer, bool skipInit);
void (*smgr_write) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
void (*smgr_writeback) (SMgrRelation reln, ForkNumber forknum,
@@ -506,9 +506,9 @@ smgrprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
*/
void
smgrread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer)
+ char *buffer, bool skipInit)
{
- smgrsw[reln->smgr_which].smgr_read(reln, forknum, blocknum, buffer);
+ smgrsw[reln->smgr_which].smgr_read(reln, forknum, blocknum, buffer, skipInit);
}
/*
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 840664429e..0416549679 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 64776e3209..4996d8855c 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,110 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..dd0b1ff32f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1092,6 +1092,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3303,6 +3307,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 53d9ddf159..f263b8318c 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1156,6 +1156,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index b7eee3da1d..afe22b2b9f 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -18,6 +18,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_statistic_d.h"
#include "funcapi.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_coerce.h"
@@ -30,6 +31,13 @@
#include "utils/syscache.h"
#include "utils/typcache.h"
+/*
+ * TODO: find less ugly way to declare core function returning pg_statistics.
+ * OID of pg_gtt_statistic_for_relation. This function should be handled in special way because it returns set of pg_statistics
+ * which contains attributes of anyarray type. Type of attributes can not be deduced from input parameters and
+ * it prevents using tuple descriptor in this case.
+ */
+#define GttStatisticFunctionId 3434
static void shutdown_MultiFuncCall(Datum arg);
static TypeFuncClass internal_get_result_type(Oid funcid,
@@ -341,7 +349,8 @@ internal_get_result_type(Oid funcid,
if (resolve_polymorphic_tupdesc(tupdesc,
&procform->proargtypes,
- call_expr))
+ call_expr) ||
+ funcid == GttStatisticFunctionId)
{
if (tupdesc->tdtypeid == RECORDOID &&
tupdesc->tdtypmod < 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ec3e2c63b0..4c15822d5c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15635,8 +15635,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index ad733d1363..be38d1728b 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -169,7 +169,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index a12fc1fc46..89c3645c39 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2228256907..6757491d35 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5503,7 +5503,14 @@
proname => 'pg_stat_get_xact_function_self_time', provolatile => 'v',
proparallel => 'r', prorettype => 'float8', proargtypes => 'oid',
prosrc => 'pg_stat_get_xact_function_self_time' },
-
+{ oid => '3434',
+ descr => 'show local statistics for global temp table',
+ proname => 'pg_gtt_statistic_for_relation', provolatile => 'v', proparallel => 'u',
+ prorettype => 'record', proretset => 't', prorows => '100', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,int2,bool,float4,int4,float4,int2,int2,int2,int2,int2,oid,oid,oid,oid,oid,oid,oid,oid,oid,oid,_float4,_float4,_float4,_float4,_float4,anyarray,anyarray,anyarray,anyarray,anyarray}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{relid,starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct,stakind1,stakind2,stakind3,stakind4,stakind5,staop1,staop2,staop3,staop4,staop5,stacoll1,stacoll2,stacoll3,stacoll4,stacoll5,stanumbers1,stanumbers2,stanumbers3,stanumbers4,stanumbers5,stavalues1,stavalues2,stavalues3,stavalues4,stavalues5}',
+ prosrc => 'pg_gtt_statistic_for_relation' },
{ oid => '3788',
descr => 'statistics: timestamp of the current statistics snapshot',
proname => 'pg_stat_get_snapshot_timestamp', provolatile => 's',
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 0c776a3e6c..124fc3c8fb 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,12 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
+#define GetRelationBackendId(id) ((id) & ~SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 3f88683a05..7ecef10e41 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -228,6 +228,13 @@ typedef PageHeaderData *PageHeader;
*/
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
+/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
/*
* PageGetItemId
* Returns an item identifier of a page.
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ec7630ce3b..56838831b7 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -31,7 +31,7 @@ extern void mdextend(SMgrRelation reln, ForkNumber forknum,
extern void mdprefetch(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
extern void mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer);
+ char *buffer, bool skipInit);
extern void mdwrite(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern void mdwriteback(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4de9fc1e69..c45040b768 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,9 +75,24 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
+/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 243822137c..a4a2da2e0b 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -95,7 +95,7 @@ extern void smgrextend(SMgrRelation reln, ForkNumber forknum,
extern void smgrprefetch(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
extern void smgrread(SMgrRelation reln, ForkNumber forknum,
- BlockNumber blocknum, char *buffer);
+ BlockNumber blocknum, char *buffer, bool skipInit);
extern void smgrwrite(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern void smgrwriteback(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index f4aa316604..365b02a9ba 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04dd3f..ae56427cba 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -277,6 +277,7 @@ typedef struct StdRdOptions
int parallel_workers; /* max number of parallel workers */
bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
+ bool on_commit_delete_rows; /* global temp table */
} StdRdOptions;
#define HEAP_MIN_FILLFACTOR 10
@@ -332,6 +333,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -340,6 +353,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -535,7 +549,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index f27b73d76d..eaf21d987d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000000..6114f8c091
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa19230d..ef7aa85706 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000000..5e95dd6f85
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000000..ae1adb6673
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2ab2115fa1..7538601870 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1349,6 +1349,40 @@ pg_group| SELECT pg_authid.rolname AS groname,
WHERE (pg_auth_members.roleid = pg_authid.oid)) AS grolist
FROM pg_authid
WHERE (NOT pg_authid.rolcanlogin);
+pg_gtt_statistic| SELECT s.starelid,
+ s.staattnum,
+ s.stainherit,
+ s.stanullfrac,
+ s.stawidth,
+ s.stadistinct,
+ s.stakind1,
+ s.stakind2,
+ s.stakind3,
+ s.stakind4,
+ s.stakind5,
+ s.staop1,
+ s.staop2,
+ s.staop3,
+ s.staop4,
+ s.staop5,
+ s.stacoll1,
+ s.stacoll2,
+ s.stacoll3,
+ s.stacoll4,
+ s.stacoll5,
+ s.stanumbers1,
+ s.stanumbers2,
+ s.stanumbers3,
+ s.stanumbers4,
+ s.stanumbers5,
+ s.stavalues1,
+ s.stavalues2,
+ s.stavalues3,
+ s.stavalues4,
+ s.stavalues5
+ FROM pg_class c,
+ LATERAL pg_gtt_statistic_for_relation(c.oid) s(starelid, staattnum, stainherit, stanullfrac, stawidth, stadistinct, stakind1, stakind2, stakind3, stakind4, stakind5, staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, stavalues1, stavalues2, stavalues3, stavalues4, stavalues5)
+ WHERE (c.relpersistence = 's'::"char");
pg_hba_file_rules| SELECT a.line_number,
a.type,
a.database,
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000000..1b9b3f4d20
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d2b17dd3ea..71c8ca4f20 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index acba391332..71abe08e4e 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000000..3058b9b2c1
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000000..c6663dc89b
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
Fix GTT index initialization.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-10.patchtext/x-patch; name=global_private_temp-10.patchDownload
diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index 33e2d28..93059ef 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -178,7 +178,7 @@ pg_prewarm(PG_FUNCTION_ARGS)
for (block = first_block; block <= last_block; ++block)
{
CHECK_FOR_INTERRUPTS();
- smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data);
+ smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data, false);
++blocks_done;
}
}
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 79430d2..39baddc 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ /*
+ * For global temp table only
+ * use AccessExclusiveLock for ensure safety
+ */
+ {
+ {
+ "on_commit_delete_rows",
+ "global temp table on commit options",
+ RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1486,6 +1499,8 @@ bytea *
default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{
static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)},
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
@@ -1586,13 +1601,17 @@ build_reloptions(Datum reloptions, bool validate,
bytea *
partitioned_table_reloptions(Datum reloptions, bool validate)
{
+ static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)}
+ };
/*
* There are no options for partitioned tables yet, but this is able to do
* some validation.
*/
return (bytea *) build_reloptions(reloptions, validate,
RELOPT_KIND_PARTITIONED,
- 0, NULL, 0);
+ sizeof(StdRdOptions), tab, lengthof(tab));
}
/*
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3fa4b76..a86de50 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -670,6 +670,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 7d6acae..7c48e5c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -396,6 +396,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8880586..22ce895 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3707,7 +3707,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index fddfbf1..9747835 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -92,6 +92,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
@@ -367,7 +371,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
/* If we got a cancel signal during the copy of the data, quit */
CHECK_FOR_INTERRUPTS();
- smgrread(src, forkNum, blkno, buf.data);
+ smgrread(src, forkNum, blkno, buf.data, false);
if (!PageIsVerified(page, blkno))
ereport(ERROR,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e6060..1f5e52b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1369,7 +1369,15 @@ LANGUAGE INTERNAL
STRICT STABLE PARALLEL SAFE
AS 'jsonb_path_query_first_tz';
+
+--
+-- Statistic for global temporary tables
--
+
+CREATE VIEW pg_gtt_statistic AS
+ SELECT s.* from pg_class c,pg_gtt_statistic_for_relation(c.oid) s where c.relpersistence='s';
+
+
-- The default permissions for functions mean that anyone can execute them.
-- A number of functions shouldn't be executable by just anyone, but rather
-- than use explicit 'superuser()' checks in those functions, we use the GRANT
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index c4420dd..85d8f04 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -40,6 +40,7 @@
#include "commands/vacuum.h"
#include "executor/executor.h"
#include "foreign/fdwapi.h"
+#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_oper.h"
@@ -103,7 +104,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -323,6 +324,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -586,14 +588,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1456,7 +1458,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1558,30 +1560,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
@@ -2890,3 +2904,72 @@ analyze_mcv_list(int *mcv_counts,
}
return num_mcv;
}
+
+PG_FUNCTION_INFO_V1(pg_gtt_statistic_for_relation);
+
+typedef struct
+{
+ int staattnum;
+ bool stainherit;
+} PgTempStatIteratorCtx;
+
+Datum
+pg_gtt_statistic_for_relation(PG_FUNCTION_ARGS)
+{
+ Oid starelid = PG_GETARG_OID(0);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ bool stainherit = false;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build tuplestore to hold the result rows */
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ /* Build a tuple descriptor for our result type */
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ do
+ {
+ int staattnum = 0;
+ while (true)
+ {
+ HeapTuple statup = SearchSysCacheCopy3(STATRELATTINH,
+ ObjectIdGetDatum(starelid),
+ Int16GetDatum(++staattnum),
+ BoolGetDatum(stainherit));
+ if (statup != NULL)
+ tuplestore_puttuple(tupstore, statup);
+ else
+ break;
+ }
+ stainherit = !stainherit;
+ } while (stainherit);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index e9d7a7f..a22a77a 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -391,6 +391,13 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
errmsg("cannot vacuum temporary tables of other sessions")));
}
+ /* not support cluster global temp table yet */
+ if (OldHeap->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("not support cluster global temporary tables yet")));
+
+
/*
* Also check for active uses of the relation in the current transaction,
* including open scans and pending AFTER trigger events.
@@ -1399,7 +1406,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 6aab73b..bc3c986 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f599393..1e4a52e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12,6 +12,9 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "postgres.h"
#include "access/attmap.h"
@@ -555,6 +558,23 @@ static List *GetParentedForeignKeyRefs(Relation partition);
static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static bool
+has_oncommit_option(List *options)
+{
+ ListCell *listptr;
+
+ foreach(listptr, options)
+ {
+ DefElem *def = (DefElem *) lfirst(listptr);
+
+ if (pg_strcasecmp(def->defname, "on_commit_delete_rows") == 0)
+ return true;
+ }
+
+ return false;
+}
+
+
/* ----------------------------------------------------------------
* DefineRelation
* Creates a new relation.
@@ -598,6 +618,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
LOCKMODE parentLockmode;
const char *accessMethod = NULL;
Oid accessMethodId = InvalidOid;
+ bool has_oncommit_clause = false;
/*
* Truncate relname to appropriate length (probably a waste of time, as
@@ -609,7 +630,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -635,17 +656,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
RangeVarGetAndCheckCreationNamespace(stmt->relation, NoLock, NULL);
/*
- * Security check: disallow creating temp tables from security-restricted
- * code. This is needed because calling code might not expect untrusted
- * tables to appear in pg_temp at the front of its search path.
- */
- if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP
- && InSecurityRestrictedOperation())
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("cannot create temporary table within security-restricted operation")));
-
- /*
* Determine the lockmode to use when scanning parents. A self-exclusive
* lock is needed here.
*
@@ -740,6 +750,38 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
/*
* Parse and validate reloptions, if any.
*/
+ /* global temp table */
+ has_oncommit_clause = has_oncommit_option(stmt->options);
+ if (stmt->relation->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ if (has_oncommit_clause)
+ {
+ if (stmt->oncommit != ONCOMMIT_NOOP)
+ elog(ERROR, "can not defeine global temp table with on commit and with clause at same time");
+ }
+ else if (stmt->oncommit != ONCOMMIT_NOOP)
+ {
+ DefElem *opt = makeNode(DefElem);
+
+ opt->type = T_DefElem;
+ opt->defnamespace = NULL;
+ opt->defname = "on_commit_delete_rows";
+ opt->defaction = DEFELEM_UNSPEC;
+
+ /* use reloptions to remember on commit clause */
+ if (stmt->oncommit == ONCOMMIT_DELETE_ROWS)
+ opt->arg = (Node *)makeString("true");
+ else if (stmt->oncommit == ONCOMMIT_PRESERVE_ROWS)
+ opt->arg = (Node *)makeString("false");
+ else
+ elog(ERROR, "global temp table not support on commit drop clause");
+
+ stmt->options = lappend(stmt->options, opt);
+ }
+ }
+ else if (has_oncommit_clause)
+ elog(ERROR, "regular table cannot specifie on_commit_delete_rows");
+
reloptions = transformRelOptions((Datum) 0, stmt->options, NULL, validnsps,
true, false);
@@ -1824,7 +1866,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -3511,6 +3554,26 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
(void *) stmt);
}
+
+static bool
+CheckGlobalTempTableNotInUse(Relation rel)
+{
+ int id;
+ for (id = 1; id <= MaxBackends; id++)
+ {
+ if (id != MyBackendId)
+ {
+ struct stat fst;
+ char* path = relpathbackend(rel->rd_node, id, MAIN_FORKNUM);
+ int rc = stat(path, &fst);
+ pfree(path);
+ if (rc == 0 && fst.st_size != 0)
+ return false;
+ }
+ }
+ return true;
+}
+
/*
* AlterTable
* Execute ALTER TABLE, which can be a list of subcommands
@@ -3568,6 +3631,9 @@ AlterTable(AlterTableStmt *stmt, LOCKMODE lockmode,
rel = relation_open(context->relid, NoLock);
CheckTableNotInUse(rel, "ALTER TABLE");
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
+ && !CheckGlobalTempTableNotInUse(rel))
+ elog(ERROR, "Global temp table used by active backends can not be altered");
ATController(stmt, rel, stmt->cmds, stmt->relation->inh, lockmode, context);
}
@@ -8169,6 +8235,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14629,6 +14701,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -15116,14 +15195,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 59d1a31..0442565 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -354,6 +354,7 @@ ExecInsert(ModifyTableState *mtstate,
ModifyTable *node = (ModifyTable *) mtstate->ps.plan;
OnConflictAction onconflict = node->onConflictAction;
+ if (resultRelInfo->ri_NumIndices > 0)
ExecMaterializeSlot(slot);
/*
@@ -361,7 +362,14 @@ ExecInsert(ModifyTableState *mtstate,
*/
resultRelInfo = estate->es_result_relation_info;
resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
+ if (resultRelationDesc->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ int i;
+ for (i = 0; i < resultRelInfo->ri_NumIndices; i++)
+ {
+ InitGTTIndexes(resultRelInfo->ri_IndexRelationDescs[i]);
+ }
+ }
/*
* BEFORE ROW INSERT Triggers.
*
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8286d9c..7a12635 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153..fd4e713 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6312,7 +6312,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d82fc5a..619ed96 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -80,6 +80,15 @@ static void set_baserel_partition_key_exprs(Relation relation,
static void set_baserel_partition_constraint(Relation relation,
RelOptInfo *rel);
+static bool
+is_index_valid(Relation index)
+{
+ if (!index->rd_index->indisvalid)
+ return false;
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ InitGTTIndexes(index);
+ return true;
+}
/*
* get_relation_info -
@@ -205,7 +214,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
* still needs to insert into "invalid" indexes, if they're marked
* indisready.
*/
- if (!index->indisvalid)
+ if (!is_index_valid(indexRelation))
{
index_close(indexRelation, NoLock);
continue;
@@ -704,7 +713,7 @@ infer_arbiter_indexes(PlannerInfo *root)
idxRel = index_open(indexoid, rte->rellockmode);
idxForm = idxRel->rd_index;
- if (!idxForm->indisvalid)
+ if (!is_index_valid(idxRel))
goto next;
/*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 1b0edf5..787de83 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3288,20 +3288,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee2d2b5..e7f3a20 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -438,6 +438,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6d1f28c..4074344 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2152,7 +2152,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index aba3960..2d88cc9 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -33,9 +33,11 @@
#include <sys/file.h>
#include <unistd.h>
+#include "access/amapi.h"
#include "access/tableam.h"
#include "access/xlog.h"
#include "catalog/catalog.h"
+#include "catalog/index.h"
#include "catalog/storage.h"
#include "executor/instrument.h"
#include "lib/binaryheap.h"
@@ -429,7 +431,7 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
)
-static Buffer ReadBuffer_common(SMgrRelation reln, char relpersistence,
+static Buffer ReadBuffer_common(SMgrRelation reln, char relpersistence, char relkind,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool *hit);
@@ -663,7 +665,7 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* miss.
*/
pgstat_count_buffer_read(reln);
- buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence,
+ buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence, reln->rd_rel->relkind,
forkNum, blockNum, mode, strategy, &hit);
if (hit)
pgstat_count_buffer_hit(reln);
@@ -691,7 +693,7 @@ ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
Assert(InRecovery);
- return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, forkNum, blockNum,
+ return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, RELKIND_RELATION, forkNum, blockNum,
mode, strategy, &hit);
}
@@ -702,7 +704,7 @@ ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
* *hit is set to true if the request was satisfied from shared buffer cache.
*/
static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(SMgrRelation smgr, char relpersistence, char relkind, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool *hit)
{
@@ -895,7 +897,8 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (track_io_timing)
INSTR_TIME_SET_CURRENT(io_start);
- smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
+ smgrread(smgr, forkNum, blockNum, (char *) bufBlock,
+ relkind == RELKIND_INDEX);
if (track_io_timing)
{
@@ -2943,7 +2946,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
/* If it's a local relation, it's localbuf.c's problem. */
if (RelFileNodeBackendIsTemp(rnode))
{
- if (rnode.backend == MyBackendId)
+ if (GetRelationBackendId(rnode.backend) == MyBackendId)
{
for (j = 0; j < nforks; j++)
DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
@@ -4423,3 +4426,19 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+void InitGTTIndexes(Relation index)
+{
+ Buffer metapage = ReadBuffer(index, 0);
+ bool isNew = PageIsNew(BufferGetPage(metapage));
+ Assert(index->rd_rel->relpersistence == RELPERSISTENCE_SESSION);
+ ReleaseBuffer(metapage);
+ if (isNew)
+ {
+ Relation heap;
+ DropRelFileNodeAllLocalBuffers(index->rd_smgr->smgr_rnode.node);
+ heap = RelationIdGetRelation(index->rd_index->indrelid);
+ index->rd_indam->ambuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+}
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index c5b771c..4400b21 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -27,12 +27,14 @@
#include "access/xlog.h"
#include "access/xlogutils.h"
+#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "miscadmin.h"
#include "pg_trace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/fd.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
@@ -40,6 +42,7 @@
#include "storage/sync.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
/*
* The magnetic disk storage manager keeps track of open file
@@ -87,6 +90,19 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ ForkNumber forknum;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +168,60 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, rel->forknum, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln, ForkNumber forknum)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->forknum = forknum;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
+static void
+RegisterOnCommitAction(SMgrRelation reln, ForkNumber forknum)
+{
+ if (reln->smgr_owner && forknum == MAIN_FORKNUM)
+ {
+ Relation rel = (Relation)((char*)reln->smgr_owner - offsetof(RelationData, rd_smgr));
+ if (rel->rd_options
+ && ((StdRdOptions *)rel->rd_options)->on_commit_delete_rows)
+ {
+ register_on_commit_action(rel->rd_id, ONCOMMIT_DELETE_ROWS);
+ }
+ }
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +288,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln, forkNum);
pfree(path);
@@ -465,6 +537,21 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln, forknum);
+ if (!(behavior & EXTENSION_RETURN_NULL))
+ RegisterOnCommitAction(reln, forknum);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +563,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -599,7 +687,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum,
*/
void
mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer)
+ char *buffer, bool skipInit)
{
off_t seekpos;
int nbytes;
@@ -644,8 +732,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode) && !skipInit)
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -735,7 +828,8 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 360b5bf..a7b491b 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -52,7 +52,7 @@ typedef struct f_smgr
void (*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
void (*smgr_read) (SMgrRelation reln, ForkNumber forknum,
- BlockNumber blocknum, char *buffer);
+ BlockNumber blocknum, char *buffer, bool skipInit);
void (*smgr_write) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
void (*smgr_writeback) (SMgrRelation reln, ForkNumber forknum,
@@ -506,9 +506,9 @@ smgrprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
*/
void
smgrread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer)
+ char *buffer, bool skipInit)
{
- smgrsw[reln->smgr_which].smgr_read(reln, forknum, blocknum, buffer);
+ smgrsw[reln->smgr_which].smgr_read(reln, forknum, blocknum, buffer, skipInit);
}
/*
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 8406644..0416549 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 64776e3..4996d88 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,110 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5..dd0b1ff 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1092,6 +1092,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3303,6 +3307,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 53d9ddf..f263b83 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1156,6 +1156,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index b7eee3d..afe22b2 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -18,6 +18,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_statistic_d.h"
#include "funcapi.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_coerce.h"
@@ -30,6 +31,13 @@
#include "utils/syscache.h"
#include "utils/typcache.h"
+/*
+ * TODO: find less ugly way to declare core function returning pg_statistics.
+ * OID of pg_gtt_statistic_for_relation. This function should be handled in special way because it returns set of pg_statistics
+ * which contains attributes of anyarray type. Type of attributes can not be deduced from input parameters and
+ * it prevents using tuple descriptor in this case.
+ */
+#define GttStatisticFunctionId 3434
static void shutdown_MultiFuncCall(Datum arg);
static TypeFuncClass internal_get_result_type(Oid funcid,
@@ -341,7 +349,8 @@ internal_get_result_type(Oid funcid,
if (resolve_polymorphic_tupdesc(tupdesc,
&procform->proargtypes,
- call_expr))
+ call_expr) ||
+ funcid == GttStatisticFunctionId)
{
if (tupdesc->tdtypeid == RECORDOID &&
tupdesc->tdtypmod < 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ec3e2c6..4c15822 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15635,8 +15635,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index ad733d1..be38d17 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -169,7 +169,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index a12fc1f..89c3645 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2228256..6757491 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5503,7 +5503,14 @@
proname => 'pg_stat_get_xact_function_self_time', provolatile => 'v',
proparallel => 'r', prorettype => 'float8', proargtypes => 'oid',
prosrc => 'pg_stat_get_xact_function_self_time' },
-
+{ oid => '3434',
+ descr => 'show local statistics for global temp table',
+ proname => 'pg_gtt_statistic_for_relation', provolatile => 'v', proparallel => 'u',
+ prorettype => 'record', proretset => 't', prorows => '100', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,int2,bool,float4,int4,float4,int2,int2,int2,int2,int2,oid,oid,oid,oid,oid,oid,oid,oid,oid,oid,_float4,_float4,_float4,_float4,_float4,anyarray,anyarray,anyarray,anyarray,anyarray}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{relid,starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct,stakind1,stakind2,stakind3,stakind4,stakind5,staop1,staop2,staop3,staop4,staop5,stacoll1,stacoll2,stacoll3,stacoll4,stacoll5,stanumbers1,stanumbers2,stanumbers3,stanumbers4,stanumbers5,stavalues1,stavalues2,stavalues3,stavalues4,stavalues5}',
+ prosrc => 'pg_gtt_statistic_for_relation' },
{ oid => '3788',
descr => 'statistics: timestamp of the current statistics snapshot',
proname => 'pg_stat_get_snapshot_timestamp', provolatile => 's',
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 0c776a3..124fc3c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,12 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
+#define GetRelationBackendId(id) ((id) & ~SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 73c7e9b..467134c 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -231,6 +231,7 @@ extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype);
extern void FreeAccessStrategy(BufferAccessStrategy strategy);
+extern void InitGTTIndexes(Relation rel);
/* inline functions */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 3f88683..7ecef10 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ec7630c..5683883 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -31,7 +31,7 @@ extern void mdextend(SMgrRelation reln, ForkNumber forknum,
extern void mdprefetch(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
extern void mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer);
+ char *buffer, bool skipInit);
extern void mdwrite(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern void mdwriteback(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4de9fc1..c45040b 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 2438221..a4a2da2 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -95,7 +95,7 @@ extern void smgrextend(SMgrRelation reln, ForkNumber forknum,
extern void smgrprefetch(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
extern void smgrread(SMgrRelation reln, ForkNumber forknum,
- BlockNumber blocknum, char *buffer);
+ BlockNumber blocknum, char *buffer, bool skipInit);
extern void smgrwrite(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern void smgrwriteback(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index f4aa316..365b02a 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04d..ae56427 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -277,6 +277,7 @@ typedef struct StdRdOptions
int parallel_workers; /* max number of parallel workers */
bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
+ bool on_commit_delete_rows; /* global temp table */
} StdRdOptions;
#define HEAP_MIN_FILLFACTOR 10
@@ -332,6 +333,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -340,6 +353,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -535,7 +549,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index f27b73d..eaf21d9 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa192..ef7aa85 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2ab2115..7538601 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1349,6 +1349,40 @@ pg_group| SELECT pg_authid.rolname AS groname,
WHERE (pg_auth_members.roleid = pg_authid.oid)) AS grolist
FROM pg_authid
WHERE (NOT pg_authid.rolcanlogin);
+pg_gtt_statistic| SELECT s.starelid,
+ s.staattnum,
+ s.stainherit,
+ s.stanullfrac,
+ s.stawidth,
+ s.stadistinct,
+ s.stakind1,
+ s.stakind2,
+ s.stakind3,
+ s.stakind4,
+ s.stakind5,
+ s.staop1,
+ s.staop2,
+ s.staop3,
+ s.staop4,
+ s.staop5,
+ s.stacoll1,
+ s.stacoll2,
+ s.stacoll3,
+ s.stacoll4,
+ s.stacoll5,
+ s.stanumbers1,
+ s.stanumbers2,
+ s.stanumbers3,
+ s.stanumbers4,
+ s.stanumbers5,
+ s.stavalues1,
+ s.stavalues2,
+ s.stavalues3,
+ s.stavalues4,
+ s.stavalues5
+ FROM pg_class c,
+ LATERAL pg_gtt_statistic_for_relation(c.oid) s(starelid, staattnum, stainherit, stanullfrac, stawidth, stadistinct, stakind1, stakind2, stakind3, stakind4, stakind5, staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, stavalues1, stavalues2, stavalues3, stavalues4, stavalues5)
+ WHERE (c.relpersistence = 's'::"char");
pg_hba_file_rules| SELECT a.line_number,
a.type,
a.database,
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d2b17dd..71c8ca4 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index acba391..71abe08 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
Sorry, small typo in the last patch.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
global_private_temp-11.patchtext/x-patch; name=global_private_temp-11.patchDownload
diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index 33e2d28..93059ef 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -178,7 +178,7 @@ pg_prewarm(PG_FUNCTION_ARGS)
for (block = first_block; block <= last_block; ++block)
{
CHECK_FOR_INTERRUPTS();
- smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data);
+ smgrread(rel->rd_smgr, forkNumber, block, blockbuffer.data, false);
++blocks_done;
}
}
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 79430d2..39baddc 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -158,6 +158,19 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ /*
+ * For global temp table only
+ * use AccessExclusiveLock for ensure safety
+ */
+ {
+ {
+ "on_commit_delete_rows",
+ "global temp table on commit options",
+ RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1486,6 +1499,8 @@ bytea *
default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{
static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)},
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
@@ -1586,13 +1601,17 @@ build_reloptions(Datum reloptions, bool validate,
bytea *
partitioned_table_reloptions(Datum reloptions, bool validate)
{
+ static const relopt_parse_elt tab[] = {
+ {"on_commit_delete_rows", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, on_commit_delete_rows)}
+ };
/*
* There are no options for partitioned tables yet, but this is able to do
* some validation.
*/
return (bytea *) build_reloptions(reloptions, validate,
RELOPT_KIND_PARTITIONED,
- 0, NULL, 0);
+ sizeof(StdRdOptions), tab, lengthof(tab));
}
/*
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3fa4b76..a86de50 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -670,6 +670,7 @@ heapam_relation_copy_data(Relation rel, const RelFileNode *newrnode)
* init fork of an unlogged relation.
*/
if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ||
(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
forkNum == INIT_FORKNUM))
log_smgrcreate(newrnode, forkNum);
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 7d6acae..7c48e5c 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -396,6 +396,9 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
case RELPERSISTENCE_TEMP:
backend = BackendIdForTempRelations();
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8880586..22ce895 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3707,7 +3707,7 @@ reindex_relation(Oid relid, int flags, int options)
if (flags & REINDEX_REL_FORCE_INDEXES_UNLOGGED)
persistence = RELPERSISTENCE_UNLOGGED;
else if (flags & REINDEX_REL_FORCE_INDEXES_PERMANENT)
- persistence = RELPERSISTENCE_PERMANENT;
+ persistence = rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION ? RELPERSISTENCE_SESSION : RELPERSISTENCE_PERMANENT;
else
persistence = rel->rd_rel->relpersistence;
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index fddfbf1..9747835 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -92,6 +92,10 @@ RelationCreateStorage(RelFileNode rnode, char relpersistence)
backend = InvalidBackendId;
needs_wal = false;
break;
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ needs_wal = false;
+ break;
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
needs_wal = true;
@@ -367,7 +371,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
/* If we got a cancel signal during the copy of the data, quit */
CHECK_FOR_INTERRUPTS();
- smgrread(src, forkNum, blkno, buf.data);
+ smgrread(src, forkNum, blkno, buf.data, false);
if (!PageIsVerified(page, blkno))
ereport(ERROR,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e6060..1f5e52b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1369,7 +1369,15 @@ LANGUAGE INTERNAL
STRICT STABLE PARALLEL SAFE
AS 'jsonb_path_query_first_tz';
+
+--
+-- Statistic for global temporary tables
--
+
+CREATE VIEW pg_gtt_statistic AS
+ SELECT s.* from pg_class c,pg_gtt_statistic_for_relation(c.oid) s where c.relpersistence='s';
+
+
-- The default permissions for functions mean that anyone can execute them.
-- A number of functions shouldn't be executable by just anyone, but rather
-- than use explicit 'superuser()' checks in those functions, we use the GRANT
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index c4420dd..85d8f04 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -40,6 +40,7 @@
#include "commands/vacuum.h"
#include "executor/executor.h"
#include "foreign/fdwapi.h"
+#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_oper.h"
@@ -103,7 +104,7 @@ static int acquire_inherited_sample_rows(Relation onerel, int elevel,
HeapTuple *rows, int targrows,
double *totalrows, double *totaldeadrows);
static void update_attstats(Oid relid, bool inh,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats, bool is_global_temp);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
@@ -323,6 +324,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
Oid save_userid;
int save_sec_context;
int save_nestlevel;
+ bool is_global_temp = onerel->rd_rel->relpersistence == RELPERSISTENCE_SESSION;
if (inh)
ereport(elevel,
@@ -586,14 +588,14 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
* pg_statistic for columns we didn't process, we leave them alone.)
*/
update_attstats(RelationGetRelid(onerel), inh,
- attr_cnt, vacattrstats);
+ attr_cnt, vacattrstats, is_global_temp);
for (ind = 0; ind < nindexes; ind++)
{
AnlIndexData *thisdata = &indexdata[ind];
update_attstats(RelationGetRelid(Irel[ind]), false,
- thisdata->attr_cnt, thisdata->vacattrstats);
+ thisdata->attr_cnt, thisdata->vacattrstats, is_global_temp);
}
/*
@@ -1456,7 +1458,7 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
* by taking a self-exclusive lock on the relation in analyze_rel().
*/
static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats, bool is_global_temp)
{
Relation sd;
int attno;
@@ -1558,30 +1560,42 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
}
}
- /* Is there already a pg_statistic tuple for this attribute? */
- oldtup = SearchSysCache3(STATRELATTINH,
- ObjectIdGetDatum(relid),
- Int16GetDatum(stats->attr->attnum),
- BoolGetDatum(inh));
-
- if (HeapTupleIsValid(oldtup))
+ if (is_global_temp)
{
- /* Yes, replace it */
- stup = heap_modify_tuple(oldtup,
- RelationGetDescr(sd),
- values,
- nulls,
- replaces);
- ReleaseSysCache(oldtup);
- CatalogTupleUpdate(sd, &stup->t_self, stup);
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ InsertSysCache(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh),
+ 0,
+ stup);
}
else
{
- /* No, insert new tuple */
- stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
- CatalogTupleInsert(sd, stup);
- }
+ /* Is there already a pg_statistic tuple for this attribute? */
+ oldtup = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(relid),
+ Int16GetDatum(stats->attr->attnum),
+ BoolGetDatum(inh));
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ CatalogTupleUpdate(sd, &stup->t_self, stup);
+ }
+ else
+ {
+ /* No, insert new tuple */
+ stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
+ CatalogTupleInsert(sd, stup);
+ }
+ }
heap_freetuple(stup);
}
@@ -2890,3 +2904,72 @@ analyze_mcv_list(int *mcv_counts,
}
return num_mcv;
}
+
+PG_FUNCTION_INFO_V1(pg_gtt_statistic_for_relation);
+
+typedef struct
+{
+ int staattnum;
+ bool stainherit;
+} PgTempStatIteratorCtx;
+
+Datum
+pg_gtt_statistic_for_relation(PG_FUNCTION_ARGS)
+{
+ Oid starelid = PG_GETARG_OID(0);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ bool stainherit = false;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));
+
+ /* Build tuplestore to hold the result rows */
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ /* Build a tuple descriptor for our result type */
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ do
+ {
+ int staattnum = 0;
+ while (true)
+ {
+ HeapTuple statup = SearchSysCacheCopy3(STATRELATTINH,
+ ObjectIdGetDatum(starelid),
+ Int16GetDatum(++staattnum),
+ BoolGetDatum(stainherit));
+ if (statup != NULL)
+ tuplestore_puttuple(tupstore, statup);
+ else
+ break;
+ }
+ stainherit = !stainherit;
+ } while (stainherit);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index e9d7a7f..a22a77a 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -391,6 +391,13 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
errmsg("cannot vacuum temporary tables of other sessions")));
}
+ /* not support cluster global temp table yet */
+ if (OldHeap->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("not support cluster global temporary tables yet")));
+
+
/*
* Also check for active uses of the relation in the current transaction,
* including open scans and pending AFTER trigger events.
@@ -1399,7 +1406,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
*/
if (newrelpersistence == RELPERSISTENCE_UNLOGGED)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_UNLOGGED;
- else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
+ else if (newrelpersistence != RELPERSISTENCE_TEMP)
reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
/* Report that we are now reindexing relations */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 6aab73b..bc3c986 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -94,7 +94,7 @@ static HTAB *seqhashtab = NULL; /* hash table for SeqTable items */
*/
static SeqTableData *last_used_seq = NULL;
-static void fill_seq_with_data(Relation rel, HeapTuple tuple);
+static void fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf);
static Relation lock_and_open_sequence(SeqTable seq);
static void create_seq_hashtable(void);
static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel);
@@ -222,7 +222,7 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
/* now initialize the sequence's data */
tuple = heap_form_tuple(tupDesc, value, null);
- fill_seq_with_data(rel, tuple);
+ fill_seq_with_data(rel, tuple, InvalidBuffer);
/* process OWNED BY if given */
if (owned_by)
@@ -327,7 +327,7 @@ ResetSequence(Oid seq_relid)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seq_rel, tuple);
+ fill_seq_with_data(seq_rel, tuple, InvalidBuffer);
/* Clear local cache so that we don't think we have cached numbers */
/* Note that we do not change the currval() state */
@@ -340,18 +340,21 @@ ResetSequence(Oid seq_relid)
* Initialize a sequence's relation with the specified tuple as content
*/
static void
-fill_seq_with_data(Relation rel, HeapTuple tuple)
+fill_seq_with_data(Relation rel, HeapTuple tuple, Buffer buf)
{
- Buffer buf;
Page page;
sequence_magic *sm;
OffsetNumber offnum;
+ bool lockBuffer = false;
/* Initialize first page of relation with special magic number */
- buf = ReadBuffer(rel, P_NEW);
- Assert(BufferGetBlockNumber(buf) == 0);
-
+ if (buf == InvalidBuffer)
+ {
+ buf = ReadBuffer(rel, P_NEW);
+ Assert(BufferGetBlockNumber(buf) == 0);
+ lockBuffer = true;
+ }
page = BufferGetPage(buf);
PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
@@ -360,7 +363,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
/* Now insert sequence tuple */
- LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+ if (lockBuffer)
+ LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
/*
* Since VACUUM does not process sequences, we have to force the tuple to
@@ -410,7 +414,8 @@ fill_seq_with_data(Relation rel, HeapTuple tuple)
END_CRIT_SECTION();
- UnlockReleaseBuffer(buf);
+ if (lockBuffer)
+ UnlockReleaseBuffer(buf);
}
/*
@@ -502,7 +507,7 @@ AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
/*
* Insert the modified tuple into the new storage file.
*/
- fill_seq_with_data(seqrel, newdatatuple);
+ fill_seq_with_data(seqrel, newdatatuple, InvalidBuffer);
}
/* process OWNED BY if given */
@@ -1178,6 +1183,17 @@ read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
page = BufferGetPage(*buf);
+ if (GlobalTempRelationPageIsNotInitialized(rel, page))
+ {
+ /* Initialize sequence for global temporary tables */
+ Datum value[SEQ_COL_LASTCOL] = {0};
+ bool null[SEQ_COL_LASTCOL] = {false};
+ HeapTuple tuple;
+ value[SEQ_COL_LASTVAL-1] = Int64GetDatumFast(1); /* start sequence with 1 */
+ tuple = heap_form_tuple(RelationGetDescr(rel), value, null);
+ fill_seq_with_data(rel, tuple, *buf);
+ }
+
sm = (sequence_magic *) PageGetSpecialPointer(page);
if (sm->magic != SEQ_MAGIC)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f599393..1e4a52e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -12,6 +12,9 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "postgres.h"
#include "access/attmap.h"
@@ -555,6 +558,23 @@ static List *GetParentedForeignKeyRefs(Relation partition);
static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static bool
+has_oncommit_option(List *options)
+{
+ ListCell *listptr;
+
+ foreach(listptr, options)
+ {
+ DefElem *def = (DefElem *) lfirst(listptr);
+
+ if (pg_strcasecmp(def->defname, "on_commit_delete_rows") == 0)
+ return true;
+ }
+
+ return false;
+}
+
+
/* ----------------------------------------------------------------
* DefineRelation
* Creates a new relation.
@@ -598,6 +618,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
LOCKMODE parentLockmode;
const char *accessMethod = NULL;
Oid accessMethodId = InvalidOid;
+ bool has_oncommit_clause = false;
/*
* Truncate relname to appropriate length (probably a waste of time, as
@@ -609,7 +630,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
* Check consistency of arguments
*/
if (stmt->oncommit != ONCOMMIT_NOOP
- && stmt->relation->relpersistence != RELPERSISTENCE_TEMP)
+ && !IsLocalRelpersistence(stmt->relation->relpersistence))
ereport(ERROR,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("ON COMMIT can only be used on temporary tables")));
@@ -635,17 +656,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
RangeVarGetAndCheckCreationNamespace(stmt->relation, NoLock, NULL);
/*
- * Security check: disallow creating temp tables from security-restricted
- * code. This is needed because calling code might not expect untrusted
- * tables to appear in pg_temp at the front of its search path.
- */
- if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP
- && InSecurityRestrictedOperation())
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("cannot create temporary table within security-restricted operation")));
-
- /*
* Determine the lockmode to use when scanning parents. A self-exclusive
* lock is needed here.
*
@@ -740,6 +750,38 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
/*
* Parse and validate reloptions, if any.
*/
+ /* global temp table */
+ has_oncommit_clause = has_oncommit_option(stmt->options);
+ if (stmt->relation->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ if (has_oncommit_clause)
+ {
+ if (stmt->oncommit != ONCOMMIT_NOOP)
+ elog(ERROR, "can not defeine global temp table with on commit and with clause at same time");
+ }
+ else if (stmt->oncommit != ONCOMMIT_NOOP)
+ {
+ DefElem *opt = makeNode(DefElem);
+
+ opt->type = T_DefElem;
+ opt->defnamespace = NULL;
+ opt->defname = "on_commit_delete_rows";
+ opt->defaction = DEFELEM_UNSPEC;
+
+ /* use reloptions to remember on commit clause */
+ if (stmt->oncommit == ONCOMMIT_DELETE_ROWS)
+ opt->arg = (Node *)makeString("true");
+ else if (stmt->oncommit == ONCOMMIT_PRESERVE_ROWS)
+ opt->arg = (Node *)makeString("false");
+ else
+ elog(ERROR, "global temp table not support on commit drop clause");
+
+ stmt->options = lappend(stmt->options, opt);
+ }
+ }
+ else if (has_oncommit_clause)
+ elog(ERROR, "regular table cannot specifie on_commit_delete_rows");
+
reloptions = transformRelOptions((Datum) 0, stmt->options, NULL, validnsps,
true, false);
@@ -1824,7 +1866,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
* table or the current physical file to be thrown away anyway.
*/
if (rel->rd_createSubid == mySubid ||
- rel->rd_newRelfilenodeSubid == mySubid)
+ rel->rd_newRelfilenodeSubid == mySubid ||
+ rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
{
/* Immediate, non-rollbackable truncation is OK */
heap_truncate_one_rel(rel);
@@ -3511,6 +3554,26 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
(void *) stmt);
}
+
+static bool
+CheckGlobalTempTableNotInUse(Relation rel)
+{
+ int id;
+ for (id = 1; id <= MaxBackends; id++)
+ {
+ if (id != MyBackendId)
+ {
+ struct stat fst;
+ char* path = relpathbackend(rel->rd_node, id, MAIN_FORKNUM);
+ int rc = stat(path, &fst);
+ pfree(path);
+ if (rc == 0 && fst.st_size != 0)
+ return false;
+ }
+ }
+ return true;
+}
+
/*
* AlterTable
* Execute ALTER TABLE, which can be a list of subcommands
@@ -3568,6 +3631,9 @@ AlterTable(AlterTableStmt *stmt, LOCKMODE lockmode,
rel = relation_open(context->relid, NoLock);
CheckTableNotInUse(rel, "ALTER TABLE");
+ if (rel->rd_rel->relpersistence == RELPERSISTENCE_SESSION
+ && !CheckGlobalTempTableNotInUse(rel))
+ elog(ERROR, "Global temp table used by active backends can not be altered");
ATController(stmt, rel, stmt->cmds, stmt->relation->inh, lockmode, context);
}
@@ -8169,6 +8235,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("constraints on unlogged tables may reference only permanent or unlogged tables")));
break;
+ case RELPERSISTENCE_SESSION:
+ if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_SESSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("constraints on session tables may reference only session tables")));
+ break;
case RELPERSISTENCE_TEMP:
if (pkrel->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
ereport(ERROR,
@@ -14629,6 +14701,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
RelationGetRelationName(rel)),
errtable(rel)));
break;
+ case RELPERSISTENCE_SESSION:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot change logged status of session table \"%s\"",
+ RelationGetRelationName(rel)),
+ errtable(rel)));
+ break;
case RELPERSISTENCE_PERMANENT:
if (toLogged)
/* nothing to do */
@@ -15116,14 +15195,7 @@ PreCommit_on_commit_actions(void)
/* Do nothing (there shouldn't be such entries, actually) */
break;
case ONCOMMIT_DELETE_ROWS:
-
- /*
- * If this transaction hasn't accessed any temporary
- * relations, we can skip truncating ON COMMIT DELETE ROWS
- * tables, as they must still be empty.
- */
- if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
- oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
+ oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
break;
case ONCOMMIT_DROP:
oids_to_drop = lappend_oid(oids_to_drop, oc->relid);
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 59d1a31..fe3283a 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -361,7 +361,14 @@ ExecInsert(ModifyTableState *mtstate,
*/
resultRelInfo = estate->es_result_relation_info;
resultRelationDesc = resultRelInfo->ri_RelationDesc;
-
+ if (resultRelationDesc->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ {
+ int i;
+ for (i = 0; i < resultRelInfo->ri_NumIndices; i++)
+ {
+ InitGTTIndexes(resultRelInfo->ri_IndexRelationDescs[i]);
+ }
+ }
/*
* BEFORE ROW INSERT Triggers.
*
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8286d9c..7a12635 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -48,6 +48,7 @@
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/rel.h"
/* results of subquery_is_pushdown_safe */
@@ -618,7 +619,7 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* the rest of the necessary infrastructure right now anyway. So
* for now, bail out if we see a temporary table.
*/
- if (get_rel_persistence(rte->relid) == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(get_rel_persistence(rte->relid)))
return;
/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153..fd4e713 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6312,7 +6312,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
* Furthermore, any index predicate or index expressions must be parallel
* safe.
*/
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ if (RelationHasSessionScope(heap) ||
!is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
!is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d82fc5a..619ed96 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -80,6 +80,15 @@ static void set_baserel_partition_key_exprs(Relation relation,
static void set_baserel_partition_constraint(Relation relation,
RelOptInfo *rel);
+static bool
+is_index_valid(Relation index)
+{
+ if (!index->rd_index->indisvalid)
+ return false;
+ if (index->rd_rel->relpersistence == RELPERSISTENCE_SESSION)
+ InitGTTIndexes(index);
+ return true;
+}
/*
* get_relation_info -
@@ -205,7 +214,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
* still needs to insert into "invalid" indexes, if they're marked
* indisready.
*/
- if (!index->indisvalid)
+ if (!is_index_valid(indexRelation))
{
index_close(indexRelation, NoLock);
continue;
@@ -704,7 +713,7 @@ infer_arbiter_indexes(PlannerInfo *root)
idxRel = index_open(indexoid, rte->rellockmode);
idxForm = idxRel->rd_index;
- if (!idxForm->indisvalid)
+ if (!is_index_valid(idxRel))
goto next;
/*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 1b0edf5..787de83 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3288,20 +3288,11 @@ OptTemp: TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| TEMP { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMPORARY { $$ = RELPERSISTENCE_TEMP; }
| LOCAL TEMP { $$ = RELPERSISTENCE_TEMP; }
- | GLOBAL TEMPORARY
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
- | GLOBAL TEMP
- {
- ereport(WARNING,
- (errmsg("GLOBAL is deprecated in temporary table creation"),
- parser_errposition(@1)));
- $$ = RELPERSISTENCE_TEMP;
- }
+ | GLOBAL TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | GLOBAL TEMP { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMPORARY { $$ = RELPERSISTENCE_SESSION; }
+ | SESSION TEMP { $$ = RELPERSISTENCE_SESSION; }
| UNLOGGED { $$ = RELPERSISTENCE_UNLOGGED; }
| /*EMPTY*/ { $$ = RELPERSISTENCE_PERMANENT; }
;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ee2d2b5..e7f3a20 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -438,6 +438,14 @@ generateSerialExtraStmts(CreateStmtContext *cxt, ColumnDef *column,
seqstmt->options = seqoptions;
/*
+ * Why we should not always use persistence of parent table?
+ * Although it is prohibited to have unlogged sequences,
+ * unlogged tables with SERIAL fields are accepted!
+ */
+ if (cxt->relation->relpersistence != RELPERSISTENCE_UNLOGGED)
+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;
+
+ /*
* If a sequence data type was specified, add it to the options. Prepend
* to the list rather than append; in case a user supplied their own AS
* clause, the "redundant options" error will point to their occurrence,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6d1f28c..4074344 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2152,7 +2152,7 @@ do_autovacuum(void)
/*
* We cannot safely process other backends' temp tables, so skip 'em.
*/
- if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+ if (IsLocalRelpersistence(classForm->relpersistence))
continue;
relid = classForm->oid;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index aba3960..2d88cc9 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -33,9 +33,11 @@
#include <sys/file.h>
#include <unistd.h>
+#include "access/amapi.h"
#include "access/tableam.h"
#include "access/xlog.h"
#include "catalog/catalog.h"
+#include "catalog/index.h"
#include "catalog/storage.h"
#include "executor/instrument.h"
#include "lib/binaryheap.h"
@@ -429,7 +431,7 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
)
-static Buffer ReadBuffer_common(SMgrRelation reln, char relpersistence,
+static Buffer ReadBuffer_common(SMgrRelation reln, char relpersistence, char relkind,
ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy,
bool *hit);
@@ -663,7 +665,7 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
* miss.
*/
pgstat_count_buffer_read(reln);
- buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence,
+ buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence, reln->rd_rel->relkind,
forkNum, blockNum, mode, strategy, &hit);
if (hit)
pgstat_count_buffer_hit(reln);
@@ -691,7 +693,7 @@ ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
Assert(InRecovery);
- return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, forkNum, blockNum,
+ return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, RELKIND_RELATION, forkNum, blockNum,
mode, strategy, &hit);
}
@@ -702,7 +704,7 @@ ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum,
* *hit is set to true if the request was satisfied from shared buffer cache.
*/
static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(SMgrRelation smgr, char relpersistence, char relkind, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool *hit)
{
@@ -895,7 +897,8 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (track_io_timing)
INSTR_TIME_SET_CURRENT(io_start);
- smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
+ smgrread(smgr, forkNum, blockNum, (char *) bufBlock,
+ relkind == RELKIND_INDEX);
if (track_io_timing)
{
@@ -2943,7 +2946,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber *forkNum,
/* If it's a local relation, it's localbuf.c's problem. */
if (RelFileNodeBackendIsTemp(rnode))
{
- if (rnode.backend == MyBackendId)
+ if (GetRelationBackendId(rnode.backend) == MyBackendId)
{
for (j = 0; j < nforks; j++)
DropRelFileNodeLocalBuffers(rnode.node, forkNum[j],
@@ -4423,3 +4426,19 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+void InitGTTIndexes(Relation index)
+{
+ Buffer metapage = ReadBuffer(index, 0);
+ bool isNew = PageIsNew(BufferGetPage(metapage));
+ Assert(index->rd_rel->relpersistence == RELPERSISTENCE_SESSION);
+ ReleaseBuffer(metapage);
+ if (isNew)
+ {
+ Relation heap;
+ DropRelFileNodeAllLocalBuffers(index->rd_smgr->smgr_rnode.node);
+ heap = RelationIdGetRelation(index->rd_index->indrelid);
+ index->rd_indam->ambuild(heap, index, BuildIndexInfo(index));
+ RelationClose(heap);
+ }
+}
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index c5b771c..4400b21 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -27,12 +27,14 @@
#include "access/xlog.h"
#include "access/xlogutils.h"
+#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "miscadmin.h"
#include "pg_trace.h"
#include "pgstat.h"
#include "postmaster/bgwriter.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "storage/fd.h"
#include "storage/md.h"
#include "storage/relfilenode.h"
@@ -40,6 +42,7 @@
#include "storage/sync.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
/*
* The magnetic disk storage manager keeps track of open file
@@ -87,6 +90,19 @@ typedef struct _MdfdVec
static MemoryContext MdCxt; /* context for all MdfdVec objects */
+/*
+ * Structure used to collect information created by this backend.
+ * Data of this related should be deleted on backend exit.
+ */
+typedef struct SessionRelation
+{
+ RelFileNodeBackend rnode;
+ ForkNumber forknum;
+ struct SessionRelation* next;
+} SessionRelation;
+
+
+static SessionRelation* SessionRelations;
/* Populate a file tag describing an md.c segment file. */
#define INIT_MD_FILETAG(a,xx_rnode,xx_forknum,xx_segno) \
@@ -152,6 +168,60 @@ mdinit(void)
ALLOCSET_DEFAULT_SIZES);
}
+
+/*
+ * Delete all data of session relations and remove their pages from shared buffers.
+ * This function is called on backend exit.
+ */
+static void
+TruncateSessionRelations(int code, Datum arg)
+{
+ SessionRelation* rel;
+ for (rel = SessionRelations; rel != NULL; rel = rel->next)
+ {
+ /* Delete relation files */
+ mdunlink(rel->rnode, rel->forknum, false);
+ }
+}
+
+/*
+ * Maintain information about session relations accessed by this backend.
+ * This list is needed to perform cleanup on backend exit.
+ * Session relation is linked in this list when this relation is created or opened and file doesn't exist.
+ * Such procedure guarantee that each relation is linked into list only once.
+ */
+static void
+RegisterSessionRelation(SMgrRelation reln, ForkNumber forknum)
+{
+ SessionRelation* rel = (SessionRelation*)MemoryContextAlloc(TopMemoryContext, sizeof(SessionRelation));
+
+ /*
+ * Perform session relation cleanup on backend exit. We are using shared memory hook, because
+ * cleanup should be performed before backend is disconnected from shared memory.
+ */
+ if (SessionRelations == NULL)
+ on_shmem_exit(TruncateSessionRelations, 0);
+
+ rel->rnode = reln->smgr_rnode;
+ rel->forknum = forknum;
+ rel->next = SessionRelations;
+ SessionRelations = rel;
+}
+
+static void
+RegisterOnCommitAction(SMgrRelation reln, ForkNumber forknum)
+{
+ if (reln->smgr_owner && forknum == MAIN_FORKNUM)
+ {
+ Relation rel = (Relation)((char*)reln->smgr_owner - offsetof(RelationData, rd_smgr));
+ if (rel->rd_options
+ && ((StdRdOptions *)rel->rd_options)->on_commit_delete_rows)
+ {
+ register_on_commit_action(rel->rd_id, ONCOMMIT_DELETE_ROWS);
+ }
+ }
+}
+
/*
* mdexists() -- Does the physical file exist?
*
@@ -218,6 +288,8 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
errmsg("could not create file \"%s\": %m", path)));
}
}
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ RegisterSessionRelation(reln, forkNum);
pfree(path);
@@ -465,6 +537,21 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
if (fd < 0)
{
+ /*
+ * In case of session relation access, there may be no yet files of this relation for this backend.
+ * If so, then create file and register session relation for truncation on backend exit.
+ */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
+ fd = PathNameOpenFile(path, O_RDWR | PG_BINARY | O_CREAT);
+ if (fd >= 0)
+ {
+ RegisterSessionRelation(reln, forknum);
+ if (!(behavior & EXTENSION_RETURN_NULL))
+ RegisterOnCommitAction(reln, forknum);
+ goto NewSegment;
+ }
+ }
if ((behavior & EXTENSION_RETURN_NULL) &&
FILE_POSSIBLY_DELETED(errno))
{
@@ -476,6 +563,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior)
errmsg("could not open file \"%s\": %m", path)));
}
+ NewSegment:
pfree(path);
_fdvec_resize(reln, forknum, 1);
@@ -599,7 +687,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum,
*/
void
mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer)
+ char *buffer, bool skipInit)
{
off_t seekpos;
int nbytes;
@@ -644,8 +732,13 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
* complaining. This allows, for example, the case of trying to
* update a block that was later truncated away.
*/
- if (zero_damaged_pages || InRecovery)
+ if (zero_damaged_pages || InRecovery || RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode))
+ {
MemSet(buffer, 0, BLCKSZ);
+ /* In case of session relation we need to write zero page to provide correct result of subsequent mdnblocks */
+ if (RelFileNodeBackendIsGlobalTemp(reln->smgr_rnode) && !skipInit)
+ mdwrite(reln, forknum, blocknum, buffer, true);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
@@ -735,7 +828,8 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
BlockNumber segno = 0;
/* mdopen has opened the first segment */
- Assert(reln->md_num_open_segs[forknum] > 0);
+ if (reln->md_num_open_segs[forknum] == 0)
+ return 0;
/*
* Start from the last open segments, to avoid redundant seeks. We have
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 360b5bf..a7b491b 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -52,7 +52,7 @@ typedef struct f_smgr
void (*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
void (*smgr_read) (SMgrRelation reln, ForkNumber forknum,
- BlockNumber blocknum, char *buffer);
+ BlockNumber blocknum, char *buffer, bool skipInit);
void (*smgr_write) (SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
void (*smgr_writeback) (SMgrRelation reln, ForkNumber forknum,
@@ -506,9 +506,9 @@ smgrprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
*/
void
smgrread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer)
+ char *buffer, bool skipInit)
{
- smgrsw[reln->smgr_which].smgr_read(reln, forknum, blocknum, buffer);
+ smgrsw[reln->smgr_which].smgr_read(reln, forknum, blocknum, buffer, skipInit);
}
/*
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 8406644..0416549 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -994,6 +994,9 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
/* Determine owning backend. */
switch (relform->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ backend = BackendIdForSessionRelations();
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 64776e3..4996d88 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1191,6 +1191,110 @@ SearchCatCache4(CatCache *cache,
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
}
+
+void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple)
+{
+ Datum arguments[CATCACHE_MAXKEYS];
+ uint32 hashValue;
+ Index hashIndex;
+ CatCTup *ct;
+ dlist_iter iter;
+ dlist_head *bucket;
+ int nkeys = cache->cc_nkeys;
+ MemoryContext oldcxt;
+
+ /*
+ * one-time startup overhead for each cache
+ */
+ if (unlikely(cache->cc_tupdesc == NULL))
+ CatalogCacheInitializeCache(cache);
+
+ /* Initialize local parameter array */
+ arguments[0] = v1;
+ arguments[1] = v2;
+ arguments[2] = v3;
+ arguments[3] = v4;
+ /*
+ * find the hash bucket in which to look for the tuple
+ */
+ hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
+ hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
+
+ /*
+ * scan the hash bucket until we find a match or exhaust our tuples
+ *
+ * Note: it's okay to use dlist_foreach here, even though we modify the
+ * dlist within the loop, because we don't continue the loop afterwards.
+ */
+ bucket = &cache->cc_bucket[hashIndex];
+ dlist_foreach(iter, bucket)
+ {
+ ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+ if (ct->dead)
+ continue; /* ignore dead entries */
+
+ if (ct->hash_value != hashValue)
+ continue; /* quickly skip entry if wrong hash val */
+
+ if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
+ continue;
+
+ /*
+ * If it's a positive entry, bump its refcount and return it. If it's
+ * negative, we can report failure to the caller.
+ */
+ if (ct->tuple.t_len == tuple->t_len)
+ {
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ return;
+ }
+ dlist_delete(&ct->cache_elem);
+ pfree(ct);
+ cache->cc_ntup -= 1;
+ CacheHdr->ch_ntup -= 1;
+ break;
+ }
+ /* Allocate memory for CatCTup and the cached tuple in one go */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+
+ ct = (CatCTup *) palloc(sizeof(CatCTup) +
+ MAXIMUM_ALIGNOF + tuple->t_len);
+ ct->tuple.t_len = tuple->t_len;
+ ct->tuple.t_self = tuple->t_self;
+ ct->tuple.t_tableOid = tuple->t_tableOid;
+ ct->tuple.t_data = (HeapTupleHeader)
+ MAXALIGN(((char *) ct) + sizeof(CatCTup));
+ /* copy tuple contents */
+ memcpy((char *) ct->tuple.t_data,
+ (const char *) tuple->t_data,
+ tuple->t_len);
+ ct->ct_magic = CT_MAGIC;
+ ct->my_cache = cache;
+ ct->c_list = NULL;
+ ct->refcount = 1; /* pinned*/
+ ct->dead = false;
+ ct->negative = false;
+ ct->hash_value = hashValue;
+ dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
+ memcpy(ct->keys, arguments, nkeys*sizeof(Datum));
+
+ cache->cc_ntup++;
+ CacheHdr->ch_ntup++;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * If the hash table has become too full, enlarge the buckets array. Quite
+ * arbitrarily, we enlarge when fill factor > 2.
+ */
+ if (cache->cc_ntup > cache->cc_nbuckets * 2)
+ RehashCatCache(cache);
+}
+
/*
* Work-horse for SearchCatCache/SearchCatCacheN.
*/
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5..dd0b1ff 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1092,6 +1092,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
switch (relation->rd_rel->relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ relation->rd_backend = BackendIdForSessionRelations();
+ relation->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
relation->rd_backend = InvalidBackendId;
@@ -3303,6 +3307,10 @@ RelationBuildLocalRelation(const char *relname,
rel->rd_rel->relpersistence = relpersistence;
switch (relpersistence)
{
+ case RELPERSISTENCE_SESSION:
+ rel->rd_backend = BackendIdForSessionRelations();
+ rel->rd_islocaltemp = false;
+ break;
case RELPERSISTENCE_UNLOGGED:
case RELPERSISTENCE_PERMANENT:
rel->rd_backend = InvalidBackendId;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 53d9ddf..f263b83 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -1156,6 +1156,16 @@ SearchSysCache4(int cacheId,
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
+void
+InsertSysCache(int cacheId,
+ Datum key1, Datum key2, Datum key3, Datum key4,
+ HeapTuple value)
+{
+ Assert(cacheId >= 0 && cacheId < SysCacheSize &&
+ PointerIsValid(SysCache[cacheId]));
+ InsertCatCache(SysCache[cacheId], key1, key2, key3, key4, value);
+}
+
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index b7eee3d..afe22b2 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -18,6 +18,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_statistic_d.h"
#include "funcapi.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_coerce.h"
@@ -30,6 +31,13 @@
#include "utils/syscache.h"
#include "utils/typcache.h"
+/*
+ * TODO: find less ugly way to declare core function returning pg_statistics.
+ * OID of pg_gtt_statistic_for_relation. This function should be handled in special way because it returns set of pg_statistics
+ * which contains attributes of anyarray type. Type of attributes can not be deduced from input parameters and
+ * it prevents using tuple descriptor in this case.
+ */
+#define GttStatisticFunctionId 3434
static void shutdown_MultiFuncCall(Datum arg);
static TypeFuncClass internal_get_result_type(Oid funcid,
@@ -341,7 +349,8 @@ internal_get_result_type(Oid funcid,
if (resolve_polymorphic_tupdesc(tupdesc,
&procform->proargtypes,
- call_expr))
+ call_expr) ||
+ funcid == GttStatisticFunctionId)
{
if (tupdesc->tdtypeid == RECORDOID &&
tupdesc->tdtypmod < 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ec3e2c6..4c15822 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -15635,8 +15635,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
tbinfo->dobj.catId.oid, false);
appendPQExpBuffer(q, "CREATE %s%s %s",
- tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
- "UNLOGGED " : "",
+ tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ? "UNLOGGED "
+ : tbinfo->relpersistence == RELPERSISTENCE_SESSION ? "SESSION " : "",
reltypename,
qualrelname);
diff --git a/src/common/relpath.c b/src/common/relpath.c
index ad733d1..be38d17 100644
--- a/src/common/relpath.c
+++ b/src/common/relpath.c
@@ -169,7 +169,18 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
}
else
{
- if (forkNumber != MAIN_FORKNUM)
+ /*
+ * Session relations are distinguished from local temp relations by adding
+ * SessionRelFirstBackendId offset to backendId.
+ * These is no need to separate them at file system level, so just subtract SessionRelFirstBackendId
+ * to avoid too long file names.
+ * Segments of session relations have the same prefix (t%d_) as local temporary relations
+ * to make it possible to cleanup them in the same way as local temporary relation files.
+ */
+ if (backendId >= SessionRelFirstBackendId)
+ backendId -= SessionRelFirstBackendId;
+
+ if (forkNumber != MAIN_FORKNUM)
path = psprintf("base/%u/t%d_%u_%s",
dbNode, backendId, relNode,
forkNames[forkNumber]);
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index a12fc1f..89c3645 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -165,6 +165,7 @@ typedef FormData_pg_class *Form_pg_class;
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
#define RELPERSISTENCE_TEMP 't' /* temporary table */
+#define RELPERSISTENCE_SESSION 's' /* session table */
/* default selection for replica identity (primary key or nothing) */
#define REPLICA_IDENTITY_DEFAULT 'd'
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2228256..6757491 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5503,7 +5503,14 @@
proname => 'pg_stat_get_xact_function_self_time', provolatile => 'v',
proparallel => 'r', prorettype => 'float8', proargtypes => 'oid',
prosrc => 'pg_stat_get_xact_function_self_time' },
-
+{ oid => '3434',
+ descr => 'show local statistics for global temp table',
+ proname => 'pg_gtt_statistic_for_relation', provolatile => 'v', proparallel => 'u',
+ prorettype => 'record', proretset => 't', prorows => '100', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,int2,bool,float4,int4,float4,int2,int2,int2,int2,int2,oid,oid,oid,oid,oid,oid,oid,oid,oid,oid,_float4,_float4,_float4,_float4,_float4,anyarray,anyarray,anyarray,anyarray,anyarray}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{relid,starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct,stakind1,stakind2,stakind3,stakind4,stakind5,staop1,staop2,staop3,staop4,staop5,stacoll1,stacoll2,stacoll3,stacoll4,stacoll5,stanumbers1,stanumbers2,stanumbers3,stanumbers4,stanumbers5,stavalues1,stavalues2,stavalues3,stavalues4,stavalues5}',
+ prosrc => 'pg_gtt_statistic_for_relation' },
{ oid => '3788',
descr => 'statistics: timestamp of the current statistics snapshot',
proname => 'pg_stat_get_snapshot_timestamp', provolatile => 's',
diff --git a/src/include/storage/backendid.h b/src/include/storage/backendid.h
index 0c776a3..124fc3c 100644
--- a/src/include/storage/backendid.h
+++ b/src/include/storage/backendid.h
@@ -22,6 +22,13 @@ typedef int BackendId; /* unique currently active backend identifier */
#define InvalidBackendId (-1)
+/*
+ * We need to distinguish local and global temporary relations by RelFileNodeBackend.
+ * The least invasive change is to add some special bias value to backend id (since
+ * maximal number of backed is limited by MaxBackends).
+ */
+#define SessionRelFirstBackendId (0x40000000)
+
extern PGDLLIMPORT BackendId MyBackendId; /* backend id of this backend */
/* backend id of our parallel session leader, or InvalidBackendId if none */
@@ -34,4 +41,12 @@ extern PGDLLIMPORT BackendId ParallelMasterBackendId;
#define BackendIdForTempRelations() \
(ParallelMasterBackendId == InvalidBackendId ? MyBackendId : ParallelMasterBackendId)
+
+#define BackendIdForSessionRelations() \
+ (BackendIdForTempRelations() + SessionRelFirstBackendId)
+
+#define IsSessionRelationBackendId(id) ((id) >= SessionRelFirstBackendId)
+
+#define GetRelationBackendId(id) ((id) & ~SessionRelFirstBackendId)
+
#endif /* BACKENDID_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 73c7e9b..467134c 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -231,6 +231,7 @@ extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype);
extern void FreeAccessStrategy(BufferAccessStrategy strategy);
+extern void InitGTTIndexes(Relation rel);
/* inline functions */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 3f88683..7ecef10 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -229,6 +229,13 @@ typedef PageHeaderData *PageHeader;
#define PageIsNew(page) (((PageHeader) (page))->pd_upper == 0)
/*
+ * Page of temporary relation is not initialized
+ */
+#define GlobalTempRelationPageIsNotInitialized(rel, page) \
+ ((rel)->rd_rel->relpersistence == RELPERSISTENCE_SESSION && PageIsNew(page))
+
+
+/*
* PageGetItemId
* Returns an item identifier of a page.
*/
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index ec7630c..5683883 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -31,7 +31,7 @@ extern void mdextend(SMgrRelation reln, ForkNumber forknum,
extern void mdprefetch(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
extern void mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
- char *buffer);
+ char *buffer, bool skipInit);
extern void mdwrite(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern void mdwriteback(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
index 4de9fc1..c45040b 100644
--- a/src/include/storage/relfilenode.h
+++ b/src/include/storage/relfilenode.h
@@ -75,10 +75,25 @@ typedef struct RelFileNodeBackend
BackendId backend;
} RelFileNodeBackend;
+/*
+ * Check whether it is local or global temporary relation, which data belongs only to one backend.
+ */
#define RelFileNodeBackendIsTemp(rnode) \
((rnode).backend != InvalidBackendId)
/*
+ * Check whether it is global temporary relation which metadata is shared by all sessions,
+ * but data is private for the current session.
+ */
+#define RelFileNodeBackendIsGlobalTemp(rnode) IsSessionRelationBackendId((rnode).backend)
+
+/*
+ * Check whether it is local temporary relation which exists only in this backend.
+ */
+#define RelFileNodeBackendIsLocalTemp(rnode) \
+ (RelFileNodeBackendIsTemp(rnode) && !RelFileNodeBackendIsGlobalTemp(rnode))
+
+/*
* Note: RelFileNodeEquals and RelFileNodeBackendEquals compare relNode first
* since that is most likely to be different in two unequal RelFileNodes. It
* is probably redundant to compare spcNode if the other fields are found equal,
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 2438221..a4a2da2 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -95,7 +95,7 @@ extern void smgrextend(SMgrRelation reln, ForkNumber forknum,
extern void smgrprefetch(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum);
extern void smgrread(SMgrRelation reln, ForkNumber forknum,
- BlockNumber blocknum, char *buffer);
+ BlockNumber blocknum, char *buffer, bool skipInit);
extern void smgrwrite(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, char *buffer, bool skipFsync);
extern void smgrwriteback(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index f4aa316..365b02a 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -228,4 +228,8 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
extern void PrintCatCacheLeakWarning(HeapTuple tuple);
extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern void InsertCatCache(CatCache *cache,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
+
#endif /* CATCACHE_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04d..ae56427 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -277,6 +277,7 @@ typedef struct StdRdOptions
int parallel_workers; /* max number of parallel workers */
bool vacuum_index_cleanup; /* enables index vacuuming and cleanup */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
+ bool on_commit_delete_rows; /* global temp table */
} StdRdOptions;
#define HEAP_MIN_FILLFACTOR 10
@@ -332,6 +333,18 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * Relation persistence is either TEMP either SESSION
+ */
+#define IsLocalRelpersistence(relpersistence) \
+ ((relpersistence) == RELPERSISTENCE_TEMP || (relpersistence) == RELPERSISTENCE_SESSION)
+
+/*
+ * Relation is either global either local temp table
+ */
+#define RelationHasSessionScope(relation) \
+ IsLocalRelpersistence(((relation)->rd_rel->relpersistence))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
@@ -340,6 +353,7 @@ typedef enum ViewOptCheckOption
VIEW_OPTION_CHECK_OPTION_CASCADED
} ViewOptCheckOption;
+
/*
* ViewOptions
* Contents of rd_options for views
@@ -535,7 +549,7 @@ typedef struct ViewOptions
* True if relation's pages are stored in local buffers.
*/
#define RelationUsesLocalBuffers(relation) \
- ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+ RelationHasSessionScope(relation)
/*
* RELATION_IS_LOCAL
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index f27b73d..eaf21d9 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -216,4 +216,8 @@ extern bool RelationSupportsSysCache(Oid relid);
#define ReleaseSysCacheList(x) ReleaseCatCacheList(x)
+
+extern void InsertSysCache(int cacheId,
+ Datum v1, Datum v2, Datum v3, Datum v4,
+ HeapTuple tuple);
#endif /* SYSCACHE_H */
diff --git a/src/test/isolation/expected/inherit-global-temp.out b/src/test/isolation/expected/inherit-global-temp.out
new file mode 100644
index 0000000..6114f8c
--- /dev/null
+++ b/src/test/isolation/expected/inherit-global-temp.out
@@ -0,0 +1,218 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_update_p s1_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_update_p: UPDATE inh_global_parent SET a = 11 WHERE a = 1;
+step s1_update_c: UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+4
+13
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+4
+13
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+2
+11
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_update_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_update_c: UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+6
+15
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+6
+15
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_delete_p s1_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_delete_p: DELETE FROM inh_global_parent WHERE a = 2;
+step s1_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+3
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_delete_c s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_delete_c: DELETE FROM inh_global_parent WHERE a IN (4, 6);
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+1
+2
+5
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+5
+6
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s2_truncate_p s1_select_p s1_select_c s2_select_p s2_select_c
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s2_truncate_p: TRUNCATE inh_global_parent;
+step s1_select_p: SELECT a FROM inh_global_parent;
+a
+
+3
+4
+step s1_select_c: SELECT a FROM inh_global_temp_child_s1;
+a
+
+3
+4
+step s2_select_p: SELECT a FROM inh_global_parent;
+a
+
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2;
+a
+
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_p s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_p: SELECT a FROM inh_global_parent; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_p: <... completed>
+a
+
+5
+6
+
+starting permutation: s1_insert_p s1_insert_c s2_insert_c s1_begin s1_truncate_p s2_select_c s1_commit
+step s1_insert_p: INSERT INTO inh_global_parent VALUES (1), (2);
+step s1_insert_c: INSERT INTO inh_global_temp_child_s1 VALUES (3), (4);
+step s2_insert_c: INSERT INTO inh_global_temp_child_s2 VALUES (5), (6);
+step s1_begin: BEGIN;
+step s1_truncate_p: TRUNCATE inh_global_parent;
+step s2_select_c: SELECT a FROM inh_global_temp_child_s2; <waiting ...>
+step s1_commit: COMMIT;
+step s2_select_c: <... completed>
+a
+
+5
+6
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index a2fa192..ef7aa85 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -88,3 +88,4 @@ test: plpgsql-toast
test: truncate-conflict
test: serializable-parallel
test: serializable-parallel-2
+test: inherit-global-temp
diff --git a/src/test/isolation/specs/inherit-global-temp.spec b/src/test/isolation/specs/inherit-global-temp.spec
new file mode 100644
index 0000000..5e95dd6
--- /dev/null
+++ b/src/test/isolation/specs/inherit-global-temp.spec
@@ -0,0 +1,73 @@
+# This is a copy of the inherit-temp test with little changes for global temporary tables.
+#
+
+setup
+{
+ CREATE TABLE inh_global_parent (a int);
+}
+
+teardown
+{
+ DROP TABLE inh_global_parent;
+}
+
+# Session 1 executes actions which act directly on both the parent and
+# its child. Abbreviation "c" is used for queries working on the child
+# and "p" on the parent.
+session "s1"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s1 () INHERITS (inh_global_parent);
+}
+step "s1_begin" { BEGIN; }
+step "s1_truncate_p" { TRUNCATE inh_global_parent; }
+step "s1_select_p" { SELECT a FROM inh_global_parent; }
+step "s1_select_c" { SELECT a FROM inh_global_temp_child_s1; }
+step "s1_insert_p" { INSERT INTO inh_global_parent VALUES (1), (2); }
+step "s1_insert_c" { INSERT INTO inh_global_temp_child_s1 VALUES (3), (4); }
+step "s1_update_p" { UPDATE inh_global_parent SET a = 11 WHERE a = 1; }
+step "s1_update_c" { UPDATE inh_global_parent SET a = 13 WHERE a IN (3, 5); }
+step "s1_delete_p" { DELETE FROM inh_global_parent WHERE a = 2; }
+step "s1_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+step "s1_commit" { COMMIT; }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s1;
+}
+
+# Session 2 executes actions on the parent which act only on the child.
+session "s2"
+setup
+{
+ CREATE GLOBAL TEMPORARY TABLE inh_global_temp_child_s2 () INHERITS (inh_global_parent);
+}
+step "s2_truncate_p" { TRUNCATE inh_global_parent; }
+step "s2_select_p" { SELECT a FROM inh_global_parent; }
+step "s2_select_c" { SELECT a FROM inh_global_temp_child_s2; }
+step "s2_insert_c" { INSERT INTO inh_global_temp_child_s2 VALUES (5), (6); }
+step "s2_update_c" { UPDATE inh_global_parent SET a = 15 WHERE a IN (3, 5); }
+step "s2_delete_c" { DELETE FROM inh_global_parent WHERE a IN (4, 6); }
+teardown
+{
+ DROP TABLE inh_global_temp_child_s2;
+}
+
+# Check INSERT behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check UPDATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_update_p" "s1_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_update_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check DELETE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_delete_p" "s1_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_delete_c" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# Check TRUNCATE behavior across sessions
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s2_truncate_p" "s1_select_p" "s1_select_c" "s2_select_p" "s2_select_c"
+
+# TRUNCATE on a parent tree does not block access to temporary child relation
+# of another session, and blocks when scanning the parent.
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_p" "s1_commit"
+permutation "s1_insert_p" "s1_insert_c" "s2_insert_c" "s1_begin" "s1_truncate_p" "s2_select_c" "s1_commit"
diff --git a/src/test/regress/expected/global_temp.out b/src/test/regress/expected/global_temp.out
new file mode 100644
index 0000000..ae1adb6
--- /dev/null
+++ b/src/test/regress/expected/global_temp.out
@@ -0,0 +1,247 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+-- Test ON COMMIT DELETE ROWS
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+ 2
+(2 rows)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+SELECT * FROM global_temptest;
+ col
+-----
+ 1
+(1 row)
+
+COMMIT;
+SELECT * FROM global_temptest;
+ col
+-----
+(0 rows)
+
+DROP TABLE global_temptest;
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+ col
+-----
+ 1
+(1 row)
+
+SELECT * FROM global_temptest2;
+ col
+-----
+(0 rows)
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+ERROR: unsupported ON COMMIT and foreign key combination
+DETAIL: Table "global_temptest4" references "global_temptest3", but they do not have the same ON COMMIT setting.
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+ a
+---
+(0 rows)
+
+DROP TABLE temp_parted_oncommit;
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+ relname
+-----------------------------------
+ global_temp_parted_oncommit_test
+ global_temp_parted_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_parted_oncommit_test;
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+ a
+---
+ 1
+(1 row)
+
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+ relname
+--------------------------------
+ global_temp_inh_oncommit_test
+ global_temp_inh_oncommit_test1
+(2 rows)
+
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ERROR: cannot inherit from temporary relation "global_temp_table"
+ROLLBACK;
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM global_temp_table;
+ a
+---
+ 1
+(1 row)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+ 2
+(3 rows)
+
+COMMIT;
+SELECT * FROM normal_table;
+ a
+---
+ 0
+ 1
+(2 rows)
+
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 1
+(1 row)
+
+\c
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+(0 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 2
+(1 row)
+
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+ aid | bid
+-----+-----
+ 1 | 1
+ 2 | 2
+ 3 | 3
+(3 rows)
+
+SELECT NEXTVAL( 'test_sequence' );
+ nextval
+---------
+ 3
+(1 row)
+
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2ab2115..7538601 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1349,6 +1349,40 @@ pg_group| SELECT pg_authid.rolname AS groname,
WHERE (pg_auth_members.roleid = pg_authid.oid)) AS grolist
FROM pg_authid
WHERE (NOT pg_authid.rolcanlogin);
+pg_gtt_statistic| SELECT s.starelid,
+ s.staattnum,
+ s.stainherit,
+ s.stanullfrac,
+ s.stawidth,
+ s.stadistinct,
+ s.stakind1,
+ s.stakind2,
+ s.stakind3,
+ s.stakind4,
+ s.stakind5,
+ s.staop1,
+ s.staop2,
+ s.staop3,
+ s.staop4,
+ s.staop5,
+ s.stacoll1,
+ s.stacoll2,
+ s.stacoll3,
+ s.stacoll4,
+ s.stacoll5,
+ s.stanumbers1,
+ s.stanumbers2,
+ s.stanumbers3,
+ s.stanumbers4,
+ s.stanumbers5,
+ s.stavalues1,
+ s.stavalues2,
+ s.stavalues3,
+ s.stavalues4,
+ s.stavalues5
+ FROM pg_class c,
+ LATERAL pg_gtt_statistic_for_relation(c.oid) s(starelid, staattnum, stainherit, stanullfrac, stawidth, stadistinct, stakind1, stakind2, stakind3, stakind4, stakind5, staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, stavalues1, stavalues2, stavalues3, stavalues4, stavalues5)
+ WHERE (c.relpersistence = 's'::"char");
pg_hba_file_rules| SELECT a.line_number,
a.type,
a.database,
diff --git a/src/test/regress/expected/session_table.out b/src/test/regress/expected/session_table.out
new file mode 100644
index 0000000..1b9b3f4
--- /dev/null
+++ b/src/test/regress/expected/session_table.out
@@ -0,0 +1,64 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+ count
+-------
+ 10000
+(1 row)
+
+\c
+select count(*) from my_private_table;
+ count
+-------
+ 0
+(1 row)
+
+select * from my_private_table where x=10001;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select * from my_private_table where y=10001;
+ x | y
+-------+-------
+ 10001 | 10001
+(1 row)
+
+select count(*) from my_private_table;
+ count
+--------
+ 100000
+(1 row)
+
+\c
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+---+---
+(0 rows)
+
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+ x | y
+---+---
+(0 rows)
+
+select * from my_private_table order by y desc limit 1;
+ x | y
+--------+--------
+ 100000 | 100000
+(1 row)
+
+drop table my_private_table;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d2b17dd..71c8ca4 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -107,7 +107,7 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
# NB: temp.sql does a reconnect which transiently uses 2 connections,
# so keep this parallel group to at most 19 tests
# ----------
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 temp global_temp session_table domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index acba391..71abe08 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -172,6 +172,8 @@ test: limit
test: plpgsql
test: copy2
test: temp
+test: global_temp
+test: session_table
test: domain
test: rangefuncs
test: prepare
diff --git a/src/test/regress/sql/global_temp.sql b/src/test/regress/sql/global_temp.sql
new file mode 100644
index 0000000..3058b9b
--- /dev/null
+++ b/src/test/regress/sql/global_temp.sql
@@ -0,0 +1,151 @@
+--
+-- GLOBAL TEMP
+-- Test global temp relations
+--
+
+-- Test ON COMMIT DELETE ROWS
+
+CREATE GLOBAL TEMP TABLE global_temptest(col int) ON COMMIT DELETE ROWS;
+
+BEGIN;
+INSERT INTO global_temptest VALUES (1);
+INSERT INTO global_temptest VALUES (2);
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest(col) ON COMMIT DELETE ROWS AS SELECT 1;
+
+SELECT * FROM global_temptest;
+COMMIT;
+
+SELECT * FROM global_temptest;
+
+DROP TABLE global_temptest;
+
+-- Test foreign keys
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest1(col int PRIMARY KEY);
+CREATE GLOBAL TEMP TABLE global_temptest2(col int REFERENCES global_temptest1)
+ ON COMMIT DELETE ROWS;
+INSERT INTO global_temptest1 VALUES (1);
+INSERT INTO global_temptest2 VALUES (1);
+COMMIT;
+SELECT * FROM global_temptest1;
+SELECT * FROM global_temptest2;
+
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temptest3(col int PRIMARY KEY) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temptest4(col int REFERENCES global_temptest3);
+COMMIT;
+
+-- For partitioned temp tables, ON COMMIT actions ignore storage-less
+-- partitioned tables.
+BEGIN;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE temp_parted_oncommit_1
+ PARTITION OF temp_parted_oncommit
+ FOR VALUES IN (1) ON COMMIT DELETE ROWS;
+INSERT INTO temp_parted_oncommit VALUES (1);
+COMMIT;
+-- partitions are emptied by the previous commit
+SELECT * FROM temp_parted_oncommit;
+DROP TABLE temp_parted_oncommit;
+
+-- Using ON COMMIT DELETE on a partitioned table does not remove
+-- all rows if partitions preserve their data.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test (a int)
+ PARTITION BY LIST (a) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_parted_oncommit_test1
+ PARTITION OF global_temp_parted_oncommit_test
+ FOR VALUES IN (1) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_parted_oncommit_test VALUES (1);
+COMMIT;
+-- Data from the remaining partition is still here as its rows are
+-- preserved.
+SELECT * FROM global_temp_parted_oncommit_test;
+-- two relations remain in this case.
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_parted_oncommit_test%';
+DROP TABLE global_temp_parted_oncommit_test;
+
+-- Check dependencies between ON COMMIT actions with inheritance trees.
+-- Data on the parent is removed, and the child goes away.
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_inh_oncommit_test1 ()
+ INHERITS(global_temp_inh_oncommit_test) ON COMMIT PRESERVE ROWS;
+INSERT INTO global_temp_inh_oncommit_test1 VALUES (1);
+INSERT INTO global_temp_inh_oncommit_test VALUES (1);
+COMMIT;
+SELECT * FROM global_temp_inh_oncommit_test;
+-- two relations remain
+SELECT relname FROM pg_class WHERE relname LIKE 'global_temp_inh_oncommit_test%';
+DROP TABLE global_temp_inh_oncommit_test1;
+DROP TABLE global_temp_inh_oncommit_test;
+
+-- Global temp table cannot inherit from temporary relation
+BEGIN;
+CREATE TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE GLOBAL TEMP TABLE global_temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+ROLLBACK;
+
+-- Temp table can inherit from global temporary relation
+BEGIN;
+CREATE GLOBAL TEMP TABLE global_temp_table (a int) ON COMMIT DELETE ROWS;
+CREATE TEMP TABLE temp_table1 ()
+ INHERITS(global_temp_table) ON COMMIT PRESERVE ROWS;
+CREATE TEMP TABLE temp_table2 ()
+ INHERITS(global_temp_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO global_temp_table VALUES (0);
+SELECT * FROM global_temp_table;
+COMMIT;
+SELECT * FROM global_temp_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE global_temp_table;
+
+-- Global temp table can inherit from normal relation
+BEGIN;
+CREATE TABLE normal_table (a int);
+CREATE GLOBAL TEMP TABLE temp_table1 ()
+ INHERITS(normal_table) ON COMMIT PRESERVE ROWS;
+CREATE GLOBAL TEMP TABLE temp_table2 ()
+ INHERITS(normal_table) ON COMMIT DELETE ROWS;
+INSERT INTO temp_table2 VALUES (2);
+INSERT INTO temp_table1 VALUES (1);
+INSERT INTO normal_table VALUES (0);
+SELECT * FROM normal_table;
+COMMIT;
+SELECT * FROM normal_table;
+DROP TABLE temp_table2;
+DROP TABLE temp_table1;
+DROP TABLE normal_table;
+
+-- Check SERIAL and BIGSERIAL pseudo-types
+CREATE GLOBAL TEMP TABLE global_temp_table ( aid BIGSERIAL, bid SERIAL );
+CREATE SEQUENCE test_sequence;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+\c
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+INSERT INTO global_temp_table DEFAULT VALUES;
+SELECT * FROM global_temp_table;
+SELECT NEXTVAL( 'test_sequence' );
+DROP TABLE global_temp_table;
+DROP SEQUENCE test_sequence;
diff --git a/src/test/regress/sql/session_table.sql b/src/test/regress/sql/session_table.sql
new file mode 100644
index 0000000..c6663dc
--- /dev/null
+++ b/src/test/regress/sql/session_table.sql
@@ -0,0 +1,18 @@
+create session table my_private_table(x integer primary key, y integer);
+insert into my_private_table values (generate_series(1,10000), generate_series(1,10000));
+select count(*) from my_private_table;
+\c
+select count(*) from my_private_table;
+select * from my_private_table where x=10001;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+create index on my_private_table(y);
+select * from my_private_table where x=10001;
+select * from my_private_table where y=10001;
+select count(*) from my_private_table;
+\c
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+insert into my_private_table values (generate_series(1,100000), generate_series(1,100000));
+select * from my_private_table where x=100001;
+select * from my_private_table order by y desc limit 1;
+drop table my_private_table;
Hi,
I am very interested in this feature that will conform to the SQL standard and I read that :
Session 1:
create global temp table gtt(x integer);
insert into gtt values (generate_series(1,100000));
Session 2:
insert into gtt values (generate_series(1,200000));
Session1:
create index on gtt(x);
explain select * from gtt where x = 1;
Session2:
explain select * from gtt where x = 1;
??? Should we use index here?
My answer is - yes.
Just because:
- Such behavior is compatible with regular tables. So it will not
confuse users and doesn't require some complex explanations.
- It is compatible with Oracle.
There is a confusion. Sadly it does not work like that at all with Oracle. Their implementation is buggy in my opinion.
Here is a very simple test case to prove it with the latest version (january 2020) :
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0<http://19.0.0.0>.0 - Production
Version 19.6.0.0<http://19.6.0.0>.0
-- session 1
create global temporary table gtt(x integer);
Table created.
-- session 2
insert into gtt SELECT level FROM dual CONNECT BY LEVEL <= 100000;
100000 rows created.
-- session 1
create index igtt on gtt(x);
Index created.
-- session 2
select * from gtt where x = 9;
no rows selected
select /*+ FULL(gtt) */ * from gtt where x = 9;
X
----------
9
What happened ? The optimizer (planner) knows the new index igtt can be efficient via dynamic sampling. Hence, igtt is used at execution time...but it is NOT populated. By default I obtained no line. If I force a full scan of the table with a hint /*+ FULL */ you can see that I obtain my line 9. Different results with different exec plans it's a WRONG RESULT bug, the worst kind of bugs.
Please don't consider Oracle as a reference for your implementation. I am 100% sure you can implement and document that better than Oracle. E.g index is populated and considered only for transactions that started after the index creation or something like that. It would be far better than this misleading behaviour.
Regards,
Phil
Télécharger Outlook pour Android<https://aka.ms/ghei36>
________________________________
From: Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
Sent: Monday, February 10, 2020 5:48:29 PM
To: Tomas Vondra <tomas.vondra@2ndquadrant.com>; Philippe BEAUDOIN <phb07@apra.asso.fr>
Cc: pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>; Konstantin Knizhnik <knizhnik@garret.ru>
Subject: Re: Global temporary tables
Sorry, small typo in the last patch.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company